Package Bio :: Module Seq
[hide private]
[frames] | no frames]

Module Seq

source code

Represent a sequence or mutable sequence, with an alphabet.

Classes [hide private]
  Seq
A read-only sequence object (essentially a string with an alphabet).
  MutableSeq
An editable sequence object (with an alphabet).
Functions [hide private]
 
transcribe(dna)
Transcribes a DNA sequence into RNA.
source code
 
back_transcribe(rna)
Back-transcribes an RNA sequence into DNA.
source code
 
_translate_str(sequence, table, stop_symbol='*', to_stop=False, pos_stop='X')
Helper function to translate a nucleotide string (PRIVATE).
source code
 
translate(sequence, table='Standard', stop_symbol='*', to_stop=False)
Translate a nucleotide sequence into amino acids.
source code
 
reverse_complement(sequence)
Returns the reverse complement sequence of a nucleotide string.
source code
 
_test()
Run the Bio.Seq module's doctests.
source code
Variables [hide private]
  __package__ = 'Bio'
Function Details [hide private]

transcribe(dna)

source code 

Transcribes a DNA sequence into RNA.

If given a string, returns a new string object.

Given a Seq or MutableSeq, returns a new Seq object with an RNA alphabet.

Trying to transcribe a protein or RNA sequence raises an exception.

e.g. >>> transcribe("ACTGN") 'ACUGN'

back_transcribe(rna)

source code 

Back-transcribes an RNA sequence into DNA.

If given a string, returns a new string object.

Given a Seq or MutableSeq, returns a new Seq object with an RNA alphabet.

Trying to transcribe a protein or DNA sequence raises an exception.

e.g. >>> back_transcribe("ACUGN") 'ACTGN'

_translate_str(sequence, table, stop_symbol='*', to_stop=False, pos_stop='X')

source code 
Helper function to translate a nucleotide string (PRIVATE).

sequence    - a string
table       - a CodonTable object (NOT a table name or id number)
stop_symbol - a single character string, what to use for terminators.
to_stop     - boolean, should translation terminate at the first
              in frame stop codon?  If there is no in-frame stop codon
              then translation continues to the end.
pos_stop    - a single character string for a possible stop codon
              (e.g. TAN or NNN)

Returns a string.

e.g.
>>> from Bio.Data import CodonTable
>>> table = CodonTable.ambiguous_dna_by_id[1]
>>> _translate_str("AAA", table)
'K'
>>> _translate_str("TAR", table)
'*'
>>> _translate_str("TAN", table)
'X'
>>> _translate_str("TAN", table, pos_stop="@")
'@'
>>> _translate_str("TA?", table)
Traceback (most recent call last):
   ...
TranslationError: Codon 'TA?' is invalid

translate(sequence, table='Standard', stop_symbol='*', to_stop=False)

source code 
Translate a nucleotide sequence into amino acids.

If given a string, returns a new string object.
Given a Seq or MutableSeq, returns a Seq object with a protein
alphabet.

table - Which codon table to use?  This can be either a name (string) or
        an NCBI identifier (integer).  Defaults to the "Standard" table.
stop_symbol - Single character string, what to use for terminators.
        This defaults to the asterisk, "*".
to_stop - Boolean, defaults to False meaning do a full translation
        continuing on past any stop codons (translated as the
        specified stop_symbol).  If True, translation is terminated
        at the first in frame stop codon (and the stop_symbol is
        not appended to the returned protein sequence).

A simple string example using the default (standard) genetic code,

>>> coding_dna = "GTGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG"
>>> translate(coding_dna)
'VAIVMGR*KGAR*'
>>> translate(coding_dna, stop_symbol="@")
'VAIVMGR@KGAR@'
>>> translate(coding_dna, to_stop=True)
'VAIVMGR'
 
Now using NCBI table 2, where TGA is not a stop codon:

>>> translate(coding_dna, table=2)
'VAIVMGRWKGAR*'
>>> translate(coding_dna, table=2, to_stop=True)
'VAIVMGRWKGAR'

Note that if the sequence has no in-frame stop codon, then the to_stop
argument has no effect:

>>> coding_dna2 = "GTGGCCATTGTAATGGGCCGC"
>>> translate(coding_dna2)
'VAIVMGR'
>>> translate(coding_dna2, to_stop=True)
'VAIVMGR'

NOTE - Ambiguous codons like "TAN" or "NNN" could be an amino acid
or a stop codon.  These are translated as "X".  Any invalid codon
(e.g. "TA?" or "T-A") will throw a TranslationError.

NOTE - Does NOT support gapped sequences.

It will however translate either DNA or RNA.

reverse_complement(sequence)

source code 

Returns the reverse complement sequence of a nucleotide string.

If given a string, returns a new string object. Given a Seq or a MutableSeq, returns a new Seq object with the same alphabet.

Supports unambiguous and ambiguous nucleotide sequences.

e.g. >>> reverse_complement("ACTGN") 'NCAGT'