Package Bio :: Package expressions :: Module genbank
[show private | hide private]
[frames | no frames]

Module Bio.expressions.genbank

Martel based parser to read GenBank formatted files.

This is a huge regular regular expression for GenBank, built using the 'regular expressions on steroids' capabilities of Martel.

Documentation for GenBank format that I found:

o GenBank/EMBL feature tables are described at: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

o There are also descriptions of different GenBank lines at: http://www.ibc.wustl.edu/standards/gbrel.txt
Function Summary
  define_block(identifier, block_tag, block_data, std_block_tag, std_tag)
Define a Martel grouping which can parse a block of text.

Variable Summary
Group accession = <Martel.Expression.Group instance at 0xb7020...
Group accession_block = <Martel.Expression.Group instance at 0...
Group authors_block = <Martel.Expression.Group instance at 0xb...
Group base_count = <Martel.Expression.Group instance at 0xb703...
Group base_count_line = <Martel.Expression.Group instance at 0...
Group base_number = <Martel.Expression.Group instance at 0xb70...
Str big_indent_space = <Martel.Expression.Str instance at 0x...
MaxRepeat blank_space = <Martel.Expression.MaxRepeat instance at 0...
Group comment_block = <Martel.Expression.Group instance at 0xb...
Group consrtm_block = <Martel.Expression.Group instance at 0xb...
Group contig_block = <Martel.Expression.Group instance at 0xb7...
Group contig_location = <Martel.Expression.Group instance at 0...
Group data_file_division = <Martel.Expression.Group instance a...
Group date = <Martel.Expression.Group instance at 0xb701cf0c>
Group db_source_block = <Martel.Expression.Group instance at 0...
Group definition_block = <Martel.Expression.Group instance at ...
list divisions = [<Martel.Expression.Str instance at 0xb701ce...
Group feature = <Martel.Expression.Group instance at 0xb7034fc...
Group feature_block = <Martel.Expression.Group instance at 0xb...
Group feature_key = <Martel.Expression.Group instance at 0xb70...
int FEATURE_KEY_INDENT = 5                                                                     
Group feature_key_line = <Martel.Expression.Group instance at ...
int FEATURE_QUALIFIER_INDENT = 21                                                                    
Group features_line = <Martel.Expression.Group instance at 0xb...
ParseRecords format = <Martel.Expression.ParseRecords instance at 0xb...
Group gi = <Martel.Expression.Group instance at 0xb7020c2c>
Seq header = <Martel.Expression.Seq instance at 0xb703ac8c>
int INDENT = 12                                                                    
Group journal_block = <Martel.Expression.Group instance at 0xb...
Group keywords_block = <Martel.Expression.Group instance at 0x...
Group location = <Martel.Expression.Group instance at 0xb70347...
Group locus = <Martel.Expression.Group instance at 0xb701ca2c>
Group locus_line = <Martel.Expression.Group instance at 0xb702...
Group medline_line = <Martel.Expression.Group instance at 0xb7...
HeaderFooter ncbi_format = <Martel.Expression.HeaderFooter instance a...
Group nid = <Martel.Expression.Group instance at 0xb702096c>
Group nid_line = <Martel.Expression.Group instance at 0xb70209...
Group organism = <Martel.Expression.Group instance at 0xb70279...
Group organism_block = <Martel.Expression.Group instance at 0x...
Group origin_line = <Martel.Expression.Group instance at 0xb70...
Group pid = <Martel.Expression.Group instance at 0xb7020a8c>
Group pid_line = <Martel.Expression.Group instance at 0xb7020b...
Group primary = <Martel.Expression.Group instance at 0xb703462...
Group primary_line = <Martel.Expression.Group instance at 0xb7...
Group primary_ref_line = <Martel.Expression.Group instance at ...
Group pubmed_line = <Martel.Expression.Group instance at 0xb70...
Group qualifier = <Martel.Expression.Group instance at 0xb7034...
Alt qualifier_space = <Martel.Expression.Alt instance at 0xb...
Str quote = <Martel.Expression.Str instance at 0xb7034a6c>
Group quoted_chars = <Martel.Expression.Group instance at 0xb7...
Seq quoted_string = <Martel.Expression.Seq instance at 0xb70...
Group record = <Martel.Expression.Group instance at 0xb703ac2c...
Group record_end = <Martel.Expression.Group instance at 0xb703...
Group reference = <Martel.Expression.Group instance at 0xb702e...
Group reference_bases = <Martel.Expression.Group instance at 0...
Group reference_line = <Martel.Expression.Group instance at 0x...
Group reference_num = <Martel.Expression.Group instance at 0xb...
Group remark_block = <Martel.Expression.Group instance at 0xb7...
list residue_prefixes = [<Martel.Expression.Str instance at 0...
Group residue_type = <Martel.Expression.Group instance at 0xb7...
list residue_types = [<Martel.Expression.Str instance at 0xb7...
Group segment = <Martel.Expression.Group instance at 0xb70273a...
Group segment_line = <Martel.Expression.Group instance at 0xb7...
Group sequence = <Martel.Expression.Group instance at 0xb703a3...
Group sequence_entry = <Martel.Expression.Group instance at 0x...
Group sequence_line = <Martel.Expression.Group instance at 0xb...
Group sequence_plus_spaces = <Martel.Expression.Group instance...
Group size = <Martel.Expression.Group instance at 0xb701caec>
Str small_indent_space = <Martel.Expression.Str instance at ...
Group source_block = <Martel.Expression.Group instance at 0xb7...
Group taxonomy = <Martel.Expression.Group instance at 0xb70277...
Group title_block = <Martel.Expression.Group instance at 0xb70...
Seq unquoted_string = <Martel.Expression.Seq instance at 0xb...
list valid_divisions = ['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'P...
list valid_residue_prefixes = ['ss-', 'ds-', 'ms-']
list valid_residue_types = ['DNA', 'RNA', 'mRNA', 'tRNA', 'rR...
Group version = <Martel.Expression.Group instance at 0xb7020b6...
Group version_line = <Martel.Expression.Group instance at 0xb7...

Function Details

define_block(identifier, block_tag, block_data, std_block_tag=None, std_tag=None)

Define a Martel grouping which can parse a block of text.

Many of the GenBank lines we'll want to process are grouped into a block like:

IDENTIFIER Blah blah blah

Where blah blah blah can wrap for multiple lines. This function makes it easy to consistently define a definition for these blocks.

Arguments: o identifier - The identifier that begins the block (like DEFINITION). o block_tag - A callback tag for the entire block. o block_data - A callback tag for the data in the block (ie. the stuff you are interested in). o std_block_tag - A Bio.Std Martel tag used to register the entire block as having being a "standard" type of information. o std_tag - A Bio.Std Martel tag used to register just the information in the block as being "standard"

Variable Details

accession

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702078c>                       

accession_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70208ec>                       

authors_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702e02c>                       

base_count

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a0cc>                       

base_count_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a14c>                       

base_number

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a36c>                       

big_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0xb701c94c>                         

blank_space

Type:
MaxRepeat
Value:
<Martel.Expression.MaxRepeat instance at 0xb701c8ec>                   

comment_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70341ac>                       

consrtm_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702e2cc>                       

contig_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a94c>                       

contig_location

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a70c>                       

data_file_division

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70201ec>                       

date

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb701cf0c>                       

db_source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702708c>                       

definition_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702072c>                       

divisions

Type:
list
Value:
[<Martel.Expression.Str instance at 0xb701ceec>,
 <Martel.Expression.Str instance at 0xb701cf8c>,
 <Martel.Expression.Str instance at 0xb701cfac>,
 <Martel.Expression.Str instance at 0xb701cfcc>,
 <Martel.Expression.Str instance at 0xb701cfec>,
 <Martel.Expression.Str instance at 0xb702002c>,
 <Martel.Expression.Str instance at 0xb702004c>,
 <Martel.Expression.Str instance at 0xb702006c>,
...                                                                    

feature

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7034fcc>                       

feature_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a04c>                       

feature_key

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703476c>                       

FEATURE_KEY_INDENT

Type:
int
Value:
5                                                                     

feature_key_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7034a0c>                       

FEATURE_QUALIFIER_INDENT

Type:
int
Value:
21                                                                    

features_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70346cc>                       

format

Type:
ParseRecords
Value:
<Martel.Expression.ParseRecords instance at 0xb703ad4c>                

gi

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7020c2c>                       

header

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0xb703ac8c>                         

INDENT

Type:
int
Value:
12                                                                    

journal_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702e80c>                       

keywords_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702732c>                       

location

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703474c>                       

locus

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb701ca2c>                       

locus_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702024c>                       

medline_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702e84c>                       

ncbi_format

Type:
HeaderFooter
Value:
<Martel.Expression.HeaderFooter instance at 0xb704002c>                

nid

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702096c>                       

nid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70209ec>                       

organism

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702794c>                       

organism_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7027b0c>                       

origin_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a2ec>                       

pid

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7020a8c>                       

pid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7020b0c>                       

primary

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703462c>                       

primary_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70342ac>                       

primary_ref_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70344ec>                       

pubmed_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702e9ec>                       

qualifier

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7034e6c>                       

qualifier_space

Type:
Alt
Value:
<Martel.Expression.Alt instance at 0xb701c9ac>                         

quote

Type:
Str
Value:
<Martel.Expression.Str instance at 0xb7034a6c>                         

quoted_chars

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7034acc>                       

quoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0xb7034c8c>                         

record

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703ac2c>                       

record_end

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703aa2c>                       

reference

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702eeac>                       

reference_bases

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7027c2c>                       

reference_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7027d2c>                       

reference_num

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7027b8c>                       

remark_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702edec>                       

residue_prefixes

Type:
list
Value:
[<Martel.Expression.Str instance at 0xb701cbac>,
 <Martel.Expression.Str instance at 0xb701cbcc>,
 <Martel.Expression.Str instance at 0xb701cbec>]                       

residue_type

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb701cdac>                       

residue_types

Type:
list
Value:
[<Martel.Expression.Str instance at 0xb701cc0c>,
 <Martel.Expression.Str instance at 0xb701cc2c>,
 <Martel.Expression.Str instance at 0xb701cc4c>,
 <Martel.Expression.Str instance at 0xb701cc6c>,
 <Martel.Expression.Str instance at 0xb701cc8c>,
 <Martel.Expression.Str instance at 0xb701ccac>,
 <Martel.Expression.Str instance at 0xb701cccc>,
 <Martel.Expression.Str instance at 0xb701ccec>,
...                                                                    

segment

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70273ac>                       

segment_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70275ec>                       

sequence

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a32c>                       

sequence_entry

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a6ac>                       

sequence_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a5cc>                       

sequence_plus_spaces

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb703a54c>                       

size

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb701caec>                       

small_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0xb701c84c>                         

source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb70278ec>                       

taxonomy

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702770c>                       

title_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb702e56c>                       

unquoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0xb7034d6c>                         

valid_divisions

Type:
list
Value:
['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'PLN', 'BCT', 'RNA', 'VRL']        

valid_residue_prefixes

Type:
list
Value:
['ss-', 'ds-', 'ms-']                                                  

valid_residue_types

Type:
list
Value:
['DNA', 'RNA', 'mRNA', 'tRNA', 'rRNA', 'uRNA', 'scRNA', 'snRNA', 'snoR\
NA']                                                                   

version

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7020b6c>                       

version_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb7020d8c>                       

Generated by Epydoc 2.1 on Thu Mar 31 20:15:51 2005 http://epydoc.sf.net