Package Bio :: Package Clustalw :: Class MultipleAlignCL
[hide private]
[frames] | no frames]

Class MultipleAlignCL

source code

Represent a clustalw multiple alignment command line.

This is meant to make it easy to code the command line options you want to submit to clustalw.

Clustalw has a ton of options and things to do but this is set up to represent a clustalw mutliple alignment.

Warning: I don't use all of these options personally, so if you find one to be broken for any reason, please let us know!

Instance Methods [hide private]
 
__init__(self, sequence_file, command='clustalw')
Initialize some general parameters that can be set as attributes.
source code
 
__str__(self)
Write out the command line as a string.
source code
 
set_output(self, output_file, output_type=None, output_order=None, change_case=None, add_seqnos=None)
Set the output parameters for the command line.
source code
 
set_guide_tree(self, tree_file)
Provide a file to use as the guide tree for alignment.
source code
 
set_new_guide_tree(self, tree_file)
Set the name of the guide tree file generated in the alignment.
source code
 
set_protein_matrix(self, protein_matrix)
Set the type of protein matrix to use.
source code
 
set_dna_matrix(self, dna_matrix)
Set the type of DNA matrix to use.
source code
 
set_type(self, residue_type)
Set the type of residues within the file.
source code
Class Variables [hide private]
  OUTPUT_TYPES = ['GCG', 'GDE', 'PHYLIP', 'PIR', 'NEXUS', 'FASTA']
  OUTPUT_ORDER = ['INPUT', 'ALIGNED']
  OUTPUT_CASE = ['LOWER', 'UPPER']
  OUTPUT_SEQNOS = ['OFF', 'ON']
  RESIDUE_TYPES = ['PROTEIN', 'DNA']
  PROTEIN_MATRIX = ['BLOSUM', 'PAM', 'GONNET', 'ID']
  DNA_MATRIX = ['IUB', 'CLUSTALW']
Method Details [hide private]

__init__(self, sequence_file, command='clustalw')
(Constructor)

source code 

Initialize some general parameters that can be set as attributes.

Arguments: o sequence_file - The file to read the sequences for alignment from. o command - The command used to run clustalw. This defaults to just 'clustalw' (ie. assumes you have it on your path somewhere).

General attributes that can be set: o is_quick - if set as 1, will use a fast algorithm to create the alignment guide tree. o allow_negative - allow negative values in the alignment matrix.

Multiple alignment attributes that can be set as attributes: o gap_open_pen - Gap opening penalty o gap_ext_pen - Gap extension penalty o is_no_end_pen - A flag as to whether or not there should be a gap separation penalty for the ends. o gap_sep_range - The gap separation penalty range. o is_no_pgap - A flag to turn off residue specific gaps o is_no_hgap - A flag to turn off hydrophilic gaps o h_gap_residues - A list of residues to count a hydrophilic o max_div - A percent identity to use for delay (? - I don't undertand this!) o trans_weight - The weight to use for transitions

set_guide_tree(self, tree_file)

source code 

Provide a file to use as the guide tree for alignment.

Raises: o IOError - If the tree_file doesn't exist.

set_protein_matrix(self, protein_matrix)

source code 

Set the type of protein matrix to use.

Protein matrix can be either one of the defined types (blosum, pam, gonnet or id) or a file with your own defined matrix.

set_dna_matrix(self, dna_matrix)

source code 

Set the type of DNA matrix to use.

The dna_matrix can either be one of the defined types (iub or clustalw) or a file with the matrix to use.

set_type(self, residue_type)

source code 

Set the type of residues within the file.

Clustal tries to guess whether the info is protein or DNA based on the number of GATCs, but this can be wrong if you have a messed up protein or DNA you are working with, so this allows you to set it explicitly.