These documents are For the HEAD of the CVS repository on July 19, 2007 Api docs for previous releases

Modware::Feature

MRNA

Summary Included libraries Package variables Synopsis Description General documentation Methods

Summary
   Modware::Feature::MRNA - Modware representation of an mRNA
Package variables top
No package variables defined.
Included modulestop
Bio::SeqFeature::Gene::Exon
Bio::SeqFeature::Gene::Transcript
Modware::Feature::TRANSCRIPT
Modware::Protein_info
Inherit top
Modware::Feature::TRANSCRIPT
Synopsistop
  # NEVER INSTANTIATE THIS OBJECT DIRECTLY, USE Modware::Feature 

#USE CASE : print the cds stored in the database as a fasta file
my $feature = new Modware::Feature( -primary_id => 'DDB0233595' );
print $feature->sequence( -type => 'cds', -format => 'fasta' );

#USE CASE : print the translated cds
my $feature = new Modware::Feature( -primary_id => 'DDB0233595' );
print $feature->sequence( -type => 'protein', -format => 'fasta' );

#USE CASE: shift feature up 200 bases
my $feature = new Modware::Feature( -primary_id => 'DDB0233595' );
$feature->shift_feature( 200 );
$feature->update()

#USE CASE: Add a description, dbxref, and an exon
use Modware::Feature;
my $transcript = new Modware::Feature( -primary_id => 'DDB0233595' );

$transcript->description( 'Gene model derived from AU12345' );
$transcript->add_external_id( -source => 'GenBank Accession Number',
-id => 'AU12345' );

$bioperl = $transcript->bioperl();
# here, we are manipulating a Bio::SeqFeature::Gene object

# shift the last exon back a little bit (to lose stop codon)
[$bioperl->exons()]->[2]->start( 281050 );

# create a new exon and add it to the feature
my $exon = Bio::SeqFeature::Gene::Exon->new( -start => 280921,
-end => 280959,
-strand => -1 );
$exon->is_coding(1);
$bioperl->add_exon($exon);

# update writes everything to the database
$transcript->update();
Descriptiontop
  The features here have 3 sequences: the cds, the protein (translation of cds) and genomic sequence.
Currently, features store any available sequences in the display_seq table. This may change, but
for now, that means that there are methods to get the cds. One is from the display_seq table
the other is to actually splice the sequence out of the chromosome that this feature is associated
with.

If you wish to retreive a particular format, use the sequences method
it takes two named arguments:
-type : one of the sequence types listed above
-format : any format that Bio::SeqIO can write, OR leave this argument
out for a plain string of DNA or AAs.

Available display_seq_types for mRNA features:
'cds'
'genomic'
'protein'

genomic returns the unspliced cds plus 1000 bases up and 1000 bases downstream. When less than
1000 bases are available, it the maximum number of bases available. The Fasta header informs you
of how many up/downstream bases are reported.

The bioperl object is a Bio::SeqFeature::Gene::Transcript. This is a SeqFeature subclass meaning
that it is modeled as a set of coordinates on a Seq object. The Seq object that this feature
is 'on' is a chromosome. This consists of exons which each have their own location. The cds
method splices out the cds based on these exons.
Methodstop
_get_bioperlDescriptionCode
_get_cached_sequencesDescriptionCode
_get_protein_infoDescriptionCode
_initDescriptionCode
_update_cached_sequence()No descriptionCode
calculate_cds_seqDescriptionCode
calculate_protein_seqDescriptionCode
insertDescriptionCode
is_partialDescriptionCode
phaseDescriptionCode
protein_infoDescriptionCode
shift_featureDescriptionCode
updateDescriptionCode

Methods description

_get_bioperlcodetopprevnext
 Title    : _get_bioperl
Note : creates a bioperl object representing this CDS (Bio::SeqFeature::Gene::Transcript)
Usage : called internally by lazy evaluated 'bioperl' method
Function : creates a bioperl object with a location on the chromosome's bioperl object.
Returns : nothing
Args : none
_get_cached_sequencescodetopprevnext
 Title    : _get_cached_sequences
Usage : $feature->_get_cached_sequences();
Function : gets the cached_sequences object for this feature (called from cached_sequences)
Returns : nothing
Args : none
_get_protein_infocodetopprevnext
 Title    : _get_protein_info
Usage : $feature->_get_protein_info();
Function : gets the protein_info object for this feature (called from protein_info)
Returns : nothing
Args : none
_initcodetopprevnext
 Title    : _init
Note : sets attributes specific to mRNA features
Usage : called internally by new
Function :
Returns : nothing
Args : none
calculate_cds_seqcodetopprevnext
 Title    : calculate_cds_seq
Function : returns coding sequence as calculated from the chromosome sequence
Returns : dna string
Args : none
calculate_protein_seqcodetopprevnext
 Title    : calculate_protein_seq
Function : returns translation of coding sequence as calculated from the chromosome sequence
Returns : aa string
Args : none
insertcodetopprevnext
 Title    : insert
Function : calls SUPER::insert but then takes care of MRNA specific things
Returns : nothing
Args : none
is_partialcodetopprevnext
 Title    : is_partial
Function : returns true if mRNA is missing start or stop codon
Returns : 0 or 1
Args : none
phasecodetopprevnext
 Title    : phase
Function : gets/sets the coding phase ( start codon - 1 ) of the feature
Returns : number
Args : optional number
protein_infocodetopprevnext
 Title    : protein_info
Note : Fetches the protein_info object associated with this feature
: May make feature and loci singletons, then use circular refs and 'weaken'
Usage : To print the protein_info name of the protein_info that this feature belongs to
: print $self->protein_info->protein_info_name();
Function : gets/sets the protein_info attribute of the feature
Returns : string
Args : optional: protein_info object
shift_featurecodetopprevnext
 Title    : shift_feature
Note :
Usage : To move a feature upstream by 125 bases:
: $feature->shift_feature( 25 );
Function : moves a feature by a specified amount
Returns : nothing
Args : integer ( + or - )
updatecodetopprevnext
 Title    : update
Function : calls SUPER::update but then takes care of MRNA specific things
Returns : nothing
Args : none

Methods code

_get_bioperldescriptiontopprevnext
sub _get_bioperl {
   my ($self, @args) = @_;

   my $bioperl  = new Bio::SeqFeature::Gene::Transcript();

   my @subfeatures = $self->subfeatures();

   my @exons       = grep{ $_->type_id->name eq $self->_exon_type() } @subfeatures;

   my @bp_exons;

   foreach my $exon ( @exons ) {
      my $locs     = $exon->featureloc_feature_id();
      my $location = $locs->next();
      $self->throw("more than one location for feature_id: ".$exon->feature_id ) if $locs->next();

     # chado is interbase coordinates, so add 1 to start of exons
my $bp_exon = Bio::SeqFeature::Gene::Exon->new ( -start => $location->fmin + 1, -end => $location->fmax, -strand => $location->strand() ); $bp_exon->add_tag_value('feature_id', $exon->feature_id() ); $bp_exon->is_coding(1); push @bp_exons, $bp_exon; } my $strand = $self->_featureloc(); #
# sort the exons by start ( order is reversed based on strand )
#
{ # was getting annoying warning, could not figure it out
no warnings qw (uninitialized ); @bp_exons = map { $_->[1] } sort { $strand *$a->[0] <=> $strand *$b->[0] } map { [ $_->start(), $_ ] } @bp_exons; } # and add them to the Transcript object
map { $bioperl->add_exon( $_ ) } @bp_exons; $self->bioperl( $bioperl ); $self->bioperl( $self->reference_feature() ) if ( $self->{reference_feature} );
}
_get_cached_sequencesdescriptiontopprevnext
sub _get_cached_sequences {
   my ($self) = @_;

   my $seq_hash = {};

   my @seqtypes = ( 'genomic', 'cds', 'protein' );
   foreach my $seqtype (@seqtypes) {
      my $methodname = "calculate_".$seqtype."_seq";
      $seq_hash->{$seqtype} = $self->$methodname;
   }
 
   $self->cached_sequences( $seq_hash );
}
_get_protein_infodescriptiontopprevnext
sub _get_protein_info {
   my ($self) = @_;

   my $protein_info  = Modware::Protein_info->new(\$ self->calculate_protein_seq() );

   $self->protein_info( $protein_info ) if $protein_info;
}
_initdescriptiontopprevnext
sub _init {
   my ($self, @args) = @_;

   $self->SUPER::_init( @args );

   $self->_exon_type( 'exon' );

   $self->phase( 0 );

  #
# if its coming from the database, set certain properties from the database
#
if ( $self->_database_object ) { my $trans_start = $self->_get_featureprop( 'translation_start' ); $self->phase( $trans_start ? $trans_start - 1 : 0 ); my $location = $self->_featureloc(); $self->strand( $location->strand() ); }
}
_update_cached_sequence()descriptiontopprevnext
sub _update_cached_sequence() {
   my ($self,@args) = @_;


  #
# set the genomic sequence which is up to 1000 bp up and down of the cds
#
my $protein_seq = $self->calculate_protein_seq(); $self->qualifiers ( [grep { !/partial/i } @{ $self->qualifiers() } ] ); ( $protein_seq !~/^M/ || $self->phase > 0 ) ? $self->add_qualifier( "Partial, 5' missing" ) : $self->remove_qualifier( "Partial, 5' missing" ); $protein_seq !~/\*$/ ? $self->add_qualifier( "Partial, 3' missing" ) : $self->remove_qualifier( "Partial, 3' missing" ); warn $self->feature_id." has 5' partial\n " if DEBUG && $self->has_qualifier( "Partial, 5' missing" ); #
# warn if sequence is bad
#
if ( $protein_seq =~ /\*./ ) { $self->warn("There are internal stop codon(s) in the coding sequence"); } delete $self->{cached_sequences};
}
calculate_cds_seqdescriptiontopprevnext
sub calculate_cds_seq {
    my ($self) = @_;

  #
# make sure that bioperl object is attached to Bio::Seq representing chromosome
#
$self->bioperl( $self->reference_feature->bioperl ) if !$self->bioperl() && $self->reference_feature(); $self->throw("Cant get calculate cds") if !$self->bioperl(); return $self->bioperl();
}
calculate_protein_seqdescriptiontopprevnext
sub calculate_protein_seq {
    my ($self) = @_;

  #
# make sure that bioperl object is attached to Bio::Seq representing chromosome
#
$self->bioperl( $self->reference_feature->bioperl ) if !$self->bioperl() && $self->reference_feature(); return $self->bioperl( undef, undef, $self->phase() )->seq();
}
insertdescriptiontopprevnext
sub insert {
   my ($self,  @args) = @_;

   $self->SUPER::insert();

   $self->_update_cached_sequence();
   $self->_insert_or_update_featureprop( 'translation_start', $self->phase() + 1  );

   $self->_update_qualifiers();  # need to update tags here, because calculating sequence can add a tags
}
is_partialdescriptiontopprevnext
sub is_partial {
    my ($self) = @_;

   return ( grep { /partial/i } @{ $self->qualifiers() } ) ? 1 : 0;
}
phasedescriptiontopprevnext
sub phase {
   my ($self, $obj) = @_;
   
   if ( scalar @_ > 1 && ( $obj < 0 || $obj > 2 ) ) {
      $self->throw("phase must be 0,1,2");
   }

   if(scalar @_ > 1) {
      $self->{phase} = $obj;
   }
   return $self->{phase};
}
protein_infodescriptiontopprevnext
sub protein_info {
   my ($self, $obj) = @_;

  #
# fetches protein_info from database (_get_protein_info) if protein_info is not yet defined
# and the user is not attempting to set the protein_info
#
exists $self->{protein_info} || scalar @_ > 1 || $self->_get_protein_info(); if(scalar @_ > 1) { $self->{protein_info} = $obj; } return $self->{protein_info};
}
shift_featuredescriptiontopprevnext
sub shift_feature {
   my ($self, $offset) = @_;

   my $bioperl = $self->bioperl;
   $bioperl->start(  $bioperl->start() + $offset );
   $bioperl->end  (  $bioperl->end() + $offset   );

   foreach my $exon ( $bioperl->exons ) {
      $exon->start(  $exon->start() + $offset );
      $exon->end  (  $exon->end()   + $offset );

   }
}
updatedescriptiontopprevnext
sub update {
   my ($self,  @args) = @_;

   $self->SUPER::update();
   
   $self->_update_cached_sequence() if $self->{'bioperl'};
   $self->_insert_or_update_featureprop( 'translation_start', $self->phase() + 1  );
  
   $self->_update_qualifiers();  # need to update tags here, because calculating sequence can add a tags
}

General documentation

AUTHOR - Eric Just top
   Eric Just e-just@northwestern.edu
APPENDIX top
   The rest of the documentation details each of the object
methods. Internal methods are usually preceded with a _
_update_cached_sequence top
 Title    : _update_cached_sequence
Function : Flags with 'Partial' qualifiers
: (Modware does not store precomputed sequences like dictyBase does)
Returns : nothing
Args : nothing: