Tuesday, September 07, 2010   
  Search   
 
DC Project Central Logo
      

What is SIMAP?

   
  
 About SIMAP    

From the SIMAP Website:

What is SIMAP?
SIMAP is a database of protein similarities and protein domains. It contains about all currently published protein sequences and is continuously updated. Protein similarities are computed using the FASTA algorithm which provides optimal speed and sensitivity. Protein domains are calculated using the InterPro methods and databases. SIMAP is to our knowledge the only project that combines comprehensive coverage with respect to all known proteins and incremental update capabilities.

What is SIMAP used for?
Because of the huge amount of known protein sequences in public databases it became clear that most of them will not be experimentally characterized in the near future. Nevertheless, proteins that have evolved from a common ancestor often share same functions (so-called orthologs). So it is possible to infer the function of a non-characterized protein from an ortholog with known function. A well-known example are the investigations about mouse genes and proteins. Their results are also beeing true for orthologous human genes and proteins in many cases. Protein similarities provide information about relations between proteins and are necessary for the prediction of orthologs.
Protein domains (often called function domains) are the structural building blocks of proteins. They are responsible for the activities of a certain protein, e.g. binding of small molecules, catalytic reactions or binding other proteins in large complexes. The knowledge about protein domains is stored in huge repositories like the InterPro databases. The prediction of domains in newly sequenced proteins is based on those database and provides a fully-automatic functional annotation of these proteins. Therefore we calculate protein domains for all proteins in SIMAP, thus providing the largest system for protein function prediction worldwide.
There are many more bioinformatics methods that rely on protein similarity and domains. Our protein similarity database provides pre-computed similarity and domain data and represents the known protein space. This opens completely new perspectives compared to the commonly used method to repeatedly re-calculate such kind of data. SIMAP is regularly updated. The similarity matrix is simply beeing incrementally extended if new sequences occur. The use of SIMAP is completely free for education and public research.

   
  
 Project Links and Stats    
   
 Print   
 Sponsor Links    
   
  
DotNetNuke® is copyright 2002-2010 by DotNetNuke Corporation