Protein family classification and functional annotation
- PMID: 12798038
- DOI: 10.1016/s1476-9271(02)00098-1
Protein family classification and functional annotation
Abstract
With the accelerated accumulation of genomic sequence data, there is a pressing need to develop computational methods and advanced bioinformatics infrastructure for reliable and large-scale protein annotation and biological knowledge discovery. The Protein Information Resource (PIR) provides an integrated public resource of protein informatics to support genomic and proteomic research. PIR produces the Protein Sequence Database of functionally annotated protein sequences. The annotation problems are addressed by a classification-driven and rule-based method with evidence attribution, coupled with an integrated knowledge base system being developed. The approach allows sensitive identification, consistent and rich annotation, and systematic detection of annotation errors, as well as distinction of experimentally verified and computationally predicted features. The knowledge base consists of two new databases, sequence analysis tools, and graphical interfaces. PIR-NREF, a non-redundant reference database, provides a timely and comprehensive collection of all protein sequences, totaling more than 1,000,000 entries. iProClass, an integrated database of protein family, function, and structure information, provides extensive value-added features for about 830,000 proteins with rich links to over 50 molecular databases. This paper describes our approach to protein functional annotation with case studies and examines common identification errors. It also illustrates that data integration in PIR supports exploration of protein relationships and may reveal protein functional associations beyond sequence homology.
Similar articles
-
The Protein Information Resource.Nucleic Acids Res. 2003 Jan 1;31(1):345-7. doi: 10.1093/nar/gkg040. Nucleic Acids Res. 2003. PMID: 12520019 Free PMC article.
-
The Protein Information Resource: an integrated public resource of functional annotation of proteins.Nucleic Acids Res. 2002 Jan 1;30(1):35-7. doi: 10.1093/nar/30.1.35. Nucleic Acids Res. 2002. PMID: 11752247 Free PMC article.
-
iProClass: an integrated database of protein family, function and structure information.Nucleic Acids Res. 2003 Jan 1;31(1):390-2. doi: 10.1093/nar/gkg044. Nucleic Acids Res. 2003. PMID: 12520030 Free PMC article.
-
Update on genome completion and annotations: Protein Information Resource.Hum Genomics. 2004 Mar;1(3):229-33. doi: 10.1186/1479-7364-1-3-229. Hum Genomics. 2004. PMID: 15588483 Free PMC article. Review.
-
The apoptosis database.Cell Death Differ. 2003 Jun;10(6):621-33. doi: 10.1038/sj.cdd.4401230. Cell Death Differ. 2003. PMID: 12761571 Review.
Cited by
-
Prediction of Novel Drug Targets and Vaccine Candidates against Human Lice (Insecta), Acari (Arachnida), and Their Associated Pathogens.Vaccines (Basel). 2021 Dec 22;10(1):8. doi: 10.3390/vaccines10010008. Vaccines (Basel). 2021. PMID: 35062669 Free PMC article.
-
Cross-Genome Comparisons of Newly Identified Domains in Mycoplasma gallisepticum and Domain Architectures with Other Mycoplasma species.Comp Funct Genomics. 2011;2011:878973. doi: 10.1155/2011/878973. Epub 2011 Aug 8. Comp Funct Genomics. 2011. PMID: 21860605 Free PMC article.
-
Genome-wide analysis and expression profiles of PdeMYB transcription factors in colored-leaf poplar (Populus deltoids).BMC Plant Biol. 2021 Sep 23;21(1):432. doi: 10.1186/s12870-021-03212-1. BMC Plant Biol. 2021. PMID: 34556053 Free PMC article.
-
ProtSweep, 2Dsweep and DomainSweep: protein analysis suite at DKFZ.Nucleic Acids Res. 2007 Jul;35(Web Server issue):W444-50. doi: 10.1093/nar/gkm364. Epub 2007 May 25. Nucleic Acids Res. 2007. PMID: 17526514 Free PMC article.
-
Sequence similarity network reveals common ancestry of multidomain proteins.PLoS Comput Biol. 2008 May 16;4(4):e1000063. doi: 10.1371/journal.pcbi.1000063. PLoS Comput Biol. 2008. PMID: 18475320 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources