In silico characterization of proteins: UniProt, InterPro and Integr8
- PMID: 18219596
- DOI: 10.1007/s12033-007-9003-x
In silico characterization of proteins: UniProt, InterPro and Integr8
Abstract
Nucleic acid sequences from genome sequencing projects are submitted as raw data, from which biologists attempt to elucidate the function of the predicted gene products. The protein sequences are stored in public databases, such as the UniProt Knowledgebase (UniProtKB), where curators try to add predicted and experimental functional information. Protein function prediction can be done using sequence similarity searches, but an alternative approach is to use protein signatures, which classify proteins into families and domains. The major protein signature databases are available through the integrated InterPro database, which provides a classification of UniProtKB sequences. As well as characterization of proteins through protein families, many researchers are interested in analyzing the complete set of proteins from a genome (i.e. the proteome), and there are databases and resources that provide non-redundant proteome sets and analyses of proteins from organisms with completely sequenced genomes. This article reviews the tools and resources available on the web for single and large-scale protein characterization and whole proteome analysis.
Similar articles
-
InterPro and InterProScan: tools for protein sequence classification and comparison.Methods Mol Biol. 2007;396:59-70. doi: 10.1007/978-1-59745-515-2_5. Methods Mol Biol. 2007. PMID: 18025686
-
Applications of InterPro in protein annotation and genome analysis.Brief Bioinform. 2002 Sep;3(3):285-95. doi: 10.1093/bib/3.3.285. Brief Bioinform. 2002. PMID: 12230037
-
UniProt: the Universal Protein knowledgebase.Nucleic Acids Res. 2004 Jan 1;32(Database issue):D115-9. doi: 10.1093/nar/gkh131. Nucleic Acids Res. 2004. PMID: 14681372 Free PMC article.
-
Bioinformatics Tools for Proteomics Data Interpretation.Adv Exp Med Biol. 2016;919:281-341. doi: 10.1007/978-3-319-41448-5_16. Adv Exp Med Biol. 2016. PMID: 27975225 Review.
-
UniProt and Mass Spectrometry-Based Proteomics-A 2-Way Working Relationship.Mol Cell Proteomics. 2023 Aug;22(8):100591. doi: 10.1016/j.mcpro.2023.100591. Epub 2023 Jun 8. Mol Cell Proteomics. 2023. PMID: 37301379 Free PMC article. Review.
Cited by
-
Analysis of the Protein phosphotome of Entamoeba histolytica reveals an intricate phosphorylation network.PLoS One. 2013 Nov 13;8(11):e78714. doi: 10.1371/journal.pone.0078714. eCollection 2013. PLoS One. 2013. PMID: 24236039 Free PMC article.
-
Using comparative genomics to uncover new kinds of protein-based metabolic organelles in bacteria.Protein Sci. 2013 Feb;22(2):179-95. doi: 10.1002/pro.2196. Epub 2013 Jan 4. Protein Sci. 2013. PMID: 23188745 Free PMC article.
-
Bioinformatic analyses of transmembrane transport: novel software for deducing protein phylogeny, topology, and evolution.J Mol Microbiol Biotechnol. 2009;17(4):163-76. doi: 10.1159/000239667. Epub 2009 Sep 18. J Mol Microbiol Biotechnol. 2009. PMID: 19776645 Free PMC article. Review.
-
The genome of the heartworm, Dirofilaria immitis, reveals drug and vaccine targets.FASEB J. 2012 Nov;26(11):4650-61. doi: 10.1096/fj.12-205096. Epub 2012 Aug 13. FASEB J. 2012. PMID: 22889830 Free PMC article.
-
Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study.Database (Oxford). 2011 Mar 15;2011:bar004. doi: 10.1093/database/bar004. Print 2011. Database (Oxford). 2011. PMID: 21411447 Free PMC article.
References
-
- Nucleic Acids Res. 2004 Jan 1;32(Database issue):D112-4 - PubMed
-
- Bioinformatics. 2005 Sep 15;21(18):3604-9 - PubMed
-
- Nucleic Acids Res. 2006 Jan 1;34(Database issue):D257-60 - PubMed
-
- Nucleic Acids Res. 2006 Jan 1;34(Database issue):D247-51 - PubMed
-
- Nucleic Acids Res. 2007 Jan;35(Database issue):D16-20 - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials