Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2009 Feb 2;10(2):207.
doi: 10.1186/gb-2009-10-2-207.

Protein function annotation by homology-based inference

Affiliations
Review

Protein function annotation by homology-based inference

Yaniv Loewenstein et al. Genome Biol. .

Abstract

With many genomes now sequenced, computational annotation methods to characterize genes and proteins from their sequence are increasingly important. The BioSapiens Network has developed tools to address all stages of this process, and here we review progress in the automated prediction of protein function based on protein sequence and structure.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Automated strategy for assigning function to proteins. The various approaches to protein function prediction are described in the text. Both protein sequences and structures can provide information for family classification and functional inference. Sequence-based methods make use of different strategies for grouping proteins into families (for example, sequence tree construction based on clustering of all against all sequence comparisons) or they compare the target sequence with pre-compiled databases of families. When a structure is available, the whole structure can be scanned against precompiled sets of functional sites. Alternatively, fragments of the target protein can be used to identify any structural similarities in the conformation of proteins of known structure, possibly related to a molecular function. Both sequences and structures, together with protein-protein interaction data, can be used to infer interactions, which can provide functional clues. Ideally, an independent set should be used to assess the reliability of the various methods.
Box 1
Box 1
Glossary of terms

References

    1. Kryshtafovych A, Fidelis K, Moult J. Progress from CASP6 to CASP7. Proteins. 2007;69(Suppl 8):194–207. doi: 10.1002/prot.21769. - DOI - PubMed
    1. Grabowski M, Joachimiak A, Otwinowski Z, Minor W. Structural genomics: keeping up with expanding knowledge of the protein universe. Curr Opin Struct Biol. 2007;17:347–353. doi: 10.1016/j.sbi.2007.06.003. - DOI - PMC - PubMed
    1. Reeves GA, Thornton JM. Integrating biological data through the genome. Hum Mol Genet. 2006;15(Spec No 1):R81–R87. doi: 10.1093/hmg/ddl086. - DOI - PubMed
    1. Prlic A, Down TA, Kulesha E, Finn RD, Kahari A, Hubbard TJ. Integrating sequence and structural biology with DAS. BMC Bioinformatics. 2007;8:333. doi: 10.1186/1471-2105-8-333. - DOI - PMC - PubMed
    1. Tramontano A. The role of molecular modelling in biomedical research. FEBS Lett. 2006;580:2928–2934. doi: 10.1016/j.febslet.2006.04.011. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources