Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2004 Aug;11(16):2135-42.
doi: 10.2174/0929867043364702.

Prediction of protein function in the absence of significant sequence similarity

Affiliations
Review

Prediction of protein function in the absence of significant sequence similarity

Paul D Dobson et al. Curr Med Chem. 2004 Aug.

Abstract

Tremendous progress in DNA sequencing has yielded the genomes of a host of important organisms. The utilisation of these resources requires understanding of the function of each gene. Standard methods of functional assignment involve sequence alignment to a gene of known function; however such methods often fail to find any significant matches. Here we discuss a number of recent alternative methods that may be of use when sequence alignment fails. Function can be defined in a number of ways including E.C. number and MIPS and KEGG functional classes. Phylogenetic profiles show the pattern of presence or absence of a protein between genomes. Protein-protein interactions can be identified by searching for interacting pairs of proteins that are fused to a single protein chain in another organism. The gene neighbour method uses the observation that if the genes that encode two proteins are close on a chromosome, the proteins tend to be functionally related. More general methods use sequence properties such as amino acid composition, mean hydrophobicity, predicted secondary structure and post-translational modification sites. Data mining methods devise rules in the form of IF... THEN statements that make predictions of function using sequence based attributes, predicted secondary structure and sequence similarity. Finally, structural features can be used, after modelling the structure of a protein from its sequence or solving its structure. Protein fold class can be strongly indicative of function, while other structural features, such as secondary structure content, cleft size and 3D structural motifs are also useful.

PubMed Disclaimer

LinkOut - more resources