Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2005 Mar-Apr;18(2):139-49.
doi: 10.1002/jmr.721.

Efficient recognition of protein fold at low sequence identity by conservative application of Psi-BLAST: validation

Affiliations
Comparative Study

Efficient recognition of protein fold at low sequence identity by conservative application of Psi-BLAST: validation

F J Stevens. J Mol Recognit. 2005 Mar-Apr.

Abstract

A substantial fraction of protein sequences derived from genomic analyses is currently classified as representing 'hypothetical proteins of unknown function'. In part, this reflects the limitations of methods for comparison of sequences with very low identity. We evaluated the effectiveness of a Psi-BLAST search strategy to identify proteins of similar fold at low sequence identity. Psi-BLAST searches for structurally characterized low-sequence-identity matches were carried out on a set of over 300 proteins of known structure. Searches were conducted in NCBI's non-redundant database and were limited to three rounds. Some 614 potential homologs with 25% or lower sequence identity to 166 members of the search set were obtained. Disregarding the expect value, level of sequence identity and span of alignment, correspondence of fold between the target and potential homolog was found in more than 95% of the Psi-BLAST matches. Restrictions on expect value or span of alignment improved the false positive rate at the expense of eliminating many true homologs. Approximately three-quarters of the putative homologs obtained by three rounds of Psi-BLAST revealed no significant sequence similarity to the target protein upon direct sequence comparison by BLAST, and therefore could not be found by a conventional search. Although three rounds of Psi-BLAST identified many more homologs than a standard BLAST search, most homologs were undetected. It appears that more than 80% of all homologs to a target protein may be characterized by a lack of significant sequence similarity. We suggest that conservative use of Psi-BLAST has the potential to propose experimentally testable functions for the majority of proteins currently annotated as 'hypothetical proteins of unknown function'.

PubMed Disclaimer

Similar articles

Cited by

Publication types

LinkOut - more resources