Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1987 Jul;84(13):4355-8.
doi: 10.1073/pnas.84.13.4355.

Profile analysis: detection of distantly related proteins

Comparative Study

Profile analysis: detection of distantly related proteins

M Gribskov et al. Proc Natl Acad Sci U S A. 1987 Jul.

Abstract

Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of similar proteins. This information is expressed in a position-specific scoring table (profile), which is created from a group of sequences previously aligned by structural or sequence similarity. The similarity of any other sequence (target) to the group of aligned sequences (probe) can be tested by comparing the target to the profile using dynamic programming algorithms. The profile method differs in two major respects from methods of sequence comparison in common use: (i) Any number of known sequences can be used to construct the profile, allowing more information to be used in the testing of the target than is possible with pairwise alignment methods. (ii) The profile includes the penalties for insertion or deletion at each position, which allow one to include the probe secondary structure in the testing scheme. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.

PubMed Disclaimer

References

    1. J Mol Biol. 1966 Mar;16(1):9-16 - PubMed
    1. Nucleic Acids Res. 1986 Aug 26;14(16):6745-63 - PubMed
    1. Annu Rev Biochem. 1978;47:251-76 - PubMed
    1. J Mol Biol. 1980 Jan 25;136(3):225-70 - PubMed
    1. Science. 1981 Oct 9;214(4517):149-59 - PubMed

Publication types

LinkOut - more resources