Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1997 Nov;73(5):2393-403.
doi: 10.1016/S0006-3495(97)78268-7.

How are model protein structures distributed in sequence space?

Affiliations

How are model protein structures distributed in sequence space?

E Bornberg-Bauer. Biophys J. 1997 Nov.

Abstract

The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.

PubMed Disclaimer

References

    1. Proc Natl Acad Sci U S A. 1996 Apr 16;93(8):3341-5 - PubMed
    1. Nat Struct Biol. 1995 Oct;2(10):856-64 - PubMed
    1. Nature. 1970 Feb 7;225(5232):563-4 - PubMed
    1. Science. 1990 Nov 23;250(4984):1121-5 - PubMed
    1. Proc Natl Acad Sci U S A. 1987 Nov;84(21):7524-8 - PubMed