Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1999 Mar 16;96(6):2591-5.
doi: 10.1073/pnas.96.6.2591.

A hierarchical approach to protein molecular evolution

Affiliations

A hierarchical approach to protein molecular evolution

L D Bogarad et al. Proc Natl Acad Sci U S A. .

Abstract

Biological diversity has evolved despite the essentially infinite complexity of protein sequence space. We present a hierarchical approach to the efficient searching of this space and quantify the evolutionary potential of our approach with Monte Carlo simulations. These simulations demonstrate that nonhomologous juxtaposition of encoded structure is the rate-limiting step in the production of new tertiary protein folds. Nonhomologous "swapping" of low-energy secondary structures increased the binding constant of a simulated protein by approximately 10(7) relative to base substitution alone. Applications of our approach include the generation of new protein folds and modeling the molecular evolution of disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic diagram of the simulated molecular evolution protocols. (a) Simulation of molecular evolution by means of base substitution (substitutions are represented by orange dots). (b) Simulated DNA shuffling showing the optimal fragmentation length of two subdomains. (c) The hierarchical optimization of local space searching: the 250 different sequences in each of the five pools (e.g., helices, strands, turns, loops, and others) are schematically represented by different shades of the same color. (d) The multipool swapping model for searching vast regions of tertiary fold space is essentially the same as in the Fig. 1c, except that now sequences from all five different structural pools can be swapped into any subdomain. Multipool swapping allows for the formation of new tertiary structures by changing the type of secondary structure at any position along the protein.
Figure 2
Figure 2
Schematic diagram representing a portion of the high-dimensional protein composition space. The three-dimensional energy landscape of Protein Fold 1 (green) is shown in cutaway. The arcs with arrowheads represent the ability of a given molecular evolution process to change the composition and so to traverse the increasingly large barriers in the energy function. The smallest arc (light yellow) represents the ability to evolve improved fold function by means of point mutation. Then in increasing order: DNA shuffling (dark yellow), swapping (orange), and mixing (red). Finally, our multipool swapping model allows an evolving system to move (purple arc) to a different energy landscape representing a new tertiary fold (bottom). With this model, functional tertiary fold space has a large yet manageable number of dimensions. That is, in 100 amino acids we assume 10 secondary structures of 5 types (we balance rare forms with the predominance of strands and helices) roughly yielding the potential for ≈107 basic tertiary folds in Nature. Clearly, organization into secondary structural classes represents a dramatic reduction in the realized complexity of sequence space (e.g., versus 300 bases of open reading frame DNA, ≈10170, or 100 amino acids with a 20-letter or 5-letter genetic code, ≈10130 or ≈1070, respectively).

Similar articles

Cited by

References

    1. Devlin J J, Panganiban L C, Devlin P E. Science. 1990;249:404–406. - PubMed
    1. Cwirla S E, Peters E A, Barrett R W, Dower W J. Proc Natl Acad Sci USA. 1990;87:6378–6382. - PMC - PubMed
    1. Scott J K, Smith G P. Science. 1990;249:386–390. - PubMed
    1. Hawkins R E, Russell S J, Winter G. J Mol Biol. 1992;226:889–896. - PubMed
    1. Gram H, Marconi L A, Barbas C F, Collet T A, Lerner R A, Kang A S. Proc Natl Acad Sci USA. 1992;89:3576–3580. - PMC - PubMed

LinkOut - more resources