Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb;72(2):193-203.
doi: 10.1007/s00239-010-9415-2. Epub 2010 Dec 4.

Exploiting models of molecular evolution to efficiently direct protein engineering

Affiliations

Exploiting models of molecular evolution to efficiently direct protein engineering

Megan F Cole et al. J Mol Evol. 2011 Feb.

Abstract

Directed evolution and protein engineering approaches used to generate novel or enhanced biomolecular function often use the evolutionary sequence diversity of protein homologs to rationally guide library design. To fully capture this sequence diversity, however, libraries containing millions of variants are often necessary. Screening libraries of this size is often undesirable due to inaccuracies of high-throughput assays, costs, and time constraints. The ability to effectively cull sequence diversity while still generating the functional diversity within a library thus holds considerable value. This is particularly relevant when high-throughput assays are not amenable to select/screen for certain biomolecular properties. Here, we summarize our recent attempts to develop an evolution-guided approach, Reconstructing Evolutionary Adaptive Paths (REAP), for directed evolution and protein engineering that exploits phylogenetic and sequence analyses to identify amino acid substitutions that are likely to alter or enhance function of a protein. To demonstrate the utility of this technique, we highlight our previous work with DNA polymerases in which a REAP-designed small library was used to identify a DNA polymerase capable of accepting non-standard nucleosides. We anticipate that the REAP approach will be used in the future to facilitate the engineering of biopolymers with expanded functions and will thus have a significant impact on the developing field of 'evolutionary synthetic biology'.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Signatures of evolutionary functional divergence. Schematics of site-specific types I & II functional divergence between subfamilies of sequences. Left, type-I functional divergence in which a specific site is occupied by a conserved aspartate residue (D) in one lineage but occupied by many residues at the homologous site in the other lineage. Middle, type-II functional divergence in which a specific site is again occupied by a conserved aspartate residue (D) in one lineage while the homologous site in the other lineage is also conserved but occupied by a different residue (histidine, H). Right, no functional divergence associated with the replacements of aspartates and histidines. Functional inferences associated with these patterns require phylogenetic analysis otherwise they are indistinguishable from historical contingence (i.e., common ancestry).
Figure 2
Figure 2
The use of phylogenetics for directed evolution and protein engineering. This schematic shows how variation contained within homologous sequences is captured by different directed evolution approaches and how this relates to sampling of sequence/function space. Branch lengths are not to scale. A, Site-directed mutagenesis approach randomly inserts mutations into the parent sequence. B, Standard DNA shuffling approach builds libraries that incorporate homologous sequence information from all branches of a phylogeny because the approach uses only extant (modern) sequence information. C, REAP approach builds libraries that incorporate homologous sequence information from only those branches of the phylogeny inferred to have undergone functional adaptation and divergence.
Figure 3
Figure 3
Flowchart for the REAP approach. The implementation of the REAP approach begins by collecting and aligning sequences from a protein family. A phylogenetic tree must then be constructed to capture the evolutionary relationship and distance between homologs. From this information, molecular models are used to detect functional divergence along branches of the phylogeny. The computational reconstruction of ancestral states of the protein along these branches is then used to identify residues and amino acids associated with the functional divergence. This will result in a list of candidate residues/mutations that may affect the function of the protein. This list of candidates can be further reduced if needed by incorporating known structural or biochemical information about the protein. For example, residues may be selected based on their proximity to the protein's active site. The final candidate residue list is then used to design the variant library to screen for the desired function. In this manner, the REAP approach results in a small number of residues/mutations to vary in the sequence library.
Figure 4
Figure 4
Phylogeny of Family A DNA polymerases. The viral and non-viral clades used for REAP analysis are highlighted. Scale bar represents amino acid replacements/site/unit evolutionary time. Examples of patterns of types I & II functional divergence are also shown.
Figure 5
Figure 5
Distribution of functional divergence sites mapped onto the structure of Taq polymerase. Locations of the 35 sites in Taq polymerase identified by the REAP analysis and mapped onto the polymerase structure and colored in magenta (PDB accession 5KTQ). The incoming nucleoside triphosphate substrate is shown in cyan.

References

    1. Arnold FH, Georgiou G. Directed Enzyme Evolution: Screening and Selection Methods. Humana Press; Totowa, New Jersey: 2003.
    1. Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, Studholme DJ, Yeats C, Eddy SR. The Pfam protein families database. Nucleic Acids Res. 2004;32:D138–41. - PMC - PubMed
    1. Benner SA, Gaucher EA. Evolution, language and analogy in functional genomics. Trends Genet. 2001;17:414–8. - PubMed
    1. Bielawski JP, Yang Z. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. J Mol Evol. 2004;59:121–32. - PubMed
    1. Brakmann S. Discovery of superior enzymes by directed molecular evolution. Chembiochem. 2001;2:865–71. - PubMed

Publication types

Substances

LinkOut - more resources