Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 May 22:46:247-269.
doi: 10.1146/annurev-biophys-070816-033631. Epub 2017 Mar 15.

Reconstructing Ancient Proteins to Understand the Causes of Structure and Function

Affiliations
Review

Reconstructing Ancient Proteins to Understand the Causes of Structure and Function

Georg K A Hochberg et al. Annu Rev Biophys. .

Abstract

A central goal in biochemistry is to explain the causes of protein sequence, structure, and function. Mainstream approaches seek to rationalize sequence and structure in terms of their effects on function and to identify function's underlying determinants by comparing related proteins to each other. Although productive, both strategies suffer from intrinsic limitations that have left important aspects of many proteins unexplained. These limits can be overcome by reconstructing ancient proteins, experimentally characterizing their properties, and retracing their evolution through time. This approach has proven to be a powerful means for discovering how historical changes in sequence produced the functions, structures, and other physical/chemical characteristics of modern proteins. It has also illuminated whether protein features evolved because of functional optimization, historical constraint, or blind chance. Here we review recent studies employing ancestral protein reconstruction and show how they have produced new knowledge not only of molecular evolutionary processes but also of the underlying determinants of modern proteins' physical, chemical, and biological properties.

Keywords: ancestral reconstruction; epistasis; evolutionary biochemistry; historical contingency; vertical analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Horizontal and vertical analysis of sequence–function relations. To identify the sequence differences that confer different functions ( green or blue) between paralogous proteins X and Y, a horizontal comparison (arrow) would include all sequence changes that occurred on branches A, B, and C (rectangles, colored by their functional and epistatic effects). Permissive substitutions in isolation do not affect function but allow the protein to tolerate function-switching changes; horizontally swapping function-switching residues from Y into X would yield a nonfunctional protein, because it lacks the permissive substitution. Restrictive substitutions make the ancestral state at the function-switching sites deleterious; swapping these residues from X into Y would also yield a nonfunctional protein. Vertical analysis would determine the function of ancestral nodes (circles, colored by their functions) and isolate the change in function to branch B, reducing the number of changes to consider and minimizing the confounding effect of epistasis.
Figure 2
Figure 2
Workflow for vertical analysis of the genetic and structural causes of functional differences between related proteins, shown for a hypothetical family of enzymes. (a) Two paralogous enzymes catalyze similar reactions on different substrates, yielding different products (colors). (b) Sequences of both paralogs ( green and blue) are collected and aligned from many species, including outgroups (black). (c) The alignment is used to computationally infer the best-fit evolutionary model and a phylogeny. Ancestral sequences are inferred by maximum likelihood at nodes representing the last common ancestor of each paralog group (Anc2, Anc3) and at the gene duplication ancestral to both groups. (d ) DNA sequences coding for ancestral proteins are synthesized and cloned; ancestral proteins are expressed and their functions experimentally characterized. This allows the branch on which a new function evolved (red ) to be identified. (e) The substitutions that conferred the derived (blue) function must be among the differences between Anc1 and Anc3 (boxed sites). To identify causal substitutions, amino acid states from Anc3 (red states in blue sequence) are introduced into Anc1 and the resulting proteins tested experimentally (bottom). In the example, an arginine to glutamate substitution (red box) recapitulates the switch in specificity. ( f ) Structures or homology models of ancestral proteins are determined to infer the mechanism by which causal substitutions conferred the new function. In this case, the derived glutamate of Anc3 satisfied the hydrogen bonding potential of the amine group unique to the derived ligand.
Figure 3
Figure 3
Vertical analysis has revealed genetic and structural mechanisms for the evolution of new functions. (a) Evolution of sensitivity to the inhibitor Gleevec in two related kinases (78). (i ) Superposition of the active sites of Abl (blue), a kinase sensitive to the inhibitor Gleevec (orange spheres), and of Gleevec-insensitive kinase Src ( green). In Abl, a loop folds over Gleevec. (ii ) Vertical analysis isolated the origin of Gleevec-sensitivity to the branch between two reconstructed ancestors (circles, colored by sensitivity). Fifteen substitutions on this branch (blue rectangle) were sufficient to confer sensitivity when introduced into the deepest ancestor. (iii ) Position of causal residues in Src (top; PDB 2OIQ) and Ab1 (bottom; PDB 1OPJ). (b) Evolution of emission wavelength in red, green, and cyan fluorescent proteins of Faviina corals (46, 74). (i ) In RFP (cylinder), the imidazole group from a histidine residue unique to RFP is covalently incorporated into the chromophore ( yellow) during maturation of the protein, causing it to emit red light. (ii ) Vertical analysis showed that RFPs evolved from a green-emitting ancestor and pointed to 38 potential causal substitutions along that lineage; 12 of these, including the derived histidine, were sufficient to recapitulate the evolution of red fluorescence when introduced into the common ancestor ( green circle).(iii ) Structural location of the causal substitutions, plotted on the structure of the reconstructed common ancestor ( green cartoon; PDB 4DXN). Yellow, chromophore; blue, incorporated imidazole ring from histidine residue; gray, other causal residues, most of which are far from the chromophore and are thought to allow a conformational rearrangement necessary for imidazole incorporation (46). Abbreviations: CFP, cyan fluorescent protein; diffs, amino acid differences; GFP, green fluorescent protein; His, histidine side chain; RFP, red fluorescent protein; subs, substitutions.
Figure 4
Figure 4
Vertical analysis has illuminated mechanisms of epistasis and functional change in protein evolution.(a) Evolution of hormone specificity in glucocorticoid and mineralocorticoid receptors. (i, left) Position of key helices in the crystal structure of AncCR, the ancestor of all MRs and GRs (olive; 2Q1H) with aldosterone bound (sticks); (right) position of helices in AncGR2, the ancestor of cortisol-specific GRs ( green; 3GN8), with cortisol bound (59). Sites that change specificity are shown as light blue sticks. The Ser–Pro substitution (rectangle) repositions one helix, allowing the Leu–Gln substitution (circle) to form a cortisol-specific hydrogen bond (dashed red line). Restrictive substitutions (hexagons) introduce residues into AncGR2 that would clash in the ancestral helix conformation (13). (ii ) Phylogeny showing the inferred historical order of the substitutions between AncCR and AncGR. Horizontally swapping key residues between paralogs (arrows) yields nonfunctional proteins. (b) Evolution of substrate specificity in apicomplexan malate and lactate dehydrogenases (11). (i ) Active-site geometry of extant DHs, with the key side chain and loop insertion (+Δ) highlighted in green and orange. Substrates (black lines, with oxygen atoms in red and methyl group in gray) are labeled; functional groups unique to each substrate are boxed. (ii ) Phylogeny and ancestral reconstruction showed that LDH function (orange) evolved from an MDH-like ancestor (green). Introducing the derived loop and Lysine residue into the deepest ancestor confers pyruvate specificity. Horizontal swaps of these features (arrows) failed to confer on either protein its paralog’s functional specificity. Abbreviations: AncCR, ancestral corticoid receptor; AncGR, ancestral glucocorticoid receptor; Arg, arginine; DH, dehydrogenase; Gln, glutamine; Gly, glycine; GR, glucocorticoid receptors; LDH, lactate dehydrogenase; Leu, leucine; Lys, lysine; MDH, malate dehydrogenase; Met, methionine; MR, mineralocorticoid; Pro, proline; Ser, serine.
Figure 5
Figure 5
Knowledge of ancestral states clarifies structure-function mechanisms. (a) Simplified example of the implications of vertical analysis. Homologs with distinct functions X and Y can be generated by partitioning functions from a multifunctional ancestral protein (top) or by a discrete change in function from a specific ancestor (bottom). Different trajectories imply different effects of key sequence differences (A–D). (b) Mechanism of evolution of serine protease specificity. (i ) Specialized tight binding pockets of the extant serine proteases cathepsin (1CGH) and chymase (2RDL). (ii ) Their reconstructed last common ancestor had both activities and a wide binding pocket (79). Lower- and upper-case letters show ancestral and derived amino acid states for key residues, using the single-letter code. Ancestral states that confer the promiscuous wide pocket are highlighted in red. (c) Evolution of DNA specificity in steroid hormone receptors. Estrogen and ketosteroid receptors bind different DNA sequences (ERE and SRE). (i ) Schematic of the receptors’ recognition helices bound to the DNA major groove. Residues at variable sites are labeled. kSRs ( green) make fewer specific interactions than ERs (blue). (ii ) Vertical analysis showed that ERs and kSRs evolved from an ER-like ancestor (57). Specificity-switching substitutions (ancestral ega to derived GSV in single-letter code) and permissive substitutions (11P) are labeled.(iii ) Interactions that characterize ER/ERE (blue) and kSR/SRE ( green) complexes are shown, with favorable interactions as arrows and exclusionary interactions as horizontal lines. Permissive substitutions enhanced dimer formation and cooperativity of binding (red arrows) in kSRs. Abbreviations: Ala, alanine; Asn, asparagine; Cys, cysteine; DBD, DNA-binding domain; ER, estrogen receptors; ERE, estrogen response element; Glu, glutamic acid; Gly, glycine; kSR, ketosteroid receptors; Phe, phenylalanine; RH, recognition helice; Ser, seranine; SRE, steroid response element; Tyr, tyrosine; Val, valine.
Figure 6
Figure 6
A neutral increase in complexity in a molecular machine (29). (a) Structure of the transmembrane ring of the vacuolar ATPase of fungi (top) and animals (bottom). The fungal ring contains three unique obligate subunits, which occupy specific relative positions. The animal ring contains only two subunits. (b) Evolution of paralogous subunits in yeast. (Bottom) Two subunits in the fungal ring (blue and yellow) are paralogs duplicated from one ancestral subunit ( green). The reconstructed ancestral subunit can form all required interfaces and reconstitute a functional ring in extant Fungi (when expressed with the pink subunit). The duplicated subunits became required because they lost specific interfaces to other subunits in a complementary fashion (red arrows). They thus could occupy only a subset of the ancestral positions relative to other subunits.

References

    1. Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21:2104–5 - PubMed
    1. Agafonov RV, Wilson C, Otten R, Buosi V, Kern D. 2014. Energetic dissection of Gleevec’s selectivity toward human tyrosine kinases. Nat. Struct. Mol. Biol 21:848–53 - PMC - PubMed
    1. Aharoni A, Gaidukov L, Khersonsky O, McQ Gould S, Roodveldt C, Tawfik DS. 2005. The ‘evolvability’ of promiscuous protein functions. Nat. Genet 37:73–76 - PubMed
    1. Ahnert SE, Marsh JA, Hernandez H, Robinson CV, Teichmann SA. 2015. Principles of assembly reveal a periodic table of protein complexes. Science 350:aaa2245. - PubMed
    1. Anderson DP, Whitney DS, Hanson-Smith V, Woznica A, Campodonico-Burnett W, et al. 2016. Evolution of an ancient protein function involved in organized multicellularity in animals. eLife 5:e10147. - PMC - PubMed

MeSH terms