Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Oct;198(2):461-71.
doi: 10.1534/genetics.114.168351.

Viewing protein fitness landscapes through a next-gen lens

Affiliations
Review

Viewing protein fitness landscapes through a next-gen lens

Jeffrey I Boucher et al. Genetics. 2014 Oct.

Abstract

High-throughput sequencing has enabled many powerful approaches in biological research. Here, we review sequencing approaches to measure frequency changes within engineered mutational libraries subject to selection. These analyses can provide direct estimates of biochemical and fitness effects for all individual mutations across entire genes (and likely compact genomes in the near future) in genetically tractable systems such as microbes, viruses, and mammalian cells. The effects of mutations on experimental fitness can be assessed using sequencing to monitor time-dependent changes in mutant frequency during bulk competitions. The impact of mutations on biochemical functions can be determined using reporters or other means of separating variants based on individual activities (e.g., binding affinity for a partner molecule can be interrogated using surface display of libraries of mutant proteins and isolation of bound and unbound populations). The comprehensive investigation of mutant effects on both biochemical function and experimental fitness provide promising new avenues to investigate the connections between biochemistry, cell physiology, and evolution. We summarize recent findings from systematic mutational analyses; describe how they relate to a field rich in both theory and experimentation; and highlight how they may contribute to ongoing and future research into protein structure-function relationships, systems-level descriptions of cell physiology, and population-genetic inferences on the relative contributions of selection and drift.

Keywords: mutant effects; physiology; protein function; systems biology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Conceptual depictions and interpretations of mutational landscapes. (A) Rendition of a landscape after those envisioned by Wright (1932) to conceptualize vast possible combinations of different alleles. Allele space is depicted on the image plane and contour lines indicate relative fitness. (B) Modern molecular analyses of mutational steps between ancestral and derived sequences have provided insights into available pathways on mutational landscapes. (C) Peaks of fitness or function under defined conditions have been explored using approaches including directed evolution, that can identify a small set of highly functional variants from stochastically generated mutations. (D) Systematic maps of local regions of mutational space (e.g., all single-step mutations from the parental sequence of a gene) can be generated using current sequencing-based approaches. These depictions are intended to provide a conceptual outline, but they are inaccurate in detail (e.g., the vastness of possible mutational space is not accurately represented, and the smoothness of the surfaces does not accurately represent the observation that single-step mutations can lead to dramatic cliff-like changes in fitness).
Figure 2
Figure 2
Sequencing-based approach to quantify the frequency of mutations in bulk competitions. (A) Engineered mutations in bulk libraries are subjected to selection pressure. Sequencing of samples before and after selection provides a direct estimate of the frequency change of each mutation, which is a measure of function or fitness under the conditions of the experiment. (B) Contributions of read depth to estimates of mutational frequencies. The fraction of a mutation (Fi) is defined by the number of copies of that mutation (Ni) relative to total variants in the sample (Ntotal). In sequencing approaches, the number of reads of a mutation (Ri) relative to total reads (Rtotal) provides an estimate of the frequency of mutations (Ei). These estimated mutational frequencies contain experimental noise. Many factors contribute to experimental noise, including sequencing errors and read depth. Read depth is frequently a dominant source of noise. Based solely on sampling, the standard error (Sp) describes the expected noise in estimating the frequency of a mutation based on read depth. The probability of an estimate falling within two standard errors from the true value is 95%. The graph illustrates how read depth (x-axis) impacts the noise in estimating mutational frequencies (y-axis). This graph is for a mutation at 0.1% frequency (for mutations at frequencies <1%, the relationship is similar). (C) Measuring multiple points during selection (e.g., at multiple time points in a growth competition), can reduce the impact of read noise for any individual point on fitness estimates. This graph shows observations of the effects of mutations in Hsp90 (N588PCCC in green, a silent N588AAC mutation in gray, and a nonsense N588*TGA mutation in red) on yeast growth in an elevated saline environment (Hietpas et al. 2013b). (D) The frequency of mutations can be estimated by directly sequencing regions containing mutations. (E) Alternatively, a barcode can be introduced outside of the ORF and associated with mutations in the ORF using tiled paired-end reads. The barcode can then be efficiently sequenced to infer the frequency of the associated ORF mutations. Barcode strategies require additional setup, but have two important advantages: they enable analyses of mutations that are separated by large distances (e.g., greater than can be spanned in a single sequencing reaction), and they enable error correction for indexes that differ by more than one base from all others.
Figure 3
Figure 3
Local mutational landscapes mapped to structure. (A) Heat map representation of a functional landscape for ubiquitin where all individual amino acid substitutions were analyzed (Roscoe and Bolon 2014). For each amino acid position, the average impact of substitutions (bottom row) is a representation of mutational sensitivity. (B) Mapping the effects of ubiquitin mutations to structure (Peschard et al. 2007) indicates that surfaces that directly contact binding partners (the purple ribbon illustrates a binding partner) are sensitive, while distal surfaces are very tolerant of amino acid substitutions. (C) Similar patterns of mutational sensitivity at binding interfaces have also been observed in systematic analyses of an RRM domain (Melamed et al. 2013) and a WW domain (Fowler et al. 2010). In B and C, sensitive positions where the average effect of a mutation was strongly deleterious are colored blue, while tolerant positions are colored yellow, intermediate positions, light blue, and binding partners, purple.
Figure 4
Figure 4
Comparison between conservation patterns in nature and experimental fitness measurements. (A) Heat map representation of both amino acids observed for a region of the Hsp90 protein from diverse eukaryotes and experimental fitness effects observed for the same region of yeast Hsp90. The function of Hsp90 is strongly conserved in eukaryotes (Picard et al. 1990), suggesting that selection acting on Hsp90 may be predominantly purifying in nature. (B) Illustration of inferences that can be made from sequence divergence patterns for genes subject to predominantly purifying selection compared to experimental measurements of fitness effects. Slightly deleterious (e.g., 1% defects) and strongly deleterious (e.g., null) mutations should both be efficiently purged from large natural populations over evolutionary time scales. The breadth of the near-neutral window of fitness effects in nature is theoretically proportional to the inverse of effective population size (Ohta 1973). Experimental approaches can monitor the effects of mutations across the full breadth of fitness (null to beneficial), but the resolution is not sufficient to distinguish the window that would be near neutral in large natural populations over long time scales. Studies of experimental fitness and patterns of divergence in nature provide distinct information that, analyzed together, can be more powerful than either approach alone.
Figure 5
Figure 5
Relationships between biochemical and physiological function. This illustration shows a map of a protein–protein interaction network where the colored circles or nodes represent distinct proteins and the lines or edges connecting them represent binding interactions. Systematic analyses of the effects of mutations on both specific biochemical functions (e.g., strength of binding between two proteins) and overall contributions to physiological function provide new opportunities to interrogate complex biological networks that mediate many critical processes.

Similar articles

Cited by

References

    1. Acevedo A., Brodsky L., Andino R., 2014. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature 505: 686–690 - PMC - PubMed
    1. Araya C. L., Fowler D. M., Chen W., Muniez I., Kelly J. W., et al. , 2012. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc. Natl. Acad. Sci. USA 109: 16858–16863 - PMC - PubMed
    1. Baase W. A., Liu L., Tronrud D. E., Matthews B. W., 1997. Lessons from the lysozyme of phage T4. Protein Sci. 19: 631–641 - PMC - PubMed
    1. Baker T. A., Sauer R. T., 2012. ClpXP, an ATP-powered unfolding and protein-degradation machine. Biochim. Biophys. Acta 1823: 15–28 - PMC - PubMed
    1. Bank C., Hietpas R. T., Wong A., Bolon D. N., Jensen J. D., 2014. A bayesian MCMC approach to assess the complete distribution of fitness effects of new mutations: uncovering the potential for adaptive walks in challenging environments. Genetics 196: 841–852 - PMC - PubMed

LinkOut - more resources