Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Mar 30;12(3):e1005960.
doi: 10.1371/journal.pgen.1005960. eCollection 2016 Mar.

Strong Selection Significantly Increases Epistatic Interactions in the Long-Term Evolution of a Protein

Affiliations

Strong Selection Significantly Increases Epistatic Interactions in the Long-Term Evolution of a Protein

Aditi Gupta et al. PLoS Genet. .

Abstract

Epistatic interactions between residues determine a protein's adaptability and shape its evolutionary trajectory. When a protein experiences a changed environment, it is under strong selection to find a peak in the new fitness landscape. It has been shown that strong selection increases epistatic interactions as well as the ruggedness of the fitness landscape, but little is known about how the epistatic interactions change under selection in the long-term evolution of a protein. Here we analyze the evolution of epistasis in the protease of the human immunodeficiency virus type 1 (HIV-1) using protease sequences collected for almost a decade from both treated and untreated patients, to understand how epistasis changes and how those changes impact the long-term evolvability of a protein. We use an information-theoretic proxy for epistasis that quantifies the co-variation between sites, and show that positive information is a necessary (but not sufficient) condition that detects epistasis in most cases. We analyze the "fossils" of the evolutionary trajectories of the protein contained in the sequence data, and show that epistasis continues to enrich under strong selection, but not for proteins whose environment is unchanged. The increase in epistasis compensates for the information loss due to sequence variability brought about by treatment, and facilitates adaptation in the increasingly rugged fitness landscape of treatment. While epistasis is thought to enhance evolvability via valley-crossing early-on in adaptation, it can hinder adaptation later when the landscape has turned rugged. However, we find no evidence that the HIV-1 protease has reached its potential for evolution after 9 years of adapting to a drug environment that itself is constantly changing. We suggest that the mechanism of encoding new information into pairwise interactions is central to protein evolution not just in HIV-1 protease, but for any protein adapting to a changing environment.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Average per-site entropies at every position of the HIV-1 protease.
Untreated (top panel) and treated (bottom panel) datasets at the earliest (year 1998, red) and latest (year 2006, blue) time point of our analysis. 300 sequences are resampled from data for each year and average entropy for each position is calculated from the entropies in 10 resampled datasets. Site-specific variation generally increased across the protein following treatment. Entropy (variability) also increased from 1998 to 2006 for several positions. Error bars denote ±1 SE.
Fig 2
Fig 2. Changes in protease entropy and physico-chemical properties.
Changes in per-site entropies (top panel), residue isoelectric point (middle panel), and residue weights (bottom panel) due to treatment. The property difference at each site is obtained by subtracting property (entropy/pI/residue-weight) value of the untreated data from that of the treated data. Average values are obtained by sampling sequence data from all years (1998–2006, 10 subsamples/year of 300 sequences each). Error bars represent ±1 SE. Red dots represent positions known to be primary drug resistance loci, while black dots mark positions of compensatory or accessory mutations [52]. Resistance loci are shaded in red and accessory loci are shaded in black.
Fig 3
Fig 3. Estimates of the information content of the HIV-1 protease.
Filled black circles represent data from untreated subjects and blue triangles represent data from treated individuals. I1 [see Eq (12)] is consistently low in treated sequence data over the years, indicating high sequence variability in the drug environment (top panel). The middle panel shows that the sum of pairwise mutual information significantly increases upon treatment (p ≤ 0.001). On adding the sum of pairwise mutual information to I1, we obtain a comprehensive measure of information that considers pairwise interactions between residues [I2, Eq (13)]. I2 for both the treated and untreated data is comparable and unchanging over the years. We use data only for positions 15–90, as residues 1–14 as well as 91–99 have missing sequence data leading to error-prone estimates of entropy, as evidenced in S2 Fig. Error bars represent ±1 SD.
Fig 4
Fig 4. Increase in epistasis in HIV-1 protease over time.
Pairwise interactions in the HIV-1 protease are shown for years 1998, 2002, and 2006 in the drug-free (top row) and drug environment (bottom row). Each heatmap shows the mutual information for each pair of residues. Pairwise information (and thus epistatic effects) are fairly constant in the drug-free environment, but gradually increase in the treated group.
Fig 5
Fig 5. Epistatic interactions mapped onto the protease structure.
Epistatic interactions in the protease sequences in treated data from the year 1998 (red on the left chain) and 2006 (blue, right chain). The interacting residues are numbered. Only those interactions are shown where information is greater than 0.1 bits, indicating strong epistasis. Figure generated using Chimera [60].
Fig 6
Fig 6. Rates of mutation between four different genotypes.
The types are denoted as 0 = AA, 1 = Aa, 2 = aA, and 3 = aa.
Fig 7
Fig 7. Allele frequencies as fitness for type 3 (aa) is varied.
In this valley-crossing landscape, w0 is always 1 and w1 = w2 = 0. Plot shows allele frequencies pi at mutation rate μ = 0.1 as a function of w3. The intermediate types aA and Aa occur only at the rate of mutation as they have zero fitness.
Fig 8
Fig 8. Two-loci two-allele model.
The left panel shows the fitness landscapes and epistasis given by Eq (9) in the first and second half of the simulation (updates 0–499: w0 = 1 and w1 = w2 = w3 = 10−5 ≈ 0; updates 500–1000: w0 = w3 = 1 and w1 = w2 = 10−5 ≈ 0). The xy-plane shows the four genotypes while the z-axis shows genotype fitness. The middle panel shows the genotype probabilities while the right panel shows the mutual information during the course of the simulation. Note that the increase in epistasis at the 500th update is reflected in the increase in mutual information. The mutation rate was 0.1 and starting population frequencies were p0 = 1 and p1 = p2 = p3 = 0.
Fig 9
Fig 9. Correlation between epistasis and information.
Each point corresponds to information and absolute value of epistasis calculated for one of the 10,000 combinations of w0, w1, w2, and w3. Here, w0 is always 1, and other fitness values are uniformly randomly assigned between 0 and 1. The inset shows the percentage of points with negligible information (<0.001 bits) as a function of epistasis.

Comment in

Similar articles

Cited by

References

    1. de Visser JAGM, Cooper TF, Elena SF. The causes of epistasis. Proc Roy Soc B. 2011;278:3617–24. 10.1098/rspb.2011.1537 - DOI - PMC - PubMed
    1. Ortlund EA, Bridgham JT, Redinbo MR, Thornton JW. Crystal structure of an ancient protein: evolution by conformational epistasis. Science. 2007;317:1544–8. 10.1126/science.1142819 - DOI - PMC - PubMed
    1. Tokuriki N, Stricher F, Serrano L, Tawfik DS. How protein stability and new functions trade off. PLoS Comput Biol. 2008;4:e1000002 10.1371/journal.pcbi.1000002 - DOI - PMC - PubMed
    1. Soskine M, Tawfik DS. Mutational effects and the evolution of new protein functions. Nat Rev Genet. 2010;11:572–82. 10.1038/nrg2808 - DOI - PubMed
    1. Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS. Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature. 2006;444:929–32. 10.1038/nature05385 - DOI - PubMed

Publication types

Substances