Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(9):e1002675.
doi: 10.1371/journal.pcbi.1002675. Epub 2012 Sep 6.

Correlated electrostatic mutations provide a reservoir of stability in HIV protease

Affiliations

Correlated electrostatic mutations provide a reservoir of stability in HIV protease

Omar Haq et al. PLoS Comput Biol. 2012.

Abstract

HIV protease, an aspartyl protease crucial to the life cycle of HIV, is the target of many drug development programs. Though many protease inhibitors are on the market, protease eventually evades these drugs by mutating at a rapid pace and building drug resistance. The drug resistance mutations, called primary mutations, are often destabilizing to the enzyme and this loss of stability has to be compensated for. Using a coarse-grained biophysical energy model together with statistical inference methods, we observe that accessory mutations of charged residues increase protein stability, playing a key role in compensating for destabilizing primary drug resistance mutations. Increased stability is intimately related to correlations between electrostatic mutations - uncorrelated mutations would strongly destabilize the enzyme. Additionally, statistical modeling indicates that the network of correlated electrostatic mutations has a simple topology and has evolved to minimize frustrated interactions. The model's statistical coupling parameters reflect this lack of frustration and strongly distinguish like-charge electrostatic interactions from unlike-charge interactions for [Formula: see text] of the most significantly correlated double mutants. Finally, we demonstrate that our model has considerable predictive power and can be used to predict complex mutation patterns, that have not yet been observed due to finite sample size effects, and which are likely to exist within the larger patient population whose virus has not yet been sequenced.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Structure of HIV protease subtype B.
The backbone structure of HIV protease subtype B (PDB ID: 1NH0) is depicted in ribbon format. The 18 electrostatically active residues are highlighted. Residue positions which have a predominantly negatively charged non-neutral residue in the sequence database are depicted in red. Residues which have a predominantly positively charged non-neutral residue in the database are depicted in blue.
Figure 2
Figure 2. Average electrostatic free energy of folding as a function of the number of electrostatic mutations.
Each point on a curve corresponds to formula image, where formula image is the number of sequences with formula image electrostatic mutations, formula image is the probability of the formula imageth sequence under a given model conditional upon the number of mutations, and formula image is its electrostatic folding energy (Equation (1); see Methods). All points are plotted relative to formula image, the average formula image of observed sequences with one electrostatic mutation. The black curve shows the average energies of observed sequences formula image, the red curve represents the average energies of sequences under a model in which each charge state occurs with equal frequency, the blue curve shows the average energies of sequences under a model in which each charge state occurs with frequencies observed in the data, and the green curve represents the average energy of sequences under a pair correlation model which preserves observed pair frequencies. The error bars on the black curve are the standard errors of the mean of observed sequences. Note that formula image for formula image.
Figure 3
Figure 3. Comparison of the sequence probabilities in the tail of the Lee database and the pair correlation model with the sequence probabilities in the Stanford database.
The probabilities of sequences under the pair correlation model, formula image, predicted using the Bethe approximation, are plotted as a function of the sequence probabilities from the Lee database, formula image. Sequences with a probability of 0 in the Lee database, i.e. unobserved sequences, are plotted to the left of the abscissa break. Every sequence is shaded using a color gradient corresponding to formula image, which represents the number of times the sequence occurs in the Stanford database, relative to its probability in the Lee database. Sequences that occur frequently in the Stanford database as compared to the Lee database have a higher ratio and are shaded red, while the sequences that do not occur as frequently in the Stanford database as compared to the Lee database have a lower ratio and are shaded blue. Sequences that are shaded green have equal probabilities in both databases. Sequences unobserved in the Lee database (leftmost row in the graph), but observed in the Stanford database have a ratio that is artificially set to equal 4, which corresponds to the color red. Unobserved Lee sequences that are also unobserved in the Stanford database are shaded green because formula image. Sequences with probabilities formula image are shaded according to the average value of formula image for a window of 10 sequences around the sequence of interest. Sequences with probabilities formula image or formula image are not shown. The indices (0), (1), (2), etc mark the locations of sequences observed zero, once, twice (etc) in the Lee database. Each dot corresponds to a unique sequence.
Figure 4
Figure 4. Distance between like and unlike-charge pairs as a function of the statistical coupling parameter, .
The statistical coupling parameter formula image is a fitting parameter that describes the statistical interaction energy between pairs of states. Since the Bethe mean field pair correlation model is a good approximation for this data, a negative formula image indicates that a pair of states is enhanced (positively correlated), while a positive formula image indicates that a pair of states is suppressed (negatively correlated). Using simple electrostatics, we observe that like-charge patterns (blue) are mostly suppressed while unlike-charge patterns (red) are enhanced. The sign of formula image is able to correctly predict the charge patterns for formula image of the top 35 most significantly correlated charge pairs out of a total of 135 pairs. The p-value for the statistical significance of this result is formula image. The reason there are 135 pairs is as follows: For each pair of residues, there are 4 possible sets of like and unlike charge combinations, resulting in a total of 612 like/unlike charge pair combinations. However, not all pairs exist in the database or are significantly correlated. Filtering results in 135 pairs with probability greater than 0.001%.

Similar articles

Cited by

References

    1. Depristo MA, Weinreich DM, Hartl DL (2005) Missense meanderings in sequence space: A bio-physical view of protein evolution. Nature 6: 678–687. - PubMed
    1. Pace CN (1975) The stability of globular proteins. CRC Crit Rev Biochem 3: 1–43. - PubMed
    1. Pain R (1987) Temperature and macromolecular structure and function. Symp Soc Exp Biol 41: 21–33. - PubMed
    1. Shoichet BK, Baase WA, Kuroki R, Matthews BW (1995) A relationship between protein stability and protein function. Proc Natl Acad Sci U S A 92: 452–456. - PMC - PubMed
    1. Bloom JD, Labthavikul ST, Otey CR, Arnold FH (2006) Protein Stability promotes Evolvability. Proc Natl Acad Sci U S A 103: 5869–5874. - PMC - PubMed

Publication types