Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2007 Aug 29;2(8):e814.
doi: 10.1371/journal.pone.0000814.

Distinguishing functional amino acid covariation from background linkage disequilibrium in HIV protease and reverse transcriptase

Affiliations
Comparative Study

Distinguishing functional amino acid covariation from background linkage disequilibrium in HIV protease and reverse transcriptase

Qi Wang et al. PLoS One. .

Abstract

Correlated amino acid mutation analysis has been widely used to infer functional interactions between different sites in a protein. However, this analysis can be confounded by important phylogenetic effects broadly classifiable as background linkage disequilibrium (BLD). We have systematically separated the covariation induced by selective interactions between amino acids from background LD, using synonymous (S) vs. amino acid (A) mutations. Covariation between two amino acid mutations, (A,A), can be affected by selective interactions between amino acids, whereas covariation within (A,S) pairs or (S,S) pairs cannot. Our analysis of the pol gene--including the protease and the reverse transcriptase genes--in HIV reveals that (A,A) covariation levels are enormously higher than for either (A,S) or (S,S), and thus cannot be attributed to phylogenetic effects. The magnitude of these effects suggests that a large portion of (A,A) covariation in the HIV pol gene results from selective interactions. Inspection of the most prominent (A,A) interactions in the HIV pol gene showed that they are known sites of independently identified drug resistance mutations, and physically cluster around the drug binding site. Moreover, the specific set of (A,A) interaction pairs was reproducible in different drug treatment studies, and vanished in untreated HIV samples. The (S,S) covariation curves measured a low but detectable level of background LD in HIV.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Schema of Separating Selective Interactions from Background Linkage Disequilibrium (BLD).
(A) Mutation covariation due to BLD. Covariation of mutation A and R (shown in multiple sequence alignment, right) is caused by co-inheritance of the two mutations from a common ancestor (shown in the phylogenetic tree, left). (B) Mutation covariation due to selective interactions. Relative fitness models for mutations x and y, the double mutant (xy), and wildtype (0). Two models are contrasted: top, independent (additive) fitness effects don't cause amino acid mutation covariation; bottom, selective interactions cause covariation of x and y. (C) Distinguishing BLD vs. fitness using pairs of amino acid mutations (A) and synonymous (S) mutations.
Figure 2
Figure 2. (A,A) Covariation Is Dramatically Higher Than (A,S) and (S,S) Covariation in the Specialty Dataset.
(A) Sliding window results of average D′. All mutation pairs, black; silent mutation pairs (S,S) only, green. Each sliding window contains 4% of the data points in the set. (B) Sliding window results of average D′. Amino acid mutation pairs (A,A), red; amino acid mutations to silent mutations (A,S), blue; silent mutation pairs (S,S), green. Each sliding window contains 2% of the data points in the set. (C–E) Plots of D′ against the physical distance (base) within the mutation pair for C) (A,A), D) (A,S) and E) (S,S).
Figure 3
Figure 3. (A,A) Covariation Is Dramatically Higher than (A,S) and (S,S) Covariation in the Stanford-Treated Dataset but not the Stanford-Untreated Dataset.
Sliding window results of average D′ in A) Stanford-Treated Dataset and B) Stanford-Untreated Dataset. Amino acid mutation pairs (A,A), red; amino acid mutations to silent mutations (A,S), blue; silent mutation pairs (S,S), green. Each sliding window contains 4% of the data points in the set.
Figure 4
Figure 4. Amino Acid Mutation Pairs that Show Strong Covariation Are Close to Active Sites in RT.
HIV-1 reverse transcriptase (RT) structure (PDB accession number 3HVTA) is shown using Protein Explorer (www.proteinexplorer.org). The RT41, 43 and 44, red; RT 67 and 70, green; RT 208, 210, 218, 219, yellow; active sites 110,185 and 186 in magenta. The grey sphere cluster is the nucleoside reverse transcriptase inhibitor — Nevirapine.
Figure 5
Figure 5. The Covariation Maps of Three Different Types of Mutation Pairs in HIV Protease.
The covariation maps of A) amino acid mutation pairs (A,A), B) amino acid mutations to silent mutations (A,S) and C) silent mutation pairs (S,S). The X and Y axes represent the codon positions in protease. Each cell represents the strongest covariation value (θ; see Materials and Methods) measured for any mutation pair of the designated type between the two positions. The strength of the covariation is depicted on a color scale, with yellow indicating covariation score (θ) larger than 1 and varying shades up to blue indicating covariation score (θ) larger than 5 (the covariation of two mutations is at least five times greater than random). White indicates no evidence of covariation.

Similar articles

Cited by

References

    1. Altschuh D, Lesk AM, Bloomer AC, Klug A. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J Mol Biol. 1987;193:693–707. - PubMed
    1. Gobel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins. 1994;18:309–317. - PubMed
    1. Shindyalov IN, Kolchanov NA, Sander C. Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng. 1994;7:349–358. - PubMed
    1. Thomas DJ, Casari G, Sander C. The prediction of protein contacts from multiple sequence alignments. Protein Eng. 1996;9:941–948. - PubMed
    1. Olmea O, Valencia A. Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Des. 1997;2:S25–32. - PubMed

Publication types