Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 15;52(4):532-9.
doi: 10.1093/cid/ciq164. Epub 2011 Jan 10.

Ambiguous nucleotide calls from population-based sequencing of HIV-1 are a marker for viral diversity and the age of infection

Affiliations

Ambiguous nucleotide calls from population-based sequencing of HIV-1 are a marker for viral diversity and the age of infection

Roger D Kouyos et al. Clin Infect Dis. .

Abstract

Background: The time passed since the infection of a human immunodeficiency virus (HIV)-infected individual (the age of infection) is an important but often only poorly known quantity. We assessed whether the fraction of ambiguous nucleotides obtained from bulk sequencing as done for genotypic resistance testing can serve as a proxy of this parameter.

Methods: We correlated the age of infection and the fraction of ambiguous nucleotides in partial pol sequences of HIV-1 sampled before initiation of antiretroviral therapy (ART). Three groups of Swiss HIV Cohort Study participants were analyzed, for whom the age of infection was estimated on the basis of Bayesian back calculation (n = 3,307), seroconversion (n = 366), or diagnoses of primary HIV infection (n = 130). In addition, we studied 124 patients for whom longitudinal genotypic resistance testing was performed while they were still ART-naïve.

Results: We found that the fraction of ambiguous nucleotides increased with the age of infection with a rate of .2% per year within the first 8 years but thereafter with a decreasing rate. We show that this pattern is consistent with population-genetic models for realistic parameters. Finally, we show that, in this highly representative population, a fraction of ambiguous nucleotides of >.5% provides strong evidence against a recent infection event <1 year prior to sampling (negative predictive value, 98.7%).

Conclusions: These findings show that the fraction of ambiguous nucleotides is a useful marker for the age of infection.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Relationship between the year of infection and the fraction of ambiguous nucleotides (f). A, Mean of f as a function of the age of infection, where data points have been binned according to the age of infection in years (n = 3,307 patients). The shaded area corresponds to the 95% confidence interval of the means of f. These confidence intervals have been determined by bootstrap (with 1,000 samples). The red point corresponds to the mean of f for the Zurich Primary HIV Infection Study data set (n = 130), for which all sequences stem from the first few months after the infection. The associated red line gives the 95% confidence interval of this mean. B, Quadratic and linear fit of the full data set. Note that the linear fit is restricted to ages of infection of ≤8 years. C, Linear fit of the different data sets. Only sequences obtained within the first 8 years after infection were considered. D, Distribution of the age of infection (in years) for different fractions of ambiguous nucleotides. The left plot depicts the density plot of the age of infection for the 5 quintiles of f. The right plot depicts, for each of the 5 quintiles of f, the 25%–75% percentiles (green lines) and 5%–95% percentiles (black lines) of the year of infection.
Figure 2.
Figure 2.
Temporal increase of the fraction of ambiguous nucleotides in the Wright-Fisher model (WFM) for a population size of 500 and a mutation rate of 3 × 10−5 mutations per generation (solid green line) and at 4-fold degenerate third-codon positions in the full data set (dashed black line). The curve for the WFM has been obtained by averaging over 104 runs of the model. N and m denote the effective population size and the mutation rate, respectively. The WFM describes discrete and nonoverlapping generations in a population with fixed size N. Every generation, each of the N genomes undergoes mutation with probability m. Then the N genomes for the next generation are determined from the gene pool by drawing every offspring genome with uniform probability from the N parental genomes. Note that the WFM assumes selective neutrality.

References

    1. Keele BF, Giorgi EE, Salazar-Gonzalez JF, et al. Identification and characterisation of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc Natl Acad Sci USA. 2008;105:7552–7. - PMC - PubMed
    1. Bonhoeffer S, Holmes EC, Nowak MA. Causes of HIV diversity. Nature. 1995;376:125. - PubMed
    1. Shankarappa R, Margolick JB, Gange SJ, et al. Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection. J Virol. 1999;73:10489–502. - PMC - PubMed
    1. Kahn JO, Walker BD. Acute human immunodeficiency virus type 1 infection. N Engl J Med. 1998;339:33–9. - PubMed
    1. Kouyos RD, von Wyl V, Yerly S, et al. Molecular epidemiology reveals long-term changes in HIV type 1 subtype B transmission in Switzerland. J Infect Dis. 2010;201:1488–97. - PubMed

Publication types

Substances