Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 2;117(22):12222-12229.
doi: 10.1073/pnas.1913682117. Epub 2020 May 19.

A near full-length HIV-1 genome from 1966 recovered from formalin-fixed paraffin-embedded tissue

Affiliations

A near full-length HIV-1 genome from 1966 recovered from formalin-fixed paraffin-embedded tissue

Sophie Gryseels et al. Proc Natl Acad Sci U S A. .

Abstract

With very little direct biological data of HIV-1 from before the 1980s, far-reaching evolutionary and epidemiological inferences regarding the long prediscovery phase of this pandemic are based on extrapolations by phylodynamic models of HIV-1 genomic sequences gathered mostly over recent decades. Here, using a very sensitive multiplex RT-PCR assay, we screened 1,645 formalin-fixed paraffin-embedded tissue specimens collected for pathology diagnostics in Central Africa between 1958 and 1966. We report the near-complete viral genome in one HIV-1 positive specimen from Kinshasa, Democratic Republic of Congo (DRC), from 1966 ("DRC66")-a nonrecombinant sister lineage to subtype C that constitutes the oldest HIV-1 near full-length genome recovered to date. Root-to-tip plots showed the DRC66 sequence is not an outlier as would be expected if dating estimates from more recent genomes were systematically biased; and inclusion of the DRC66 sequence in tip-dated BEAST analyses did not significantly alter root and internal node age estimates based on post-1978 HIV-1 sequences. There was larger variation in divergence time estimates among datasets that were subsamples of the available HIV-1 genomes from 1978 to 2014, showing the inherent phylogenetic stochasticity across subsets of the real HIV-1 diversity. Our phylogenetic analyses date the origin of the pandemic lineage of HIV-1 to a time period around the turn of the 20th century (1881 to 1918). In conclusion, this unique archival HIV-1 sequence provides direct genomic insight into HIV-1 in 1960s DRC, and, as an ancient-DNA calibrator, it validates our understanding of HIV-1 evolutionary history.

Keywords: HIV-1; evolution; phylogeny; virus.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
(A) Unrooted maximum-likelihood tree of the complete dataset of 830 HIV-1 group M genomes. (B) Evolutionary distances between the root and all tips of a rooted version of the tree shown in A plotted against the year the sequence was sampled. The root location was detemined in Tempest (30), minimizing the sum of the squared residuals from this exploratory regression. Root-to-tip plots based on ML trees of subsampled datasets are displayed in SI Appendix, Fig. S1. (C) Midpoint rooted ML tree of a 1,799-nt pol alignment that includes subsampled dataset A, sequences that cluster with the DRC66 lineage, plus multiple divergent subtype C-like sequences as summarized in ref. , some of which are derived from intersubtype recombinant genomes (e.g., “CU”) or of which only partial sequence for this alignment is available. Tips of the subtype C-related sequences, including DRC66, are labeled by subtype (marked with * if determined based on partial sequence only, e.g., “C*”), sampling year, sampling country, and GenBank accession number. For sampling country, COD = Democratic Republic of the Congo, BWA = Botwana, SWE = Sweden, ZAF = South Africa. In all three figures subtypes are color coded according to the color legend. The DRC66 sequence is indicated with a red star.
Fig. 2.
Fig. 2.
Time-scaled phylogenetic BEAST tree of subsampled dataset A estimated under a model that includes the sampling date of DRC66. Branches are color coded by the log of the estimated evolutionary rate for that branch (-lnRate), drawn from a log-normal distribution using the uncorrelated relaxed clock model (48). Node labels are coded on a gray scale by the posterior probability of clade support values. The DRC66 sample is marked with a red star. Trees with tip labels for all five subsampled datasets are displayed in SI Appendix, Fig. S2.
Fig. 3.
Fig. 3.
Mean node age and mean evolutionary rate estimates and their 95% HPD intervals for time-scaled phylogenies of the five different subsampled datasets (A–E), which were each analyzed in BEAST in three different ways: including DRC66 and its tip age (estimates represented by squares), including DRC66 but with its sampling date unknown and to be estimated (estimates represented by circles), and excluding the DRC66 sequence (estimates represented by triangles). See also SI Appendix, Table S1. (A) Age estimates of the root of the tree (open characters) and of the node representing the common ancestor of conventional subtype C (filled characters) for each of the three BEAST analyses. (B) Age estimates of the clade that encompasses both conventional subtype C and DRC66 (squares and circles with crosses) for the two BEAST analyses that included DRC66 and the estimated sample ages of DRC66 for those analyses in which this was left to be estimated (stars). (C) Estimates of the evolutionary rate along the terminal branch leading to DRC66 in BEAST analyses that included DRC66’s sampling date (stars) and mean evolutionary rates across entire phylogenies for each of the three BEAST analyses.

References

    1. World Health Organization , The 2018 Update, Global Health Workforce Statistics (World Health Organization, Geneva, 2018).
    1. Sharp P. M., Hahn B. H., Origins of HIV and the AIDS pandemic. Cold Spring Harb. Perspect. Med. 1, a006841 (2011). - PMC - PubMed
    1. Aiewsakun P., Katzourakis A., Time-dependent rate phenomenon in viruses. J. Virol. 90, 7184–7195 (2016). - PMC - PubMed
    1. Ho S. Y. W., et al. , Time-dependent rates of molecular evolution. Mol. Ecol. 20, 3087–3101 (2011). - PubMed
    1. Gifford R. J., Viral evolution in deep time: Lentiviruses and mammals. Trends Genet. 28, 89–100 (2012). - PubMed

Publication types

Associated data

LinkOut - more resources