Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 1;13(2):e0192081.
doi: 10.1371/journal.pone.0192081. eCollection 2018.

A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

Affiliations

A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

Gonzalo Yebra et al. PLoS One. .

Abstract

Background & methods: The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker.

Results: The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset.

Conclusions: The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1
A) Recombination pattern (according to jpHMM) of the two URF_50B (accession numbers MF109354 and MF109527) discovered in the ICONIC dataset and the GenBank full-length sequence JN417236, prototype of the recombinant CRF50_A1D. B) Bootscanning analysis of the URF_50B performed using SimPlot and diagram showing the definitive recombination pattern.
Fig 2
Fig 2
Panel A) Recombination pattern (according to jpHMM) of the URF_A1B (accession number MF109371) discovered in the ICONIC dataset, the URF_50B full-length sequence (accession number KT022394) described by Foster and colleagues and a member of the A1/D Canadian cluster described by Brenner and colleagues (accession number GU592197). These 3 which presented high similarity in the PR+RT pol region. B) Maximum-likelihood tree of the PR+RT pol region (1,000bp) of the A1/B cluster of sequences with high similarity to the ICONIC URF_A1B. Tips corresponding to the UKHRD are in orange. Tip names show, for the GenBank sequences: subtype, sampling country, accession number and sampling year (not available for the Canadian sequences); for UKHDRD sequences: country of birth, ID and sampling year. Branches corresponding to the A1/B cluster are shown in blue. The ICONIC URF_A1B sequence is highlighted with an arrow. The tree is rooted to subtype A1 reference sequences. The diagram at the right shows the recombination pattern shown by the sequences in the cluster, according to jpHMM.
Fig 3
Fig 3
Panel A) Recombination pattern (according to jpHMM) of the URF_A1D (accession number MF109466) discovered in the ICONIC dataset and the GenBank full-length A1/D sequence KT022394, which presented high similarity in the PR+RT pol region. B) Maximum-likelihood tree of the PR+RT pol region (1,000bp) of the A1/D cluster of sequences with high similarity to the ICONIC URF_A1D. Tips corresponding to the UKHRD are in orange. Tip names show, for the GenBank sequences: subtype, sampling country, accession number and sampling year; for UKHDRD sequences: country of birth, ID and sampling year. Branches corresponding to the A1/D cluster are shown in blue. The ICONIC URF_A1D and the GenBank sequence KT022394 are highlighted with arrows. The tree is rooted to a subtype B sequence. The diagram at the right shows the recombination pattern shown by the sequences in the cluster, according to jpHMM.
Fig 4
Fig 4
A) Recombination pattern (according to jpHMM) of the URF_0622 (accession number MF109682) discovered in the ICONIC dataset and the sequences AF064699 and EU743963, prototypes of the recombinants CRF06_cpx and CRF22_01A1, respectively. B) Bootscanning analysis of the URF_0106G performed using SimPlot and diagram showing the definitive recombination pattern.
Fig 5
Fig 5
A) Recombination pattern (according to jpHMM) of the URF_0206 (15228_1_49) discovered in the ICONIC dataset and the sequences L39106 and AF064699, prototypes of the recombinants CRF02_AG and CRF06_cpx, respectively. B) Bootscanning analysis of the URF_0206 performed using SimPlot and diagram showing the definitive recombination pattern.

Similar articles

Cited by

References

    1. LANL, 2017. Los Alamos HIV database. Last accessed: May 1, 2017. Available from: http://www.hiv.lanl.gov.
    1. LANL, 2017. HIV Circulating Recombinant Forms (CRFs). Last accessed: May 1, 2017. Available from: http://www.hiv.lanl.gov/content/sequence/HIV/CRFs/CRFs.html.
    1. Hemelaar J, Gouws E, Ghys PD, Osmanov S. Global trends in molecular epidemiology of HIV-1 during 2000–2007. AIDS. 2011; 25: 679–89. doi: 10.1097/QAD.0b013e328342ff93 - DOI - PMC - PubMed
    1. Beloukas A, Psarris A, Giannelou P, Kostaki E, Hatzakis A, Paraskevis D. Molecular epidemiology of HIV-1 infection in Europe: An overview. Infect Genet Evol. 2016; 46: 180–9. doi: 10.1016/j.meegid.2016.06.033 - DOI - PubMed
    1. Dolling D, Hué S, Delpech V, Fearnhill E, Leigh-Brown A, Geretti AM, et al. The increasing genetic diversity of HIV-1 in the UK, 2002–2010. AIDS. 2014; 28: 773–80. doi: 10.1097/QAD.0000000000000119 - DOI - PMC - PubMed

Publication types