Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 9;7(2):veab065.
doi: 10.1093/ve/veab065. eCollection 2021.

Using host genetics to infer the global spread and evolutionary history of HCV subtype 3a

Affiliations

Using host genetics to infer the global spread and evolutionary history of HCV subtype 3a

Shang-Kuan Lin et al. Virus Evol. .

Abstract

Studies have shown that hepatitis C virus subtype 3a (HCV-3a) is likely to have been circulating in South Asia before its global spread. However, the time and route of this dissemination remain unclear. For the first time, we generated host and virus genome-wide data for more than 500 patients infected with HCV-3a from the UK, North America, Australia, and New Zealand. We used the host genomic data to infer the ancestry of the patients and used this information to investigate the epidemic history of HCV-3a. We observed that viruses from hosts of South Asian ancestry clustered together near the root of the tree, irrespective of the sampling country, and that they were more diverse than viruses from other host ancestries. We hypothesized that South Asian hosts are more likely to have been infected in South Asia and used the inferred host ancestries to distinguish between the location where the infection was acquired and where the sample was taken. Next, we inferred that three independent transmission events resulted in the spread of the virus from South Asia to the UK, North America, and Oceania. This initial spread happened during or soon after the end of World War II. This was subsequently followed by many independent transmissions between the UK, North America, and Oceania. Using both host and virus genomic information can be highly informative in studying the virus epidemic history, especially in the context of chronic infections where migration histories need to be accounted for.

Keywords: HCV; evolution; host–virus genetics; phylogenetics; phylogeography.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Projection of the host genetic PCs of BOSON cohort onto the 1000 Genome Project PCs to detect and validate self-reported ancestries. Each dot is an individual, and the colours indicate host ancestries. The 1000 Genome Project data were used to adjust the self-reported ethnicities in the BOSON cohort.
Figure 2.
Figure 2.
HCV-3a phylogeny and its association with host ancestry and demography. (A) ML tree with the terminal branches coloured according to the sampling location and the tip marker indicating the host ancestry as inferred from host genetic data. (B) Distribution of pairwise genetic distances between HCV whole genomes in different host groups. (C) Age distribution in different host groups (age data only available for hosts from the BOSON Cohort).
Figure 3.
Figure 3.
Detection of the location of infection for South Asian hosts. (A) A structured coalescent phylogeographic analysis applied to a time-calibrated ML tree and the host ancestries to infer the location of infection for South Asian hosts. The branches are coloured by the most likely host ancestral state. Lines on the tips of the trees indicate (1) South Asian hosts sampled in a Western country and inferred to have been infected in South Asia, (2) sampling locations, (3) host ancestries, and (4) absence of host genetic information. The two bar charts represent the location distribution for (B) sampling locations and (C) inferred locations of infection among South Asian hosts living in the West.
Figure 4.
Figure 4.
Phylogeographic analysis of the global spread of HCV-3a. (A) Maximum Clade Credibility (MCC) tree with branches colour-coded by the most likely infection location. The three nodes that correspond to the earliest introductions from South Asia to the UK (EU), North America (NA), and Oceania (AU) are indicated using blue diamonds. Other significant transmission events between continents are indicated using red circles. (B) The 95 per cent HPD of the time of the highlighted nodes in (A). The estimated time points are colour-coded by the geographical source of transmission. The grey block corresponds to the time period of World War II (1939–45).

References

    1. The 1000 Genomes Project Consortium . (2015) ‘A Global Reference for Human Genetic Variation’, Nature, 526: 68–74. - PMC - PubMed
    1. Ansari M. A. et al. (2017) ‘Genome-to-Genome Analysis Highlights the Impact of the Human Innate and Adaptive Immune Systems on the Hepatitis C Virus’, Nature Genetics, 49: 666–73. - PMC - PubMed
    1. —— et al. (2019) ‘Interferon Lambda 4 Impacts the Genetic Diversity of Hepatitis C Virus’, eLife, 8: e42463. - PMC - PubMed
    1. Bouckaert R. et al. (2014) ‘BEAST 2: A Software Platform for Bayesian Evolutionary Analysis’, PLoS Computational Biology, 10: e1003537. - PMC - PubMed
    1. British Government Act (1948), <https://www.legislation.gov.uk/ukpga/Geo6/11-12/56/enacted> accessed 5 Jul 2021.