Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Dec 3;8(12):a031690.
doi: 10.1101/cshperspect.a031690.

Evolutionary Origins of Enteric Hepatitis Viruses

Affiliations
Review

Evolutionary Origins of Enteric Hepatitis Viruses

Anna-Lena Sander et al. Cold Spring Harb Perspect Med. .

Abstract

The enterically transmitted hepatitis A (HAV) and hepatitis E viruses (HEV) are the leading causes of acute viral hepatitis in humans. Despite the discovery of HAV and HEV 40-50 years ago, their evolutionary origins remain unclear. Recent discoveries of numerous nonprimate hepatoviruses and hepeviruses allow revisiting the evolutionary history of these viruses. In this review, we provide detailed phylogenomic analyses of primate and nonprimate hepatoviruses and hepeviruses. We identify conserved and divergent genomic properties and corroborate historical interspecies transmissions by phylogenetic comparisons and recombination analyses. We discuss the likely non-recent origins of human HAV and HEV precursors carried by mammals other than primates, and detail current zoonotic HEV infections. The novel nonprimate hepatoviruses and hepeviruses offer exciting new possibilities for future research focusing on host range and the unique biological properties of HAV and HEV.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Genetic diversity of hepatoviruses and hepeviruses and evidence for zoonotic transmission of hepeviruses to humans. (A) Bayesian phylogeny of the full genome of hepatoviruses. A GenBank search with the term “Hepatovirus” was performed on August 3, 2017, and all sequences longer than 6000 nucleotides were selected. Duplicates, cell culture-adapted strains, or viruses isolated from experimentally infected animals were excluded from the dataset, resulting in 124 final hepatitis A virus (HAV) sequences. Complete polyprotein coding sequences were translation aligned, then nonhomologous regions (px/3A) were deleted. HAV genotype (gt)IV and gtVI are defined only by partial sequence information and were, thus, not included in A (Nainan et al. 1991; Robertson et al. 1992). (B) Bayesian phylogeny of the full genome of hepeviruses. A GenBank search using the term “Hepeviridae” was performed on August 4, 2017, and all sequences exceeding 6000 nucleotides were selected. Duplicates, cell culture-adapted strains, or viruses isolated from experimentally infected animals were excluded from the dataset unless they were described as reference sequences according to Smith et al. (2016), resulting in final hepatitis E virus (HEV) 317 sequences. Complete open reading frame (ORF)1 and 2 sequences were concatenated and translationally aligned, and the nonhomologous hypervariable region within ORF1 was deleted. Unique hepeviruses from fox and mink were not included in B because they are only partially sequenced (Bodewes et al. 2013; Krog et al. 2013). Orthohepe, Orthohepevirus. (C) Bayesian phylogenies of the complete ORF1 and 2 of representative human and nonhuman hepeviruses showing zoonotic origins of human HEV. The moose HEV was used as an outgroup (KF951328). All Bayesian phylogenies were generated at the nucleotide level from translation alignments excluding all ambiguous data or gaps using MrBayes V3.1 (Ronquist and Huelsenbeck 2003). A general time-reversible (GTR) model with a γ distribution (G) across sites and a proportion of invariant sites (I) (GTR + G + I) was used as the substitution model. Trees were run for two million generations, sampled every 100 steps. After an exclusion of 5000 of the total 20,000 trees as burn-in, final trees were annotated with TreeAnnotator from the BEAST package (Drummond and Rambaut 2007) and visualized with FigTree. Bayesian posterior probability support above 0.9 at nodes is highlighted by filled circles. The scale bar indicates genetic distance. HEV genotypes (gt) are indicated by arabic numerals and subtypes by letters; HAV genotypes are indicated by roman numerals. H. sap., Homo sapiens; S. scr., Sus scrofa; C. nip., Cervus nippon; O. cun., Oryctulagus cuniculus; C. dro., Camelus dromedarius.
Figure 2.
Figure 2.
Genomic variability among hepatoviruses and hepeviruses. Genome organization showing conservation of putative functional domains within (A) hepatoviruses, and (B) hepeviruses. Conserved domains (green) are depicted above, and nonconserved domains (red) are depicted below graphs. (C) Relative CpG, UpA dinucleotide content and effective number of codons in hepatoviruses and hepeviruses calculated using SSE 1.3 software (Simmonds 2012). Median (bar) and quartiles (box and whiskers) are shown. (D) Phylogenetic relationships of boreoeutherian vertebrate orders, including an avian outgroup. (Phlyogeny adapted from Foley et al. 2016.) Squares indicate vertebrate orders in which hepatoviruses or orthohepeviruses were found. (E) Amino acid sequence identities within hepatoviruses (top) and hepeviruses (bottom) of different host orders. Generally, representative viruses from each host order were tagged and sequence identities within families were plotted using a fragment length of 400 and a step size of 200 amino acid residues. Alignment gaps were excluded from the analysis. A schematic representation of the hepatitis A virus (HAV)/hepatitis E virus (HEV) genome organization is depicted at the top for orientation. For HAV, complete coding sequences of the polyproteins were translationally aligned. Accession numbers of representative sequences were, within Primates: AB020564, AY644676, AB279732, D00924; Chiroptera: KT452742, KT452730, KT452729, KT452714; Rodentia: KT452735, KT452685, KT229611, KT452644, KT452637; Eulipotyphla: KT452691, KT452658. For HEV, the complete open reading frame (ORF)1 and ORF2 were concatenated and translationally aligned. ORF3 is only shown for indication of its position. Accession numbers of representative sequences were, within Lagomorpha: FJ906895, KJ013415; Primates: M73218, M74506, AP003430, AB197673; Cetartiodactyla: AF082843, AB189071, AB573435, AB602441, KF951328, KJ496143, KX387865; Chiroptera: JQ001749, KJ562187, KX513953; Aves: KX589065, KU670940, AY535004. Hel, Helicase; HVR, hypervariable region; MT, methyltransferase; PCP, papain-like cysteine protease; RdRp, RNA-dependent RNA polymerase; TMD, transmembrane domain; UTR, untranslated region; X, X domain/ADP-ribose-binding module; Y, Y-like domain.
Figure 3.
Figure 3.
Evidence of recombination in hepatoviruses. (A) Phylogenetic compatibility scan of the full polyprotein genes of hepatoviruses isolated from humans, rodents, tree shrew, and bats. GenBank Accession numbers of sequences used were: AB279735; KT452644; KT452685; KT452729; KT877158; KT452742; KT452735. The graph was created using SSE 1.3 (Simmonds 2012), a sliding window of 500 nt, and a step size of 50 nt, with a bootstrap cutoff of 70%. (B) The bootscan graph shows percent of bootstrap replicates (y axis) that support grouping of the query sequence with each of three test sequences in a 1000 nt window sliding over the genome at a 20 nt step (x axis, window center position); the plot was done using Simplot 3.5 (Lole et al. 1999) and the Kimura substitution model; dotted line shows the 70% reliable bootstrap support cutoff. As a result of alignment shifts, the genome plots do not precisely correspond to raw genome positions. (C) Bayesian phylogenies of hepatovirus domains P1, P2 (only 2C), and P3 (only 3CD) showing reliable evidence of several recombination events. Viruses are colored according to their host order. Phylogenies were generated at the amino acid level from translation alignments excluding all ambiguous data or gaps using MrBayes V3.1 (Huelsenbeck and Ronquist 2001) and a WAG amino acid substitution model. Trees were generated as described above and rooted by the avian encephalomyelitis virus (genus Tremovirus). Bayesian posterior probabilities above 0.9 are marked by filled circles at nodes. The scale bar indicates genetic distance. M. fas., Macaca fascicularis; C. aet., Chlorocebus aethiops; M. arv., Microtus arvalis; C. mig., Cricetulus migratorius; M. him., Marmota himalayana; S. mas., Sigmodon mascotensis; E. hel., Eidolon helvum; T. bel., Tupaia belangeri chinensis; E. eur., Erinaceus europeaensis; C. afr., Coleura afra; R. lan., Rhinolophus landeri; P. vit., Phoca vitulina; M. man., Miniopterus cf. manavi; L. sik., Lophuromys sikapusi; S. ara., Sorex araneus.
Figure 4.
Figure 4.
Evidence of recombination in hepeviruses. (A) Phylogenetic compatibility scan of concatenated open reading frame (ORF)1 + ORF2 alignment of hepeviruses isolated from human (X98292), tree shrew (KR905549), Falco tinnunculus (KU670940), Rattus rattus (AB847306), chicken (KF511397), Eptesicus serotonius (JQ001749), and Rhinolophus ferrumequinum (KJ562187) was generated as described in the legend to Figure 3A. (B) Bayesian phylogenies of complete ORF1 and 2 of orthohepeviruses. Viruses are colored according to their host order. Phylogenies were generated at the nucleotide level from translation alignments excluding all ambiguous data or gaps using MrBayes V3.1. A general time-reversible (GTR) model with a γ distribution (G) across sites and a proportion of invariant sites (I) (GTR + G + I) was used as the substitution model. Trees were generated as above. Trees were rooted by the sister genus Piscihepevirus. Bayesian posterior probabilities above 0.9 are marked by filled circles at nodes. The scale bar indicates genetic distance. C. bac., Camelus bactrianus; A. alc., Alces alces; R. nor., Rattus norvegicus; M. put., Mustela putorius furo; F. tin., Falco tinnunculus; E. gar., Egretta garzetta; E. ser., Eptesicus serotinus; G. gal., Gallus gallus; M. dav., Myotis davidii; R. fer., Rhinolophus ferrumequinum. (C) Evidence of recombination in the species Orthohepevirus A relative to species B–D. The scan was performed similarly as described in the legend for Figure 3B. Window size was 800 nt, step size at 20 nt. Accession numbers of sequences used: JQ013793; JN998606; AY535004; KJ562187. As a result of alignment shifts, the genome plots do not precisely correspond to raw genome positions. Orthohepe, Orthohepevirus. (D) Evidence of recombination in species Orthohepevirus A genotype (gt) 8 (C. bactrianus) relative to gt1 (Homo sapiens), gt7a (Camelus dromedarius), and the moose (A. alces) hepevirus. Window size was 2500 nt, step size 50 nt. Accession numbers of sequences used: KX387865; KJ496143; X98292; KF951328. ORF1–ORF3 were concatenated for bootscan analyses, as indicated by genomic representations above panels C and D. Hel, Helicase; HVR, hypervariable genome region; PCP, papain-like cysteine protease.
Figure 5.
Figure 5.
Global distribution and phylogenetic relationships of bat hepatoviruses and hepeviruses. (A) Map showing the geographic origins of recently identified bat hepatoviruses (blue) and bat hepeviruses (orange). CHN, China; COD, Democratic Republic of the Congo; CRC, Costa Rica; ESP, Spain; GER, Germany; GHA, Ghana; LUX, Luxembourg; MAD, Madagascar; PAN, Panama; ROU, Romania; UKR, Ukraine. (B) Chiropteran phylogeny (phylogeny adapted from Foley et al. 2016) complemented manually by the family of Miniopteridae, which diverged around 43 million years ago (mya) from the Vespertilionidae (Miller-Butterworth et al. 2007). Bat families in which hepatitis A virus (HAV)- or hepatitis E virus (HEV)-related viruses have been found are tagged with blue or orange squares, respectively. (C) Bayesian phylogeny of a 863-nucleotide partial VP2/VP3 region of bat and representative nonchiropteran hepatoviruses. Analyzed region corresponds to positions 391–1253 in a prototype genotype (gt) Ia HAV strain (GenBank AB020564). M. gla., Myodes glareolus. See also the legend to Figure 3. (D) Bayesian phylogeny of a 324-nucleotide partial RNA-dependent RNA polymerase (RdRp) region of bat and representative nonchiropteran hepeviruses. The analyzed region corresponds to positions 4255–4577 in an HEV gt1 prototype strain (GenBank accession number M73218). Generally, Bayesian phylogenies were generated on translation alignments excluding all ambiguous data or gaps using MrBayes V3.1 (Huelsenbeck and Ronquist 2001) and a WAG amino acid substitution matrix. Trees were rooted by Tremovirus for HAV and Piscihepevirus for HEV, respectively. Bayesian posterior probabilities above 0.9 are marked by filled circles at nodes. The scale bar indicates genetic distance. Host trees were generated using complete cytochrome B coding sequences retrieved from GenBank and settings as described for virus phylogenies, including priors, to increase phylogenetic resolution above the level of host families.

References

    1. Agol VI, Gmyl AP. 2010. Viral security proteins: Counteracting host defences. Nat Rev Microbiol 8: 867–878. - PMC - PubMed
    1. Anthony SJ, St Leger JA, Liang E, Hicks AL, Sanchez-Leon MD, Jain K, Lefkowitch JH, Navarrete-Macias I, Knowles N, Goldstein T, et al. 2015. Discovery of a novel hepatovirus (Phopivirus of seals) related to human hepatitis A virus. MBio 6: e01180. - PMC - PubMed
    1. Balayan MS, Andzhaparidze AG, Savinskaya SS, Ketiladze ES, Braginsky DM, Savinov AP, Poleschuk VF. 1983. Evidence for a virus in non-A, non-B hepatitis transmitted via the fecal-oral route. Intervirology 20: 23–31. - PubMed
    1. Batts W, Yun S, Hedrick R, Winton J. 2011. A novel member of the family Hepeviridae from cutthroat trout (Oncorhynchus clarkii). Virus Res 158: 116–123. - PubMed
    1. Beard MR, Cohen L, Lemon SM, Martin A. 2001. Characterization of recombinant hepatitis A virus genomes containing exogenous sequences at the 2A/2B junction. J Virol 75: 1414–1426. - PMC - PubMed

LinkOut - more resources