Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 13;17(7):e1009744.
doi: 10.1371/journal.ppat.1009744. eCollection 2021 Jul.

Genome-wide analyses of human noroviruses provide insights on evolutionary dynamics and evidence of coexisting viral populations evolving under recombination constraints

Affiliations

Genome-wide analyses of human noroviruses provide insights on evolutionary dynamics and evidence of coexisting viral populations evolving under recombination constraints

Kentaro Tohma et al. PLoS Pathog. .

Abstract

Norovirus is a major cause of acute gastroenteritis worldwide. Over 30 different genotypes, mostly from genogroup I (GI) and II (GII), have been shown to infect humans. Despite three decades of genome sequencing, our understanding of the role of genomic diversification across continents and time is incomplete. To close the spatiotemporal gap of genomic information of human noroviruses, we conducted a large-scale genome-wide analyses that included the nearly full-length sequencing of 281 archival viruses circulating since the 1970s in over 10 countries from four continents, with a major emphasis on norovirus genotypes that are currently underrepresented in public genome databases. We provided new genome information for 24 distinct genotypes, including the oldest genome information from 12 norovirus genotypes. Analyses of this new genomic information, together with those publicly available, showed that (i) noroviruses evolve at similar rates across genomic regions and genotypes; (ii) emerging viruses evolved from transiently-circulating intermediate viruses; (iii) diversifying selection on the VP1 protein was recorded in genotypes with multiple variants; (iv) non-structural proteins showed a similar branching on their phylogenetic trees; and (v) contrary to the current understanding, there are restrictions on the ability to recombine different genomic regions, which results in co-circulating populations of viruses evolving independently in human communities. This study provides a comprehensive genetic analysis of diverse norovirus genotypes and the role of non-structural proteins on viral diversification, shedding new light on the mechanisms of norovirus evolution and transmission.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Spatiotemporal distribution of norovirus genomes analyzed in this study.
(A) Geographic map showing the countries with nearly full-length genome sequences highlighted in dark gray. The pie chart indicates the ratio of newly obtained sequences in this study (red) and those from the public database, GenBank (gray). The size of the pie chart shows relative sample size for each country. Geographic map (1:10m Cultural Vector, Admin 0 –Countries) was obtained at https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-0-countries/ (accessed on June 25, 2021). (B) The line indicates the number of nearly full-length genome sequences by year and the bar graph indicates the ratio of newly obtained sequences (red) and public database (gray). (C) Number of newly obtained sequences for major and minor genotypes. The pie chart indicates the ratio of newly obtained sequences in this study (red) and those from public database, GenBank (gray).
Fig 2
Fig 2. Archival samples improved the estimates of the rate of evolution.
(A) Linear regression of the root-to-tip divergence of the VP1-encoding nucleotide sequences indicated the improvement of the fitness (R2) by adding the archival samples in the analyses (P<0.01 in Wilcoxon matched-pairs signed rank test). The R2 values were calculated using maximum-likelihood phylogenetic trees from each genotype or variant. *NA indicates not enough sequences to build phylogenetic trees. Genotypes or variants presenting >5 sequences and time range >5 years were included in the analyses. (B) Mean and the 95% highest posterior density interval of nucleotide substitution rate (substitutions/site/year) were estimated for each genotype and variant with Bayesian framework. Only genotypes or variants presenting >10 sequences and time range >5 years were included in the analyses.
Fig 3
Fig 3. Diversifying selection on the major capsid protein (VP1) results in multiple intra-genotype variants.
Episodic diversifying selections were comprehensively estimated for GI, GII, and GIX genotypes presenting ≥20 sequences using MEME (Mixed Effects Model of Evolution), which modeled branch-by-branch episodic pressure, and FUBAR (Fast, Unconstrained Bayesian AppRoximation), which assumed constant pressure across the entire phylogenetic tree. (A) Statistically significant positively selected sites (P<0.05 with empirical Bayes Factor on internal branches>100 in MEME and Bayes Factor >0.9 in FUBAR) were counted in each genotype and summarized as bar plot. (B) The excessive number of sites identified as being under episodic diversifying selection on non-GII.4 viruses was correlated with the number of variants but not with the number of sequences or time span. The linear regression line and the 95% confidence band were plotted in gray. Circles represent non-GII.4 (filled) and GII.4 (empty) genotypes.
Fig 4
Fig 4. Limited accumulation of amino acid mutations on the major capsid protein of non-GII.4 noroviruses.
Amino acid distance was calculated from the oldest viruses for each given genotype from (A) GI, (B) GII, and (C) GIX viruses. Only genotypes with data from samples with ≥5 sequences were analyzed. Variants within each genotype were separately analyzed and are shown with different colors. Lines represent the linear regression for amino acid mutations occurring during a given time span for each genotype or variant.
Fig 5
Fig 5. Genome-wide analysis indicates that NS1/2, NS4, major (VP1), and minor (VP2) capsid proteins are the most variable regions of human noroviruses.
Genetic variability of encoding regions of nonstructural (NS1/2–7) and capsid (VP1 and VP2) proteins was calculated using Shannon entropy for GI and GII noroviruses. Bars represent the mean value calculated from individual residues values. Standard errors are shown for each bar.
Fig 6
Fig 6. Noroviruses evolve at similar rates across genomic regions and genotypes.
The mean and the 95% highest posterior density interval of nucleotide substitution rate (substitutions/site/year) were estimated from sequences encoding (A) non-structural and (B) VP2 proteins. Only genotypes or variants presenting >10 sequences were included in the analyses.
Fig 7
Fig 7. Large-scale phylogenetic analyses revealed intermediate or minor viruses.
Large-scale phylogenetic trees from (A) GII.17 VP1 and NS7, (B) GIX.1 VP1, and (C) GII NS7 nucleotide sequences revealed atypical viruses embedded in their evolutionary history. Newly described viruses and intermediate viruses are highlighted in yellow. Genotypes and variants in clusters were summarized and collapsed for visualization.
Fig 8
Fig 8. Phylogenetic pairing of capsid and polymerase types suggests restriction on recombination.
Genotypes were determined for all the viruses in the nearly full-length dataset using Norovirus Genotyping Tool [102]. Viruses were grouped phylogenetically and the number of viruses with a given capsid and polymerase types was recorded in each cell. Phylogenetic trees were calculated using a subsampled dataset: a maximum of two viruses from each combination of genotype and polymerase type. The colored boxes in the matrix indicate the recombination groups associated with the phylogenetic clustering on the NS7-encoding nucleotide sequences. The crossing lines on the phylogenetic tree of the VP1 indicate the disruption of VP1- clustering and recombination groups.
Fig 9
Fig 9. Recombination is restricted by sequence variation at the 3’ end of the ORF1 region.
(A) Genetic variability of the ORF1/2 junction region (400 nt) was measured using Shannon entropy. Stem-loop sequences on predicted sub-genomic RNA promoter and sub-genomic RNA was determined based on analyses from Simmonds et al. [60]. Nucleotide position was recorded based on Norwalk virus 8flla (accession number M87661). (B) Multidimensional Scaling Analysis (MDS) revealed the association between recombination pattern and nucleotide sequence variation at the 3’ end of ORF1 region. MDS (two-dimensional) was conducted with four non-overlapping windows in the ORF1/2 junction (400 nt): 1–100 nt and 101–200 nt in the 3’ end of ORF1, and 201–300 nt and 301–400 nt in the 5’ end of ORF2 region (top panels). MDS was also provided with different regions based on their genetic functions: 3’ end of NS7, predicted stem-loop, linker sequence, and 5’ end of sub-genomic RNA (bottom panels). Each dot represents a virus sequence color-coded based on the respective recombination group as defined in Fig 8. The MDS was conducted using nucleotide differences among viruses and the two dimensions in the MDS maps were represented by x- and y- axes with their directions arbitrary determined.
Fig 10
Fig 10. Mixed-infections with multiple norovirus genotypes suggest limited recombination events.
(A) Three cross-sectional cases and (B and C) two prolonged shedding cases for cohort children (NV066X and PX127, respectively) of mixed infection with two different norovirus genotypes. Pie charts represent the ratio of the read counts from each genotype, with total number of reads shown on the bottom of the pie charts. The Sanky diagrams represent the assembled contigs (clones) colored by corresponding genotypes. The x axis shows the norovirus genomic region, while the height of each clone indicates the depth of coverage at each given genome position.
Fig 11
Fig 11. Limited number of recombination events in the evolutionary history of major norovirus genotypes.
The maximum-clade credibility trees of GII.2, GII.3, GII.4, and GII.6 noroviruses were estimated and annotated with their corresponding polymerase types. Branches are color-coded by the polymerase types on the external tips and their ancestral nodes. The color for each polymerase type was determined from the phylogeny of the NS7-encoding nucleotide sequences. The direction and number of changes of polymerase types, i.e. recombination events, were estimated using Markov jump counting along the branches. Events with mean frequency >0.5 were summarized as bar graphs (the mean and the 95% highest posterior density interval).

Similar articles

Cited by

References

    1. Pires SM, Fischer-Walker CL, Lanata CF, Devleesschauwer B, Hall AJ, Kirk MD, et al.. Aetiology-Specific Estimates of the Global and Regional Incidence and Mortality of Diarrhoeal Diseases Commonly Transmitted through Food. PLoS One. 2015;10(12):e0142927. doi: 10.1371/journal.pone.0142927 - DOI - PMC - PubMed
    1. Patel MM, Widdowson MA, Glass RI, Akazawa K, Vinje J, Parashar UD. Systematic literature review of role of noroviruses in sporadic gastroenteritis. Emerg Infect Dis. 2008;14(8):1224–31. doi: 10.3201/eid1408.071114 - DOI - PMC - PubMed
    1. de Graaf M, van Beek J, Koopmans MP. Human norovirus transmission and evolution in a changing world. Nat Rev Microbiol. 2016;14(7):421–33. doi: 10.1038/nrmicro.2016.48 - DOI - PubMed
    1. Kapikian AZ, Wyatt RG, Dolin R, Thornhill TS, Kalica AR, Chanock RM. Visualization by immune electron microscopy of a 27-nm particle associated with acute infectious nonbacterial gastroenteritis. J Virol. 1972;10(5):1075–81. doi: 10.1128/JVI.10.5.1075-1081.1972 - DOI - PMC - PubMed
    1. Prasad BV, Hardy ME, Dokland T, Bella J, Rossmann MG, Estes MK. X-ray crystallographic structure of the Norwalk virus capsid. Science. 1999;286(5438):287–90. doi: 10.1126/science.286.5438.287 - DOI - PubMed

Publication types