Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences

doi:10.1093/genetics/iyaa039

. 2021 Feb 9;217(2):iyaa039.

doi: 10.1093/genetics/iyaa039.

Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences

Myriam Croze¹, Yuseob Kim^{1

2}

Affiliations

¹ Division of EcoScience, Ewha Womans University, Seoul 03760, Korea.
² Department of Life Science, Ewha Womans University, Seoul 03760, Korea.

PMID: 33724414
PMCID: PMC8045704
DOI: 10.1093/genetics/iyaa039

Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences

Myriam Croze et al. Genetics. 2021.

. 2021 Feb 9;217(2):iyaa039.

doi: 10.1093/genetics/iyaa039.

Authors

Myriam Croze¹, Yuseob Kim^{1

2}

Affiliations

¹ Division of EcoScience, Ewha Womans University, Seoul 03760, Korea.
² Department of Life Science, Ewha Womans University, Seoul 03760, Korea.

PMID: 33724414
PMCID: PMC8045704
DOI: 10.1093/genetics/iyaa039

Abstract

Basic summary statistics that quantify the population genetic structure of influenza virus are important for understanding and inferring the evolutionary and epidemiological processes. However, the sampling dates of global virus sequences in the last several decades are scattered nonuniformly throughout the calendar. Such temporal structure of samples and the small effective size of viral population hampers the use of conventional methods to calculate summary statistics. Here, we define statistics that overcome this problem by correcting for the sampling-time difference in quantifying a pairwise sequence difference. A simple linear regression method jointly estimates the mutation rate and the level of sequence polymorphism, thus providing an estimate of the effective population size. It also leads to the definition of Wright's FST for arbitrary time-series data. Furthermore, as an alternative to Tajima's D statistic or the site-frequency spectrum, a mismatch distribution corrected for sampling-time differences can be obtained and compared between actual and simulated data. Application of these methods to seasonal influenza A/H3N2 viruses sampled between 1980 and 2017 and sequences simulated under the model of recurrent positive selection with metapopulation dynamics allowed us to estimate the synonymous mutation rate and find parameter values for selection and demographic structure that fit the observation. We found that the mutation rates of HA and PB1 segments before 2007 were particularly high and that including recurrent positive selection in our model was essential for the genealogical structure of the HA segment. Methods developed here can be generally applied to population genetic inferences using serially sampled genetic data.

Keywords: influenza virus; mismatch distribution; serial sample; summary statistics.

PubMed Disclaimer

Figures

**Figure 1**
Coalescent tree of two viral sequences that are sampled at times different by τ. Assuming a constant rate µ of neutral (synonymous) mutation along a lineage, the expected neutral sequence difference is given by $(2 E [T] + τ) μ$ , where T is time to the coalescence of two contemporaneous sequences. Therefore, the expectation of synonymous sequence difference is greater than the scaled mutation rate, 2E[T]µ = 2 N_eµ, and the difference is τµ.

**Figure 2**
Pairwise nucleotide difference (d) per site of segments HA, NA, PB1, PB2, PA, and NP plotted against sampling time difference (τ, in days) for H3N2 data sequences. Data points are from 27-year data (1980–2006; black dots) and from the 10-year data (2007–2017; gray dots). Regression lines for 27- and 10-year data are shown in red and blue, respectively. The proportions of bootstrap samples in the tests for the statistical difference between 27- and 10-year periods of $\hat{μ}$ , $\overset{⌣}{π}$ , and $\hat{N_{e}}$ are shown below each regression plot.

**Figure 3**
The TCMDs of six influenza virus segments in the 27-year (A) and 10- year (B) H3N2 data sets. To obtain $\overset{⌣}{d}$ and make histograms $τ_{\max} = 300$ and bin size w = 0.002 were used.

**Figure 4**
The average TCMD (black curve) for simulated data under neutrality (s = 0) with m = 0.004 and $K_{\max} = 110$ that produce the best-fitting $F_{ST}$ and $\overset{⌣}{π}$ values to the observed data (red curve; the TCMD of HA segments in the 10-year data set). TCMDs of individual simulation replicates are shown in gray curves. To obtain $\overset{⌣}{d}$ and make histograms $τ_{\max} = 300$ and w = 0.002 were used.

**Figure 5**
The average TCMD (black curve) for simulated data under positive selection (s = 0.1 and ε = 10) with m = 0.00025 and $K_{\max} = 6700$ , which is congruent to the TCMD (red curve) of HA segment in the 10-year H3N2 data (KS test, p = 0.058). TCMDs of individual simulation replicates are shown in gray curves. To obtain $\overset{⌣}{d}$ and make histograms $τ_{\max} = 300$ and w = 0.002 were used.

See this image and copyright information in PMC

Cited by

Effects of host and pathogenicity on mutation rates in avian influenza A viruses.
Kim G, Shin HM, Kim HR, Kim Y. Kim G, et al. Virus Evol. 2022 Feb 21;8(1):veac013. doi: 10.1093/ve/veac013. eCollection 2022. Virus Evol. 2022. PMID: 35295747 Free PMC article.

References

1. Allen JD, Ross TM. 2018. H3N2 influenza viruses in humans: viral mechanisms, evolution, and evaluation. Hum Vaccin Immunother. 14:1840–1847. - PMC - PubMed
1. Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, et al. 2008. The influenza virus resource at the National Center for Biotechnology Information. JVI 82:596–601. - PMC - PubMed
1. Bedford T, Cobey S, Pascual M. 2011. Strength and tempo of selection revealed in viral gene genealogies. BMC Evol Biol. 11:220. - PMC - PubMed
1. Berry IM, Melendrez MC, Li T, Hawksworth AW, Brice GT, et al. 2016. Frequency of influenza H3N2 intra-subtype reassortment: attributes and implications of reassortant spread. BMC Biol. 14:117. - PMC - PubMed
1. Bhatt S, Holmes EC, Pybus OG. 2011. The genomic rate of molecular adaptation of the human influenza A virus. Mol Biol Evol. 28:2443–2451. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

[1] Allen JD, Ross TM. 2018. H3N2 influenza viruses in humans: viral mechanisms, evolution, and evaluation. Hum Vaccin Immunother. 14:1840–1847. - PMC - PubMed

[2] Allen JD, Ross TM. 2018. H3N2 influenza viruses in humans: viral mechanisms, evolution, and evaluation. Hum Vaccin Immunother. 14:1840–1847. - PMC - PubMed

[3] Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, et al. 2008. The influenza virus resource at the National Center for Biotechnology Information. JVI 82:596–601. - PMC - PubMed

[4] Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, et al. 2008. The influenza virus resource at the National Center for Biotechnology Information. JVI 82:596–601. - PMC - PubMed

[5] Bedford T, Cobey S, Pascual M. 2011. Strength and tempo of selection revealed in viral gene genealogies. BMC Evol Biol. 11:220. - PMC - PubMed

[6] Bedford T, Cobey S, Pascual M. 2011. Strength and tempo of selection revealed in viral gene genealogies. BMC Evol Biol. 11:220. - PMC - PubMed

[7] Berry IM, Melendrez MC, Li T, Hawksworth AW, Brice GT, et al. 2016. Frequency of influenza H3N2 intra-subtype reassortment: attributes and implications of reassortant spread. BMC Biol. 14:117. - PMC - PubMed

[8] Berry IM, Melendrez MC, Li T, Hawksworth AW, Brice GT, et al. 2016. Frequency of influenza H3N2 intra-subtype reassortment: attributes and implications of reassortant spread. BMC Biol. 14:117. - PMC - PubMed

[9] Bhatt S, Holmes EC, Pybus OG. 2011. The genomic rate of molecular adaptation of the human influenza A virus. Mol Biol Evol. 28:2443–2451. - PMC - PubMed

[10] Bhatt S, Holmes EC, Pybus OG. 2011. The genomic rate of molecular adaptation of the human influenza A virus. Mol Biol Evol. 28:2443–2451. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences

Affiliations

Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Miscellaneous