Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

doi:10.1111/rssc.12338

. 2019 Aug;68(4):859-885.

doi: 10.1111/rssc.12338. Epub 2019 Feb 3.

Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Vinny Davies¹, William T Harvey¹, Richard Reeve¹, Dirk Husmeier¹

Affiliations

PMID: 31598013
PMCID: PMC6774336
DOI: 10.1111/rssc.12338

Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Vinny Davies et al. J R Stat Soc Ser C Appl Stat. 2019 Aug.

. 2019 Aug;68(4):859-885.

doi: 10.1111/rssc.12338. Epub 2019 Feb 3.

Authors

Vinny Davies¹, William T Harvey¹, Richard Reeve¹, Dirk Husmeier¹

Affiliation

¹ University of Glasgow UK.

PMID: 31598013
PMCID: PMC6774336
DOI: 10.1111/rssc.12338

Abstract

Understanding how genetic changes allow emerging virus strains to escape the protection afforded by vaccination is vital for the maintenance of effective vaccines. We use structural and phylogenetic differences between pairs of virus strains to identify important antigenic sites on the surface of the influenza A(H1N1) virus through the prediction of haemagglutination inhibition (HI) titre: pairwise measures of the antigenic similarity of virus strains. We propose a sparse hierarchical Bayesian model that can deal with the pairwise structure and inherent experimental variability in the H1N1 data through the introduction of latent variables. The latent variables represent the underlying HI titre measurement of any given pair of virus strains and help to account for the fact that, for any HI titre measurement between the same pair of virus strains, the difference in the viral sequence remains the same. Through accurately representing the structure of the H1N1 data, the model can select virus sites which are antigenic, while its latent structure achieves the computational efficiency that is required to deal with large virus sequence data, as typically available for the influenza virus. In addition to the latent variable model, we also propose a new method, the block-integrated widely applicable information criterion biWAIC, for selecting between competing models. We show how this enables us to select the random effects effectively when used with the model proposed and we apply both methods to an A(H1N1) data set.

Keywords: Antigenic variability; Bayesian hierarchical models; Influenza virus; Latent variable models; Markov chain Monte Carlo sampling; Mixed effects models; Spike‐and‐slab prior; Widely applicable information criterion.

PubMed Disclaimer

Figures

**Figure 1**
Three‐dimensional structure of the influenza A(H1N1) haemagglutinin protein coloured by antigenic status: haemagglutinin is exposed on the virus surface and is composed of two regions, HA1 and HA2; HA1 is responsible for binding to host cells and is the primary target for the host immune system; known antigenic sites and the receptor binding site where changes are also expected to cause variation in the HI assay are shown in dark grey (proven regions); plausible antigenic regions in the head domain of haemagglutinin are shown in light grey; implausible antigenic regions in the stalk domain are shown in black, as are surface‐exposed areas of the HA2 part of the protein which was not included in our analysis; this model representation of the surface of haemagglutinin is based on the resolved structure of influenza A(H1N1) strain A/Puerto Rico/8/34 (Gamblin *et al*., 2004)

**Figure 2**
Compact representation of eSABRE as a probabilistic graphical model: , the data and fixed (higher order) hyperparameters: ∘, parameters and hyperparameters that are inferred

formula image — **Figure 2**
Compact representation of eSABRE as a probabilistic graphical model: , the data and fixed (higher order) hyperparameters: ∘, parameters and hyperparameters that are inferred

**Figure 3**
Boxplots showing the effect of non‐IID Gaussian noise on a model assuming IID Gaussian noise (the boxplots show the probability that an irrelevant variable is included in a model for data with IID Gaussian noise () against the probabilities for a model with a noise structure based on the H1N1 data set (); the results show the probability that the irrelevant variable is included in the model decreases as the number of observations increases for the data with IID Gaussian noise; conversely it shows an increase in the probability of its inclusion as the number of observations increases when there is a noise structure based on the H1N1 data set): (a) 500 observations; (b) 1000 observations; (c) 2000 observations

**Figure 4**
Bar plot of F1‐scores given in Table 3: the bar plot compares the F1‐scores for nWAIC (), biWAIC () and Bayesian tenfold ICV () in terms of correctly selecting random‐effect components for the data set described in Section 6.1.3; the figure takes the results from Table 3

**Figure 5**
Plot of sensitivities and 1 minus specificities for the results given in Table 3: the plot compares nWAIC (∘), biWAIC (×) and Bayesian tenfold iCV (▵) in terms of correctly selecting random‐effect components for the data set described in Section 6.1.3; the figure takes the results from Table 3 and plots the sensitivities against the complementary specificities (i.e. 1 minus specificities), i.e. as single points from a receiver operating characteristic curve

**Figure 6**
Three‐dimensional structure of the influenza A(H1N1) haemagglutinin protein showing the positions of proven and plausible antigenic residues identified by using eSABRE: (a) proven residues (black) selected by eSABRE; (b) labelled plausible residues (black) where the biologically proven sites from Fig. 1 are shown in dark grey; the representation of the surface of haemagglutinin is based on the resolved structure of influenza A(H1N1) strain A/Puerto Rico/8/34 (Gamblin *et al*., 2004)

See this image and copyright information in PMC

Cited by

A Bayesian approach to incorporate structural data into the mapping of genotype to antigenic phenotype of influenza A(H3N2) viruses.
Harvey WT, Davies V, Daniels RS, Whittaker L, Gregory V, Hay AJ, Husmeier D, McCauley JW, Reeve R. Harvey WT, et al. PLoS Comput Biol. 2023 Mar 27;19(3):e1010885. doi: 10.1371/journal.pcbi.1010885. eCollection 2023 Mar. PLoS Comput Biol. 2023. PMID: 36972311 Free PMC article.
Antigenic characterization of influenza and SARS-CoV-2 viruses.
Wang Y, Tang CY, Wan XF. Wang Y, et al. Anal Bioanal Chem. 2022 Apr;414(9):2841-2881. doi: 10.1007/s00216-021-03806-6. Epub 2021 Dec 14. Anal Bioanal Chem. 2022. PMID: 34905077 Free PMC article. Review.

References

1. Andrieu, C. and Doucet, A. (1999) Joint bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Trans. Signl Process., 47, 2667–2676.
1. Barr, I. G. , Russell, C. , Besselaar, T. G. , Cox, N. J. , Daniels, R. S. , Donis, R. , Engelhardt, O. G. , Grohmann, G. , Itamura, S. , Kelso, A. , McCauley, J. , Odagiri, T. , Schultz‐Cherry, S. , Shu, Y. , Smith, D. , Tashiro, M. , Wang, D. , Webby, R. , Xu, X. , Ye, Z. and Zhang, W. (2014) WHO recommendations for the viruses used in the 2013‐2014 Northern Hemisphere influenza vaccine: epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. Vaccine, 32, 4713–4725. - PubMed
1. Caton, A. J. , Brownlee, G. G. , Yewdell, J. W. and Gerhard, W. (1982) The antigenic structure of the influenza virus A/PR/8/34 hemagglutinin (H1 subtype). Cell, 31, part 1, 417–427. - PubMed
1. Davies, V. (2016) Sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution. PhD Thesis. University of Glasgow, Glasgow.
1. Davies, V. , Reeve, R. , Harvey, W. T. and Husmeier, D. (2016) Selecting random effect components in a sparse hierarchical Bayesian model for identifying antigenic variability In Computational Intelligence Methods for Bioinformatics and Biostatistics (eds Angelini C., Rancoita P. M. V. and Rovetta S.), pp. 14–27. Cham: Springer.

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

[1] Andrieu, C. and Doucet, A. (1999) Joint bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Trans. Signl Process., 47, 2667–2676.

[2] Andrieu, C. and Doucet, A. (1999) Joint bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Trans. Signl Process., 47, 2667–2676.

[3] Barr, I. G. , Russell, C. , Besselaar, T. G. , Cox, N. J. , Daniels, R. S. , Donis, R. , Engelhardt, O. G. , Grohmann, G. , Itamura, S. , Kelso, A. , McCauley, J. , Odagiri, T. , Schultz‐Cherry, S. , Shu, Y. , Smith, D. , Tashiro, M. , Wang, D. , Webby, R. , Xu, X. , Ye, Z. and Zhang, W. (2014) WHO recommendations for the viruses used in the 2013‐2014 Northern Hemisphere influenza vaccine: epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. Vaccine, 32, 4713–4725. - PubMed

[4] Barr, I. G. , Russell, C. , Besselaar, T. G. , Cox, N. J. , Daniels, R. S. , Donis, R. , Engelhardt, O. G. , Grohmann, G. , Itamura, S. , Kelso, A. , McCauley, J. , Odagiri, T. , Schultz‐Cherry, S. , Shu, Y. , Smith, D. , Tashiro, M. , Wang, D. , Webby, R. , Xu, X. , Ye, Z. and Zhang, W. (2014) WHO recommendations for the viruses used in the 2013‐2014 Northern Hemisphere influenza vaccine: epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. Vaccine, 32, 4713–4725. - PubMed

[5] Caton, A. J. , Brownlee, G. G. , Yewdell, J. W. and Gerhard, W. (1982) The antigenic structure of the influenza virus A/PR/8/34 hemagglutinin (H1 subtype). Cell, 31, part 1, 417–427. - PubMed

[6] Caton, A. J. , Brownlee, G. G. , Yewdell, J. W. and Gerhard, W. (1982) The antigenic structure of the influenza virus A/PR/8/34 hemagglutinin (H1 subtype). Cell, 31, part 1, 417–427. - PubMed

[7] Davies, V. (2016) Sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution. PhD Thesis. University of Glasgow, Glasgow.

[8] Davies, V. (2016) Sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution. PhD Thesis. University of Glasgow, Glasgow.

[9] Davies, V. , Reeve, R. , Harvey, W. T. and Husmeier, D. (2016) Selecting random effect components in a sparse hierarchical Bayesian model for identifying antigenic variability In Computational Intelligence Methods for Bioinformatics and Biostatistics (eds Angelini C., Rancoita P. M. V. and Rovetta S.), pp. 14–27. Cham: Springer.

[10] Davies, V. , Reeve, R. , Harvey, W. T. and Husmeier, D. (2016) Selecting random effect components in a sparse hierarchical Bayesian model for identifying antigenic variability In Computational Intelligence Methods for Bioinformatics and Biostatistics (eds Angelini C., Rancoita P. M. V. and Rovetta S.), pp. 14–27. Cham: Springer.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Affiliation

Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources