Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug;68(4):859-885.
doi: 10.1111/rssc.12338. Epub 2019 Feb 3.

Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Affiliations

Improving the identification of antigenic sites in the H1N1 influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Vinny Davies et al. J R Stat Soc Ser C Appl Stat. 2019 Aug.

Abstract

Understanding how genetic changes allow emerging virus strains to escape the protection afforded by vaccination is vital for the maintenance of effective vaccines. We use structural and phylogenetic differences between pairs of virus strains to identify important antigenic sites on the surface of the influenza A(H1N1) virus through the prediction of haemagglutination inhibition (HI) titre: pairwise measures of the antigenic similarity of virus strains. We propose a sparse hierarchical Bayesian model that can deal with the pairwise structure and inherent experimental variability in the H1N1 data through the introduction of latent variables. The latent variables represent the underlying HI titre measurement of any given pair of virus strains and help to account for the fact that, for any HI titre measurement between the same pair of virus strains, the difference in the viral sequence remains the same. Through accurately representing the structure of the H1N1 data, the model can select virus sites which are antigenic, while its latent structure achieves the computational efficiency that is required to deal with large virus sequence data, as typically available for the influenza virus. In addition to the latent variable model, we also propose a new method, the block-integrated widely applicable information criterion biWAIC, for selecting between competing models. We show how this enables us to select the random effects effectively when used with the model proposed and we apply both methods to an A(H1N1) data set.

Keywords: Antigenic variability; Bayesian hierarchical models; Influenza virus; Latent variable models; Markov chain Monte Carlo sampling; Mixed effects models; Spike‐and‐slab prior; Widely applicable information criterion.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Three‐dimensional structure of the influenza A(H1N1) haemagglutinin protein coloured by antigenic status: haemagglutinin is exposed on the virus surface and is composed of two regions, HA1 and HA2; HA1 is responsible for binding to host cells and is the primary target for the host immune system; known antigenic sites and the receptor binding site where changes are also expected to cause variation in the HI assay are shown in dark grey (proven regions); plausible antigenic regions in the head domain of haemagglutinin are shown in light grey; implausible antigenic regions in the stalk domain are shown in black, as are surface‐exposed areas of the HA2 part of the protein which was not included in our analysis; this model representation of the surface of haemagglutinin is based on the resolved structure of influenza A(H1N1) strain A/Puerto Rico/8/34 (Gamblin et al., 2004)
Figure 2
Figure 2
Compact representation of eSABRE as a probabilistic graphical model: formula image, the data and fixed (higher order) hyperparameters: ∘, parameters and hyperparameters that are inferred
Figure 3
Figure 3
Boxplots showing the effect of non‐IID Gaussian noise on a model assuming IID Gaussian noise (the boxplots show the probability that an irrelevant variable is included in a model for data with IID Gaussian noise (formula image) against the probabilities for a model with a noise structure based on the H1N1 data set (formula image); the results show the probability that the irrelevant variable is included in the model decreases as the number of observations increases for the data with IID Gaussian noise; conversely it shows an increase in the probability of its inclusion as the number of observations increases when there is a noise structure based on the H1N1 data set): (a) 500 observations; (b) 1000 observations; (c) 2000 observations
Figure 4
Figure 4
Bar plot of F1‐scores given in Table 3: the bar plot compares the F1‐scores for nWAIC (formula image), biWAIC (formula image) and Bayesian tenfold ICV (formula image) in terms of correctly selecting random‐effect components for the data set described in Section 6.1.3; the figure takes the results from Table 3
Figure 5
Figure 5
Plot of sensitivities and 1 minus specificities for the results given in Table 3: the plot compares nWAIC (∘), biWAIC (×) and Bayesian tenfold iCV (▵) in terms of correctly selecting random‐effect components for the data set described in Section 6.1.3; the figure takes the results from Table 3 and plots the sensitivities against the complementary specificities (i.e. 1 minus specificities), i.e. as single points from a receiver operating characteristic curve
Figure 6
Figure 6
Three‐dimensional structure of the influenza A(H1N1) haemagglutinin protein showing the positions of proven and plausible antigenic residues identified by using eSABRE: (a) proven residues (black) selected by eSABRE; (b) labelled plausible residues (black) where the biologically proven sites from Fig. 1 are shown in dark grey; the representation of the surface of haemagglutinin is based on the resolved structure of influenza A(H1N1) strain A/Puerto Rico/8/34 (Gamblin et al., 2004)

Similar articles

Cited by

References

    1. Andrieu, C. and Doucet, A. (1999) Joint bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Trans. Signl Process., 47, 2667–2676.
    1. Barr, I. G. , Russell, C. , Besselaar, T. G. , Cox, N. J. , Daniels, R. S. , Donis, R. , Engelhardt, O. G. , Grohmann, G. , Itamura, S. , Kelso, A. , McCauley, J. , Odagiri, T. , Schultz‐Cherry, S. , Shu, Y. , Smith, D. , Tashiro, M. , Wang, D. , Webby, R. , Xu, X. , Ye, Z. and Zhang, W. (2014) WHO recommendations for the viruses used in the 2013‐2014 Northern Hemisphere influenza vaccine: epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. Vaccine, 32, 4713–4725. - PubMed
    1. Caton, A. J. , Brownlee, G. G. , Yewdell, J. W. and Gerhard, W. (1982) The antigenic structure of the influenza virus A/PR/8/34 hemagglutinin (H1 subtype). Cell, 31, part 1, 417–427. - PubMed
    1. Davies, V. (2016) Sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution. PhD Thesis. University of Glasgow, Glasgow.
    1. Davies, V. , Reeve, R. , Harvey, W. T. and Husmeier, D. (2016) Selecting random effect components in a sparse hierarchical Bayesian model for identifying antigenic variability In Computational Intelligence Methods for Bioinformatics and Biostatistics (eds Angelini C., Rancoita P. M. V. and Rovetta S.), pp. 14–27. Cham: Springer.

LinkOut - more resources