Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 14;19(8):e1011384.
doi: 10.1371/journal.pcbi.1011384. eCollection 2023 Aug.

serosim: An R package for simulating serological data arising from vaccination, epidemiological and antibody kinetics processes

Affiliations

serosim: An R package for simulating serological data arising from vaccination, epidemiological and antibody kinetics processes

Arthur Menezes et al. PLoS Comput Biol. .

Abstract

serosim is an open-source R package designed to aid inference from serological studies, by simulating data arising from user-specified vaccine and antibody kinetics processes using a random effects model. Serological data are used to assess population immunity by directly measuring individuals' antibody titers. They uncover locations and/or populations which are susceptible and provide evidence of past infection or vaccination to help inform public health measures and surveillance. Both serological data and new analytical techniques used to interpret them are increasingly widespread. This creates a need for tools to simulate serological studies and the processes underlying observed titer values, as this will enable researchers to identify best practices for serological study design, and provide a standardized framework to evaluate the performance of different inference methods. serosim allows users to specify and adjust model inputs representing underlying processes responsible for generating the observed titer values like time-varying patterns of infection and vaccination, population demography, immunity and antibody kinetics, and serological sampling design in order to best represent the population and disease system(s) of interest. This package will be useful for planning sampling design of future serological studies, understanding determinants of observed serological data, and validating the accuracy and power of new statistical methods.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Directed acyclic graph representation of the full serosim model.
Each model level is shown within a box with stochastic dependencies depicted by a solid arrow and deterministic dependencies by a dashed arrow. Parameters/latent states of interest are depicted within the blue circles while the red square represents the observed state. The unobserved processes level (latent states) contains the epidemiological model (exposure model and immunity model) and the antibody model while the observed processes level contains the observation model. The probability of a successful exposure event x for individual i at time t (ϕi,t,x) is determined by the user-specified exposure and immunity models (1,2,3) while Zi,t,x is the binary state indicating whether individual i was infected or vaccinated by exposure event x at time t as determined by a Bernoulli trial (4). The antibody model (5) specifies how the true quantity of biomarker b for individual i at time t (Ai,t,b) is generated. Lastly, the observation model (6) specifies how the observed quantity of biomarker b for individual i at time t (Yi,t,b) is generated as a probabilistic function of the true, latent biomarker quantity A. Created with BioRender.com.
Fig 2
Fig 2. Exposure to biomarker mapping example scenarios.
The biomarker map specifies the exposure events x and biomarkers b of interest for the user’s simulation. Exposure events x are defined as any event which leads to biomarker production. Biomarkers b can represent antibodies against the entire virus, a specific epitope or a specific antibody class depending on the user’s preference and assay characteristic. Here, we provide various examples of biomarker maps to illustrate the flexibility over the data-generating process provided to the user. Created with BioRender.com.
Fig 3
Fig 3. Required inputs, models and subsequent outputs from the main serosim function, runserosim.
In order to build a simulation with serosim, users must follow the 7 steps outlined here and in the methods section to specify the required inputs and models for runserosim. Steps 1–3 specify initial simulation inputs while steps 4–7 specify the bulk of the unobserved and observed processes. For steps 4–7, we outline the user-specified inputs in the left column which are used for the user-specified models as depicted by the sampling statements in the middle column. Lastly, the generated outputs produced once the simulation is complete are depicted in the right column. Created with BioRender.com.
Fig 4
Fig 4. Individual exposure probabilities and individual immune history plots from a simulation with two exposure types.
The left hand plot displays the probability of successful (immunity-boosting) exposure for a simulation with 120 time steps, two exposure events and 100 individuals. The right hand plot displays the immune histories for the same simulation. Exposure event one (top row) represents an infection event and exposure event two (bottom row) represents a vaccination event. NA indicates that an individual was not available to be exposed in that time period, usually because they were not yet born or entered the study population.
Fig 5
Fig 5. True and observed biomarker quantity plots.
The left hand plot displays the true antibody kinetics for a simulation with 120 time steps, one biomarker and 100 individuals. This plot displays true biomarker quantities for all 100 individuals at all time steps. The right hand plot displays the observed biomarker quantity at the observation time (t = 120) given the specified observation model, similar to a cross-sectional survey. In this example, all individuals alive during the endpoint (t = 120) had their biomarker quantity, in this case antibody titer, measured with a continuous assay with user-specified noise, sensitivity and specificity. The left hand plot represents the unobserved process level within serosim generated by the exposure, immunity and antibody models while the right hand plot represents the observed data generated by the observation model (Fig 1). The true antibody kinetics for each individual (left hand plot) is not known in real world settings where researchers only have cross-sectional antibody titers (right hand plot).
Fig 6
Fig 6. runserosim run times for simulations of varying complexities.
We ran the runserosim function 100 times and report the run times under various simulation settings (number of individuals and time steps). Both parallelization and pre-computation within runserosim were turned on and 8 cores were specified. All four of these example cases are included in serosim. Each case study varies in complexity (S9 Table). The increase in run time between case study 1 and case study 2 is due to a computationally complex exposure model which modifies each individual’s force of exposure based on their age and nutritional status. Case study 2 was not run for 5000 individuals for 500 time steps due to the time required to run the simulations.
Fig 7
Fig 7. Sensitivity and specificity for varying thresholds of seropositivity in simulation-recovery experiments.
We simulated case study one 100 times and stored individual exposure history and observed biomarker quantities. We then calculated the number of true positives, true negatives, false positives and false negatives for identifying infections using various titer thresholds for seropositivity ranging from 100 mIU/mL to 350 mIU/mL. Here, we plotted the sensitivity and specificity achieved at each of those titer thresholds.

Similar articles

Cited by

References

    1. Metcalf CJE, Farrar J, Cutts FT, Basta NE, Graham AL, Lessler J, et al.. Use of serological surveys to generate key insights into the changing global landscape of infectious disease. Lancet. 2016;388: 728–730. doi: 10.1016/S0140-6736(16)30164-7 - DOI - PMC - PubMed
    1. Takahashi S, Metcalf CJE, Ferrari MJ, Moss WJ, Truelove SA, Tatem AJ, et al.. Reduced vaccination and the risk of measles and other childhood infections post-Ebola. Science. 2015;347: 1240–1242. doi: 10.1126/science.aaa3438 - DOI - PMC - PubMed
    1. Bjørnstad ON, Finkenstädt BF, Grenfell BT. Dynamics of Measles Epidemics: Estimating Scaling of Transmission Rates Using a Time Series SIR Model. Ecol Monogr. 2002;72: 169–184.
    1. Lessler J, Metcalf CJE, Grais RF, Luquero FJ, Cummings DAT, Grenfell BT. Measuring the performance of vaccination programs using cross-sectional surveys: a likelihood framework and retrospective analysis. PLoS Med. 2011;8: e1001110. doi: 10.1371/journal.pmed.1001110 - DOI - PMC - PubMed
    1. Cutts FT, Izurieta HS, Rhoda DA. Measuring coverage in MNCH: design, implementation, and interpretation challenges associated with tracking vaccination coverage using household surveys. PLoS Med. 2013;10: e1001404. doi: 10.1371/journal.pmed.1001404 - DOI - PMC - PubMed

Publication types