. 2023 Aug 14;19(8):e1011384.

doi: 10.1371/journal.pcbi.1011384. eCollection 2023 Aug.

serosim: An R package for simulating serological data arising from vaccination, epidemiological and antibody kinetics processes

Arthur Menezes¹, Saki Takahashi², Isobel Routledge³, C Jessica E Metcalf^{1

4}, Andrea L Graham^{1

5}, James A Hay^{6

7}

Affiliations

¹ Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, United States of America.
² Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America.
³ Department of Medicine, University of California San Francisco, San Francisco, California, United States of America.
⁴ Princeton School of Public and International Affairs, Princeton University, Princeton, New Jersey, United States of America.
⁵ Santa Fe Institute, Santa Fe, New Mexico, United States of America.
⁶ Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁷ Center for Communicable Disease Dynamics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.

PMID: 37578985
PMCID: PMC10449138
DOI: 10.1371/journal.pcbi.1011384

serosim: An R package for simulating serological data arising from vaccination, epidemiological and antibody kinetics processes

Arthur Menezes et al. PLoS Comput Biol. 2023.

. 2023 Aug 14;19(8):e1011384.

doi: 10.1371/journal.pcbi.1011384. eCollection 2023 Aug.

Authors

Arthur Menezes¹, Saki Takahashi², Isobel Routledge³, C Jessica E Metcalf^{1

4}, Andrea L Graham^{1

5}, James A Hay^{6

7}

Affiliations

¹ Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, United States of America.
² Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America.
³ Department of Medicine, University of California San Francisco, San Francisco, California, United States of America.
⁴ Princeton School of Public and International Affairs, Princeton University, Princeton, New Jersey, United States of America.
⁵ Santa Fe Institute, Santa Fe, New Mexico, United States of America.
⁶ Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁷ Center for Communicable Disease Dynamics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.

PMID: 37578985
PMCID: PMC10449138
DOI: 10.1371/journal.pcbi.1011384

Abstract

serosim is an open-source R package designed to aid inference from serological studies, by simulating data arising from user-specified vaccine and antibody kinetics processes using a random effects model. Serological data are used to assess population immunity by directly measuring individuals' antibody titers. They uncover locations and/or populations which are susceptible and provide evidence of past infection or vaccination to help inform public health measures and surveillance. Both serological data and new analytical techniques used to interpret them are increasingly widespread. This creates a need for tools to simulate serological studies and the processes underlying observed titer values, as this will enable researchers to identify best practices for serological study design, and provide a standardized framework to evaluate the performance of different inference methods. serosim allows users to specify and adjust model inputs representing underlying processes responsible for generating the observed titer values like time-varying patterns of infection and vaccination, population demography, immunity and antibody kinetics, and serological sampling design in order to best represent the population and disease system(s) of interest. This package will be useful for planning sampling design of future serological studies, understanding determinants of observed serological data, and validating the accuracy and power of new statistical methods.

Copyright: © 2023 Menezes et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Directed acyclic graph representation of the full *serosim* model.**
Each model level is shown within a box with stochastic dependencies depicted by a solid arrow and deterministic dependencies by a dashed arrow. Parameters/latent states of interest are depicted within the blue circles while the red square represents the observed state. The unobserved processes level (latent states) contains the epidemiological model (exposure model and immunity model) and the antibody model while the observed processes level contains the observation model. The probability of a successful exposure event x for individual i at time t (ϕ_i,t,x) is determined by the user-specified exposure and immunity models (1,2,3) while Z_i,t,x is the binary state indicating whether individual i was infected or vaccinated by exposure event x at time t as determined by a Bernoulli trial (4). The antibody model (5) specifies how the true quantity of biomarker b for individual i at time t (A_i,t,b) is generated. Lastly, the observation model (6) specifies how the observed quantity of biomarker b for individual i at time t (Y_i,t,b) is generated as a probabilistic function of the true, latent biomarker quantity A. Created with BioRender.com.

**Fig 2. Exposure to biomarker mapping example scenarios.**
The biomarker map specifies the exposure events x and biomarkers b of interest for the user’s simulation. Exposure events x are defined as any event which leads to biomarker production. Biomarkers b can represent antibodies against the entire virus, a specific epitope or a specific antibody class depending on the user’s preference and assay characteristic. Here, we provide various examples of biomarker maps to illustrate the flexibility over the data-generating process provided to the user. Created with BioRender.com.

**Fig 3. Required inputs, models and subsequent outputs from the main *serosim* function, runserosim.**
In order to build a simulation with *serosim*, users must follow the 7 steps outlined here and in the methods section to specify the required inputs and models for **runserosim.** Steps 1–3 specify initial simulation inputs while steps 4–7 specify the bulk of the unobserved and observed processes. For steps 4–7, we outline the user-specified inputs in the left column which are used for the user-specified models as depicted by the sampling statements in the middle column. Lastly, the generated outputs produced once the simulation is complete are depicted in the right column. Created with BioRender.com.

**Fig 4. Individual exposure probabilities and individual immune history plots from a simulation with two exposure types.**
The left hand plot displays the probability of successful (immunity-boosting) exposure for a simulation with 120 time steps, two exposure events and 100 individuals. The right hand plot displays the immune histories for the same simulation. Exposure event one (top row) represents an infection event and exposure event two (bottom row) represents a vaccination event. NA indicates that an individual was not available to be exposed in that time period, usually because they were not yet born or entered the study population.

**Fig 5. True and observed biomarker quantity plots.**
The left hand plot displays the true antibody kinetics for a simulation with 120 time steps, one biomarker and 100 individuals. This plot displays true biomarker quantities for all 100 individuals at all time steps. The right hand plot displays the observed biomarker quantity at the observation time (t = 120) given the specified observation model, similar to a cross-sectional survey. In this example, all individuals alive during the endpoint (t = 120) had their biomarker quantity, in this case antibody titer, measured with a continuous assay with user-specified noise, sensitivity and specificity. The left hand plot represents the unobserved process level within *serosim* generated by the exposure, immunity and antibody models while the right hand plot represents the observed data generated by the observation model (Fig 1). The true antibody kinetics for each individual (left hand plot) is not known in real world settings where researchers only have cross-sectional antibody titers (right hand plot).

**Fig 6. runserosim run times for simulations of varying complexities.**
We ran the **runserosim** function 100 times and report the run times under various simulation settings (number of individuals and time steps). Both parallelization and pre-computation within **runserosim** were turned on and 8 cores were specified. All four of these example cases are included in *serosim*. Each case study varies in complexity (S9 Table). The increase in run time between case study 1 and case study 2 is due to a computationally complex exposure model which modifies each individual’s force of exposure based on their age and nutritional status. Case study 2 was not run for 5000 individuals for 500 time steps due to the time required to run the simulations.

**Fig 7. Sensitivity and specificity for varying thresholds of seropositivity in simulation-recovery experiments.**
We simulated case study one 100 times and stored individual exposure history and observed biomarker quantities. We then calculated the number of true positives, true negatives, false positives and false negatives for identifying infections using various titer thresholds for seropositivity ranging from 100 mIU/mL to 350 mIU/mL. Here, we plotted the sensitivity and specificity achieved at each of those titer thresholds.

See this image and copyright information in PMC

Cited by

Serodynamics: A primer and synthetic review of methods for epidemiological inference using serological data.
Hay JA, Routledge I, Takahashi S. Hay JA, et al. Epidemics. 2024 Dec;49:100806. doi: 10.1016/j.epidem.2024.100806. Epub 2024 Nov 30. Epidemics. 2024. PMID: 39647462 Free PMC article. Review.
serojump: A Bayesian tool for inferring infection timing and antibody kinetics from longitudinal serological data.
Hodgson D, Hay J, Jarju S, Jobe D, Wenlock R, de Silva TI, Kucharski AJ. Hodgson D, et al. medRxiv [Preprint]. 2025 Mar 5:2025.03.04.25323335. doi: 10.1101/2025.03.04.25323335. medRxiv. 2025. PMID: 40093253 Free PMC article. Preprint.
RSero: A user-friendly R package to reconstruct pathogen circulation history from seroprevalence studies.
Hozé N, Pons-Salort M, Metcalf CJE, White M, Salje H, Cauchemez S. Hozé N, et al. PLoS Comput Biol. 2025 Feb 3;21(2):e1012777. doi: 10.1371/journal.pcbi.1012777. eCollection 2025 Feb. PLoS Comput Biol. 2025. PMID: 39899643 Free PMC article.
serocalculator, an R package for estimating seroincidence from cross-sectional serological data.
Lai KW, Orwa C, Seidman JC, Garrett DO, Saha SK, Tamrakar D, Qamar FN, Charles RC, Andrews JR, Aiemjoy K, Morrison DE. Lai KW, et al. medRxiv [Preprint]. 2025 Aug 4:2025.06.04.25328941. doi: 10.1101/2025.06.04.25328941. medRxiv. 2025. PMID: 40502584 Free PMC article. Preprint.
Linking multiple serological assays to infer dengue virus infections from paired samples using mixture models.
Hamins-Puértolas M, Buddhari D, Salje H, Huang AT, Hunsawong T, Cummings DAT, Fernandez S, Farmer A, Kaewhiran S, Khampaen D, Srikiatkhachorn A, Iamsirithaworn S, Waickman A, Thomas SJ, Endy T, Rothman AL, Anderson KB, Rodriguez-Barraquer I. Hamins-Puértolas M, et al. medRxiv [Preprint]. 2024 Dec 10:2024.12.08.24318683. doi: 10.1101/2024.12.08.24318683. medRxiv. 2024. PMID: 39711706 Free PMC article. Preprint.

See all "Cited by" articles

References

1. Metcalf CJE, Farrar J, Cutts FT, Basta NE, Graham AL, Lessler J, et al.. Use of serological surveys to generate key insights into the changing global landscape of infectious disease. Lancet. 2016;388: 728–730. doi: 10.1016/S0140-6736(16)30164-7 - DOI - PMC - PubMed
1. Takahashi S, Metcalf CJE, Ferrari MJ, Moss WJ, Truelove SA, Tatem AJ, et al.. Reduced vaccination and the risk of measles and other childhood infections post-Ebola. Science. 2015;347: 1240–1242. doi: 10.1126/science.aaa3438 - DOI - PMC - PubMed
1. Bjørnstad ON, Finkenstädt BF, Grenfell BT. Dynamics of Measles Epidemics: Estimating Scaling of Transmission Rates Using a Time Series SIR Model. Ecol Monogr. 2002;72: 169–184.
1. Lessler J, Metcalf CJE, Grais RF, Luquero FJ, Cummings DAT, Grenfell BT. Measuring the performance of vaccination programs using cross-sectional surveys: a likelihood framework and retrospective analysis. PLoS Med. 2011;8: e1001110. doi: 10.1371/journal.pmed.1001110 - DOI - PMC - PubMed
1. Cutts FT, Izurieta HS, Rhoda DA. Measuring coverage in MNCH: design, implementation, and interpretation challenges associated with tracking vaccination coverage using household surveys. PLoS Med. 2013;10: e1001404. doi: 10.1371/journal.pmed.1001404 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

WT_/Wellcome Trust/United Kingdom

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

serosim: An R package for simulating serological data arising from vaccination, epidemiological and antibody kinetics processes

Affiliations

serosim: An R package for simulating serological data arising from vaccination, epidemiological and antibody kinetics processes

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous