Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 May 2:2025.04.30.25326766.
doi: 10.1101/2025.04.30.25326766.

Calculating epidemiological outcomes from simulated longitudinal data

Affiliations

Calculating epidemiological outcomes from simulated longitudinal data

Selina Pi et al. medRxiv. .

Abstract

Microsimulation models generate individual life trajectories that must be summarized as population-level outcomes for model calibration and validation. While there are established formulas to calculate outcomes such as prevalence, incidence, and lifetime risk from cross-sectional and short-term longitudinal studies, limited guidance exists to calculate these outcomes using long-term longitudinal data due to the rarity of large-scale studies covering events across the human lifespan. This technical report presents various methods to calculate epidemiological outcomes from simulated longitudinal data, from replicating a real-world study design to fully incorporating longitudinal disease and exposure durations. We provide an open-source code base with functions in R to calculate the prevalence, incidence, age-conditional risk, lifetime risk, and disease-specific mortality of a condition from individual-level time-to-event data. In addition, we provide guidance and code for calculating cancer-related outcomes from individual-level data, such as the stage distribution at diagnosis, the distribution of precancerous lesion multiplicity, and the mean dwell and sojourn time. Given the various possible formulations for certain outcomes, we call for increased transparency in reporting how summary outcomes are derived from microsimulation model outputs, and we anticipate that this report will facilitate the calculation of epidemiological outcomes in both simulated and real-world data.

Keywords: epidemiology; incidence; microsimulation; prevalence.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Cumulative distribution function for Weibull distribution with shape 4 and scale 60
Figure 2:
Figure 2:
Comparison of formulations for aggregate prevalence from age 30 to 80 (left) and prevalence in 10-year intervals from age 30 to 80 (right), with standard deviation of the estimates shown as error bars

Similar articles

References

    1. Vanni Tazio, Karnon Jonathan, Madan Jason, White Richard G., Edmunds W. John, Foss Anna M., and Legood Rosa. Calibrating Models in Economic Evaluation. PharmacoEconomics, 29(1):35–49, January 2011. - PubMed
    1. Haberman S.. Mathematical treatment of the incidence and prevalence of disease. Social Science & Medicine. Part A: Medical Psychology & Medical Sociology, 12:147–152, January 1978. - PubMed
    1. Lauer Jeremy A., Röhrich Klaus, Wirth Harald, Charette Claude, Gribble Steve, and Murray Christopher JL. PopMod: a longitudinal population model with two interacting disease states. Cost Effectiveness and Resource Allocation, 1(1):6, February 2003. - PMC - PubMed
    1. Elbasha Elamin H., Dasbach Erik J., and Insinga Ralph P.. Model for Assessing Human Papillomavirus Vaccination Strategies - Volume 13, Number 1—January 2007 - Emerging Infectious Diseases journal - CDC. 13(1):28, January 2007. - PMC - PubMed
    1. Alarid-Escudero Fernando, Gracia Valeria, Wolf Marina, Zhao Ran, Easterly Caleb W, Kim Jane J, Canfell Karen, de Kok Inge M C M, Barnabas Ruanne V, and Kulasingam Shalini. State-level disparities in cervical cancer prevention and outcomes in the United States: a modeling study. JNCI: Journal of the National Cancer Institute, 117(4):737–746, April 2025. - PMC - PubMed

Publication types

LinkOut - more resources