Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Sep;31(9):1641-1655.
doi: 10.1177/09622802211023955. Epub 2021 Dec 21.

Estimating a time-to-event distribution from right-truncated data in an epidemic: A review of methods

Affiliations
Review

Estimating a time-to-event distribution from right-truncated data in an epidemic: A review of methods

Shaun R Seaman et al. Stat Methods Med Res. 2022 Sep.

Abstract

Time-to-event data are right-truncated if only individuals who have experienced the event by a certain time can be included in the sample. For example, we may be interested in estimating the distribution of time from onset of disease symptoms to death and only have data on individuals who have died. This may be the case, for example, at the beginning of an epidemic. Right truncation causes the distribution of times to event in the sample to be biased towards shorter times compared to the population distribution, and appropriate statistical methods should be used to account for this bias. This article is a review of such methods, particularly in the context of an infectious disease epidemic, like COVID-19. We consider methods for estimating the marginal time-to-event distribution, and compare their efficiencies. (Non-)identifiability of the distribution is an important issue with right-truncated data, particularly at the beginning of an epidemic, and this is discussed in detail. We also review methods for estimating the effects of covariates on the time to event. An illustration of the application of many of these methods is provided, using data on individuals who had died with coronavirus disease by 5 April 2020.

Keywords: Coronavirus disease; Cox regression; failure time; identifiability; relative efficiency; right-truncation; survival analysis.

PubMed Disclaimer

Conflict of interest statement

Declaration of conflicting interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
For each of five values of λ and of four values of θ1 , graph shows the asymptotic relative efficiency (ARE) of the estimator of E(T) based on (i) L1kwn (solid line), (ii) L3 (broken line) and (iii) L4 (dot-dash line) all compared to (iv) L1est . The x -axis of each graph is τ and the y -axis is the ARE.
Figure 2.
Figure 2.
Distribution of symptom onset times in the sample of 304 individuals.
Figure 3.
Figure 3.
Estimated survival curves from the gamma model (left) and log normal model (right), obtained using likelihoods L1est , L1kwn , L3 and L4 . Dotted lines represent 95% confidence intervals (Estimates using L1kwn and L4 are so close they may be hard to distinguish.).
Figure 4.
Figure 4.
Comparison of non-parametric estimate (step function) of survival conditional on delay being less than 31 days with corresponding estimates from the gamma model (left) and log normal model (right). Dotted lines represent 95% confidence intervals for the non-parametric estimate. Estimates using L1kwn and L4 are so close that they are shown by a single line, and estimates using L1est and L3 are so close that they may be difficult to distinguish.
Figure 5.
Figure 5.
Estimate of log hazard ratio associated with sex=female as a function of P(Tτsex=male) . Dotted lines indicate 95% confidence limits calculated by bootstrap; these may be unreliable when P(Tτsex=male) is large (see text).

References

    1. Brookmeyer R, Damiano A. Statistical methods for short-term projections of AIDS incidence. Stat Med 1989; 8: 23–34. - PubMed
    1. Kalbfleisch JD, Lawless JF. Inference based on retrospective ascertainment: An analysis of the data on transfusion-related AIDS. J Am Stat Assoc 1989; 84: 360–372.
    1. Lynden-Bell D. A method of allowing for known observational selection in small samples applied to 3CR quasars. Mon Not R Astron Soc 1971; 155: 95–118.
    1. Turnbull BW. The empirical distribution function with arbitrarily grouped, censored and truncated data. J R Stat Soc, Ser B 1976; 38: 290–295.
    1. Verity R, Okell LC, Dorigatti I, et al.. Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect Dis 2020; 20: 669–677. - PMC - PubMed

Publication types