Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011 Aug;20(8):1571-9.
doi: 10.1158/1055-9965.EPI-10-1311. Epub 2011 Jul 12.

The handling of missing data in molecular epidemiology studies

Affiliations
Review

The handling of missing data in molecular epidemiology studies

Manisha Desai et al. Cancer Epidemiol Biomarkers Prev. 2011 Aug.

Abstract

Molecular epidemiology studies face a missing data problem, as biospecimen or imaging data are often collected on only a proportion of subjects eligible for study. We investigated all molecular epidemiology studies published as Research Articles, Short Communications, or Null Results in Brief in Cancer Epidemiology, Biomarkers & Prevention from January 1, 2009, to March 31, 2010, to characterize the extent that missing data were present and to elucidate how the issue was addressed. Of 278 molecular epidemiology studies assessed, most (95%) had missing data on a key variable (66%) and/or used availability of data (often, but not always the biomarker data) as inclusion criterion for study entry (45%). Despite this, only 10% compared subjects included in the analysis with those excluded from the analysis and 88% with missing data conducted a complete-case analysis, a method known to yield biased and inefficient estimates when the data are not missing completely at random. Our findings provide evidence that missing data methods are underutilized in molecular epidemiology studies, which may deleteriously affect the interpretation of results. We provide practical guidelines for the analysis and interpretation of molecular epidemiology studies with missing data.

PubMed Disclaimer

Conflict of interest statement

Disclosure of Potential Conflicts of Interest

No potential conflicts of interest were disclosed.

Figures

Figure 1
Figure 1
Articles considered for inclusion in assessment.
Figure 2
Figure 2
Presence of and techniques used to address missing data for studies assessed.

Comment in

References

    1. Greenland S, Finkle WD. A critical look at methods for handling missing covariates in epidemiologic regression analyses. Am J Epidemiol. 1995;142:1255–64. - PubMed
    1. Klebanoff MA, Cole SR. Use of multiple imputation in the epidemiologic literature. Am J Epidemiol. 2008;168:355–7. - PMC - PubMed
    1. Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc. 1996;91:473–89.
    1. Little R, Rubin DB. Statistical analysis with missing data. New York: Wiley-Interscience; 1987.
    1. Clendenen T, Koenig KL, Shore RE, Levitz M, Arslan AA, Zeleniuch-Jacquotte A. Postmenopausal levels of endogenous sex hormones and risk of colorectal cancer. Cancer Epidemiol Biomarkers Prev. 2009;18:275–81. - PMC - PubMed