Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Xiaowei Yang¹, Jinhui Li, Steven Shoptaw

Affiliations

PMID: 18205247
PMCID: PMC3032542
DOI: 10.1002/sim.3111

Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Xiaowei Yang et al. Stat Med. 2008.

. 2008 Jul 10;27(15):2826-49.

doi: 10.1002/sim.3111.

Authors

Xiaowei Yang¹, Jinhui Li, Steven Shoptaw

Affiliation

¹ Division of Biostatistics, School of Medicine, University of California, Med Sci 1-C, Suite 200, Davis, CA 95616, USA. XDYang@UCDavis.edu

PMID: 18205247
PMCID: PMC3032542
DOI: 10.1002/sim.3111

Abstract

Biomedical research is plagued with problems of missing data, especially in clinical trials of medical and behavioral therapies adopting longitudinal design. After a literature review on modeling incomplete longitudinal data based on full-likelihood functions, this paper proposes a set of imputation-based strategies for implementing selection, pattern-mixture, and shared-parameter models for handling intermittent missing values and dropouts that are potentially nonignorable according to various criteria. Within the framework of multiple partial imputation, intermittent missing values are first imputed several times; then, each partially imputed data set is analyzed to deal with dropouts with or without further imputation. Depending on the choice of imputation model or measurement model, there exist various strategies that can be jointly applied to the same set of data to study the effect of treatment or intervention from multi-faceted perspectives. For illustration, the strategies were applied to a data set with continuous repeated measures from a smoking cessation clinical trial.

PubMed Disclaimer

Figures

**Figure 1**
The average and SD curves for the log-scaled carbon monoxide levels. On this plot, the four mean curves of the log-scaled carbon monoxide levels and the corresponding pointwise standard errors are drawn for each of the four treatment conditions: Control, RP-only, CM-only, and RP+CM (RP = relapse prevention, CM = contingency management). Vertical bars indicate the estimated standard errors of average carbon monoxide levels. The stars (‘*’) over the x-axis mark the time points (i.e. visit numbers), where the carbon monoxide levels are significantly different indicated by a pointwise ANOVA (p-value<0.001). Y -axis indicates values of carbon monoxide levels after log(1+ x) transform. X-axis represents number of clinic visit for study participants (1, …, 36; three times per week).

**Figure 2**
Missingness patterns for the carbon monoxide levels across treatment conditions. For each treatment condition, an image depicts the missingness indicators of carbon monoxide levels for each smoker at each research visit. Dark colored area indicates that the corresponding carbon monoxide levels were observed while white colored area indicates that the corresponding data were missing intermittently or missing after dropout. The four treatment conditions are control, RP-only, CM-only, and RP+CM (RP = relapse prevention, CM = contingency management).

**Figure 3**
Mean carbon monoxide levels for completers and early terminators. By dividing the 174 smokers into two groups: completers (n₁ = 112) and early terminators (n₁ = 62), the mean curves of carbon monoxide levels for subjects receiving CM (contingency management) and for subjects receiving no CM are depicted within each of the two groups (completers and early terminators).

**Figure 4**
Plate 1. Pattern-dependent distribution of carbon monoxide levels. Using the software package named ‘MPI 2.0’, profiles and mean curves of carbon monoxide levels are drawn within each of the five groups determined by the dropout times: dropout at or before week 5, 7, 9, 11, and 12. In plots, green curves correspond to the mean carbon monoxide levels of subjects who received CM (contingency management), red curves indicate the mean curves of the subjects who did not receive CM, and gray-colored dash-lines depict the profiles of all the subjects within each group. The bottom-right plot depicts all the mean profiles corresponding to the five dropout patterns.

See this image and copyright information in PMC

References

1. Nich C, Carroll KM. ‘Intention-to-treat’ meets ‘missing data’: implications of alternate strategies for analyzing clinical trials data. Drug and Alcohol Dependence. 2002;68:121–130. - PMC - PubMed
1. Hedeker D, Gibbons RD. A random effects ordinal regression model for multilevel analysis. Biometrics. 1994;50:933–944. - PubMed
1. Follmann D, Wu M. An approximate generalized linear model with random effects for informative missing data. Biometrics. 1995;51:151–168. - PubMed
1. Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22.
1. Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Medical
- ClinicalTrials.gov
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Affiliation

Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical