Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Aug;58(8):1242-51.
doi: 10.1373/clinchem.2012.186007. Epub 2012 Jun 22.

Biases introduced by choosing controls to match risk factors of cases in biomarker research

Affiliations

Biases introduced by choosing controls to match risk factors of cases in biomarker research

Margaret Sullivan Pepe et al. Clin Chem. 2012 Aug.

Abstract

Background: Selecting controls that match cases on risk factors for the outcome is a pervasive practice in biomarker research studies. Such matching, however, biases estimates of biomarker prediction performance. The magnitudes of these biases are unknown.

Methods: We examined the prediction performance of biomarkers and improvements in prediction gained by adding biomarkers to risk factor information. Data simulated from bivariate normal statistical models and data from a study to identify critically ill patients were used. We compared true performance with that estimated from case control studies that do or do not use matching. ROC curves were used to quantify performance. We propose a new statistical method to estimate prediction performance from matched studies for which data on the matching factors are available for subjects in the population.

Results: Performance estimated with standard analyses can be grossly biased by matching, especially when biomarkers are highly correlated with matching risk factors. In our studies, the performance of the biomarker alone was underestimated whereas the improvement in performance gained by adding the marker to risk factors was overestimated by 2-10-fold. We found examples for which the relative ranking of 2 biomarkers for prediction was inappropriately reversed by use of a matched design. The new approach to estimation corrected for bias in matched studies.

Conclusions: To properly gauge prediction performance in the population or the improvement gained by adding a biomarker to known risk factors, matched case control studies must be supplemented with risk factor information from the population and must be analyzed with nonstandard statistical methods.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ROC curves calculated with data from cases and unmatched controls and from a study where controls are selected to match cases in regards to clinical risk factor predictors (CP). Data for n=2000 cases and controls were simulated from bivariate binormal models corresponding to the scenario in Table 1 with ROCX(0.2)= ROCY(0.2)=0.64 and correlation=0.5.
Figure 2
Figure 2
ROC curves calculated with data from cases and unmatched controls and from a study where controls are selected to match cases in regards to clinical risk factor predictors (CP). Data for n=100 cases and controls were simulated from bivariate binormal models corresponding to the scenario in Figure 1.
Figure 3
Figure 3
ROC curves for critical illness calculated using the clinical predictors alone (CP), the marker alone or using the marker and clinical predictors combined (CP+marker).
Figure 4
Figure 4
Comparison of markers in unmatched and matched studies. Left panels show performances of markers alone. Right panels show incremental value, i.e. performance of each marker combined with clinical predictors versus clinical predictors alone. The better marker is marker A in each scenario but it appears worse than marker B in a matched study.

Comment in

References

    1. Zhu CS, Pinsky PF, Cramer DW, Ransohoff DF, Hartge P, Pfeiffer RM, Urban N, et al. A framework for evaluating biomarkers for early detection: validation of biomarker panels for ovarian cancer. Cancer Prev Res. 2011;4:375–83. - PMC - PubMed
    1. Liu R, Chen X, Du Y, Yao W, Shen L, Wang C, et al. Serum MicroRNA Expression Profile as a Biomarker in the Diagnosis and Prognosis of Pancreatic Cancer. Clin Chem. 58:610–8. - PubMed
    1. Moore LE, Pfeiffer RM, Zhang Z, Lu KH, Fung ET, Bast RC., Jr Proteomic biomarkers in combination with CA 125 for detection of epithelial ovarian cancer using prediagnostic serum samples from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Cancer. 2012;118:91–100. - PMC - PubMed
    1. Chapman CJ, Thorpe AJ, Murray A, Parsy-Kowalska CB, Allen J, Stafford KM. Immunobiomarkers in small cell lung cancer: potential early cancer signals. Clin Cancer Res. 2011;17:1474–80. - PubMed
    1. Maeda J, Higashiyama M, Imaizumi A, Nakayama T, Yamamoto H, Daimon T. Possibility of multivariate function composed of plasma amino acid profiles as a novel screening index for non-small cell lung cancer: a case control study. BMC Cancer. 2010;10:690. - PMC - PubMed

Publication types