Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 May 12:15:45.
doi: 10.1186/s12874-015-0036-8.

A holistic comparative analysis of diagnostic tests for urothelial carcinoma: a study of Cxbladder Detect, UroVysion® FISH, NMP22® and cytology based on imputation of multiple datasets

Affiliations
Comparative Study

A holistic comparative analysis of diagnostic tests for urothelial carcinoma: a study of Cxbladder Detect, UroVysion® FISH, NMP22® and cytology based on imputation of multiple datasets

Vivienne Breen et al. BMC Med Res Methodol. .

Abstract

Background: Comparing the relative utility of diagnostic tests is challenging when available datasets are small, partial or incomplete. The analytical leverage associated with a large sample size can be gained by integrating several small datasets to enable effective and accurate across-dataset comparisons. Accordingly, we propose a methodology for a holistic comparative analysis and ranking of cancer diagnostic tests through dataset integration and imputation of missing values, using urothelial carcinoma (UC) as a case study.

Methods: Five datasets comprising samples from 939 subjects, including 89 with UC, where up to four diagnostic tests (cytology, NMP22®, UroVysion® Fluorescence In-Situ Hybridization (FISH) and Cxbladder Detect) were integrated into a single dataset containing all measured records and missing values. The tests were firstly ranked using three criteria: sensitivity, specificity and a standard variable (feature) ranking method popularly known as signal-to-noise ratio (SNR) index derived from the mean values for all subjects clinically known to have UC versus healthy subjects. Secondly, step-wise unsupervised and supervised imputation (the latter accounting for the 'clinical truth' as determined by cystoscopy) was performed using personalized modelling, k-nearest-neighbour methods, multiple logistic regression and multilayer perceptron neural networks. All imputation models were cross-validated by comparing their post-imputation predictive accuracy for UC with their pre-imputation accuracy. Finally, the post-imputation tests were re-ranked using the same three criteria.

Results: In both measured and imputed data sets, Cxbladder Detect ranked higher for sensitivity, and urine cytology a higher specificity, when compared with other UC tests. Cxbladder Detect consistently ranked higher than FISH and all other tests when SNR analyses were performed on measured, unsupervised and supervised imputed datasets. Supervised imputation resulted in a smaller cross-validation error. Cxbladder Detect was robust to imputation showing a 2% difference in its predictive versus clinical accuracy, outperforming FISH, NMP22 and cytology.

Conclusion: All data analysed, pre- and post-imputation showed that Cxbladder Detect had higher SNR and outperformed all other comparator tests, including FISH. The methodology developed and validated for comparative ranking of the diagnostic tests for detecting UC, may be further applied to other cancer diagnostic datasets across population groups and multiple datasets.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Ranking of tests in a univariate mode using SNR on the integrated dataset before imputation
Fig. 2
Fig. 2
Rankings of tests for the integrated dataset after a supervised and b unsupervised imputation
Fig. 3
Fig. 3
Comparisons after a supervised and b unsupervised imputation in two-dimensional contour plots of sensitivity and specificity

References

    1. Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Modelling relative survival in the presence of incomplete data: a tutorial. Int J Epidemiol. 2010;39:118–28. doi: 10.1093/ije/dyp309. - DOI - PubMed
    1. He Y, Yucel R, Zaslavsky AM. Misreporting, missing data and multiple imputation: improving accuracy of cancer registry databases. Chance (NY) 2008;21:55–8. - PMC - PubMed
    1. He Y, Zaslavsky AM, Harrington DP, Catalano P, Landrum MB. Multiple imputation in a large-scale complex survey: a practical guide. Stat Methods Med Res. 2010;19:653–70. doi: 10.1177/0962280208101273. - DOI - PMC - PubMed
    1. Eisemann N, Waldmann A, Katalinic A. Imputation of missing values of tumour stage in population-based cancer registration. BMC Med Res Methodol. 2011;11:129–41. doi: 10.1186/1471-2288-11-129. - DOI - PMC - PubMed
    1. Guzel C, Kaya M, Yildiz O, Bilge HS. Breast cancer diagnosis based on naïve bayes machine learning with KNN missing data imputation. AWERProcedia Inf Technol Comput Sci. 2010;4:401–7.

Publication types

MeSH terms