Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model

doi:10.1016/j.acra.2011.08.003

. 2011 Dec;18(12):1537-48.

doi: 10.1016/j.acra.2011.08.003.

Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model

Lorenzo L Pesce¹, Karla Horsch, Karen Drukker, Charles E Metz

Affiliations

PMID: 22055797
PMCID: PMC3368704
DOI: 10.1016/j.acra.2011.08.003

Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model

Lorenzo L Pesce et al. Acad Radiol. 2011 Dec.

. 2011 Dec;18(12):1537-48.

doi: 10.1016/j.acra.2011.08.003.

Authors

Lorenzo L Pesce¹, Karla Horsch, Karen Drukker, Charles E Metz

Affiliation

¹ Department of Radiology, MC 2026, The University of Chicago Medical Center, 5841 S Maryland Avenue, Chicago, IL 60637-1470, USA.

PMID: 22055797
PMCID: PMC3368704
DOI: 10.1016/j.acra.2011.08.003

Abstract

Rationale and objectives: Semiparametric methods provide smooth and continuous receiver operating characteristic (ROC) curve fits to ordinal test results and require only that the data follow some unknown monotonic transformation of the model's assumed distributions. The quantitative relationship between cutoff settings or individual test-result values on the data scale and points on the estimated ROC curve is lost in this procedure, however. To recover that relationship in a principled way, we propose a new algorithm for "proper" ROC curves and illustrate it by use of the proper binormal model.

Materials and methods: Several authors have proposed the use of multinomial distributions to fit semiparametric ROC curves by maximum-likelihood estimation. The resulting approach requires nuisance parameters that specify interval probabilities associated with the data, which are used subsequently as a basis for estimating values of the curve parameters of primary interest. In the method described here, we employ those "nuisance" parameters to recover the relationship between any ordinal test-result scale and true-positive fraction, false-positive fraction, and likelihood ratio. Computer simulations based on the proper binormal model were used to evaluate our approach in estimating those relationships and to assess the coverage of its confidence intervals for realistically sized datasets.

Results: In our simulations, the method reliably estimated simple relationships between test-result values and the several ROC quantities.

Conclusion: The proposed approach provides an effective and reliable semiparametric method with which to estimate the relationship between cutoff settings or individual test-result values and corresponding points on the ROC curve.

PubMed Disclaimer

Figures

**Figure 1**
The receiver operating characteristic (ROC) curves, binormal parameters, and areas under the ROC curve (AUCs) for the three populations used in the simulation studies.

**Figure 2**
The average behavior of the estimation algorithm (areas under the receiver operating characteristic curve = 0.85 and no skew). **(a)** Histograms of simulated test-result values (5000 samples of 200 positive and 200 negative cases each) are shown. The negative, positive, and mixed histograms are histograms of the actually negative cases, the actually positive cases, and all cases taken together, respectively. **(b,c,d)** The results for the average log likelihood ratio, the average true positive fraction and the average false positive fraction, respectively. These averages were taken over bins of test-results, versus the 95% confidence intervals constructed from the average estimated errors and the empirical errors (standard deviation of the distribution of the estimates) are shown also. “Estimated value” here indicates the mean of the samples that belong to that bin.

**Figure 3**
The average behavior of the estimation algorithm (areas under the receiver operating characteristic curve = 0.85 and high skew). **(a)** Histograms of simulated test-result values (5000 samples of 200 positive and 200 negative cases each) are shown. The negative, positive, and mixed histograms are those of the actually negative cases, the actually positive cases, and all cases taken together, respectively. **(b,c,d)** The results for average log likelihood ratio, average true-positive fraction, and average false-positive fraction, respectively. These averages were taken over bins of test-results. Also shown are 95% CIs constructed from the average estimated errors and the empirical errors (standard deviation of the distribution of the estimates).

**Figure 4**
Results of the regression analysis. Averages were taken over 5000 samples of 100, 200, 400, 600, 800, and 1000 cases each. The 95% confidence intervals constructed from the standard errors are shown also. **(a,c,e)** The average estimated slope for the no-, medium-, and high-skew ROC curves, respectively. Similarly **(b,d,f)** depict the average estimated intercept. Results based on ordinary and weighted least squares results are labeled least squares (LSQ) and weighted least squares (WLSQ), respectively. Please note that although the ranges of the vertical axes of Figs 4b, d, and f differ, each spans 1.5 units.

**Figure 5**
Coverage of the 95% CIs for true-positive fraction (TPF), false-positive fraction (FPF), and log-likelihood ratio (LLR) bins for the no-skew receiver operating characteristic curve shown in Figure 1. The coverage results from our simulation studies involving 200 cases per sample are shown in panels **(a)**, **(b)** and **(c)**, respectively, whereas those of simulation studies involving 2000 cases per sample are shown in panels **(d)**, **(e)** and **(f)**.

**Figure 6**
Coverage of the 95% CIs for true-positive fraction (TPF), false-positive fraction (FPF), and log-likelihood ratio (LLR) bins for the high-skew receiver operating characteristic curve shown in Figure 1. The coverage results from the simulation studies involving 200 cases per sample are shown in **(a,b,c)**, respectively, whereas those of simulation studies involving 2000 cases per sample are shown in **(d,e,f)**.

**Figure 7**
Output of the automated classifier for benign and malignant cases in ultrasonography, with both model-based and empirical estimates of the thresholds corresponding to true-positive fraction = 0.90 and false-positive fraction = 0.10. Also shown are 95% confidence intervals for the empirical estimates of threshold.

**Figure 8**
The classifier's model-based-fit and empirical receiver operating characteristic curves together with model-based and empirical 95% confidence intervals for true-positive fraction (TPF) and false-positive fraction (FPF) at thresholds corresponding to a model-based TPF = 0.90 or a model-based FPF = 0.10. One should note that sampling errors are larger near “the center” of the ROC curve due to the sampling properties of proportions, because FPF and TPF are proportions; hence, larger residuals from receiver operating characteristic curve-fitting algorithms must be expected there.

See this image and copyright information in PMC

Cited by

An additive selection of markers to improve diagnostic accuracy based on a discriminatory measure.
Tang LL, Kang L, Liu C, Schisterman EF, Liu A. Tang LL, et al. Acad Radiol. 2013 Jul;20(7):854-62. doi: 10.1016/j.acra.2013.02.009. Epub 2013 Apr 20. Acad Radiol. 2013. PMID: 23611438 Free PMC article.
Multivariate normally distributed biomarkers subject to limits of detection and receiver operating characteristic curve inference.
Perkins NJ, Schisterman EF, Vexler A. Perkins NJ, et al. Acad Radiol. 2013 Jul;20(7):838-46. doi: 10.1016/j.acra.2013.04.001. Acad Radiol. 2013. PMID: 23747152 Free PMC article.
Verification of modified receiver-operating characteristic software using simulated rating data.
Shiraishi J, Fukuoka D, Iha R, Inada H, Tanaka R, Hara T. Shiraishi J, et al. Radiol Phys Technol. 2018 Dec;11(4):406-414. doi: 10.1007/s12194-018-0479-9. Epub 2018 Sep 22. Radiol Phys Technol. 2018. PMID: 30244314
Calibration of medical diagnostic classifier scores to the probability of disease.
Chen W, Sahiner B, Samuelson F, Pezeshk A, Petrick N. Chen W, et al. Stat Methods Med Res. 2018 May;27(5):1394-1409. doi: 10.1177/0962280216661371. Epub 2016 Aug 8. Stat Methods Med Res. 2018. PMID: 27507287 Free PMC article.
Tree-structured subgroup analysis of receiver operating characteristic curves for diagnostic tests.
Li C, Glüer CC, Eastell R, Felsenberg D, Reid DM, Roux C, Lu Y. Li C, et al. Acad Radiol. 2012 Dec;19(12):1529-36. doi: 10.1016/j.acra.2012.09.007. Acad Radiol. 2012. PMID: 23122572 Free PMC article.

See all "Cited by" articles

References

1. Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol. 2007;14:723–748. - PubMed
1. Pepe MS. The statistical evaluation of medical tests for classification and prediction. Oxford University Press; Oxford; New York: 2004.
1. Zhou X-H, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. Wiley-Interscience; New York: 2002.
1. Diabetes. World Health Organization; [July 14, 2010]. Available from http://www.who.int/topics/diabetes_mellitus/en/
1. American College of Radiology . Breast imaging reporting and data system (BI-RADS) American College of Radiology; Reston, VA: 2003.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

[1] Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol. 2007;14:723–748. - PubMed

[2] Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol. 2007;14:723–748. - PubMed

[3] Pepe MS. The statistical evaluation of medical tests for classification and prediction. Oxford University Press; Oxford; New York: 2004.

[4] Pepe MS. The statistical evaluation of medical tests for classification and prediction. Oxford University Press; Oxford; New York: 2004.

[5] Zhou X-H, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. Wiley-Interscience; New York: 2002.

[6] Zhou X-H, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. Wiley-Interscience; New York: 2002.

[7] Diabetes. World Health Organization; [July 14, 2010]. Available from http://www.who.int/topics/diabetes_mellitus/en/

[8] Diabetes. World Health Organization; [July 14, 2010]. Available from http://www.who.int/topics/diabetes_mellitus/en/

[9] American College of Radiology . Breast imaging reporting and data system (BI-RADS) American College of Radiology; Reston, VA: 2003.

[10] American College of Radiology . Breast imaging reporting and data system (BI-RADS) American College of Radiology; Reston, VA: 2003.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model

Affiliation

Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources