. 2018 Jan 20;5(1):e000663.

doi: 10.1136/openhrt-2017-000663. eCollection 2018.

Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values

Ricardo Petraco¹, Hakim-Moulay Dehbi¹, James P Howard¹, Matthew J Shun-Shin¹, Sayan Sen¹, Sukhjinder S Nijjer¹, Jamil Mayet¹, Justin E Davies¹, Darrel P Francis¹

Affiliations

Affiliation

¹ International Centre for Circulatory Health, National Heart and Lung Institute, Imperial College London and Imperial College Healthcare NHS Trust, London, UK.

PMID: 29387424
PMCID: PMC5786922
DOI: 10.1136/openhrt-2017-000663

Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values

Ricardo Petraco et al. Open Heart. 2018.

. 2018 Jan 20;5(1):e000663.

doi: 10.1136/openhrt-2017-000663. eCollection 2018.

Authors

Ricardo Petraco¹, Hakim-Moulay Dehbi¹, James P Howard¹, Matthew J Shun-Shin¹, Sayan Sen¹, Sukhjinder S Nijjer¹, Jamil Mayet¹, Justin E Davies¹, Darrel P Francis¹

Affiliation

¹ International Centre for Circulatory Health, National Heart and Lung Institute, Imperial College London and Imperial College Healthcare NHS Trust, London, UK.

PMID: 29387424
PMCID: PMC5786922
DOI: 10.1136/openhrt-2017-000663

Abstract

Background: Diagnostic accuracy is widely accepted by researchers and clinicians as an optimal expression of a test's performance. The aim of this study was to evaluate the effects of disease severity distribution on values of diagnostic accuracy as well as propose a sample-independent methodology to calculate and display accuracy of diagnostic tests.

Methods and findings: We evaluated the diagnostic relationship between two hypothetical methods to measure serum cholesterol (Chol_rapid and Chol_gold) by generating samples with statistical software and (1) keeping the numerical relationship between methods unchanged and (2) changing the distribution of cholesterol values. Metrics of categorical agreement were calculated (accuracy, sensitivity and specificity). Finally, a novel methodology to display and calculate accuracy values was presented (the V-plot of accuracies).

Conclusion: No single value of diagnostic accuracy can be used to describe the relationship between tests, as accuracy is a metric heavily affected by the underlying sample distribution. Our novel proposed methodology, the V-plot of accuracies, can be used as a sample-independent measure of a test performance against a reference gold standard.

Keywords: diagnostic accuracy; diagnostic tests; study sample.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

**Figure 1**
Disease severity and classification agreement between methods: schematic representation of the principle that classification agreement between two methods of measurement (or diagnostic accuracy if one is seen as a reference gold standard) varies across the range of disease severity. At the extremes of disease and health agreement is 100%. Close to the classification cut-off, around the intermediate range of disease severity, agreement falls, reaching a nadir close to 50%.

**Figure 2**
Diagnostic performance of the new cholesterol test. The performance of the new cholesterol test (Chol_rapid) changed significantly between the two studies. The overall accuracy of Chol_rapid to diagnose hypercholesterolaemia fell in the primary care retrospective cohort (B), when compared with the initial validation study (A). Values of area under ROC curve, sensitivity, specificity and predictive values were also largely different. ROC, receiver operator characteristic.

**Figure 3**
Numerical agreement between Chol_rapid and Cho_gold is equal between studies. Despite different magnitudes of classification agreement (diagnostic accuracy) between Chol_rapid and Chol_gold in the two studies, the raw measurement disagreement between the two methods remained unchanged. This can be appreciated from measures of the vertical scatter such as the SE of the estimate (plot A) and from Bland-Altman plots (B). It can be inferred that the observed drop in Chol_rapid performance in the primary care study cannot be explained by a change in its true measurement performance. LOA, limits of agreement.

**Figure 4**
Histograms of cholesterol values from both studies. While the validation study included patients with a wide range of cholesterol values, the primary care cohort was formed predominantly of patients with intermediate values of cholesterol. This difference was responsible for the significant drop in Chol_rapid accuracy reported in the primary care study.

**Figure 5**
The V-plot of accuracies of Chol_rapid. The V-plot permits a visual demonstration that the classification agreement between Chol_rapid and Chol_gold is equal in the two studies in each quantile of disease severity. The *overall* classification agreement (diagnostic accuracy of Chol_rapid) could change between studies, depending on the proportion of patients in each quantile. The V-plot consistently identifies the range of cholesterol values within which the agreement between tests is lower than 90% (dashed lines).

**Figure 6**
Methodology for the calculation of per-range agreement and V-plot display.

**Figure 7**
Calculating the overall accuracy in different samples using the V-plot. The V-plot agreement between Chol_rapid and Chol_gold can be derived from any study that compared the two methods (top panel). It can be used as a fingerprint of classification agreement to calculate the overall agreement between Chol_rapid and Chol_gold in any sample in which the distribution of cholesterol values is known (samples A, B and C).

See this image and copyright information in PMC

Cited by

Reliability of Instantaneous Wave-Free Ratio (iFR) for the Evaluation of Left Main Coronary Artery Lesions.
De Rosa S, Polimeni A, De Velli G, Conte M, Sorrentino S, Spaccarotella C, Mongiardo A, Sabatino J, Contarini M, Indolfi C. De Rosa S, et al. J Clin Med. 2019 Jul 31;8(8):1143. doi: 10.3390/jcm8081143. J Clin Med. 2019. PMID: 31370353 Free PMC article.

References

1. Leeflang MM, Deeks JJ, Gatsonis C, et al. . Systematic reviews of diagnostic test accuracy. Ann Intern Med 2008;149:889–97. 10.7326/0003-4819-149-12-200812160-00008 - DOI - PMC - PubMed
1. Mallett S, Halligan S, Thompson M, et al. . Interpreting diagnostic accuracy studies for patient care. BMJ 2012;345:e3999 10.1136/bmj.e3999 - DOI - PubMed
1. Alberg AJ, Park JW, Hager BW, et al. . The use of "overall accuracy" to evaluate the validity of screening or diagnostic tests. J Gen Intern Med 2004;19(5 Pt 1):460–5. 10.1111/j.1525-1497.2004.30091.x - DOI - PMC - PubMed
1. Brenner H, Gefeller O, sensitivity Vof. specificity, likelihood ratios and predictive values with disease prevalence. Stat Med 1997;16:981–91. 10.1002/(SICI)1097-0258(19970515)16:9<981::AID-SIM510>3.0.CO;2-N - DOI - PubMed
1. Bossuyt PM, Reitsma JB, Bruns DE, et al. . STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 2015;351:h5527 10.1136/bmj.h5527 - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values

Affiliation

Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources