Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 20;5(1):e000663.
doi: 10.1136/openhrt-2017-000663. eCollection 2018.

Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values

Affiliations

Effects of disease severity distribution on the performance of quantitative diagnostic methods and proposal of a novel 'V-plot' methodology to display accuracy values

Ricardo Petraco et al. Open Heart. .

Abstract

Background: Diagnostic accuracy is widely accepted by researchers and clinicians as an optimal expression of a test's performance. The aim of this study was to evaluate the effects of disease severity distribution on values of diagnostic accuracy as well as propose a sample-independent methodology to calculate and display accuracy of diagnostic tests.

Methods and findings: We evaluated the diagnostic relationship between two hypothetical methods to measure serum cholesterol (Cholrapid and Cholgold) by generating samples with statistical software and (1) keeping the numerical relationship between methods unchanged and (2) changing the distribution of cholesterol values. Metrics of categorical agreement were calculated (accuracy, sensitivity and specificity). Finally, a novel methodology to display and calculate accuracy values was presented (the V-plot of accuracies).

Conclusion: No single value of diagnostic accuracy can be used to describe the relationship between tests, as accuracy is a metric heavily affected by the underlying sample distribution. Our novel proposed methodology, the V-plot of accuracies, can be used as a sample-independent measure of a test performance against a reference gold standard.

Keywords: diagnostic accuracy; diagnostic tests; study sample.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1
Disease severity and classification agreement between methods: schematic representation of the principle that classification agreement between two methods of measurement (or diagnostic accuracy if one is seen as a reference gold standard) varies across the range of disease severity. At the extremes of disease and health agreement is 100%. Close to the classification cut-off, around the intermediate range of disease severity, agreement falls, reaching a nadir close to 50%.
Figure 2
Figure 2
Diagnostic performance of the new cholesterol test. The performance of the new cholesterol test (Cholrapid) changed significantly between the two studies. The overall accuracy of Cholrapid to diagnose hypercholesterolaemia fell in the primary care retrospective cohort (B), when compared with the initial validation study (A). Values of area under ROC curve, sensitivity, specificity and predictive values were also largely different. ROC, receiver operator characteristic.
Figure 3
Figure 3
Numerical agreement between Cholrapid and Chogold is equal between studies. Despite different magnitudes of classification agreement (diagnostic accuracy) between Cholrapid and Cholgold in the two studies, the raw measurement disagreement between the two methods remained unchanged. This can be appreciated from measures of the vertical scatter such as the SE of the estimate (plot A) and from Bland-Altman plots (B). It can be inferred that the observed drop in Cholrapid performance in the primary care study cannot be explained by a change in its true measurement performance. LOA, limits of agreement.
Figure 4
Figure 4
Histograms of cholesterol values from both studies. While the validation study included patients with a wide range of cholesterol values, the primary care cohort was formed predominantly of patients with intermediate values of cholesterol. This difference was responsible for the significant drop in Cholrapid accuracy reported in the primary care study.
Figure 5
Figure 5
The V-plot of accuracies of Cholrapid. The V-plot permits a visual demonstration that the classification agreement between Cholrapid and Cholgold is equal in the two studies in each quantile of disease severity. The overall classification agreement (diagnostic accuracy of Cholrapid) could change between studies, depending on the proportion of patients in each quantile. The V-plot consistently identifies the range of cholesterol values within which the agreement between tests is lower than 90% (dashed lines).
Figure 6
Figure 6
Methodology for the calculation of per-range agreement and V-plot display.
Figure 7
Figure 7
Calculating the overall accuracy in different samples using the V-plot. The V-plot agreement between Cholrapid and Cholgold can be derived from any study that compared the two methods (top panel). It can be used as a fingerprint of classification agreement to calculate the overall agreement between Cholrapid and Cholgold in any sample in which the distribution of cholesterol values is known (samples A, B and C).

Similar articles

Cited by

References

    1. Leeflang MM, Deeks JJ, Gatsonis C, et al. . Systematic reviews of diagnostic test accuracy. Ann Intern Med 2008;149:889–97. 10.7326/0003-4819-149-12-200812160-00008 - DOI - PMC - PubMed
    1. Mallett S, Halligan S, Thompson M, et al. . Interpreting diagnostic accuracy studies for patient care. BMJ 2012;345:e3999 10.1136/bmj.e3999 - DOI - PubMed
    1. Alberg AJ, Park JW, Hager BW, et al. . The use of "overall accuracy" to evaluate the validity of screening or diagnostic tests. J Gen Intern Med 2004;19(5 Pt 1):460–5. 10.1111/j.1525-1497.2004.30091.x - DOI - PMC - PubMed
    1. Brenner H, Gefeller O, sensitivity Vof. specificity, likelihood ratios and predictive values with disease prevalence. Stat Med 1997;16:981–91. 10.1002/(SICI)1097-0258(19970515)16:9<981::AID-SIM510>3.0.CO;2-N - DOI - PubMed
    1. Bossuyt PM, Reitsma JB, Bruns DE, et al. . STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 2015;351:h5527 10.1136/bmj.h5527 - DOI - PMC - PubMed

Publication types