Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 15;12(1):13810.
doi: 10.1038/s41598-022-14826-2.

Diagnostic accuracy of symptoms for an underlying disease: a simulation study

Affiliations

Diagnostic accuracy of symptoms for an underlying disease: a simulation study

Yi-Sheng Chao et al. Sci Rep. .

Abstract

Symptoms have been used to diagnose conditions such as frailty and mental illnesses. However, the diagnostic accuracy of the numbers of symptoms has not been well studied. This study aims to use equations and simulations to demonstrate how the factors that determine symptom incidence influence symptoms' diagnostic accuracy for disease diagnosis. Assuming a disease causing symptoms and correlated with the other disease in 10,000 simulated subjects, 40 symptoms occurred based on 3 epidemiological measures: proportions diseased, baseline symptom incidence (among those not diseased), and risk ratios. Symptoms occurred with similar correlation coefficients. The sensitivities and specificities of single symptoms for disease diagnosis were exhibited as equations using the three epidemiological measures and approximated using linear regression in simulated populations. The areas under curves (AUCs) of the receiver operating characteristic (ROC) curves was the measure to determine the diagnostic accuracy of multiple symptoms, derived by using 2 to 40 symptoms for disease diagnosis. With respect to each AUC, the best set of sensitivity and specificity, whose difference with 1 in the absolute value was maximal, was chosen. The results showed sensitivities and specificities of single symptoms for disease diagnosis were fully explained with the three epidemiological measures in simulated subjects. The AUCs increased or decreased with more symptoms used for disease diagnosis, when the risk ratios were greater or less than 1, respectively. Based on the AUCs, with risk ratios were similar to 1, symptoms did not provide diagnostic values. When risk ratios were greater or less than 1, maximal or minimal AUCs usually could be reached with less than 30 symptoms. The maximal AUCs and their best sets of sensitivities and specificities could be well approximated with the three epidemiological and interaction terms, adjusted R-squared ≥ 0.69. However, the observed overall symptom correlations, overall symptom incidence, and numbers of symptoms explained a small fraction of the AUC variances, adjusted R-squared ≤ 0.03. In conclusion, the sensitivities and specificities of single symptoms for disease diagnosis can be explained fully by the at-risk incidence and the 1 minus baseline incidence, respectively. The epidemiological measures and baseline symptom correlations can explain large fractions of the variances of the maximal AUCs and the best sets of sensitivities and specificities. These findings are important for researchers who want to assess the diagnostic accuracy of composite diagnostic criteria.

PubMed Disclaimer

Conflict of interest statement

YSC is currently employed by the Canadian Agency for Drugs and Technologies in Health. The other authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Elements of the simulations in this study. d proportions diseased, ir incidence rate, rr risk ratio.
Figure 2
Figure 2
Symptom incidence depending on the baseline incidence, proportions diseased, and risk ratios. RR risk ratios.
Figure 3
Figure 3
Receiver operating characteristic (ROC) curves for disease diagnosis based on the numbers of symptoms. Red dots = the set of sensitivities and specificities with the largest difference in the absolute values between 1 and the sums of sensitivities and specificities in a ROC curve. For each number of symptoms used for disease diagnosis, one red dot—a best set of sensitivities and specificities—was selected. With a maximum of 40 symptoms used for disease diagnosis, ROC curves in this figure were created assuming the risk ratio as 2, baseline symptom incidence as 0.1, proportions diseased as 0.05, no correlations between diseases, and no correlations between symptoms.
Figure 4
Figure 4
Areas under the receiver operating characteristic curves for disease diagnosis by numbers of symptoms, baseline symptom incidence, and symptom risk ratios. AUC area under curve, CI confidence interval, RR risk ratio, incidence baseline symptom incidence among those not diseased. Gray dots are the area under curve (AUCs) whose 95% confidence intervals (CIs) overlapped with the maximal AUC 95% CIs identified using a maximum of 40 symptoms for disease diagnosis. The lines were added to show the AUCs assuming the same epidemiological measures. All AUCs assuming 0.8 correlations between symptoms among those not diseased are illustrated.
Figure 5
Figure 5
Sensitivities for disease diagnosis by numbers of symptoms, baseline symptom incidence, and symptom risk ratios. AUC area under curve, CI confidence interval, RR risk ratio, incidence baseline symptom incidence among those not diseased. Gray dots are the area under curve (AUCs) whose 95% confidence intervals (CIs) overlapped with the maximal AUC 95% CIs identified using a maximum of 40 symptoms for disease diagnosis. The lines were added to show the AUCs assuming the same epidemiological measures. All AUCs assuming 0.8 correlations between symptoms among those not diseased are illustrated.
Figure 6
Figure 6
Specificities for disease diagnosis by numbers of symptoms, baseline symptom incidence, and symptom risk ratios. AUC area under curve, CI confidence interval, RR risk ratio, incidence baseline symptom incidence among those not diseased. Gray dots are the area under curve (AUCs) whose 95% confidence intervals (CIs) overlapped with the maximal AUC 95% CIs identified using a maximum of 40 symptoms for disease diagnosis. The lines were added to show the AUCs assuming the same epidemiological measures. All AUCs assuming 0.8 correlations between symptoms among those not diseased are illustrated.

Similar articles

Cited by

References

    1. Chao Y-S, Wu H-C, Wu C-J, Chen W-C. Index or illusion: The case of frailty indices in the Health and Retirement Study. PLoS ONE. 2018;13(7):e0197859. doi: 10.1371/journal.pone.0197859. - DOI - PMC - PubMed
    1. Chao Y-S, Lin K-F, Wu C-J, Wu H-C, Hsu H-T, Tsao L-C, et al. Simulation study to demonstrate biases created by diagnostic criteria of mental illnesses: Major depressive episodes, dysthymia, and manic episodes. BMJ Open. 2020;10(11):e037022. doi: 10.1136/bmjopen-2020-037022. - DOI - PMC - PubMed
    1. Soares-Weiser K, Maayan N, Bergman H, Davenport C, Kirkham AJ, Grabowski S, et al. First rank symptoms for schizophrenia (Cochrane diagnostic test accuracy review) Schizophr. Bull. 2015;41(4):792–794. doi: 10.1093/schbul/sbv061. - DOI - PMC - PubMed
    1. American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM-IV-TR®) American Psychiatric Association Publishing; 2010.
    1. Chao Y-S, McGolrick D, Wu C-J, Wu H-C, Chen W-C. A proposal for a self-rated frailty index and status for patient-oriented research. BMC Res. Notes. 2019;12(1):172. doi: 10.1186/s13104-019-4206-3. - DOI - PMC - PubMed