Spectrum bias in algorithms derived by artificial intelligence: a case study in detecting aortic stenosis using electrocardiograms
- PMID: 36713099
- PMCID: PMC9707965
- DOI: 10.1093/ehjdh/ztab061
Spectrum bias in algorithms derived by artificial intelligence: a case study in detecting aortic stenosis using electrocardiograms
Abstract
Aims: Spectrum bias can arise when a diagnostic test is derived from study populations with different disease spectra than the target population, resulting in poor generalizability. We used a real-world artificial intelligence (AI)-derived algorithm to detect severe aortic stenosis (AS) to experimentally assess the effect of spectrum bias on test performance.
Methods and results: All adult patients at the Mayo Clinic between 1 January 1989 and 30 September 2019 with transthoracic echocardiograms within 180 days after electrocardiogram (ECG) were identified. Two models were developed from two distinct patient cohorts: a whole-spectrum cohort comparing severe AS to any non-severe AS and an extreme-spectrum cohort comparing severe AS to no AS at all. Model performance was assessed. Overall, 258 607 patients had valid ECG and echocardiograms pairs. The area under the receiver operator curve was 0.87 and 0.91 for the whole-spectrum and extreme-spectrum models, respectively. Sensitivity and specificity for the whole-spectrum model was 80% and 81%, respectively, while for the extreme-spectrum model it was 84% and 84%, respectively. When applying the AI-ECG derived from the extreme-spectrum cohort to patients in the whole-spectrum cohort, the sensitivity, specificity, and area under the curve dropped to 83%, 73%, and 0.86, respectively.
Conclusion: While the algorithm performed robustly in identifying severe AS, this study shows that limiting datasets to clearly positive or negative labels leads to overestimation of test performance when testing an AI algorithm in the setting of classifying severe AS using ECG data. While the effect of the bias may be modest in this example, clinicians should be aware of the existence of such a bias in AI-derived algorithms.
Keywords: Aortic stenosis; Artificial intelligence; Electrocardiogram; Spectrum bias.
© The Author(s) 2021. Published by Oxford University Press on behalf of the European Society of Cardiology.
Figures



Similar articles
-
Electrocardiogram screening for aortic valve stenosis using artificial intelligence.Eur Heart J. 2021 Aug 7;42(30):2885-2896. doi: 10.1093/eurheartj/ehab153. Eur Heart J. 2021. PMID: 33748852
-
Automated Detection of Acute Myocardial Infarction Using Asynchronous Electrocardiogram Signals-Preview of Implementing Artificial Intelligence With Multichannel Electrocardiographs Obtained From Smartwatches: Retrospective Study.J Med Internet Res. 2021 Sep 10;23(9):e31129. doi: 10.2196/31129. J Med Internet Res. 2021. PMID: 34505839 Free PMC article.
-
Comparison of Chest Radiograph Interpretations by Artificial Intelligence Algorithm vs Radiology Residents.JAMA Netw Open. 2020 Oct 1;3(10):e2022779. doi: 10.1001/jamanetworkopen.2020.22779. JAMA Netw Open. 2020. PMID: 33034642 Free PMC article.
-
Key Principles of Clinical Validation, Device Approval, and Insurance Coverage Decisions of Artificial Intelligence.Korean J Radiol. 2021 Mar;22(3):442-453. doi: 10.3348/kjr.2021.0048. Korean J Radiol. 2021. PMID: 33629545 Free PMC article. Review.
-
Artificial Intelligence-Enabled ECG: a Modern Lens on an Old Technology.Curr Cardiol Rep. 2020 Jun 19;22(8):57. doi: 10.1007/s11886-020-01317-x. Curr Cardiol Rep. 2020. PMID: 32562154 Review.
Cited by
-
Clinical significance, challenges and limitations in using artificial intelligence for electrocardiography-based diagnosis.Int J Arrhythmia. 2022;23(1):24. doi: 10.1186/s42444-022-00075-x. Epub 2022 Oct 1. Int J Arrhythmia. 2022. PMID: 36212507 Free PMC article. Review.
-
Diagnostic Accuracy of AI Algorithms in Aortic Stenosis Screening: A Systematic Review and Meta-Analysis.Clin Med Res. 2024 Sep;22(3):145-155. doi: 10.3121/cmr.2024.1934. Clin Med Res. 2024. PMID: 39438148 Free PMC article.
-
The Role of Artificial Intelligence in the Prediction, Diagnosis, and Management of Cardiovascular Diseases: A Narrative Review.Cureus. 2025 Mar 28;17(3):e81332. doi: 10.7759/cureus.81332. eCollection 2025 Mar. Cureus. 2025. PMID: 40291312 Free PMC article. Review.
-
Artificial intelligence for diagnosing exudative age-related macular degeneration.Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2. Cochrane Database Syst Rev. 2024. PMID: 39417312
References
-
- Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med 1978;299:926–930. - PubMed
-
- Lachs MS, Nachamkin I, Edelstein PH, Goldman J, Feinstein AR, Schwartz JS. Spectrum bias in the evaluation of diagnostic tests: lessons from the rapid dipstick test for urinary tract infection. Ann Intern Med 1992;117:135–140. - PubMed
-
- Jelinek M. Spectrum bias: why generalists and specialists don't connect. ACP J Club 2008;149:2. - PubMed
-
- Liu W, Li M, Yi L. Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework. Autism Res 2016;9:888–898. - PubMed
LinkOut - more resources
Full Text Sources
Research Materials