Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep;3(9):e543-e554.
doi: 10.1016/S2589-7500(21)00116-3.

Tuberculosis detection from chest x-rays for triaging in a high tuberculosis-burden setting: an evaluation of five artificial intelligence algorithms

Affiliations
Free article

Tuberculosis detection from chest x-rays for triaging in a high tuberculosis-burden setting: an evaluation of five artificial intelligence algorithms

Zhi Zhen Qin et al. Lancet Digit Health. 2021 Sep.
Free article

Abstract

Background: Artificial intelligence (AI) algorithms can be trained to recognise tuberculosis-related abnormalities on chest radiographs. Various AI algorithms are available commercially, yet there is little impartial evidence on how their performance compares with each other and with radiologists. We aimed to evaluate five commercial AI algorithms for triaging tuberculosis using a large dataset that had not previously been used to train any AI algorithms.

Methods: Individuals aged 15 years or older presenting or referred to three tuberculosis screening centres in Dhaka, Bangladesh, between May 15, 2014, and Oct 4, 2016, were recruited consecutively. Every participant was verbally screened for symptoms and received a digital posterior-anterior chest x-ray and an Xpert MTB/RIF (Xpert) test. All chest x-rays were read independently by a group of three registered radiologists and five commercial AI algorithms: CAD4TB (version 7), InferRead DR (version 2), Lunit INSIGHT CXR (version 4.9.0), JF CXR-1 (version 2), and qXR (version 3). We compared the performance of the AI algorithms with each other, with the radiologists, and with the WHO's Target Product Profile (TPP) of triage tests (≥90% sensitivity and ≥70% specificity). We used a new evaluation framework that simultaneously evaluates sensitivity, proportion of Xpert tests avoided, and number needed to test to inform implementers' choice of software and selection of threshold abnormality scores.

Findings: Chest x-rays from 23 954 individuals were included in the analysis. All five AI algorithms significantly outperformed the radiologists. The areas under the receiver operating characteristic curve were 90·81% (95% CI 90·33-91·29) for qXR, 90·34% (89·81-90·87) for CAD4TB, 88·61% (88·03-89·20) for Lunit INSIGHT CXR, 84·90% (84·27-85·54) for InferRead DR, and 84·89% (84·26-85·53) for JF CXR-1. Only qXR (74·3% specificity [95% CI 73·3-74·9]) and CAD4TB (72·9% specificity [72·3-73·5]) met the TPP at 90% sensitivity. All five AI algorithms reduced the number of Xpert tests required by 50% while maintaining a sensitivity above 90%. All AI algorithms performed worse among older age groups (>60 years) and people with a history of tuberculosis.

Interpretation: AI algorithms can be highly accurate and useful triage tools for tuberculosis detection in high-burden regions, and outperform human readers.

Funding: Government of Canada.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests We declare no competing interests.

Comment in

Similar articles

Cited by

Publication types

LinkOut - more resources