Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Jun 19:2024.06.19.24309061.
doi: 10.1101/2024.06.19.24309061.

An independent, multi-country head-to-head accuracy comparison of automated chest x-ray algorithms for the triage of pulmonary tuberculosis

Affiliations

An independent, multi-country head-to-head accuracy comparison of automated chest x-ray algorithms for the triage of pulmonary tuberculosis

William Worodria et al. medRxiv. .

Abstract

Background: Computer-aided detection (CAD) algorithms for automated chest X-ray (CXR) reading have been endorsed by the World Health Organization for tuberculosis (TB) triage, but independent, multi-country assessment and comparison of current products are needed to guide implementation.

Methods: We conducted a head-to-head evaluation of five CAD algorithms for TB triage across seven countries. We included CXRs from adults who presented to outpatient facilities with at least two weeks of cough in India, Madagascar, the Philippines, South Africa, Tanzania, Uganda, and Vietnam. The participants completed a standard evaluation for pulmonary TB, including sputum collection for Xpert MTB/RIF Ultra and culture. Against a microbiological reference standard, we calculated and compared the accuracy overall, by country and key groups for five CAD algorithms: CAD4TB (Delft Imaging), INSIGHT CXR (Lunit), DrAid (Vinbrain), Genki (Deeptek), and qXR (qure.AI). We determined the area under the ROC curve (AUC) and if any CAD product could achieve the minimum target accuracy for a TB triage test (≥90% sensitivity and ≥70% specificity). We then applied country- and population-specific thresholds and recalculated accuracy to assess any improvement in performance.

Results: Of 3,927 individuals included, the median age was 41 years (IQR 29-54), 12.9% were people living with HIV (PLWH), 8.2% living with diabetes, and 21.2% had a prior history of TB. The overall AUC ranged from 0.774-0.819, and specificity ranged from 64.8-73.8% at 90% sensitivity. CAD4TB had the highest overall accuracy (73.8% specific, 95% CI 72.2-75.4, at 90% sensitivity), although qXR and INSIGHT CXR also achieved the target 70% specificity. There was heterogeneity in accuracy by country, and females and PLWH had lower sensitivity while males and people with a history of TB had lower specificity. The performance remained stable regardless of diabetes status. When country- and population-specific thresholds were applied, at least one CAD product could achieve or approach the target accuracy for each country and sub-group, except for PLWH and those with a history of TB.

Conclusions: Multiple CAD algorithms can achieve or exceed the minimum target accuracy for a TB triage test, with improvement when using setting- or population-specific thresholds. Further efforts are needed to integrate CAD into routine TB case detection programs in high-burden communities.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS OF INTEREST The authors declare no conflicts of interest. The installation and use of the different CAD software evaluated in this manuscript was provided free of charge by all CAD vendors to FIND. CAD vendors did not have any role in the study design, data collection, analysis, the decision to publish or the preparation of the manuscript.

Figures

Figure 1.
Figure 1.. Flowchart of Participants
Figure 2.
Figure 2.. Receiver operating characteristic curve of each CAD Algorithm.
Each ROC curve represents a CAD algorithm as indicted in the legend, with reported area under the curve (AUC). The red horizontal and vertical lines indicate minimum target sensitivity and specificity for a TB triage test at 90% and 70%, respectively.
Figure 3.
Figure 3.. Forest plot of the sensitivity and specificity of CAD4TB by country and subgroup using a universal threshold.
(A) The sensitivity and specificity by country, with 95% CIs; (B) The sensitivity and specificity by subgroup, with 95% CIs. The overall accuracy of the CAD algorithm is listed at the bottom with a vertical dashed red line, in order to compare the overall estimate to the country and subgroup estimates.
Figure 4.
Figure 4.. Forest plot of the sensitivity and specificity of CAD4TB by country and subgroup using country- and population-specific thresholds.
(A) The sensitivity and specificity by country, with 95% CIs; and (B) The sensitivity and specificity by subgroup, with 95% CIs. Of note, the threshold selected is based on a 90% sensitivity. The overall accuracy of the CAD algorithm is listed at the bottom with a vertical dashed red line, in order to compare the overall estimate to the country and subgroup estimates.

References

    1. World Health Organization. Global Tuberculosis Report 2023. Geneva: WHO, 2023.
    1. World Health Organization. WHO consolidated guidelines on tuberculosis. Module 2: screening – systematic screening for tuberculosis disease. Geneva: WHO, 2021. 2021. - PubMed
    1. World Health Organization. High priority target product profiles for new tuberculosis diagnostics: report of a consensus meeting, 28–29 April 2014, Geneva, Switzerland. 2014.
    1. FIND. Digital Chest Radiography and Computer-Aided Detection (CAD) Solutions for Tuberculosis Diagnostics: Technology Landscape Analysis. FIND: Geneva, 2021.
    1. Vo LNQ, Codlin A, Ngo TD, et al. Early Evaluation of an Ultra-Portable X-ray System for Tuberculosis Active Case Finding. Trop Med Infect Dis 2021; 6(3). - PMC - PubMed

Publication types

LinkOut - more resources