Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2023 Mar 1;6(3):e233391.
doi: 10.1001/jamanetworkopen.2023.3391.

Artificial Intelligence for Hip Fracture Detection and Outcome Prediction: A Systematic Review and Meta-analysis

Affiliations
Meta-Analysis

Artificial Intelligence for Hip Fracture Detection and Outcome Prediction: A Systematic Review and Meta-analysis

Johnathan R Lex et al. JAMA Netw Open. .

Abstract

Importance: Artificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and potential impact of these newly developed algorithms are currently unknown.

Objective: To evaluate the performance of AI algorithms designed to diagnose hip fractures on radiographs and predict postoperative clinical outcomes following hip fracture surgery relative to current practices.

Data sources: A systematic review of the literature was performed using the MEDLINE, Embase, and Cochrane Library databases for all articles published from database inception to January 23, 2023. A manual reference search of included articles was also undertaken to identify any additional relevant articles.

Study selection: Studies developing machine learning (ML) models for the diagnosis of hip fractures from hip or pelvic radiographs or to predict any postoperative patient outcome following hip fracture surgery were included.

Data extraction and synthesis: This study followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses and was registered with PROSPERO. Eligible full-text articles were evaluated and relevant data extracted independently using a template data extraction form. For studies that predicted postoperative outcomes, the performance of traditional predictive statistical models, either multivariable logistic or linear regression, was recorded and compared with the performance of the best ML model on the same out-of-sample data set.

Main outcomes and measures: Diagnostic accuracy of AI models was compared with the diagnostic accuracy of expert clinicians using odds ratios (ORs) with 95% CIs. Areas under the curve for postoperative outcome prediction between traditional statistical models (multivariable linear or logistic regression) and ML models were compared.

Results: Of 39 studies that met all criteria and were included in this analysis, 18 (46.2%) used AI models to diagnose hip fractures on plain radiographs and 21 (53.8%) used AI models to predict patient outcomes following hip fracture surgery. A total of 39 598 plain radiographs and 714 939 hip fractures were used for training, validating, and testing ML models specific to diagnosis and postoperative outcome prediction, respectively. Mortality and length of hospital stay were the most predicted outcomes. On pooled data analysis, compared with clinicians, the OR for diagnostic error of ML models was 0.79 (95% CI, 0.48-1.31; P = .36; I2 = 60%) for hip fracture radiographs. For the ML models, the mean (SD) sensitivity was 89.3% (8.5%), specificity was 87.5% (9.9%), and F1 score was 0.90 (0.06). The mean area under the curve for mortality prediction was 0.84 with ML models compared with 0.79 for alternative controls (P = .09).

Conclusions and relevance: The findings of this systematic review and meta-analysis suggest that the potential applications of AI to aid with diagnosis from hip radiographs are promising. The performance of AI in diagnosing hip fractures was comparable with that of expert radiologists and surgeons. However, current implementations of AI for outcome prediction do not seem to provide substantial benefit over traditional multivariable predictive statistics.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Dr Lex reported receiving grants from Arthrex Inc outside the submitted work and serving on the Resident Advisory Board for PrecisionOS Technologies. No other disclosures were reported.

Figures

Figure 1.
Figure 1.. Forest Plot Demonstrating the Accuracy of Artificial Intelligence Models Compared With Clinicians in Diagnosing Hip Fractures
ML indicates machine learning; OR, odds ratio.
Figure 2.
Figure 2.. Sensitivity and Specificity of Artificial Intelligence (AI) Models Used for Diagnosing Hip Fractures on Radiographs and of Clinicians Who Used the Same Test Data Set

References

    1. Thompson NC, Greenewald K, Lee K, Manso GF. The computational limits of deep learning. arXiV. Preprint posted online July 10, 2020. doi:10.48550/arXiv.2007.05558 - DOI
    1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539 - DOI - PubMed
    1. The Medical Futurist . FDA-approved A.I.-based algorithms. Accessed April 30, 2022. https://medicalfuturist.com/fda-approved-ai-based-algorithms/
    1. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3(1):118. doi:10.1038/s41746-020-00324-0 - DOI - PMC - PubMed
    1. Center for Devices and Radiological Health. US Food and Drug Administration. Artificial intelligence and machine learning (AI/ML)-enabled medical devices. Accessed April 30, 2022. https://www.fda.gov/medical-devices/software-medical-device-samd/artific...

Publication types