A practical guide to the implementation of AI in orthopaedic research, Part 6: How to evaluate the performance of AI research?
- PMID: 38826500
- PMCID: PMC11141501
- DOI: 10.1002/jeo2.12039
A practical guide to the implementation of AI in orthopaedic research, Part 6: How to evaluate the performance of AI research?
Abstract
Artificial intelligence's (AI) accelerating progress demands rigorous evaluation standards to ensure safe, effective integration into healthcare's high-stakes decisions. As AI increasingly enables prediction, analysis and judgement capabilities relevant to medicine, proper evaluation and interpretation are indispensable. Erroneous AI could endanger patients; thus, developing, validating and deploying medical AI demands adhering to strict, transparent standards centred on safety, ethics and responsible oversight. Core considerations include assessing performance on diverse real-world data, collaborating with domain experts, confirming model reliability and limitations, and advancing interpretability. Thoughtful selection of evaluation metrics suited to the clinical context along with testing on diverse data sets representing different populations improves generalisability. Partnering software engineers, data scientists and medical practitioners ground assessment in real needs. Journals must uphold reporting standards matching AI's societal impacts. With rigorous, holistic evaluation frameworks, AI can progress towards expanding healthcare access and quality.
Level of evidence: Level V.
Keywords: AI; ML; digitalization; healthcare; performance metrics.
© 2024 The Author(s). Journal of Experimental Orthopaedics published by John Wiley & Sons Ltd on behalf of European Society of Sports Traumatology, Knee Surgery and Arthroscopy.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
Similar articles
-
A practical guide to the implementation of AI in orthopaedic research-Part 7: Risks, limitations, safety and verification of medical AI systems.J Exp Orthop. 2025 Apr 24;12(2):e70247. doi: 10.1002/jeo2.70247. eCollection 2025 Apr. J Exp Orthop. 2025. PMID: 40276496 Free PMC article. Review.
-
Generative AI in healthcare: an implementation science informed translational path on application, integration and governance.Implement Sci. 2024 Mar 15;19(1):27. doi: 10.1186/s13012-024-01357-9. Implement Sci. 2024. PMID: 38491544 Free PMC article. Review.
-
Role of artificial intelligence, machine learning and deep learning models in corneal disorders - A narrative review.J Fr Ophtalmol. 2024 Sep;47(7):104242. doi: 10.1016/j.jfo.2024.104242. Epub 2024 Jul 15. J Fr Ophtalmol. 2024. PMID: 39013268 Review.
-
Exploring the Role of Artificial Intelligence in Mental Healthcare: Progress, Pitfalls, and Promises.Cureus. 2023 Sep 5;15(9):e44748. doi: 10.7759/cureus.44748. eCollection 2023 Sep. Cureus. 2023. PMID: 37809254 Free PMC article.
-
Consensus statements on the current landscape of artificial intelligence applications in endoscopy, addressing roadblocks, and advancing artificial intelligence in gastroenterology.Gastrointest Endosc. 2025 Jan;101(1):2-9.e1. doi: 10.1016/j.gie.2023.12.003. Epub 2024 Apr 17. Gastrointest Endosc. 2025. PMID: 38639679
Cited by
-
Artificial Intelligence and Musculoskeletal Surgical Applications.HSS J. 2025 May 20:15563316251339596. doi: 10.1177/15563316251339596. Online ahead of print. HSS J. 2025. PMID: 40405922 Free PMC article. Review.
-
Bioethical Considerations of Deploying Artificial Intelligence in Clinical Orthopedic Settings: A Narrative Review.HSS J. 2025 May 30:15563316251340303. doi: 10.1177/15563316251340303. Online ahead of print. HSS J. 2025. PMID: 40458232 Free PMC article. Review.
-
Artificial intelligence and the diagnosis of oral cavity cancer and oral potentially malignant disorders from clinical photographs: a narrative review.Front Oral Health. 2025 Mar 10;6:1569567. doi: 10.3389/froh.2025.1569567. eCollection 2025. Front Oral Health. 2025. PMID: 40130020 Free PMC article. Review.
-
Revolutionizing total hip arthroplasty: The role of artificial intelligence and machine learning.J Exp Orthop. 2025 Mar 22;12(1):e70195. doi: 10.1002/jeo2.70195. eCollection 2025 Jan. J Exp Orthop. 2025. PMID: 40123682 Free PMC article.
-
Artificial intelligence-assisted analysis of musculoskeletal imaging-A narrative review of the current state of machine learning models.Knee Surg Sports Traumatol Arthrosc. 2025 Aug;33(8):3032-3038. doi: 10.1002/ksa.12702. Epub 2025 Jun 1. Knee Surg Sports Traumatol Arthrosc. 2025. PMID: 40450562 Free PMC article. Review.
References
-
- Abdar, M. , Pourpanah, F. , Hussain, S. , Rezazadegan, D. , Liu, L. , Ghavamzadeh, M. et al. (2021) A review of uncertainty quantification in deep learning: techniques, applications and challenges. Information Fusion, 76, 243–297. Available from: 10.1016/j.inffus.2021.05.008 - DOI
-
- Box, G.E.P. (1976) Science and statistics. Journal of the American Statistical Association, 71, 791–799. Available from: 10.1080/01621459.1976.10480949 - DOI
-
- Chen, A. , Stanovsky, G. , Singh, S. & Gardner, M. (2019) Evaluating question answering evaluation. Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 1 January 2019. Hong Kong, China: Association for Computational Linguistics, pp. 119–124. Available from: 10.18653/v1/D19-5817 - DOI
Publication types
LinkOut - more resources
Full Text Sources
Research Materials