Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer
- PMID: 37615504
- PMCID: PMC10566917
- DOI: 10.1097/CORR.0000000000002771
Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer
Abstract
Background: Improvement in survival in patients with advanced cancer is accompanied by an increased probability of bone metastasis and related pathologic fractures (especially in the proximal femur). The few systems proposed and used to diagnose impending fractures owing to metastasis and to ultimately prevent future fractures have practical limitations; thus, novel screening tools are essential. A CT scan of the abdomen and pelvis is a standard modality for staging and follow-up in patients with cancer, and radiologic assessments of the proximal femur are possible with CT-based digitally reconstructed radiographs. Deep-learning models, such as convolutional neural networks (CNNs), may be able to predict pathologic fractures from digitally reconstructed radiographs, but to our knowledge, they have not been tested for this application.
Questions/purposes: (1) How accurate is a CNN model for predicting a pathologic fracture in a proximal femur with metastasis using digitally reconstructed radiographs of the abdomen and pelvis CT images in patients with advanced cancer? (2) Do CNN models perform better than clinicians with varying backgrounds and experience levels in predicting a pathologic fracture on abdomen and pelvis CT images without any knowledge of the patients' histories, except for metastasis in the proximal femur?
Methods: A total of 392 patients received radiation treatment of the proximal femur at three hospitals from January 2011 to December 2021. The patients had 2945 CT scans of the abdomen and pelvis for systemic evaluation and follow-up in relation to their primary cancer. In 33% of the CT scans (974), it was impossible to identify whether a pathologic fracture developed within 3 months after each CT image was acquired, and these were excluded. Finally, 1971 cases with a mean age of 59 ± 12 years were included in this study. Pathologic fractures developed within 3 months after CT in 3% (60 of 1971) of cases. A total of 47% (936 of 1971) were women. Sixty cases had an established pathologic fracture within 3 months after each CT scan, and another group of 1911 cases had no established pathologic fracture within 3 months after CT scan. The mean age of the cases in the former and latter groups was 64 ± 11 years and 59 ± 12 years, respectively, and 32% (19 of 60) and 53% (1016 of 1911) of cases, respectively, were female. Digitally reconstructed radiographs were generated with perspective projections of three-dimensional CT volumes onto two-dimensional planes. Then, 1557 images from one hospital were used for a training set. To verify that the deep-learning models could consistently operate even in hospitals with a different medical environment, 414 images from other hospitals were used for external validation. The number of images in the groups with and without a pathologic fracture within 3 months after each CT scan increased from 1911 to 22,932 and from 60 to 720, respectively, using data augmentation methods that are known to be an effective way to boost the performance of deep-learning models. Three CNNs (VGG16, ResNet50, and DenseNet121) were fine-tuned using digitally reconstructed radiographs. For performance measures, the area under the receiver operating characteristic curve, accuracy, sensitivity, specificity, precision, and F1 score were determined. The area under the receiver operating characteristic curve was used to evaluate three CNN models mainly, and the optimal accuracy, sensitivity, and specificity were calculated using the Youden J statistic. Accuracy refers to the proportion of fractures in the groups with and without a pathologic fracture within 3 months after each CT scan that were accurately predicted by the CNN model. Sensitivity and specificity represent the proportion of accurately predicted fractures among those with and without a pathologic fracture within 3 months after each CT scan, respectively. Precision is a measure of how few false-positives the model produces. The F1 score is a harmonic mean of sensitivity and precision, which have a tradeoff relationship. Gradient-weighted class activation mapping images were created to check whether the CNN model correctly focused on potential pathologic fracture regions. The CNN model with the best performance was compared with the performance of clinicians.
Results: DenseNet121 showed the best performance in identifying pathologic fractures; the area under the receiver operating characteristic curve for DenseNet121 was larger than those for VGG16 (0.77 ± 0.07 [95% CI 0.75 to 0.79] versus 0.71 ± 0.08 [95% CI 0.69 to 0.73]; p = 0.001) and ResNet50 (0.77 ± 0.07 [95% CI 0.75 to 0.79] versus 0.72 ± 0.09 [95% CI 0.69 to 0.74]; p = 0.001). Specifically, DenseNet121 scored the highest in sensitivity (0.22 ± 0.07 [95% CI 0.20 to 0.24]), precision (0.72 ± 0.19 [95% CI 0.67 to 0.77]), and F1 score (0.34 ± 0.10 [95% CI 0.31 to 0.37]), and it focused accurately on the region with the expected pathologic fracture. Further, DenseNet121 was less likely than clinicians to mispredict cases in which there was no pathologic fracture than cases in which there was a fracture; the performance of DenseNet121 was better than clinician performance in terms of specificity (0.98 ± 0.01 [95% CI 0.98 to 0.99] versus 0.86 ± 0.09 [95% CI 0.81 to 0.91]; p = 0.01), precision (0.72 ± 0.19 [95% CI 0.67 to 0.77] versus 0.11 ± 0.10 [95% CI 0.05 to 0.17]; p = 0.0001), and F1 score (0.34 ± 0.10 [95% CI 0.31 to 0.37] versus 0.17 ± 0.15 [95% CI 0.08 to 0.26]; p = 0.0001).
Conclusion: CNN models may be able to accurately predict impending pathologic fractures from digitally reconstructed radiographs of the abdomen and pelvis CT images that clinicians may not anticipate; this can assist medical, radiation, and orthopaedic oncologists clinically. To achieve better performance, ensemble-learning models using knowledge of the patients' histories should be developed and validated. The code for our model is publicly available online at https://github.com/taehoonko/CNN_path_fx_prediction .
Level of evidence: Level III, diagnostic study.
Copyright © 2023 by the Association of Bone and Joint Surgeons.
Conflict of interest statement
The institution of one or more of the authors (MWJ) has received, during the study period, funding from the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2021R1F1A1047841). Each author certifies that there are no funding or commercial associations (consultancies, stock ownership, equity interest, patent/licensing arrangements, etc.) that might pose a conflict of interest in connection with the submitted article related to the author or any immediate family members. All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research ® editors and board members are on file with the publication and can be viewed on request.
Figures





Comment in
-
CORR Insights®: Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.Clin Orthop Relat Res. 2023 Nov 1;481(11):2257-2259. doi: 10.1097/CORR.0000000000002828. Epub 2023 Aug 28. Clin Orthop Relat Res. 2023. PMID: 37638845 Free PMC article. No abstract available.
Similar articles
-
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23. Clin Orthop Relat Res. 2024. PMID: 39051924
-
Does Augmenting Irradiated Autografts With Free Vascularized Fibula Graft in Patients With Bone Loss From a Malignant Tumor Achieve Union, Function, and Complication Rate Comparably to Patients Without Bone Loss and Augmentation When Reconstructing Intercalary Resections in the Lower Extremity?Clin Orthop Relat Res. 2025 Jun 26. doi: 10.1097/CORR.0000000000003599. Online ahead of print. Clin Orthop Relat Res. 2025. PMID: 40569278
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12. Clin Orthop Relat Res. 2024. PMID: 37306629 Free PMC article.
-
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320. Health Technol Assess. 2001. PMID: 12065068
References
-
- Chlap P, Min H, Vandenberg N, Dowling J, Holloway L, Haworth A. A review of medical image data. J Med Imaging Radiat Oncol. 2021;65:545-563. - PubMed
-
- Damron TA, Morgan H, Prakash D, Grant W, Aronowitz J, Heiner J. Critical evaluation of Mirels' rating system for impending pathologic fractures. Clin Orthop Relat Res. 2003;415(suppl):S201-S207. - PubMed
-
- Fuller RM, Kim J, An TW, et al. Assessment of flatfoot deformity using digitally reconstructed radiographs: reliability and comparison to conventional radiographs. Foot Ankle Int. 2022;43:983-993. - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Research Materials