Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

doi:10.1038/s41598-024-83793-7

. 2025 Jan 2;15(1):125.

doi: 10.1038/s41598-024-83793-7.

Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

Affiliations

¹ Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan. msoufi@is.naist.jp.
² Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan. otake@is.naist.jp.
³ Department of Orthopedic Surgery, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka, 565-0871, Japan.
⁴ Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan.
⁵ Department of Radiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo, 160-8582, Japan.
⁶ Hitachi Health Care Center, Hitachi Ltd., 4-3-16 Ose, Hitachi, 307-0076, Japan.
⁷ Department of Bone and Joint Surgery, Graduate School of Medicine, Ehime University, Shitsukawa, Toon, Ehime, 791-0295, Japan.
⁸ Department of Orthopaedic Medical Engineering, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka, 565-0871, Japan.
⁹ Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan. yoshi@is.naist.jp.

PMID: 39747203
PMCID: PMC11696574
DOI: 10.1038/s41598-024-83793-7

Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

Mazen Soufi et al. Sci Rep. 2025.

. 2025 Jan 2;15(1):125.

doi: 10.1038/s41598-024-83793-7.

Authors

Affiliations

¹ Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan. msoufi@is.naist.jp.
² Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan. otake@is.naist.jp.
³ Department of Orthopedic Surgery, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka, 565-0871, Japan.
⁴ Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan.
⁵ Department of Radiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo, 160-8582, Japan.
⁶ Hitachi Health Care Center, Hitachi Ltd., 4-3-16 Ose, Hitachi, 307-0076, Japan.
⁷ Department of Bone and Joint Surgery, Graduate School of Medicine, Ehime University, Shitsukawa, Toon, Ehime, 791-0295, Japan.
⁸ Department of Orthopaedic Medical Engineering, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka, 565-0871, Japan.
⁹ Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan. yoshi@is.naist.jp.

PMID: 39747203
PMCID: PMC11696574
DOI: 10.1038/s41598-024-83793-7

Abstract

Deep learning-based image segmentation has allowed for the fully automated, accurate, and rapid analysis of musculoskeletal (MSK) structures from medical images. However, current approaches were either applied only to 2D cross-sectional images, addressed few structures, or were validated on small datasets, which limit the application in large-scale databases. This study aimed to validate an improved deep learning model for volumetric MSK segmentation of the hip and thigh with uncertainty estimation from clinical computed tomography (CT) images. Databases of CT images from multiple manufacturers/scanners, disease status, and patient positioning were used. The segmentation accuracy, and accuracy in estimating the structures volume and density, i.e., mean HU, were evaluated. An approach for segmentation failure detection based on predictive uncertainty was also investigated. The model has improved all segmentation accuracy and structure volume/density evaluation metrics compared to a shallower baseline model with a smaller training database (N = 20). The predictive uncertainty yielded large areas under the receiver operating characteristic (AUROC) curves (AUROCs ≥ .95) in detecting inaccurate and failed segmentations. Furthermore, the study has shown an impact of the disease severity status on the model's predictive uncertainties when applied to a large-scale database. The high segmentation and muscle volume/density estimation accuracy and the high accuracy in failure detection based on the predictive uncertainty exhibited the model's reliability for analyzing individual MSK structures in large-scale CT databases.

PubMed Disclaimer

Conflict of interest statement

Decalarations. Competing interests: Masahiro Jinzaki received a grant from Canon Medical Systems. However, Canon Medical Systems was not involved in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, and approval of the manuscript. The remaining authors have no conflicts of interest to declare. Ethics approval: Ethics approval Ethical approval was obtained from the Institutional Review Boards (IRBs) of the institutions participating in this study (IRB approval numbers: 21115 for Osaka University Hospital, 2023-28 for Hitachi Health Care Center, 2020-M-7 for Nara Institute of Science and Technology, and jRCTs032180267 for Keio University.)

Figures

**Fig. 1**
Segmentation labels of the bones and muscles

**Fig. 2**
Overall scheme for validation of musculoskeletal segmentation model for automated assessment of bones and muscles in CT images with uncertainty estimation

**Fig. 3**
Summary of the research questions tackled in the study with the corresponding databases and methodologies used in the experiments. Sect: section numbers in the paper, ROI: region-of-interest, Hip OA: hip osteoarthritis, GT: ground-truth (annotation), DB: database, N: number of cases

**Fig. 4**
Distributions of the segmentation accuracy (a), predictive uncertainty (b), and volume/mean HU accuracy (c) of the bones and muscles (averaged on all structures) by each model applied to DB#1 (N = 50). Horizontal lines in the boxes represent the medians, while blue boxes represent the means. Detailed values are depicted in Supplementary Figs. A.1-A.5. DC: Dice coefficient, ASD: average symmetric surface distance, AVE: average volume error, AIE: average intensity error, n.s.: not significant, *: p < 0.017, Student’s t-test or Wilcoxon signed rank sum test with Bonferroni correction

**Fig. 5**
Receiver operating characteristic (ROC) curves of the inaccurate and failed segmentation detection in DB#1 (N = 50) using the predictive uncertainty. Thresholds were determined based on the median absolute deviations (σ) of the DC

**Fig. 6**
Distributions of the accuracy evaluation metrics and predictive uncertainty of the three MSK structure groups, i.e., thigh (left) and hip (middle) muscles and bones (right), in terms of the disease status of body sides in hip OA patients in internal validation DB#1 (a) and large-scale predictive uncertainty analysis in DB#5) (b). N: number of cases. n.s.: not significant, *: p < 0.004. (Based on Shapiro’s normality test, the hypothesis test was performed using either the Wilcoxon signed-rank test or the Student’s t-test. Bonferroni correction was used for the multiple comparisons.)

**Fig. 7**
Comparison between segmentation model accuracy (a, c) and predictive uncertainty (b) of the GMED muscle in the multi-manufacturer/scanner databases DB#1(N = 50), DB2(N = 18), DB#3(N = 10) and DB#4(N = 20). DC: Dice coefficient, ASD: Average symmetric surface distance, AVE: Average volume error, AIE: Average intensity error, su: supine, st: standing, n.s.: not significant, *: p < 0.01. (Based on Shapiro’s normality test, the hypothesis tests were performed using either the Wilcoxon signed-rank test or the Student’s t-test with Bonferroni correction). The triangles indicate the cases corresponding to the 5th (blue filled triangle) and 95th (red filled upside down triangle) quantiles of the predictive uncertainty visualized in A.7 and A.8.

**Fig. 8**
Ground-truth (GT) and predicted (Auto) segmentations of the unaffected (Un.) and affected (Aff.) sides of a representative hip OA case (median DC in Fig. 4) with diagnostic biomarkers, histograms, and muscle density visualizations of the gluteus maximus (GMAX) and gluteus medius (GMED) muscles.

See this image and copyright information in PMC

Cited by

Evaluating upper leg muscle volume : the reliability of thigh circumference measurement 10 cm above the patella.
Kono S, Takashima K, Uemura K, Mae H, Takagi K, Soufi M, Otake Y, Sato Y, Sugano N, Okada S, Hamada H. Kono S, et al. Bone Joint Res. 2025 Aug 1;14(8):666-673. doi: 10.1302/2046-3758.148.BJR-2024-0216.R2. Bone Joint Res. 2025. PMID: 40744446 Free PMC article.

References

1. Pickhardt, P. J. et al. Fully automated deep learning tool for sarcopenia assessment on ct: L1 versus l3 vertebral level muscle measurements for opportunistic prediction of adverse clinical outcomes. AJR. Am. J. Roentgenol.218, 124 (2022). - PMC - PubMed
1. Islam, S. et al. Fully automated deep-learning section-based muscle segmentation from ct images for sarcopenia assessment. Clin. Radiol.77, e363–e371 (2022). - PubMed
1. Bridge, C. P. et al. A fully automated deep learning pipeline for multi–vertebral level quantification and characterization of muscle and adipose tissue on chest ct scans. Radiol. Artif. Intell.4, e210080 (2022). - PMC - PubMed
1. McSweeney, D. M. et al. Transfer learning for data-efficient abdominal muscle segmentation with convolutional neural networks. Med. Phys.49, 3107–3120 (2022). - PMC - PubMed
1. Yokota, F. et al. Automated ct segmentation of diseased hip using hierarchical and conditional statistical shape models. In Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds) Medical Image Computing and Computer-Assisted Intervention MICCAI 2013, vol. 8150 ofLecture Notes in Computer Science, 10.1007/978-3-642-40763-5 24 (Springer, Berlin, Heidelberg, 2013). - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information

[1] Pickhardt, P. J. et al. Fully automated deep learning tool for sarcopenia assessment on ct: L1 versus l3 vertebral level muscle measurements for opportunistic prediction of adverse clinical outcomes. AJR. Am. J. Roentgenol.218, 124 (2022). - PMC - PubMed

[2] Pickhardt, P. J. et al. Fully automated deep learning tool for sarcopenia assessment on ct: L1 versus l3 vertebral level muscle measurements for opportunistic prediction of adverse clinical outcomes. AJR. Am. J. Roentgenol.218, 124 (2022). - PMC - PubMed

[3] Islam, S. et al. Fully automated deep-learning section-based muscle segmentation from ct images for sarcopenia assessment. Clin. Radiol.77, e363–e371 (2022). - PubMed

[4] Islam, S. et al. Fully automated deep-learning section-based muscle segmentation from ct images for sarcopenia assessment. Clin. Radiol.77, e363–e371 (2022). - PubMed

[5] Bridge, C. P. et al. A fully automated deep learning pipeline for multi–vertebral level quantification and characterization of muscle and adipose tissue on chest ct scans. Radiol. Artif. Intell.4, e210080 (2022). - PMC - PubMed

[6] Bridge, C. P. et al. A fully automated deep learning pipeline for multi–vertebral level quantification and characterization of muscle and adipose tissue on chest ct scans. Radiol. Artif. Intell.4, e210080 (2022). - PMC - PubMed

[7] McSweeney, D. M. et al. Transfer learning for data-efficient abdominal muscle segmentation with convolutional neural networks. Med. Phys.49, 3107–3120 (2022). - PMC - PubMed

[8] McSweeney, D. M. et al. Transfer learning for data-efficient abdominal muscle segmentation with convolutional neural networks. Med. Phys.49, 3107–3120 (2022). - PMC - PubMed

[9] Yokota, F. et al. Automated ct segmentation of diseased hip using hierarchical and conditional statistical shape models. In Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds) Medical Image Computing and Computer-Assisted Intervention MICCAI 2013, vol. 8150 ofLecture Notes in Computer Science, 10.1007/978-3-642-40763-5 24 (Springer, Berlin, Heidelberg, 2013). - PubMed

[10] Yokota, F. et al. Automated ct segmentation of diseased hip using hierarchical and conditional statistical shape models. In Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds) Medical Image Computing and Computer-Assisted Intervention MICCAI 2013, vol. 8150 ofLecture Notes in Computer Science, 10.1007/978-3-642-40763-5 24 (Springer, Berlin, Heidelberg, 2013). - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

Affiliations

Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle assessment in hip-to-knee clinical CT images

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical