. 2023 Apr;49(2):1057-1069.

doi: 10.1007/s00068-022-02136-1. Epub 2022 Nov 14.

Development and external validation of automated detection, classification, and localization of ankle fractures: inside the black box of a convolutional neural network (CNN)

Jasper Prijs^{1

2

3}, Zhibin Liao⁴, Minh-Son To^{5

6}, Johan Verjans⁴, Paul C Jutte⁷, Vincent Stirler⁷, Jakub Olczak⁸, Max Gordon⁸, Daniel Guss^{9

10}, Christopher W DiGiovanni^{9

10}, Ruurd L Jaarsma¹¹, Frank F A IJpma⁷, Job N Doornberg^{7

11

5}; Machine Learning Consortium

Collaborators, Affiliations

Collaborators

Machine Learning Consortium:
Kaan Aksakal, Britt Barvelink, Benn Beuker, Anne Eva Bultra, Luisa E Carmo Oliviera, Joost Colaris, Huub de Klerk, Andrew Duckworth, Kaj Ten Duis, Eelco Fennema, Jorrit Harbers, Ran Hendrickx, Merilyn Heng, Sanne Hoeksema, Mike Hogervorst, Bhavin Jadav, Julie Jiang, Aditya Karhade, Gino Kerkhoffs, Joost Kuipers, Charlotte Laane, David Langerhuizen, Bart Lubberts, Wouter Mallee, Haras Mhmud, Mostafa El Moumni, Patrick Nieboer, Koen Oude Nijhuis, Peter van Ooijen, Jacobien Oosterhoff, Jai Rawat, David Ring, Sanne Schilstra, Jospeph Schwab, Sheila Sprague, Sjoerd Stufkens, Elvira Tijdens, Michel van der Bekerom, Puck van der Vet, Jean- Paul de Vries, Klaus Wendt, Matthieu Wijffels, David Worsley

Affiliations

¹ Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands. jasperprijs@icloud.com.
² Department of Surgery, Groningen University Medical Centre, Groningen, The Netherlands. jasperprijs@icloud.com.
³ Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia. jasperprijs@icloud.com.
⁴ Australian Institute for Machine Learning, Adelaide, Australia.
⁵ College of Medicine and Public Health, Flinders University, Adelaide, Australia.
⁶ Department of Neurosurgery, Flinders Medical Center, Adelaide, Australia.
⁷ Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands.
⁸ Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Solna, Sweden.
⁹ Massachusetts General Hospital, Boston, USA.
¹⁰ Harvard Medical School, Boston, USA.
¹¹ Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia.

PMID: 36374292
PMCID: PMC10175446
DOI: 10.1007/s00068-022-02136-1

Development and external validation of automated detection, classification, and localization of ankle fractures: inside the black box of a convolutional neural network (CNN)

Jasper Prijs et al. Eur J Trauma Emerg Surg. 2023 Apr.

. 2023 Apr;49(2):1057-1069.

doi: 10.1007/s00068-022-02136-1. Epub 2022 Nov 14.

Authors

Collaborators

Machine Learning Consortium:
Kaan Aksakal, Britt Barvelink, Benn Beuker, Anne Eva Bultra, Luisa E Carmo Oliviera, Joost Colaris, Huub de Klerk, Andrew Duckworth, Kaj Ten Duis, Eelco Fennema, Jorrit Harbers, Ran Hendrickx, Merilyn Heng, Sanne Hoeksema, Mike Hogervorst, Bhavin Jadav, Julie Jiang, Aditya Karhade, Gino Kerkhoffs, Joost Kuipers, Charlotte Laane, David Langerhuizen, Bart Lubberts, Wouter Mallee, Haras Mhmud, Mostafa El Moumni, Patrick Nieboer, Koen Oude Nijhuis, Peter van Ooijen, Jacobien Oosterhoff, Jai Rawat, David Ring, Sanne Schilstra, Jospeph Schwab, Sheila Sprague, Sjoerd Stufkens, Elvira Tijdens, Michel van der Bekerom, Puck van der Vet, Jean- Paul de Vries, Klaus Wendt, Matthieu Wijffels, David Worsley

Affiliations

¹ Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands. jasperprijs@icloud.com.
² Department of Surgery, Groningen University Medical Centre, Groningen, The Netherlands. jasperprijs@icloud.com.
³ Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia. jasperprijs@icloud.com.
⁴ Australian Institute for Machine Learning, Adelaide, Australia.
⁵ College of Medicine and Public Health, Flinders University, Adelaide, Australia.
⁶ Department of Neurosurgery, Flinders Medical Center, Adelaide, Australia.
⁷ Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands.
⁸ Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Solna, Sweden.
⁹ Massachusetts General Hospital, Boston, USA.
¹⁰ Harvard Medical School, Boston, USA.
¹¹ Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia.

PMID: 36374292
PMCID: PMC10175446
DOI: 10.1007/s00068-022-02136-1

Abstract

Purpose: Convolutional neural networks (CNNs) are increasingly being developed for automated fracture detection in orthopaedic trauma surgery. Studies to date, however, are limited to providing classification based on the entire image-and only produce heatmaps for approximate fracture localization instead of delineating exact fracture morphology. Therefore, we aimed to answer (1) what is the performance of a CNN that detects, classifies, localizes, and segments an ankle fracture, and (2) would this be externally valid?

Methods: The training set included 326 isolated fibula fractures and 423 non-fracture radiographs. The Detectron2 implementation of the Mask R-CNN was trained with labelled and annotated radiographs. The internal validation (or 'test set') and external validation sets consisted of 300 and 334 radiographs, respectively. Consensus agreement between three experienced fellowship-trained trauma surgeons was defined as the ground truth label. Diagnostic accuracy and area under the receiver operator characteristic curve (AUC) were used to assess classification performance. The Intersection over Union (IoU) was used to quantify accuracy of the segmentation predictions by the CNN, where a value of 0.5 is generally considered an adequate segmentation.

Results: The final CNN was able to classify fibula fractures according to four classes (Danis-Weber A, B, C and No Fracture) with AUC values ranging from 0.93 to 0.99. Diagnostic accuracy was 89% on the test set with average sensitivity of 89% and specificity of 96%. External validity was 89-90% accurate on a set of radiographs from a different hospital. Accuracies/AUCs observed were 100/0.99 for the 'No Fracture' class, 92/0.99 for 'Weber B', 88/0.93 for 'Weber C', and 76/0.97 for 'Weber A'. For the fracture bounding box prediction by the CNN, a mean IoU of 0.65 (SD ± 0.16) was observed. The fracture segmentation predictions by the CNN resulted in a mean IoU of 0.47 (SD ± 0.17).

Conclusions: This study presents a look into the 'black box' of CNNs and represents the first automated delineation (segmentation) of fracture lines on (ankle) radiographs. The AUC values presented in this paper indicate good discriminatory capability of the CNN and substantiate further study of CNNs in detecting and classifying ankle fractures.

Level of evidence: II, Diagnostic imaging study.

Keywords: Ankle; Artificial Intelligence; CNN; Lateral Malleolus.

PubMed Disclaimer

Conflict of interest statement

One author (JP) certifies that he has received, an amount less than USD 15,000 from the Michael van Vloten Foundation (Rotterdam, The Netherlands), an amount less than USD 10,000 from ZonMw (Den Haag, The Netherlands), and an amount less than USD 10,000 from the Prins Bernhard Cultuur Fonds (Amsterdam, The Netherlands). One author (JND) certifies that he has received an unrestricted Postdoc Research Grant from the Marti-Keuning-Eckhardt Foundation.

Figures

**Fig. 1**
Workflow used to create the final convoluted neural network (CNN) for the classification of ankle fractures. This involves a two-stage approach. An initial CNN was trained to select cases that were considered difficult—for example, fractures that were hard to appreciate—for classification. Subsequently, the final CNN was trained using these radiographs selected by the former CNN

**Fig. 2**
This figure presents how the final convoluted neural network (CNN) goes from the input image (1) to the final prediction (6). The region proposal network and backbone create countless bounding boxes (2), where each box has a high likelihood of the presence of an object. Then, the region of interest (RoI) crops the bounding boxes to fit fixed dimensions, in this case 256 × 256 pixels (3). These cropped images are then used to simultaneously segment (4a) and classify (4b). Finally, the cropped images are then resized to their original dimensions (5) and presented on top of the input image as predictions (6)

**Fig. 3**
From left to right: Object detection, semantic segmentation, and instance segmentation

**Fig. 4**
From left to right: Ground truth (gt) versus prediction (pred), area of union (gt + pred), and area of overlap

**Fig. 5**
Selection of correct classifications by the final convoluted neural network

**Fig. 6**
AO/OTA 44/Weber A misclassified as a 44/Weber B, AO/OTA 44/Weber C misclassified as a No Fracture

**Fig. 7**
Segmentations and classifications of the final convoluted neural network for AO/OTA 44/Weber A (top), B (middle), and C (bottom)

See this image and copyright information in PMC

References

1. Adams M, Chen W, Holcdorf D, McCusker MW, Howe PD, Gaillard F. Computer vs human: Deep learning versus perceptual training for the detection of neck of femur fractures. J Med Imaging Radiat Oncol. 2019;63(1):27–32. doi: 10.1111/1754-9485.12828. - DOI - PubMed
1. Badgeley MA, Zech JR, Oakden-Rayner L, Glicksberg BS, Liu M, Gale W, McConnell MV, Percha B, Snyder TM, Dudley JT. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ Digit Med. 2019;2:31. doi: 10.1038/s41746-019-0105-1. - DOI - PMC - PubMed
1. Oliveira ECL, van den Merkhof A, Olczak J, Gordon M, Jutte PC, Jaarsma RL, Ijpma FFA, Doornberg JN, Prijs J. An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application? Bone Jt Open. 2021;2(10):879–885. doi: 10.1302/2633-1462.210.BJO-2021-0133. - DOI - PMC - PubMed
1. Choi JW, Cho YJ, Lee S, Lee J, Lee S, Choi YH, Cheon J-E, Ha JY. Using a dual-input convolutional neural network for automated detection of pediatric supracondylar fracture on conventional radiography. Invest Radiol. 2020;55(2):101–110. doi: 10.1097/RLI.0000000000000615. - DOI - PubMed
1. Chung SW, Han SS, Lee JW, Oh KS, Kim NR, Yoon JP, Kim JY, Moon SH, Kwon J, Lee HJ, Noh YM, Kim Y. Automated detection and classification of the proximal humerus fracture by using deep learning algorithm. Acta Orthop. 2018;89(4):468–473. doi: 10.1080/17453674.2018.1453714. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Development and external validation of automated detection, classification, and localization of ankle fractures: inside the black box of a convolutional neural network (CNN)

Collaborators

Affiliations

Development and external validation of automated detection, classification, and localization of ankle fractures: inside the black box of a convolutional neural network (CNN)

Authors

Collaborators

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Research Materials