Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 10;14(1):8372.
doi: 10.1038/s41598-024-59077-5.

High sensitivity methods for automated rib fracture detection in pediatric radiographs

Affiliations

High sensitivity methods for automated rib fracture detection in pediatric radiographs

Jonathan Burkow et al. Sci Rep. .

Abstract

Rib fractures are highly predictive of non-accidental trauma in children under 3 years old. Rib fracture detection in pediatric radiographs is challenging because fractures can be obliquely oriented to the imaging detector, obfuscated by other structures, incomplete, and non-displaced. Prior studies have shown up to two-thirds of rib fractures may be missed during initial interpretation. In this paper, we implemented methods for improving the sensitivity (i.e. recall) performance for detecting and localizing rib fractures in pediatric chest radiographs to help augment performance of radiology interpretation. These methods adapted two convolutional neural network (CNN) architectures, RetinaNet and YOLOv5, and our previously proposed decision scheme, "avalanche decision", that dynamically reduces the acceptance threshold for proposed regions in each image. Additionally, we present contributions of using multiple image pre-processing and model ensembling techniques. Using a custom dataset of 1109 pediatric chest radiographs manually labeled by seven pediatric radiologists, we performed 10-fold cross-validation and reported detection performance using several metrics, including F2 score which summarizes precision and recall for high-sensitivity tasks. Our best performing model used three ensembled YOLOv5 models with varied input processing and an avalanche decision scheme, achieving an F2 score of 0.725 ± 0.012. Expert inter-reader performance yielded an F2 score of 0.732. Results demonstrate that our combination of sensitivity-driving methods provides object detector performance approaching the capabilities of expert human readers, suggesting that these methods may provide a viable approach to identify all rib fractures.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Summary of the major contributions of this paper including curation of a labeled dataset and the addition of up to three high-sensitivity methods (avalanche decision scheme, varied-input processing, and ensembling) to start-of-the-art pre-trained object detectors. In addition, inter-reader variability between expert radiologists was evaluated on 338 of the 624 fracture-present radiographs.
Figure 2
Figure 2
Representative results from multi-class segmentation showing manually labeled images (left) and U-Net results (right) with final automatically-generated cropped region represented by red box.
Figure 3
Figure 3
Plot of relative decision threshold for bounding box acceptance as a function of the number of accepted proposals.
Figure 4
Figure 4
The three types of image processing used in the varied-input-processing ensemble models. (a) normal-cropped histogram equalized 3x-stacked images; (b) segmentation-masked adaptive thresholding 3x-stacked images (binary); (c) segmentation-masked raw, histogram equalized, and bilateral low-pass filtered (blended).
Figure 5
Figure 5
Test set images with ground truth (teal, red) and model predictions (green, yellow), with true positives (green), false positives (yellow), and false negatives (red). Predictions from the 6x-YOLOv5 ensemble trained on histogram equalized input images with a γ=0.20 avalanche scheme, achieving 0.536±0.044 precision, 0.795±0.022 recall, and 0.723±0.010 F2 score.
Figure 6
Figure 6
Explanation of model nomenclature for ensembling combined with different input processing. The selection of input processing [a,b,c] is described in Fig. 4 and varied input processing [*] uses one from each input processing type. For example, results presented for method 3x-R* would be for an ensemble of three RetinaNet models (trained on different folds) using varied input processing.
Figure 7
Figure 7
F2 scores across all possible confidence thresholds for 1x-Ra models, 3x-Yc ensembles, and the hybrid, 3x-R* + 3x-Y* ensemble. Each dashed line represents performance from one model or combination of ensembles. The best performing avalanche scheme (‘Conservative’) is compared to the ’Standard’ decision scheme. Generally, the avalanche scheme performed better than conventional inferencing, except for low decision thresholds. In every case, the avalanche decision scheme reaches higher max F2 scores than standard.

References

    1. Kelly, C., Street, C. & Building, M. E. S. Child maltreatment 2020. Child Maltreatment 313 (2020).
    1. McMahon P, Grossman W, Gaffney M, Stanitski C. Soft-tissue injury as an indication of child abuse. J. Bone Jt. Surg. 1995;77:1179–1183. doi: 10.2106/00004623-199508000-00006. - DOI - PubMed
    1. Kemp AM, et al. Patterns of skeletal fractures in child abuse: Systematic review. BMJ. 2008;337:a1518. doi: 10.1136/bmj.a1518. - DOI - PMC - PubMed
    1. Darling SE, Done SL, Friedman SD, Feldman KW. Frequency of intrathoracic injuries in children younger than 3 years with rib fractures. Pediatr. Radiol. 2014;44:1230–1236. doi: 10.1007/s00247-014-2988-y. - DOI - PubMed
    1. Barsness KA, et al. The positive predictive value of rib fractures as an indicator of nonaccidental trauma in children. J. Trauma Acute Care Surg. 2003;54:1107–1110. doi: 10.1097/01.TA.0000068992.01030.A8. - DOI - PubMed