Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;29(12):1819-1832.
doi: 10.1016/j.acra.2022.02.020. Epub 2022 Mar 26.

Deep Learning Classification of Spinal Osteoporotic Compression Fractures on Radiographs using an Adaptation of the Genant Semiquantitative Criteria

Affiliations

Deep Learning Classification of Spinal Osteoporotic Compression Fractures on Radiographs using an Adaptation of the Genant Semiquantitative Criteria

Qifei Dong et al. Acad Radiol. 2022 Dec.

Abstract

Rationale and objectives: Osteoporosis affects 9% of individuals over 50 in the United States and 200 million women globally. Spinal osteoporotic compression fractures (OCFs), an osteoporosis biomarker, are often incidental and under-reported. Accurate automated opportunistic OCF screening can increase the diagnosis rate and ensure adequate treatment. We aimed to develop a deep learning classifier for OCFs, a critical component of our future automated opportunistic screening tool.

Materials and methods: The dataset from the Osteoporotic Fractures in Men Study comprised 4461 subjects and 15,524 spine radiographs. This dataset was split by subject: 76.5% training, 8.5% validation, and 15% testing. From the radiographs, 100,409 vertebral bodies were extracted, each assigned one of two labels adapted from the Genant semiquantitative system: moderate to severe fracture vs. normal/trace/mild fracture. GoogLeNet, a deep learning model, was trained to classify the vertebral bodies. The classification threshold on the predicted probability of OCF outputted by GoogLeNet was set to prioritize the positive predictive value (PPV) while balancing it with the sensitivity. Vertebral bodies with the top 0.75% predicted probabilities were classified as moderate to severe fracture.

Results: Our model yielded a sensitivity of 59.8%, a PPV of 91.2%, and an F1 score of 0.72. The areas under the receiver operating characteristic curve (AUC-ROC) and the precision-recall curve were 0.99 and 0.82, respectively.

Conclusion: Our model classified vertebral bodies with an AUC-ROC of 0.99, providing a critical component for our future automated opportunistic screening tool. This could lead to earlier detection and treatment of OCFs.

Keywords: Deep learning; Fragility fracture; Opportunistic screening; Osteoporosis; Semiquantitative.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest:

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Nathan M. Cross reports financial support was provided by General Electric-Association of University Radiologists Radiology Research Academic Fellowship.

Qifei Dong reports financial support was provided by National Institute of Arthritis and Musculoskeletal and Skin Diseases.

Gang Luo reports financial support was provided by National Institute of Arthritis and Musculoskeletal and Skin Diseases.

Li-Yung Lui reports financial support was provided by National Institute of Health.

Deborah M. Kado reports financial support was provided by National Institute on Aging.

Peggy M. Cawthon reports financial support was provided by National Institutes of Health.

David Haynor reports financial support was provided by National Institute of Arthritis and Musculoskeletal and Skin Diseases.

Jeffrey G. Jarvik reports financial support was provided by National Institute of Arthritis and Musculoskeletal and Skin Diseases.

Nathan M. Cross reports financial support was provided by National Institute of Arthritis and Musculoskeletal and Skin Diseases.

Deborah M. Kado reports a relationship with National Osteoporosis Foundation that includes: speaking and lecture fees.

Deborah M. Kado reports a relationship with American Bone Health that includes: speaking and lecture fees.

Deborah M. Kado reports a relationship with Interdisciplinary Symposium on Osteoporosis that includes: speaking and lecture fees.

Deborah M. Kado reports a relationship with Veterans Administration Health System that includes: travel reimbursement.

Deborah M. Kado reports a relationship with Stanford University School of Medicine that includes: travel reimbursement.

Deborah M. Kado reports a relationship with American Society of Bone and Mineral Research that includes: travel reimbursement.

Deborah M. Kado reports a relationship with ASBMR Task Force on Long-Term Safety and Efficacy of Vertebral Augmentation that includes: board membership.

Deborah M. Kado reports a relationship with Data Safety Monitoring Board, TOPAZ Trial that includes: board membership.

Deborah M. Kado reports a relationship with NIH NIA Aging Workshop for the American Society of Bone and Mineral Research (ASBMR) that includes: board membership.

Jeffrey G. Jarvik reports a relationship with GE-Association of University Radiologists Radiology Research Academic Fellowship that includes: travel reimbursement.

Gang Luo currently works part-time at Amazon as an Amazon Scholar.

Deborah M. Kado reports Wolters Kluwer/UpToDate: Royalties as a chapter author.

Jeffrey G. Jarvik reports Springer Publishing: Royalties as a book co-editor; and Wolters Kluwer/UpToDate: Royalties as a chapter author.

All other authors report no conflicts of interest.

Figures

Figure 1:
Figure 1:
A graphic representation of the Genant semiquantitative (SQ) osteoporotic fracture classification criteria. This approach to facture classification uses nine fracture classes and one normal class. Fractures are graded by the degree of height loss (mild, moderate, or severe) and whether the vertebral body height loss is predominantly anterior, posterior, or central. The MrOS dataset assigns the 10 classes the numerical labels: 0, 1, 2, 2.5, 3, 4, 4.5, 5, 6, and 7 (green bubbles). The original Genant criteria were modified slightly by the MrOS study to include the requirement for depression of the endplate to be present for the “Mild deformity” row [35]. This system was simplified into two classes: label 0 (yellow) representing a normal or possible, mild deformity, and label 1 (orange) representing a moderate to severe deformity. Adapted from: Genant HK, Wu CY, van Kujik C, Nevitt MC: Vertebral fracture assessment using a semiquantitative technique. J Bone Miner Res. Sep; 1148, 1993. Fig. 1: Semiquantitative visual grading of vertebral deformities: Graphic representation. © 1993 American Society for Bone and Mineral Research.
Figure 2:
Figure 2:
The MrOS dataset was divided into the test, validation, and training sets by subject. A thoracic and lumbar radiograph was obtained at both the first clinical visit (Visit 1) and the follow-up clinical visit (Visit 2). Since the radiographs of the same subject have some commonality, datasets were divided on a subject basis. To reduce the data imbalance degree in the training set, instances of label 0 (normal/possible/mild deformity) were subsampled to the ratio of 2.5:1 (label 0 to label 1) in order to better balance the cases in the two classes.
Figure 3:
Figure 3:
This process of generating a vertebral patch was performed for each vertebral body labeled in a radiograph using the four corner points indicated by red stars. The blue and purple arrows demonstrate the creation of the vertebral patches without and with the augmentation steps, respectively. The vertebral patches in the validation and test sets should not be augmented. Both the raw and augmented vertebral patches were included in the training set. The steps are: 1) flip the radiograph horizontally to conform to the convention that the subject faces left; 2) form two coordinate axes with the x-axis bisecting the angle between the two diagonals connecting the four corner points; 3) obtain the smallest square that bounds the four corner points with edges parallel to the corresponding coordinate axes; 4) expand the square from its center to increase the area by four times, preventing cutoff of part of the vertebral body and providing surrounding image context; 5) augment the vertebral patch by scaling, rotating, and translating the square; 6) extract the square as a vertebral patch; 7) invert the grayscale if the bones are darker than the background; 8) augment the vertebral patch by changing the contrast and brightness and adding Gaussian noise to the vertebral patch; 9) resize the vertebral patch to 224×224 pixels and normalize each pixel value by subtracting the mean and then dividing by the standard deviation [31].
Figure 4:
Figure 4:
The process to determine whether the grayscale of a vertebral patch is inverted. The steps are: 1) find the endplates using Sobel operator and hysteresis thresholding; 2) on the endplate, obtain a vertical strip of pixels whose midpoint is the pixel on the endplate; 3) on the vertical stripe, find the pixels with the highest and the lowest gray intensities; 4) calculate dh and dl; 5) traverse the pixels on the endplates and repeat Steps 2, 3, and 4 to get all dh and all dl; 6) calculate the means of all dh and of all dl, respectively, and compare them to determine whether the grayscale of the vertebral patch is inverted. If the mean of all dl is < the mean of all dh, the grayscale of the vertebral patch is inverted; otherwise, the vertebral patch is standard.
Figure 5:
Figure 5:
In the entire MrOS dataset, the number of vertebral bodies of each SQ class at each anatomic level of the spine (A) excluding and (B) including the normal class. Each digit in the figure’s legend represents an SQ class shown in Figure 1 (green bubbles).
Figure 6:
Figure 6:
On the test set, the final deep learning model achieved an AUC-ROC of 0.99 (A) and an AUC-PR of 0.82 (B) with the associated 95% confidence intervals (CIs). Two thresholding methods are used. Their corresponding sensitivities, specificities, PPVs, NPVs, FDRs, F1 scores, and accuracies are shown in (C). The values in each pair of brackets in (C) represent the 95% CI. Thresholding method 1 provides a favorable PPV and a favorable FDR for large volume screening. Thresholding method 2 balances sensitivity and specificity by optimizing Youden’s J statistics. The confusion matrices generated using thresholding methods 1 and 2 are shown in (D) and (E), respectively. For each confusion matrix, to normalize the values in it, each element in each row of it is divided by the sum of the elements in the row. The normalized confusion matrices are shown in (F) and (G). With thresholding methods 1 and 2 used, (H) and (I) show the accuracy, sensitivity, PPV, and F1 score by each anatomic level of the spine. Note that the number of fractured vertebral bodies at T10 and T11 are three and five, respectively. Because these numbers are small, there is limited statistical power to evaluate the model at these two anatomic levels (T10 and T11).

References

    1. Looker AC, Borrud LG, Dawson-Hughes B, Shepherd JA, Wright NC. Osteoporosis or low bone mass at the femur neck or lumbar spine in older adults, United States, 2005–2008. NCHS Data Brief 2012;93:1–8. - PubMed
    1. Kanis JA, on behalf of the World Health Organization Scientific Group (2007). Assessment of osteoporosis at the primary health-care level. Technical Report WHO Collaborating Centre for Metabolic Bone Diseases, University of Sheffield, UK.
    1. Hodsman AB, Leslie WD, Tsang JF, Gamble GD. 10-year probability of recurrent fractures following wrist and other osteoporotic fractures in a large clinical cohort: an analysis from the Manitoba Bone Density Program. JAMA Internal Medicine 2008;168(20):2261–7. - PubMed
    1. Roux S, Cabana F, Carrier N, Beaulieu M, April PM, Beaulieu MC, Boire G. The World Health Organization Fracture Risk Assessment Tool (FRAX) underestimates incident and recurrent fractures in consecutive patients with fragility fractures. J Clin Endocrinol Metab 2014;99(7):2400–8. - PubMed
    1. Robinson CM, Royds M, Abraham A, McQueen MM, Court-Brown CM, Christie J. Refractures in patients at least forty-five years old: a prospective analysis of twenty-two thousand and sixty patients. J Bone Joint Surg Am 2002;84(9):1528–33. - PubMed

Publication types