Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 17;14(10):4270-4280.
doi: 10.18632/aging.204084. Epub 2022 May 17.

Identification of combined biomarkers for predicting the risk of osteoporosis using machine learning

Affiliations

Identification of combined biomarkers for predicting the risk of osteoporosis using machine learning

Zhenlong Zheng et al. Aging (Albany NY). .

Abstract

Osteoporosis is a severe chronic skeletal disorder that affects older individuals, especially postmenopausal women. However, molecular biomarkers for predicting the risk of osteoporosis are not well characterized. The aim of this study was to identify combined biomarkers for predicting the risk of osteoporosis using machine learning methods. We merged three publicly available gene expression datasets (GSE56815, GSE13850, and GSE2208) to obtain expression data for 6354 unique genes in postmenopausal women (45 with high bone mineral density and 45 with low bone mineral density). All machine learning methods were implemented in R, with the GEOquery and limma packages, for dataset download and differentially expressed gene identification, and a nomogram for predicting the risk of osteoporosis was constructed. We detected 378 significant differentially expressed genes using the limma package, representing 15 major biological pathways. The performance of the predictive models based on combined biomarkers (two or three genes) was superior to that of models based on a single gene. The best predictive gene set among two-gene sets included PLA2G2A and WRAP73. The best predictive gene set among three-gene sets included LPN1, PFDN6, and DOHH. Overall, we demonstrated the advantages of using combined versus single biomarkers for predicting the risk of osteoporosis. Further, the predictive nomogram constructed using combined biomarkers could be used by clinicians to identify high-risk individuals and in the design of efficient clinical trials to reduce the incidence of osteoporosis.

Keywords: combined biomarker; gene expression; machine learning; osteoporosis; risk prediction.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS OF INTEREST: Zhenlong Zheng, Xianglan Zhang, Bong-Kyeong Oh, and Ki-Yeol Kim declare that they have no conflicts of interest related to this work.

Figures

Figure 1
Figure 1
Study design. Data for duplicated genes in each gene expression dataset were averaged. The datasets were then merged based on gene name. Finally, osteoporosis-predictive genes were identified, as indicated. BMD: bone mineral density; GO: Gene Ontology; KEGG: Kyoto Encyclopedia Genes Genomes; ML: machine learning; HTML: Hypertext Markup Language format.
Figure 2
Figure 2
Gene expression patterns in the three datasets analyzed. (A) Gene expression pattern in the merged microarray dataset, which includes 6354 genes and data from 90 experiments. (B) Gene expression pattern of significant differentially expressed genes (n = 378) in high-BMD and low-BMD groups. The genes were identified using the limma package in R; among them, 191 genes were down-regulated and 187 genes were up-regulated.
Figure 3
Figure 3
Comparison of prediction accuracies of combinations of different numbers of genes. The specific-number gene sets were selected from 378 significant differentially expressed genes identified by the merged microarray dataset using the limma package. Vertical and horizontal axes represent the prediction accuracy and the number of genes considered in combination, respectively.
Figure 4
Figure 4
Nomogram for predicting the probability of osteoporosis risk. (A) Identification of the probability of osteoporosis risk for an individual patient. (B) Practical use of the nomogram, available in Hypertext Markup Language (HTML) format.

Similar articles

Cited by

References

    1. Akkawi I, Zmerly H. Osteoporosis: Current Concepts. Joints. 2018; 6:122–7. 10.1055/s-0038-1660790 - DOI - PMC - PubMed
    1. Kling JM, Clarke BL, Sandhu NP. Osteoporosis prevention, screening, and treatment: a review. J Womens Health (Larchmt). 2014; 23:563–72. 10.1089/jwh.2013.4611 - DOI - PMC - PubMed
    1. Trajanoska K, Rivadeneira F. The genetic architecture of osteoporosis and fracture risk. Bone. 2019; 126:2–10. 10.1016/j.bone.2019.04.005 - DOI - PubMed
    1. Huang QY, Kung AW. Genetics of osteoporosis. Mol Genet Metab. 2006; 88:295–306. 10.1016/j.ymgme.2006.04.009 - DOI - PubMed
    1. Billington EO, Leslie WD, Brown JP, Prior JC, Morin SN, Kovacs CS, Kaiser SM, Lentle BC, Anastassiades T, Towheed T, Kline GA. Simulated effects of early menopausal bone mineral density preservation on long-term fracture risk: a feasibility study. Osteoporos Int. 2021; 32:1313–20. 10.1007/s00198-021-05826-5 - DOI - PubMed

Publication types