Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 31;12(4):550.
doi: 10.3390/jpm12040550.

Development and Evaluation of a Machine Learning Prediction Model for Small-for-Gestational-Age Births in Women Exposed to Radiation before Pregnancy

Affiliations

Development and Evaluation of a Machine Learning Prediction Model for Small-for-Gestational-Age Births in Women Exposed to Radiation before Pregnancy

Xi Bai et al. J Pers Med. .

Abstract

Exposure to radiation has been associated with increased risk of delivering small-for-gestational-age (SGA) newborns. There are no tools to predict SGA newborns in pregnant women exposed to radiation before pregnancy. Here, we aimed to develop an array of machine learning (ML) models to predict SGA newborns in women exposed to radiation before pregnancy. Patients' data was obtained from the National Free Preconception Health Examination Project from 2010 to 2012. The data were randomly divided into a training dataset (n = 364) and a testing dataset (n = 91). Eight various ML models were compared for solving the binary classification of SGA prediction, followed by a post hoc explainability based on the SHAP model to identify and interpret the most important features that contribute to the prediction outcome. A total of 455 newborns were included, with the occurrence of 60 SGA births (13.2%). Overall, the model obtained by extreme gradient boosting (XGBoost) achieved the highest area under the receiver-operating-characteristic curve (AUC) in the testing set (0.844, 95% confidence interval (CI): 0.713-0.974). All models showed satisfied AUCs, except for the logistic regression model (AUC: 0.561, 95% CI: 0.355-0.768). After feature selection by recursive feature elimination (RFE), 15 features were included in the final prediction model using the XGBoost algorithm, with an AUC of 0.821 (95% CI: 0.650-0.993). ML algorithms can generate robust models to predict SGA newborns in pregnant women exposed to radiation before pregnancy, which may thus be used as a prediction tool for SGA newborns in high-risk pregnant women.

Keywords: exposure to radiation; machine learning; prediction; small for gestational age.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
A flow chart of the methods used for data extraction, training, and testing. NFPHEP = National Free Preconception Health Examination Project, LR = logistic regression, RF = random forest, GBDT = gradient boosting decision tree, LGBM = light gradient boosting machine, XGBoost = extreme gradient boosting, CatBoost = category boosting, SVM = support vector machine, MLP = multi-layer perceptron, RFE = recursive feature elimination, SHAP = Shapley Additive Explanation.
Figure 2
Figure 2
Receiver operating characteristic (ROC) curves of the eight machine learning (ML) models in predicting small for gestational age (SGA) in the testing dataset. LR = logistic regression, RF = random forest, GBDT = gradient boosting decision tree, LGBM = light gradient boosting machine, XGB = extreme gradient boosting, CB = category boosting, MLP = multi-layer perceptron, SVM = support vector machine.
Figure 3
Figure 3
Receiver operating characteristic (ROC) curves of the final machine learning (ML) model generated after recursive feature elimination (RFE) in predicting small for gestational age (SGA).
Figure 4
Figure 4
The Shapley Additive Explanation (SHAP) values for most important predictors of small for gestational age (SGA) in the final model. ALT = alanine aminotransferase, PLT = platelet count, BMI = body mass index, Cr = creatinine. Each line represents a feature, and the abscissa is the SHAP value, which represents the degree of influence on the outcome. Each dot represents a sample. Plot is colored red (blue) if the value of the feature is high (low).
Figure 5
Figure 5
Newborns correctly classified as non-small-for-gestational-age (A) and small-for-gestational-age (B).

Similar articles

Cited by

References

    1. McCowan L.M., Figueras F., Anderson N.H. Evidence-based national guidelines for the management of suspected fetal growth restriction: Comparison, consensus, and controversy. Am. J. Obstet. Gynecol. 2018;218:S855–S868. doi: 10.1016/j.ajog.2017.12.004. - DOI - PubMed
    1. Lindqvist P.G., Molin J. Does antenatal identification of small-for-gestational age fetuses significantly improve their outcome? Ultrasound. Obstet. Gynecol. 2005;25:258–264. doi: 10.1002/uog.1806. - DOI - PubMed
    1. Frøen J.F., Gardosi J.O., Thurmann A., Francis A., Stray-Pedersen B. Restricted fetal growth in sudden intrauterine unexplained death. Acta Obstet. Et. Gynecol. Scand. 2004;83:801–807. doi: 10.1111/j.0001-6349.2004.00602.x. - DOI - PubMed
    1. Gardosi J., Madurasinghe V., Williams M., Malik A., Francis A. Maternal and fetal risk factors for stillbirth: Population based study. BMJ. 2013;346:f108. doi: 10.1136/bmj.f108. - DOI - PMC - PubMed
    1. Dugandzic R., Dodds L., Stieb D., Smith-Doiron M. The association between low level exposures to ambient air pollution and term low birth weight: A retrospective cohort study. Environ. Health. 2006;5:3. doi: 10.1186/1476-069X-5-3. - DOI - PMC - PubMed

LinkOut - more resources