Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 9;24(1):738.
doi: 10.1186/s12884-024-06942-w.

Building a machine learning-based risk prediction model for second-trimester miscarriage

Affiliations

Building a machine learning-based risk prediction model for second-trimester miscarriage

Sangsang Qi et al. BMC Pregnancy Childbirth. .

Abstract

Background: Second-trimester miscarriage is a common adverse pregnancy outcome that imposes substantial economic and psychological pressures on both the physical and mental well-being of patients and their families. Currently, there is a scarcity of research on predictive models for the risk of second-trimester miscarriage.

Methods: Clinical data were retrospectively collected from patients who were in the second trimester of pregnancy (between 14+0 and 27+6 weeks gestation), whose main diagnosis was "threatened abortion" and who were hospitalized at the Women and Children's Hospital of Ningbo University from January 2020 to October 2023. Following preliminary data processing, the patient cohort was randomly stratified into a training cohort and a validation cohort at proportions of 70% and 30%, respectively. The Boruta algorithm and multifactor analysis were used to refine feature factors and determine the optimal features linked to second-trimester miscarriages. The imbalanced dataset from the training cohort was rectified by applying the SMOTE oversampling approach. Seven machine-learning models were built and subjected to a comprehensive analysis to validate and evaluate their predictive capabilities. Through this rigorous assessment, the optimal model was selected. Shapley additive explanations (SHAP) were generated to provide insights into the model's predictions, and a visual representation of the predictive model was built.

Results: A total of 2006 patients were included in the study; 395 (19.69%) of them had second-trimester miscarriages. XGBoost was shown to be the optimal model after a comparison of seven different models utilizing metrics such as accuracy, precision, recall, the F1 score, precision-recall average precision, the receiver operating characteristic-area under the curve, decision curve analysis, and the calibration curve. The most significant feature was cervical length, and the top ten features of second-trimester miscarriage were found using the SHAP technique based on relevance rankings.

Conclusion: The risk of a second-trimester miscarriage can be accurately predicted by the visual risk prediction model, which is based on the machine learning mentioned above.

Keywords: Machine learning; Prediction models; Second-trimester miscarriage.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Flowchart of the predictive model
Fig. 2
Fig. 2
a The percentage of missing datas in each variable, b With the red squares representing the missing values STM: Second-trimester miscarriage HPB: History of preterm birth STMH: Second-trimester miscarriage history NIP: Number of intrauterine procedures SCH: Subchorionic haematoma WBC: White blood cell count NEU%: Neutrophil percentage CRP: C-reactive protein BMI: Body mass index VP: Vaginitis during pregnancy AP: Assisted pregnancy MP: Multiple pregnancies PA: Placental abnormalities AAV: Abnormal amniotic fluid volume UA: Uterine abnormalities NC: Neoplasms of cervix ACS: After cervical surgery
Fig. 3
Fig. 3
STM: Second-trimester miscarriage HPB: History of preterm birth STMH: Second-trimester miscarriage history NIP: Number of intrauterine procedures SCH: Subchorionic haematoma WBC: White blood cell count NEU%: Neutrophil percentage CRP: C-reactive protein BMI: Body mass index VP: Vaginitis during pregnancy AP: Assisted pregnancy MP: Multiple pregnancies PA: Placental abnormalities AAV: Abnormal amniotic fluid volume UA: Uterine abnormalities NC: Neoplasms of cervix ACS: After cervical surgery
Fig. 4
Fig. 4
Variables having box plot in green shows all predictors are important. If boxplots are in red, it shows they are rejected. And yellow color of box plot indicates they are tentative. STM: Second-trimester miscarriage HPB: History of preterm birth STMH: Second-trimester miscarriage history NIP: Number of intrauterine procedures SCH: Subchorionic haematoma WBC: White blood cell count NEU%: Neutrophil percentage CRP: C-reaction protein BMI: Body mass index VP: Vaginitis during pregnancy AP: Assisted pregnancy MP: Multiple pregnancies PA: Placental abnormalities AAV: Abnormal amniotic fluid volume UA: Uterine abnormalities NC: Neoplasms of cervix ACS: After cervical surgery
Fig. 5
Fig. 5
White blood cell; CRP: C-reactive protein menstrual flow: more than menstrual flow
Fig. 6
Fig. 6
Receiver operating characteristic curves and precision-recall curves. a-b Receiver operating characteristic curves in the training cohort and validation cohort. c-d Precision-recall curves in the training cohort and validation cohort. LR: Logistic Regression, KNN: K-Nearest Neighbors, SVM: Support Vector Machine DT: Decision Tree RF: Random Forest, XGBoost: EXtreme Gradient Boosting ANN: Artificial Neural Network
Fig. 7
Fig. 7
Radar map for comparative performance analysis of machine learning models. LR: Logistic Regression, KNN: K-Nearest Neighbors, SVM: Support Vector Machine DT: Decision Tree RF: Random Forest, XGBoost: EXtreme Gradient Boosting ANN: Artificial Neural Network ROC-AUC: Area under the precision-recall curve PR-AP: Precision-Recall Average Precision
Fig. 8
Fig. 8
Decision curve analysis and calibration plots. a-b Decision curve analysis in the training cohort and validation cohort c-d Calibration plots in the training cohort and validation cohort LR: Logistic Regression, KNN: K-Nearest Neighbors, SVM: Support Vector Machine DT: Decision Tree RF: Random Forest, XGBoost: EXtreme Gradient Boosting ANN: Artificial Neural Network
Fig. 9
Fig. 9
a Shapley’s additive interpretation. The positive and negative SHAP values of a feature indicate the degree to which the feature increases or decreases the predicted value, respectively. b SHAP feature importance matrix. Each bar represents the contribution of a feature to a particular prediction. c SHAP dependence plots showing the predicted risk versus the feature value. It can reveal the relationship between features and predictions, as well as the impact of different intervals of eigenvalues on predictions. d, e SHAP model for two typical predictions: SCH: subchorionic haematoma; WBC: white blood cell count; NEU%: neutrophil percentage; CRP: C-reactive protein <MF: less than menstrual flow >MF: more than menstrual flow; PA: placental abnormalities; AD: abnormal placental development; PP: placenta previa
Fig. 10
Fig. 10
A web-based risk assessment tool for second-trimester miscarriage

Similar articles

References

    1. Niinimäki M, Mentula M, Jahangiri R, Männistö J, Haverinen A, Heikinheimo O. Medical treatment of second-trimester fetal miscarriage; A retrospective analysis. PLoS ONE. 2017;12(7):e0182198. - PMC - PubMed
    1. Bottomley C, Bourne T. Diagnosing miscarriage. Best Pract Res Clin Obstet Gynaecol. 2009;23(4):463–77. - PubMed
    1. Saraswat L, Bhattacharya S, Maheshwari A, Bhattacharya S. Maternal and perinatal outcome in women with threatened miscarriage in the first trimester: a systematic review. BJOG Int J Obstet Gynaecol. 2010;117(3):245–57. - PubMed
    1. Weiss JL, Malone FD, Vidaver J, Ball RH, Nyberg DA, Comstock CH, et al. Threatened abortion: a risk factor for poor pregnancy outcome, a population-based screening study. Am J Obstet Gynecol. 2004;190(3):745–50. - PubMed
    1. Aljameel SS, Aljabri M, Aslam N, Alomari DM, Alyahya A, Alfaris S, et al. An Automated System for Early Prediction of Miscarriage in the First Trimester Using Machine Learning. CMC-Comput Mater Continua. 2023;75(1):1291–304.

LinkOut - more resources