Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 2;15(1):22631.
doi: 10.1038/s41598-025-06998-4.

Development and validation of machine learning models for predicting blastocyst yield in IVF cycles

Affiliations

Development and validation of machine learning models for predicting blastocyst yield in IVF cycles

Wen-Jie Huo et al. Sci Rep. .

Abstract

Predicting blastocyst formation poses significant challenges in reproductive medicine and critically influences clinical decision-making regarding extended embryo culture. While previous research has primarily focused on determining whether an IVF cycle can produce at least one blastocyst, less attention has been given to quantifying blastocyst yields. This study aims to develop and validate such a quantitative predictive tool for IVF cycles. We employed three machine learning models-SVM, LightGBM, and XGBoost-which demonstrated comparable performance and outperformed traditional linear regression models (R2: 0.673-0.676 vs. 0.587, Mean absolute error: 0.793-0.809 vs. 0.943). Ultimately, LightGBM emerged as the optimal model, due to utilizing fewer features (8 vs. 10-11 in SVM/XGBoost) and offering superior interpretability. We then stratified predictions and actual yields into three categories (0, 1-2, and ≥ 3 blastocysts) to evaluate the model's discriminative performance. In this multi-classification task, LightGBM demonstrated robust accuracy (0.675-0.71) with fair-to-moderate agreement (kappa coefficients: 0.365-0.5) across both the overall cohort and poor-prognosis subgroups. Feature importance analysis identified three critical predictors: the number of extended culture embryos, the mean cell number on Day 3, and the proportion of 8-cell embryos. By leveraging the potential of machine learning, this research provides clinicians with valuable insights for making individualized decisions regarding extended embryo culture.

Keywords: Blastocyst yield; Clinical decision support; Extended embryo culture; In vitro fertilization; Machine learning.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Ethical approval for the study was obtained from the Institutional Review Board of Nanfang Hospital, as authorized by the Ethical Committee (approval number: NFEC-2024-326). The procedures followed were in accordance with the ethical standards of the Declaration of Helsinki of the World Medical Association. The Ethical Committee of Nanfang Hospita waived the need for obtaining informed consent from the participants. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Performance comparison of machine learning models using recursive feature elimination (RFE). The figure illustrates the impact of RFE on model performance across four machine learning algorithms: Light Gradient Boosting Machine (LightGBM), Linear Regression (LR), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost). Features are systematically eliminated from 21 down to 2. The top panel presents the test R2 (coefficient of determination), where higher values indicate better model fit, while the bottom panel displays the test Mean Absolute Error (MAE), where lower values represent better prediction accuracy.
Fig. 2
Fig. 2
Distribution of predicted versus actual blastocyst yields in bins across the overall cohort and subgroups. The confusion matrices visualized as bar plots show the relationship between predicted and actual blastocyst yields (0, 1–2, and ≥ 3) for the overall cohort and three clinical subgroups. The plots illustrate the class imbalance and prediction patterns across different clinical scenarios, with notably skewed distributions in adverse subgroups.
Fig. 3
Fig. 3
Feature importance and partial dependence analysis using LightGBM. (A) The bar plot reveals the relative importance of features in the LightGBM model, with values quantifying each feature’s proportional contribution to the model’s predictive performance. (B) Individual conditional expectation and partial dependence plots illustrate the nuanced effects of the top six features on blastocyst yields. Thirty gray lines track the prediction trajectories of 30 samples, illustrating how predictions dynamically shift as a specific feature varies while other features remain constant. The red line delineates the mean effect across all samples, providing a comprehensive view of each feature’s impact on model predictions.

Similar articles

References

    1. Glujovsky, D. et al. Cleavage-stage versus blastocyst-stage embryo transfer in assisted reproductive technology. Cochrane Database Syst. Rev.5, Cd002118 (2022). - PubMed
    1. ASRM. Blastocyst culture and transfer in clinically assisted reproduction: a committee opinion. Fertil. Steril.110, 1246–1252 (2018). - PubMed
    1. Smeltzer, S., Acharya, K., Truong, T., Pieper, C. & Muasher, S. Clinical pregnancy and live birth increase significantly with every additional blastocyst up to five and decline after that: an analysis of 16,666 first fresh single-blastocyst transfers from the society for assisted reproductive technology registry. Fertil. Steril.112, 866–873e1 (2019). - PubMed
    1. Xiong, F. et al. Association between the number of top-quality blastocysts and live births after single blastocyst transfer in the first fresh or vitrified-warmed IVF/ICSI cycle. Reprod. Biomed. Online. 40, 530–537 (2020). - PubMed
    1. Cornelisse, S. et al. Cumulative live birth rate of a blastocyst versus cleavage stage embryo transfer policy during in vitro fertilisation in women with a good prognosis: multicentre randomised controlled trial. Bmj386, e080133 (2024). - PMC - PubMed

Publication types

LinkOut - more resources