Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
- PMID: 35944167
- DOI: 10.1093/humrep/deac171
Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
Abstract
Study question: What is the accuracy and agreement of embryologists when assessing the implantation probability of blastocysts using time-lapse imaging (TLI), and can it be improved with a data-driven algorithm?
Summary answer: The overall interobserver agreement of a large panel of embryologists was moderate and prediction accuracy was modest, while the purpose-built artificial intelligence model generally resulted in higher performance metrics.
What is known already: Previous studies have demonstrated significant interobserver variability amongst embryologists when assessing embryo quality. However, data concerning embryologists' ability to predict implantation probability using TLI is still lacking. Emerging technologies based on data-driven tools have shown great promise for improving embryo selection and predicting clinical outcomes.
Study design, size, duration: TLI video files of 136 embryos with known implantation data were retrospectively collected from two clinical sites between 2018 and 2019 for the performance assessment of 36 embryologists and comparison with a deep neural network (DNN).
Participants/materials, setting, methods: We recruited 39 embryologists from 13 different countries. All participants were blinded to clinical outcomes. A total of 136 TLI videos of embryos that reached the blastocyst stage were used for this experiment. Each embryo's likelihood of successfully implanting was assessed by 36 embryologists, providing implantation probability grades (IPGs) from 1 to 5, where 1 indicates a very low likelihood of implantation and 5 indicates a very high likelihood. Subsequently, three embryologists with over 5 years of experience provided Gardner scores. All 136 blastocysts were categorized into three quality groups based on their Gardner scores. Embryologist predictions were then converted into predictions of implantation (IPG ≥ 3) and no implantation (IPG ≤ 2). Embryologists' performance and agreement were assessed using Fleiss kappa coefficient. A 10-fold cross-validation DNN was developed to provide IPGs for TLI video files. The model's performance was compared to that of the embryologists.
Main results and the role of chance: Logistic regression was employed for the following confounding variables: country of residence, academic level, embryo scoring system, log years of experience and experience using TLI. None were found to have a statistically significant impact on embryologist performance at α = 0.05. The average implantation prediction accuracy for the embryologists was 51.9% for all embryos (N = 136). The average accuracy of the embryologists when assessing top quality and poor quality embryos (according to the Gardner score categorizations) was 57.5% and 57.4%, respectively, and 44.6% for fair quality embryos. Overall interobserver agreement was moderate (κ = 0.56, N = 136). The best agreement was achieved in the poor + top quality group (κ = 0.65, N = 77), while the agreement in the fair quality group was lower (κ = 0.25, N = 59). The DNN showed an overall accuracy rate of 62.5%, with accuracies of 62.2%, 61% and 65.6% for the poor, fair and top quality groups, respectively. The AUC for the DNN was higher than that of the embryologists overall (0.70 DNN vs 0.61 embryologists) as well as in all of the Gardner groups (DNN vs embryologists-Poor: 0.69 vs 0.62; Fair: 0.67 vs 0.53; Top: 0.77 vs 0.54).
Limitations, reasons for caution: Blastocyst assessment was performed using video files acquired from time-lapse incubators, where each video contained data from a single focal plane. Clinical data regarding the underlying cause of infertility and endometrial thickness before the transfer was not available, yet may explain implantation failure and lower accuracy of IPGs. Implantation was defined as the presence of a gestational sac, whereas the detection of fetal heartbeat is a more robust marker of embryo viability. The raw data were anonymized to the extent that it was not possible to quantify the number of unique patients and cycles included in the study, potentially masking the effect of bias from a limited patient pool. Furthermore, the lack of demographic data makes it difficult to draw conclusions on how representative the dataset was of the wider population. Finally, embryologists were required to assess the implantation potential, not embryo quality. Although this is not the traditional approach to embryo evaluation, morphology/morphokinetics as a means of assessing embryo quality is believed to be strongly correlated with viability and, for some methods, implantation potential.
Wider implications of the findings: Embryo selection is a key element in IVF success and continues to be a challenge. Improving the predictive ability could assist in optimizing implantation success rates and other clinical outcomes and could minimize the financial and emotional burden on the patient. This study demonstrates moderate agreement rates between embryologists, likely due to the subjective nature of embryo assessment. In particular, we found that average embryologist accuracy and agreement were significantly lower for fair quality embryos when compared with that for top and poor quality embryos. Using data-driven algorithms as an assistive tool may help IVF professionals increase success rates and promote much needed standardization in the IVF clinic. Our results indicate a need for further research regarding technological advancement in this field.
Study funding/competing interest(s): Embryonics Ltd is an Israel-based company. Funding for the study was partially provided by the Israeli Innovation Authority, grant #74556.
Trial registration number: N/A.
Keywords: IVF/ICSI outcome; artificial intelligence; assisted reproduction; deep learning; embryo quality; embryologist agreement; embryology; implantation.
© The Author(s) 2022. Published by Oxford University Press on behalf of European Society of Human Reproduction and Embryology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Similar articles
-
Discard or not discard, that is the question: an international survey across 117 embryologists on the clinical management of borderline quality blastocysts.Hum Reprod. 2023 Oct 3;38(10):1901-1909. doi: 10.1093/humrep/dead174. Hum Reprod. 2023. PMID: 37649342
-
Development of an artificial intelligence-based assessment model for prediction of embryo viability using static images captured by optical light microscopy during IVF.Hum Reprod. 2020 Apr 28;35(4):770-784. doi: 10.1093/humrep/deaa013. Hum Reprod. 2020. PMID: 32240301 Free PMC article.
-
Should we freeze it? Agreement on fate of borderline blastocysts is poor and does not improve with a modified blastocyst grading system.Hum Reprod. 2020 May 1;35(5):1045-1053. doi: 10.1093/humrep/deaa060. Hum Reprod. 2020. PMID: 32358601
-
Real-time image and time-lapse technology to select the single blastocyst to transfer in assisted reproductive cycles.Zygote. 2023 Jun;31(3):207-216. doi: 10.1017/S0967199423000151. Epub 2023 Apr 11. Zygote. 2023. PMID: 37039114 Review.
-
Morphological and morphokinetic associations with aneuploidy: a systematic review and meta-analysis.Hum Reprod Update. 2022 Aug 25;28(5):656-686. doi: 10.1093/humupd/dmac022. Hum Reprod Update. 2022. PMID: 35613016
Cited by
-
Large language models to facilitate pregnancy prediction after in vitro fertilization.Acta Obstet Gynecol Scand. 2025 Jan;104(1):6-12. doi: 10.1111/aogs.14989. Epub 2024 Oct 28. Acta Obstet Gynecol Scand. 2025. PMID: 39465561 Free PMC article.
-
Enhancing predictive models for egg donation: time to blastocyst hatching and machine learning insights.Reprod Biol Endocrinol. 2024 Sep 11;22(1):116. doi: 10.1186/s12958-024-01285-9. Reprod Biol Endocrinol. 2024. PMID: 39261843 Free PMC article.
-
Embryo selection through artificial intelligence versus embryologists: a systematic review.Hum Reprod Open. 2023 Aug 15;2023(3):hoad031. doi: 10.1093/hropen/hoad031. eCollection 2023. Hum Reprod Open. 2023. PMID: 37588797 Free PMC article.
-
Viewing early life without labels: optical approaches for imaging the early embryo†.Biol Reprod. 2024 Jun 12;110(6):1157-1174. doi: 10.1093/biolre/ioae062. Biol Reprod. 2024. PMID: 38647415 Free PMC article. Review.
-
WISE: whole-scenario embryo identification using self-supervised learning encoder in IVF.J Assist Reprod Genet. 2024 Apr;41(4):967-978. doi: 10.1007/s10815-024-03080-2. Epub 2024 Mar 12. J Assist Reprod Genet. 2024. PMID: 38470553 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials