Predicting polycystic ovary syndrome with machine learning algorithms from electronic health records
- PMID: 38356959
- PMCID: PMC10866556
- DOI: 10.3389/fendo.2024.1298628
Predicting polycystic ovary syndrome with machine learning algorithms from electronic health records
Abstract
Introduction: Predictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis.
Methods: This is a retrospective cohort study from a SafetyNet hospital's electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound.
Results: We developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG.
Conclusion: Machine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.
Keywords: artificial intelligence; disease prediction; machine learning; polycystic ovary syndrome (PCOS); predictive model.
Copyright © 2024 Zad, Jiang, Wolf, Wang, Cheng, Paschalidis and Mahalingaiah.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures



Update of
-
Predicting polycystic ovary syndrome (PCOS) with machine learning algorithms from electronic health records.medRxiv [Preprint]. 2023 Oct 1:2023.07.27.23293255. doi: 10.1101/2023.07.27.23293255. medRxiv. 2023. Update in: Front Endocrinol (Lausanne). 2024 Jan 30;15:1298628. doi: 10.3389/fendo.2024.1298628. PMID: 37577593 Free PMC article. Updated. Preprint.
Similar articles
-
Predicting polycystic ovary syndrome (PCOS) with machine learning algorithms from electronic health records.medRxiv [Preprint]. 2023 Oct 1:2023.07.27.23293255. doi: 10.1101/2023.07.27.23293255. medRxiv. 2023. Update in: Front Endocrinol (Lausanne). 2024 Jan 30;15:1298628. doi: 10.3389/fendo.2024.1298628. PMID: 37577593 Free PMC article. Updated. Preprint.
-
Polycystic ovary syndrome: clinical and laboratory variables related to new phenotypes using machine-learning models.J Endocrinol Invest. 2022 Mar;45(3):497-505. doi: 10.1007/s40618-021-01672-8. Epub 2021 Sep 15. J Endocrinol Invest. 2022. PMID: 34524677
-
Identification of subjects with polycystic ovary syndrome using electronic health records.Reprod Biol Endocrinol. 2015 Oct 29;13:116. doi: 10.1186/s12958-015-0115-z. Reprod Biol Endocrinol. 2015. PMID: 26510685 Free PMC article.
-
Application of machine learning and artificial intelligence in the diagnosis and classification of polycystic ovarian syndrome: a systematic review.Front Endocrinol (Lausanne). 2023 Sep 18;14:1106625. doi: 10.3389/fendo.2023.1106625. eCollection 2023. Front Endocrinol (Lausanne). 2023. PMID: 37790605 Free PMC article.
-
Unveiling the Role of Artificial Intelligence (AI) in Polycystic Ovary Syndrome (PCOS) Diagnosis: A Comprehensive Review.Reprod Sci. 2024 Oct;31(10):2901-2915. doi: 10.1007/s43032-024-01615-7. Epub 2024 Jun 21. Reprod Sci. 2024. PMID: 38907128 Review.
Cited by
-
Polycystic ovary syndrome (PCOS) management using a nutrition recommender mobile application: identifying key requirements.BMC Womens Health. 2025 Aug 14;25(1):393. doi: 10.1186/s12905-025-03936-4. BMC Womens Health. 2025. PMID: 40814091 Free PMC article.
-
Artificial Intelligence and Machine Learning: An Updated Systematic Review of Their Role in Obstetrics and Midwifery.Cureus. 2025 Mar 11;17(3):e80394. doi: 10.7759/cureus.80394. eCollection 2025 Mar. Cureus. 2025. PMID: 40070886 Free PMC article. Review.
-
The Role of Artificial Intelligence in Female Infertility Diagnosis: An Update.J Clin Med. 2025 Apr 30;14(9):3127. doi: 10.3390/jcm14093127. J Clin Med. 2025. PMID: 40364156 Free PMC article. Review.
-
Harnessing Microbiome, Bacterial Extracellular Vesicle, and Artificial Intelligence for Polycystic Ovary Syndrome Diagnosis and Management.Biomolecules. 2025 Jun 7;15(6):834. doi: 10.3390/biom15060834. Biomolecules. 2025. PMID: 40563474 Free PMC article. Review.
-
Polycystic Ovary Syndrome and the Internet of Things: A Scoping Review.Healthcare (Basel). 2024 Aug 21;12(16):1671. doi: 10.3390/healthcare12161671. Healthcare (Basel). 2024. PMID: 39201229 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Medical