Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 7;20(1):e0313494.
doi: 10.1371/journal.pone.0313494. eCollection 2025.

Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning

Affiliations

Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning

Ji-Ying Chen et al. PLoS One. .

Abstract

Polycystic ovary syndrome (PCOS) is a primary endocrine disorder affecting premenopausal women involving metabolic dysregulation. We aimed to screen serum biomarkers in PCOS patients using untargeted lipidomics and ensemble machine learning. Serum from PCOS patients and non-PCOS subjects were collected for untargeted lipidomics analysis. Through analyzing the classification of differential lipid metabolites and the association between differential lipid metabolites and clinical indexes, ensemble machine learning, data preprocessing, statistical test pre-screening, ensemble learning method secondary screening, biomarkers verification and evaluation, and diagnostic panel model construction and verification were performed on the data of untargeted lipidomics. Results indicated that different lipid metabolites not only differ between groups but also have close effects on different corresponding clinical indexes. PI (18:0/20:3)-H and PE (18:1p/22:6)-H were identified as candidate biomarkers. Three machine learning models, logistic regression, random forest, and support vector machine, showed that screened biomarkers had better classification ability and effect. In addition, the correlation of candidate biomarkers was low, indicating that the overlap between the selected biomarkers was low, and the combination of panels was more optimized. When the AUC value of the test set of the constructed diagnostic panel model was 0.815, the model's accuracy in the test set was 0.74, specificity was 0.88, and sensitivity was 0.7. This study demonstrated the applicability and robustness of machine learning algorithms to analyze lipid metabolism data for efficient and reliable biomarker screening. PI (18:0/20:3)-H and PE (18:1p/22:6)-H showed great potential in diagnosing PCOS.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Fig 1
Fig 1. Classification of differential lipid metabolites.
(A) Distribution of lipid metabolites in different classifications. Abscissa represents classification, and the ordinate represents a number. (B) OPLS-DA model of Control and PCOS samples. (C) Comparative volcanic map of lipid metabolites between Control and PCOS groups. The red dots represent up-regulated metabolites in the Control group, while the blue dots represent down-regulated metabolites in the Control group. The size of the dots is adjusted to the adjusted P value. The abscissa represents log2 (fold change), and the ordinate represents -log10 (P value). (D) Top 20 metabolites with VIP value in the OPLS-DA model. (E) Classification of differential lipid metabolites.
Fig 2
Fig 2. Associations between differential lipid metabolites and clinical indexes.
(A) Associations between differential metabolite classification and clinical indexes. B and C. 620POS and 628POS metabolites are significantly associated with clinical indexes (BMI, LH, T, and LH/FSH). D and E. 112POS and 576POS metabolites are associated considerably with clinical indexes (T and LH/FSH).
Fig 3
Fig 3. Screening of candidate biomarkers for PCOS.
(A) Ranking of candidate biomarkers weight value. Abscissa: weight value; ordinate: substance name; black: biological macromolecules that did not pass the screening criteria; red: candidate biomarkers. (B) AUC cumulative trend plot of candidate biomarkers classification model. Abscissa: candidate biomarkers; ordinate: AUC cumulative value.
Fig 4
Fig 4. Validation of biomarkers for PCOS.
(A) Boxplots of ROC analysis, specificity, and sensitivity for three single model assessments. Abscissa: AUC, specificity and sensitivity scores, the closer it is to 1, the better; ordinate: Three algorithm models used, LR is logistic regression, SVM is support vector machine, and RF is random forest. (B) ROC plots for three single model evaluations. Abscissa: false positive (1-specificity); ordinate: true positive (sensitivity); green curve: random forest ROC curve; blue curve: support vector machine ROC curve; red curve: logistic regression ROC curve; the lower right corner is the AUC value of each model curve. (C) Accuracy, sensitivity, and specificity curves of the three single model evaluations. The three graphs are the accuracy, actual positive rate and accurate negative rate curves of LR, SVM, RF model, abscissa: cutoff value; ordinate: percentage; red curve: accuracy; green curve: sensitivity; blue curve: specificity.
Fig 5
Fig 5. Characterization of biomarkers for PCOS.
(A) Random forests calculate the importance of biomarkers. Abscissa: importance; ordinate: substance name. (B) Boxplots of the expression levels of PI (18:0/20:3)-H and PE (18:1p/22:6)-H in each sample. (C) Correlation analysis graph. Abscissa: substance name; ordinate: substance name; circle: relationship coefficient, the larger the correlation, the larger the area and the darker the color; color: red is positive correlation, blue is a negative correlation.
Fig 6
Fig 6. Construction and ability evaluation of PCOS diagnosis panel model.
(A) Cutoff trend chart. (B) ROC curve.

Similar articles

Cited by

References

    1. Gomez JMD, VanHise K, Stachenfeld N, Chan JL, Merz NB, Shufelt C. Subclinical cardiovascular disease and polycystic ovary syndrome. Fertil Steril. 2022;117(5):912–23. doi: 10.1016/j.fertnstert.2022.02.028 - DOI - PMC - PubMed
    1. Ma N, Zhou J, Lu W. The Normal Body Mass Index (BMI) of Women with Polycystic Ovary Syndrome (PCOS) was Associated with IVF/ICSI Assisted Conception Outcomes. Clinical and Experimental Obstetrics & Gynecology. 2023;50(11). doi: 10.31083/j.ceog5011228 - DOI
    1. Krentowska A, Kowalska I. Metabolic syndrome and its components in different phenotypes of polycystic ovary syndrome. Diabetes Metab Res Rev. 2022;38(1):e3464. doi: 10.1002/dmrr.3464 - DOI - PubMed
    1. Azin F, Khazali H. Phytotherapy of polycystic ovary syndrome: A review. Int J Reprod Biomed. 2022;20(1):13–20. doi: 10.18502/ijrm.v20i1.10404 - DOI - PMC - PubMed
    1. Wright PJ, Corbett CF, Dawson RM, Wirth MD, Pinto BM. Fitness Assessments of Women with Polycystic Ovary Syndrome: A Prospective Process Feasibility Study. Clinical and Experimental Obstetrics & Gynecology. 2023;50(4). doi: 10.31083/j.ceog5004088 - DOI