Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr:102:105064.
doi: 10.1016/j.ebiom.2024.105064. Epub 2024 Mar 20.

Detection of endometrial cancer in cervico-vaginal fluid and blood plasma: leveraging proteomics and machine learning for biomarker discovery

Affiliations

Detection of endometrial cancer in cervico-vaginal fluid and blood plasma: leveraging proteomics and machine learning for biomarker discovery

Kelechi Njoku et al. EBioMedicine. 2024 Apr.

Abstract

Background: The anatomical continuity between the uterine cavity and the lower genital tract allows for the exploitation of uterine-derived biomaterial in cervico-vaginal fluid for endometrial cancer detection based on non-invasive sampling methodologies. Plasma is an attractive biofluid for cancer detection due to its simplicity and ease of collection. In this biomarker discovery study, we aimed to identify proteomic signatures that accurately discriminate endometrial cancer from controls in cervico-vaginal fluid and blood plasma.

Methods: Blood plasma and Delphi Screener-collected cervico-vaginal fluid samples were acquired from symptomatic post-menopausal women with (n = 53) and without (n = 65) endometrial cancer. Digitised proteomic maps were derived for each sample using sequential window acquisition of all theoretical mass spectra (SWATH-MS). Machine learning was employed to identify the most discriminatory proteins. The best diagnostic model was determined based on accuracy and model parsimony.

Findings: A protein signature derived from cervico-vaginal fluid more accurately discriminated cancer from control samples than one derived from plasma. A 5-biomarker panel of cervico-vaginal fluid derived proteins (HPT, LG3BP, FGA, LY6D and IGHM) predicted endometrial cancer with an AUC of 0.95 (0.91-0.98), sensitivity of 91% (83%-98%), and specificity of 86% (78%-95%). By contrast, a 3-marker panel of plasma proteins (APOD, PSMA7 and HPT) predicted endometrial cancer with an AUC of 0.87 (0.81-0.93), sensitivity of 75% (64%-86%), and specificity of 84% (75%-93%). The parsimonious model AUC values for detection of stage I endometrial cancer in cervico-vaginal fluid and blood plasma were 0.92 (0.87-0.97) and 0.88 (0.82-0.95) respectively.

Interpretation: Here, we leveraged the natural shed of endometrial tumours to potentially develop an innovative approach to endometrial cancer detection. We show proof of principle that endometrial cancers secrete unique protein signatures that can enable cancer detection via cervico-vaginal fluid assays. Confirmation in a larger independent cohort is warranted.

Funding: Cancer Research UK, Blood Cancer UK, National Institute for Health Research.

Keywords: Biomarker; Cervico-vaginal fluid; Endometrial cancer; Plasma; Proteins.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
(a) Volcano plots summarising the differential expression of cervico-vaginal fluid supernatant proteins based on the degree of log2 fold change and test of statistical significance. Proteins with log2 FC of >1 only are represented as orange dots. Those with p < 0.05 are represented as red dots. Proteins exhibiting log2 FC >1 and p < 0.05 are represented as green dots and those exhibiting neither are represented as black dots. (b) Principal component analysis showing discrimination between cancers (n = 53) and controls (n = 65) based on all identified cervico-vaginal fluid supernatant proteins (n = 597 proteins). (c) Important discriminatory proteins identified by the random forest machine learning algorithm for cervico-vaginal fluid supernatant proteins and ranked according to their contribution to the overall diagnostic accuracy based on the mean decrease accuracy metric. (d) Principal component analysis showing discrimination between cancers and controls based on top ten discriminatory cervico-vaginal supernatant proteins (e) Functional pathway analysis of top discriminatory cervico-vaginal fluid supernatant proteins. (f) Diagnostic performance of the parsimonious model of cervico-vaginal fluid supernatant proteins (5-biomarker panel) for the detection of endometrial cancer.
Fig. 2
Fig. 2
(a) Box plots showing the permutation importance of the cervico-vaginal fluid supernatant proteins confirmed by the Boruta algorithm to be important. (b) Crude cumulative AUC analyses for the Boruta-identified proteins based on multiple forward stepwise logistic regression. (c) Gene ontology analysis of the unique Boruta identified biomarkers using the webserver WebGestalt and showing the biological processes (red), cellular components (blue) and molecular functions (green).
Fig. 3
Fig. 3
(a) Volcano plots summarising the differential expression of plasma proteins based on the degree of log2 fold change and test of statistical significance. Proteins with log2 FC of >1 only are represented as orange dots. Those with p < 0.05 are represented as red dots. Proteins exhibiting log2 FC >1 and p < 0.05 are represented as green dots and those exhibiting neither are represented as black dots. (b) Principal component analysis showing discrimination between cancers and controls based on all identified plasma proteins (n = 533 proteins). (c) Important discriminatory proteins identified by the random forest machine learning algorithm for plasma proteins and ranked according to their contribution to the overall diagnostic accuracy based on the mean decrease accuracy metric. (d) Principal component analysis showing discrimination between cancers and controls based on top ten discriminatory plasma proteins (e) Functional pathway analysis of top discriminator plasma proteins. (f) Diagnostic performance of the parsimonious model of plasma proteins (3-biomarker panel) for the detection of endometrial cancer.
Fig. 4
Fig. 4
(a) Box plots showing the permutation importance of the plasma proteins confirmed by the Boruta algorithm to be important. (b) Crude cumulative AUC analyses for the Boruta-identified plasma proteins based on multiple forward stepwise logistic regression. (c) Gene ontology analysis of the unique Boruta identified plasma biomarkers using the webserver WebGestalt and showing biological processes (red), cellular components (blue) and molecular functions (green).

References

    1. Sung H., Ferlay J., Siegel R.L., et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. - PubMed
    1. Crosbie E.J., Kitson S.J., McAlpine J.N., Mukhopadhyay A., Powell M.E., Singh N. Endometrial cancer. Lancet. 2022;399(10333):1412–1428. - PubMed
    1. Islami F., Ward E.M., Sung H., et al. Annual report to the nation on the status of cancer, Part 1: national cancer statistics. J Natl Cancer Inst. 2021;113(12):1648–1669. - PMC - PubMed
    1. Badrick E., Cresswell K., Ellis P., et al. Top ten research priorities for detecting cancer early. Lancet Public Health. 2019;4(11) - PubMed
    1. Njoku K., Chiasserini D., Whetton A.D., Crosbie E.J. Proteomic biomarkers for the detection of endometrial cancer. Cancers. 2019;11(10):1572. - PMC - PubMed