. 2025 Mar 25;14(7):2223.

doi: 10.3390/jcm14072223.

Mapping the Advanced-Stage Epithelial Ovarian Cancer Landscape Goes Beyond Words: Two Large Language Models, Eight Tasks, One Journey

Michela Quaranta¹, Alexandros Laios¹, Charlie Rogers¹, Anastasia Ioanna Mavromatidou², Amudha Thangavelu¹, Georgios Theophilou¹, David Nugent¹, Diederick DeJong¹, Evangelos Kalampokis²

Affiliations

¹ Department of Gynaecologic Oncology, ESGO Centre of Excellence for Ovarian Cancer Surgery, St James's University Hospital, Leeds LS9 7TF, UK.
² Information Systems Lab, Department of Business Administration, University of Macedonia, 54636 Thessaloniki, Greece.

PMID: 40217674
PMCID: PMC11989528
DOI: 10.3390/jcm14072223

Mapping the Advanced-Stage Epithelial Ovarian Cancer Landscape Goes Beyond Words: Two Large Language Models, Eight Tasks, One Journey

Michela Quaranta et al. J Clin Med. 2025.

. 2025 Mar 25;14(7):2223.

doi: 10.3390/jcm14072223.

Authors

Michela Quaranta¹, Alexandros Laios¹, Charlie Rogers¹, Anastasia Ioanna Mavromatidou², Amudha Thangavelu¹, Georgios Theophilou¹, David Nugent¹, Diederick DeJong¹, Evangelos Kalampokis²

Affiliations

¹ Department of Gynaecologic Oncology, ESGO Centre of Excellence for Ovarian Cancer Surgery, St James's University Hospital, Leeds LS9 7TF, UK.
² Information Systems Lab, Department of Business Administration, University of Macedonia, 54636 Thessaloniki, Greece.

PMID: 40217674
PMCID: PMC11989528
DOI: 10.3390/jcm14072223

Abstract

Background/Objectives: The advancement of natural language processing (NLP) technologies has transformed various sectors. However, their application in the healthcare domain, particularly for analysing clinical notes, remains underdeveloped. We investigated the use of deep neural networks, specifically transformer-based models, to predict intraoperative and post-operative outcomes related to advanced-stage epithelial ovarian cancer cytoreduction (aEOC) using unstructured surgical notes. Methods: We evaluated the performance of RoBERTa, a general-purpose language model, and GatorTron, a domain-specific model, across eight binary classification tasks using the same dataset. The dataset consisted of 560 surgical records from patients with aEOC who underwent cytoreductive surgery at a tertiary UK reference centre. Predictive outcomes were converted into binary features to facilitate classification tasks. To enhance the contextual information available to the models, textual data from "operative findings" and "operative notes" were concatenated. Results: Our findings highlight the tangible benefits of employing domain-specific language models for clinical text analysis. GatorTron generally outperformed RoBERTa across most predictive tasks, underscoring the advantages of domain-specific pretraining for understanding medical terminology and context. Both models struggled to predict certain outcomes, particularly those involving post-operative events like major complications and length of hospital stay, despite adjustments in hyperparameters and training strategies. This limitation suggests that operative text alone may not sufficiently capture the complexities of post-operative recovery. Conclusions: These findings have valuable implications for developing medical AI systems to improve the delivery of modern aEOC healthcare.

Keywords: GatorTron; RoBERTa; epithelial ovarian cancer; natural language processing; operative notes; transfer learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
Class distribution of Integer features.

**Figure 2**
(A) Word count distribution of text variables. The variation reflects the different purposes of the fields. (B) Word cloud visualisation of the concatenated texts extracted from operative notes and operative findings. More frequent terms appear larger.

**Figure 3**
Radar plots comparing performance between RoBerta and GatorTron models for all examined clinical tasks using Matthew’s correlation coefficient (MCC), recall, precision, F1 score, accuracy, area under precision–recall curve (AURPC), and area under receiver operating characteristic curve (AUROC).

See this image and copyright information in PMC

References

1. Doufekas K., Olaitan A. Clinical epidemiology of epithelial ovarian cancer in the UK. Int. J. Women’S Health. 2014;6:537–545. - PMC - PubMed
1. du Bois A., Reuss A., Pujade-Lauraine E., Harter P., Ray-Coquard I., Pfisterer J. Role of surgical outcome as prognostic factor in advanced epithelial ovarian cancer: A combined exploratory analysis of 3 prospectively randomized phase 3 multicenter trials. Cancer. 2009;115:1234–1244. doi: 10.1002/cncr.24149. - DOI - PubMed
1. Chi D.S., Franklin C.C., Levine D.A., Akselrod F., Sabbatini P., Jarnagin W.R., DeMatteo R., Poynor E.A., Abu-Rustum N.R., Barakat R.R. Improved optimal cytoreduction rates for stages IIIC and IV epithelial ovarian, fallopian tube, and primary peritoneal cancer: A change in surgical approach. Gynecol. Oncol. 2004;94:650–654. - PubMed
1. Dagliati A., Malovini A., Tibollo V., Bellazzi R. Health informatics and EHR to support clinical research in the COVID-19 pandemic: An overview. Brief. Bioinform. 2021;22:812–822. doi: 10.1093/bib/bbaa418. - DOI - PMC - PubMed
1. Martin-Sanchez F., Verspoor K. Big data in medicine is driving big changes. Yearb. Med. Inform. 2014;23:14–20. - PMC - PubMed

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Mapping the Advanced-Stage Epithelial Ovarian Cancer Landscape Goes Beyond Words: Two Large Language Models, Eight Tasks, One Journey

Affiliations

Mapping the Advanced-Stage Epithelial Ovarian Cancer Landscape Goes Beyond Words: Two Large Language Models, Eight Tasks, One Journey

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

Related information

LinkOut - more resources

Full Text Sources