Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 19:78:102923.
doi: 10.1016/j.eclinm.2024.102923. eCollection 2024 Dec.

Development and validation of a deep learning pipeline to diagnose ovarian masses using ultrasound screening: a retrospective multicenter study

Affiliations

Development and validation of a deep learning pipeline to diagnose ovarian masses using ultrasound screening: a retrospective multicenter study

Wen-Li Dai et al. EClinicalMedicine. .

Abstract

Background: Ovarian cancer has the highest mortality rate among gynaecological malignancies and is initially screened using ultrasound. Owing to the high complexity of ultrasound images of ovarian masses and the anatomical characteristics of the deep pelvic cavity, subjective assessment requires extensive experience and skill. Therefore, detecting the ovaries and ovarian masses and diagnose ovarian cancer are challenging. In the present study, we aimed to develop an automated deep learning framework, the Ovarian Multi-Task Attention Network (OvaMTA), for ovary and ovarian mass detection, segmentation, and classification, as well as further diagnosis of ovarian masses based on ultrasound screening.

Methods: Between June 2020 and May 2022, the OvaMTA model was trained, validated and tested on a training and validation cohort including 6938 images and an internal testing cohort including 1584 images which were recruited from 21 hospitals involving women who underwent ultrasound examinations for ovarian masses. Subsequently, we recruited two external test cohorts from another two hospitals. We obtained 1896 images between February 2024 and April 2024 as image-based external test dataset, and further obtained 159 videos for the video-based external test dataset between April 2024 and May 2024. We developed an artificial intelligence (AI) system (termed OvaMTA) to diagnose ovarian masses using ultrasound screening. It includes two models: an entire image-based segmentation model, OvaMTA-Seg, for ovary detection and a diagnosis model, OvaMTA-Diagnosis, for predicting the pathological type of ovarian mass using image patches cropped by OvaMTA-Seg. The performance of the system was evaluated in one internal and two external validation cohorts, and compared with doctors' assessments in real-world testing. We recruited eight physicians to assess the real-world data. The value of the system in assisting doctors with diagnosis was also evaluated.

Findings: In terms of segmentation, OvaMTA-Seg achieved an average Dice score of 0.887 on the internal test set and 0.819 on the image-based external test set. OvaMTA-Seg also performed well in ovarian mass detection from test images, including healthy ovaries and masses (internal test area under the curve [AUC]: 0.970; external test AUC: 0.877). In terms of classification diagnosis prediction, OvaMTA-Diagnosis demonstrated high performance on image-based internal (AUC: 0.941) and external test sets (AUC: 0.941). In video-based external testing, OvaMTA recognised 159 videos with ovarian masses with AUC of 0.911, and is comparable to the performance of senior radiologists (ACC: 86.2 vs. 88.1, p = 0.50; SEN: 81.8 vs. 88.6, p = 0.16; SPE: 89.2 vs. 87.6, p = 0.68). There was a significant improvement in junior and intermediate radiologists who were assisted by AI compared to those who were not assisted by AI (ACC: 80.8 vs. 75.3, p = 0.00015; SEN: 79.5 vs. 74.6, p = 0.029; SPE: 81.7 vs. 75.8, p = 0.0032). General practitioners assisted by AI achieved an average performance of radiologists (ACC: 82.7 vs. 81.8, p = 0.80; SEN: 84.8 vs. 82.6, p = 0.72; SPE: 81.2 vs. 81.2, p > 0.99).

Interpretation: The OvaMTA system based on ultrasound imaging is a simple and practical auxiliary tool for screening for ovarian cancer, with a diagnostic performance comparable to that of senior radiologists. This provides a potential tool for screening ovarian cancer.

Funding: This work was supported by the National Natural Science Foundation of China (Grant Nos. 12090020, 82071929, and 12090025) and the R&D project of the Pazhou Lab (Huangpu) (Grant No. 2023K0605).

Keywords: Artificial intelligence pipeline; Deep learning; Ovarian cancer; Ovarian mass; Ultrasound.

PubMed Disclaimer

Conflict of interest statement

All authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Flowchart shows the eligibility criteria and procedure of AI system evaluation. US, ultrasound. AI, artificial intelligence.
Fig. 2
Fig. 2
Overview of the study design. (a). Collect ultrasound data and pathological results from 23 hospitals, and label the edges of ovarian masses. (b). Compose an image dataset and a video dataset for model development. (c). The OvaMTA system we propose consists of two neural networks: the OvaMTA-Seg network for automatic tumor detection and the OvaMTA-Diagnosis network for predicting malignancy. (d). Finally, we evaluate model performance using ROC, confusion matrix, accuracy, sensitivity, specificity, and Kappa coefficient, and compare with doctors in video testing. ROC, receiver operating characteristic curve. OvaMTA, ovarian multi-task attention network. ROI, region of interests. AI, artificial intelligence.
Fig. 3
Fig. 3
Diagnostic performances of OvaMTA in image-based testing. ROC and confusion matrices of internal test (a, b, c) and external test A (d, e, f) datasets shows the diagnostic performance of AI system in detecting ovarian masses and discriminating benign and malignant ovarian tumors based on images. ROC, receiver operating characteristic curve. OvaMTA, ovarian multi-task attention network. AI, artificial intelligence.
Fig. 4
Fig. 4
Comparison of diagnostic performance between AI system, radiologists, and general practitioners. (a). ROC of external test videos shows the diagnostic performance of AI system. And comparing with radiologists of different experience levels. Big red star represents the average value of radiologists. The grey circle represents the performance of junior radiologists, the grey diamond represents the performance of intermediate radiologists, the gray triangle represents the performance of expert, and the grey square represents the performance of general practitioners. (b). Bar plot shows performances of eight doctors with indicated paired McNemar test p-values for group comparison; (c). Normalized confusion matrix of AI system's performance in discriminating benign and malignant ovarian tumors in external test videos. (d). Kappa matrix of eight doctors' assessment in external test videos. (e). Kappa matrix of eight doctors' assessment assisted with AI in external test videos. Each value in the matrices represents the inter-observer agreement between two doctors. On the axes of kappa matrices, general practitioners 1–2 and radiologists 1–6 are arranged in the order from bottom to top and from left to right in (e), (d). ROC, receiver operating characteristic curve. AI, artificial intelligence.
Fig. 5
Fig. 5
Two types of ovarian masses detected, segmented, diagnosed, and interpreted by AI system. (a). Keyframe of greyscale ultrasound videos in a 64-year-old woman with solid component(s) within a multilocular cyst. The AI system can detect and map the edges of the mass, and make malignant classification and probability in real time. The final diagnosis result provided by AI system was consistent with the pathological result after surgery (endometrioid carcinoma). The AI system gives interpretable heat maps in real time, allowing doctors to pay more attention to the irregular compartments and solid components. (b). Keyframe of greyscale ultrasound videos in a 52-year-old woman with a unilocular anechoic cyst. The AI system can detect and map the edges of the mass, and make benign classification and probability in real time. The final diagnosis result provided by AI system was consistent with the pathological result after surgery (serous cystadenofibroma). The AI system gives interpretable heat maps in real time, allowing doctors to pay more attention to the anechoic cystic part. AI, artificial intelligence.

Similar articles

Cited by

References

    1. Bray F., Laversanne M., Sung H., et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229–263. - PubMed
    1. Zheng R., Zhang S., Zeng H., et al. Cancer incidence and mortality in China, 2016. J Natl Cancer Cent. 2022;2(1):1–9. - PMC - PubMed
    1. Lheureux S., Braunstein M., Oza A.M. Epithelial ovarian cancer: evolution of management in the era of precision medicine. CA Cancer J Clin. 2019;69(4):280–304. - PubMed
    1. Froyman W., Landolfo C., De Cock B., et al. Risk of complications in patients with conservatively managed ovarian tumours (IOTA5): a 2-year interim analysis of a multicentre, prospective, cohort study. Lancet Oncol. 2019;20(3):448–458. - PubMed
    1. Jacobs I., Oram D., Fairbanks J., Turner J., Frost C., Grudzinskas J.G. A risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian cancer. Br J Obstet Gynaecol. 1990;97(10):922–929. - PubMed

LinkOut - more resources