Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports

Affiliations

¹ University of California, San Francisco, 521 Parnassus Avenue Rm 6222, San Francisco, CA, 94143, USA.
² University of California, Berkeley, Berkeley, CA, USA.
³ University of Washington, Seattle, WA, USA.
⁴ University of Arizona, Tucson, AZ, USA.
⁵ University of Pittsburgh, Pittsburgh, PA, USA.
⁶ University of Pennsylvania, Philadelphia, PA, USA.
⁷ Indiana University, Indianapolis, IN, USA.
⁸ University of California, San Francisco, 521 Parnassus Avenue Rm 6222, San Francisco, CA, 94143, USA. rima.arnaout@ucsf.edu.

PMID: 39953053
PMCID: PMC11828978
DOI: 10.1038/s41598-024-83540-y

Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports

Suganya Subramaniam et al. Sci Rep. 2025.

. 2025 Feb 14;15(1):5456.

doi: 10.1038/s41598-024-83540-y.

Authors

Affiliations

¹ University of California, San Francisco, 521 Parnassus Avenue Rm 6222, San Francisco, CA, 94143, USA.
² University of California, Berkeley, Berkeley, CA, USA.
³ University of Washington, Seattle, WA, USA.
⁴ University of Arizona, Tucson, AZ, USA.
⁵ University of Pittsburgh, Pittsburgh, PA, USA.
⁶ University of Pennsylvania, Philadelphia, PA, USA.
⁷ Indiana University, Indianapolis, IN, USA.
⁸ University of California, San Francisco, 521 Parnassus Avenue Rm 6222, San Francisco, CA, 94143, USA. rima.arnaout@ucsf.edu.

PMID: 39953053
PMCID: PMC11828978
DOI: 10.1038/s41598-024-83540-y

Abstract

Big data can revolutionize research and quality improvement for cardiac ultrasound. Text reports are a critical part of such analyses. Cardiac ultrasound reports include structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both statistical- and large language model based techniques. We tested whether we could use NLP to map cardiac ultrasound text to a three-level hierarchical ontology. We used statistical machine learning (EchoMap) and zero-shot inference using GPT. We tested eight datasets from 24 different institutions and compared both methods against clinician-scored ground truth. Despite all adhering to clinical guidelines, institutions differed in their structured reporting. EchoMap performed best with validation accuracy of 98% for the first ontology level, 93% for first and second levels, and 79% for all three. EchoMap retained performance across external test datasets and could extrapolate to examples not included in training. EchoMap's accuracy was comparable to zero-shot GPT at the first level of the ontology and outperformed GPT at second and third levels. We show that statistical machine learning can map text to structured ontology and may be especially useful for small, specialized text datasets.

Keywords: Echocardiography report; Large language models; Machine learning; Natural language processing; Ontology.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

**Fig. 1**
Workflow for the three machine learning approaches evaluated. Data (structured dictionaries and free text from echo reports) were preprocessed, then passed to each of three model types: (A) Hierarchical random forest statistical machine learning model, which included additional engineered features and used each level’s prediction to inform the subsequent level, (B) zero-shot GPT making independent predictions per level of ontology, (C) zero-shot GPT making multi-class prediction. GPT, generative pre-trained transformer; RF, random forest; UMLS, unified medical language system; L1, L2, L3, Level 1, Level 2, Level 3, respectively.

**Fig. 2**
Correctness by ontology level, by dataset. Validation (UCSF) and test set (outside hospitals) performance for each of three model architectures: (A) Echomap, (B) Zero-shot GPT, (C) multi-class zero-shot GPT. UCSF, University of California, San Francisco. PITT, University of Pittsburgh; IU, Indiana University; UCSF-OSH, outside hospital reports in the UCSF system; UAZ, University of Arizona; UCSF-FREE, free-text sentences from UCSF reports; UPENN, University of Pennsylvania; UW, University of Washington.

**Fig. 3**
Aggregate performance across all datasets evaluated, by each mapping model and by ontology level. Box plots represent performance of all eight validation and test datasets in order to illustrate differences among mapping models.

See this image and copyright information in PMC

References

1. Arnaout, R. et al. The (heart and) soul of a human creation: Designing echocardiography for the big data age. J. Am. Soc. Echocardiogr.36, 800–801 (2023). - PMC - PubMed
1. Galderisi, M. et al. Standardization of adult transthoracic echocardiography reporting in agreement with recent chamber quantification, diastolic function, and heart valve disease recommendations: an expert consensus document of the European Association of Cardiovascular Imaging. Eur. Heart J. Cardiovasc. Imaging18, 1301–1310 (2017). - PubMed
1. Haendel, M. A., Chute, C. G. & Robinson, P. N. Classification, ontology, and precision medicine. N. Engl. J. Med.379, 1452–1462 (2018). - PMC - PubMed
1. Bodenreider, O. The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Res.32, D267-270 (2004). - PMC - PubMed
1. SNOMED CT. SNOMED International https://www.snomed.org/use-snomed-ct.

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 HL150394/HL/NHLBI NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports

Affiliations

Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources