Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach

Sebastian Griewing^{1

2

3

4}, Fabian Lechner^{5

6}, Niklas Gremke⁷, Stefan Lukac^{8

9}, Wolfgang Janni⁸, Markus Wallwiener^{10

9}, Uwe Wagner^{7

9}, Martin Hirsch⁶, Sebastian Kuhn⁵

Affiliations

¹ Institute for Digital Medicine, University Hospital Giessen and Marburg, Philipps-University Marburg, Marburg, Germany. s.griewing@uni-marburg.de.
² Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Palo Alto, CA, USA. s.griewing@uni-marburg.de.
³ Marburg Gynecological Cancer Center, Giessen and Marburg University Hospital, Philipps-University Marburg, Marburg, Germany. s.griewing@uni-marburg.de.
⁴ Commission Digital Medicine, German Society for Gynecology and Obstetrics (DGGG), Berlin, Germany. s.griewing@uni-marburg.de.
⁵ Institute for Digital Medicine, University Hospital Giessen and Marburg, Philipps-University Marburg, Marburg, Germany.
⁶ Institute for Artificial Intelligence in Medicine, University Hospital Giessen and Marburg, Philipps-University Marburg, Marburg, Germany.
⁷ Marburg Gynecological Cancer Center, Giessen and Marburg University Hospital, Philipps-University Marburg, Marburg, Germany.
⁸ Department of Obstetrics and Gynecology, University Hospital Ulm, University of Ulm, Ulm, Germany.
⁹ Commission Digital Medicine, German Society for Gynecology and Obstetrics (DGGG), Berlin, Germany.
¹⁰ Halle Gynecological Cancer Center, Halle University Hospital, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany.

PMID: 39382778
PMCID: PMC11464535
DOI: 10.1007/s00432-024-05964-3

Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach

Sebastian Griewing et al. J Cancer Res Clin Oncol. 2024.

. 2024 Oct 9;150(10):451.

doi: 10.1007/s00432-024-05964-3.

Authors

Sebastian Griewing^{1

2

3

4}, Fabian Lechner^{5

6}, Niklas Gremke⁷, Stefan Lukac^{8

9}, Wolfgang Janni⁸, Markus Wallwiener^{10

9}, Uwe Wagner^{7

9}, Martin Hirsch⁶, Sebastian Kuhn⁵

Affiliations

¹ Institute for Digital Medicine, University Hospital Giessen and Marburg, Philipps-University Marburg, Marburg, Germany. s.griewing@uni-marburg.de.
² Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Palo Alto, CA, USA. s.griewing@uni-marburg.de.
³ Marburg Gynecological Cancer Center, Giessen and Marburg University Hospital, Philipps-University Marburg, Marburg, Germany. s.griewing@uni-marburg.de.
⁴ Commission Digital Medicine, German Society for Gynecology and Obstetrics (DGGG), Berlin, Germany. s.griewing@uni-marburg.de.
⁵ Institute for Digital Medicine, University Hospital Giessen and Marburg, Philipps-University Marburg, Marburg, Germany.
⁶ Institute for Artificial Intelligence in Medicine, University Hospital Giessen and Marburg, Philipps-University Marburg, Marburg, Germany.
⁷ Marburg Gynecological Cancer Center, Giessen and Marburg University Hospital, Philipps-University Marburg, Marburg, Germany.
⁸ Department of Obstetrics and Gynecology, University Hospital Ulm, University of Ulm, Ulm, Germany.
⁹ Commission Digital Medicine, German Society for Gynecology and Obstetrics (DGGG), Berlin, Germany.
¹⁰ Halle Gynecological Cancer Center, Halle University Hospital, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany.

PMID: 39382778
PMCID: PMC11464535
DOI: 10.1007/s00432-024-05964-3

Abstract

Purpose: Large language models (LLM) show potential for decision support in breast cancer care. Their use in clinical care is currently prohibited by lack of control over sources used for decision-making, explainability of the decision-making process and health data security issues. Recent development of Small Language Models (SLM) is discussed to address these challenges. This preclinical proof-of-concept study tailors an open-source SLM to the German breast cancer guideline (BC-SLM) to evaluate initial clinical accuracy and technical functionality in a preclinical simulation.

Methods: A multidisciplinary tumor board (MTB) is used as the gold-standard to assess the initial clinical accuracy in terms of concordance of the BC-SLM with MTB and comparing it to two publicly available LLM, ChatGPT3.5 and 4. The study includes 20 fictional patient profiles and recommendations for 5 treatment modalities, resulting in 100 binary treatment recommendations (recommended or not recommended). Statistical evaluation includes concordance with MTB in % including Cohen's Kappa statistic (κ). Technical functionality is assessed qualitatively in terms of local hosting, adherence to the guideline and information retrieval.

Results: The overall concordance amounts to 86% for BC-SLM (κ = 0.721, p < 0.001), 90% for ChatGPT4 (κ = 0.820, p < 0.001) and 83% for ChatGPT3.5 (κ = 0.661, p < 0.001). Specific concordance for each treatment modality ranges from 65 to 100% for BC-SLM, 85-100% for ChatGPT4, and 55-95% for ChatGPT3.5. The BC-SLM is locally functional, adheres to the standards of the German breast cancer guideline and provides referenced sections for its decision-making.

Conclusion: The tailored BC-SLM shows initial clinical accuracy and technical functionality, with concordance to the MTB that is comparable to publicly-available LLMs like ChatGPT4 and 3.5. This serves as a proof-of-concept for adapting a SLM to an oncological disease and its guideline to address prevailing issues with LLM by ensuring decision transparency, explainability, source control, and data security, which represents a necessary step towards clinical validation and safe use of language models in clinical oncology.

Keywords: Artificial intelligence; Breast cancer; Clinical oncology; Large language model; Small language model.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1**
Generic visualization of the BC-SLM frontend

**Fig. 2**
Overview of the technological design concept

**Fig. 3**
Structure of the preclinical simulation

See this image and copyright information in PMC

References

1. Baca SC, Seo JH, Davidsohn MP et al (2023) Liquid biopsy epigenomic profiling for cancer subtyping. Nat Med 29:2737–2741. 10.1038/s41591-023-02605-z - PMC - PubMed
1. Basu P, Mukhopadhyay A, Konishi I (2018) Targeted therapy for gynecologic cancers: toward the era of precision medicine. Int J Gynecol Obstet 143:131–136. 10.1002/ijgo.12620 - PubMed
1. Benary M, Wang XD, Schmidt M et al (2023) Leveraging large language models for decision support in personalized oncology. JAMA Netw Open 6:E2343689. 10.1001/jamanetworkopen.2023.43689 - PMC - PubMed
1. Boca SM, Panagiotou R, Shruti et al (2018) Future of evidence synthesis in precision oncology: between systematic seviews and biocuration. JCO Precis Oncol 2. 10.1200/PO.17.00175 - PMC - PubMed
1. Borchert F, Lohr C, Modersohn L et al (2022) GGPONC 2.0-The German clinical guideline corpus for oncology: curation workflow, annotation policy, baseline NER raggers. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3650–3660, Marseille, France. European Language Resources Association

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach

Affiliations

Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical