Toward Automated Clinical Transcriptions

Mitchell A Klusty¹, W Vaiden Logan¹, Samuel E Armstrong¹, Aaron D Mullen¹, Caroline N Leach¹, Ken Calvert¹, Jeff Talbert¹, V K Cody Bumgardner¹

Affiliations

PMID: 40502215
PMCID: PMC12150720

Toward Automated Clinical Transcriptions

Mitchell A Klusty et al. AMIA Jt Summits Transl Sci Proc. 2025.

. 2025 Jun 10:2025:235-241.

eCollection 2025.

Authors

Mitchell A Klusty¹, W Vaiden Logan¹, Samuel E Armstrong¹, Aaron D Mullen¹, Caroline N Leach¹, Ken Calvert¹, Jeff Talbert¹, V K Cody Bumgardner¹

Affiliation

¹ University of Kentucky, Lexington, KY, USA.

PMID: 40502215
PMCID: PMC12150720

Abstract

Administrative documentation is a major driver of rising healthcare costs and is linked to adverse outcomes, including physician burnout and diminished quality of care. This paper introduces a secure system that applies recent advancements in speech-to-text transcription and speaker-labeling (diarization) to patient-provider conversations. This system is optimized to produce accurate transcriptions and highlight potential errors to promote rapid human verification, further reducing the necessary manual effort. Applied to over 40 hours of simulated conversations, this system offers a promising foundation for automating clinical transcriptions.

PubMed Disclaimer

Figures

**Figure 1**
Diagram of the full system showing how each individual component interacts

**Figure 2**
Example showing calculation of transcription-diarization overlap to predict the speaker

**Figure 3**
A graph showing the distributions of Word Error Rates

**Figure 4**
A graph showing the distributions of mislabeled speakers

**Figure 5**
Pie charts detailing the percentages of words in the original text and transcribed text

**Figure 6**
Bar graph showing the breakdown of incorrect words in the transcription and the percentage of words with the speaker mislabeled, separated by domain

See this image and copyright information in PMC

References

1. Bredin H. pyannote-audio [Internet] GitHub. 2023 [cited 2024 Sep 16] Available from: https://github.com/pyannote/pyannote-audio .
1. Radford A, Kim J. W, Xu T, Brockman G, McLeavey C, Sutskever I. Robust speech recognition via large-scale weak supervision [Internet] arXiv. 2022 [cited 2024 Sep 16] Available from: https://arxiv.org/abs/2212.04356 .
1. MinIO documentation MinIO for kubernetes [Internet] 2023 [cited 2024 Sep 16] Available from: https://min.io/docs/minio/kubernetes/upstream/
1. ClearML documentation [Internet] 2023 [cited 2024 Sep 16] Available from: https://clear.ml/docs/latest/docs/
1. Bredin H. Pyannote speaker-diarization-3.1 [Internet] 2024 [cited 2024 Sep 16] Available from: https://huggingface.co/pyannote/speaker-diarization-3.1 .

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Toward Automated Clinical Transcriptions

Affiliation

Toward Automated Clinical Transcriptions

Authors

Affiliation

Abstract

Figures

References

LinkOut - more resources

Full Text Sources