Deciphering clinical abbreviations with a privacy protecting machine learning system
- PMID: 36460656
- PMCID: PMC9718734
- DOI: 10.1038/s41467-022-35007-9
Deciphering clinical abbreviations with a privacy protecting machine learning system
Abstract
Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing "HIT" for "heparin induced thrombocytopenia"), ambiguous terms that require expertise to disambiguate (using "MS" for "multiple sclerosis" or "mental status"), or domain-specific vernacular ("cb" for "complicated by"). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data.
© 2022. The Author(s).
Conflict of interest statement
All authors are employed by Google as indicated by the affiliation. Google has filed a provisional patent application 63/269,420 that is related to this article.
Figures



References
-
- Federal Rules Mandating Open Notes. 2022. https://www.opennotes.org/onc-federal-rule/.
-
- Chemali, M., Hibbert, E. J. & Sheen, A General practitioner understanding of abbreviations used in hospital discharge letters. Med. J. Aust.203, 147, 147e.1–4. (2015). - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical