Enhancing phenotype recognition in clinical notes using large language models: PhenoBCBERT and PhenoGPT
- PMID: 38264716
- PMCID: PMC10801236
- DOI: 10.1016/j.patter.2023.100887
Enhancing phenotype recognition in clinical notes using large language models: PhenoBCBERT and PhenoGPT
Abstract
To enhance phenotype recognition in clinical notes of genetic diseases, we developed two models-PhenoBCBERT and PhenoGPT-for expanding the vocabularies of Human Phenotype Ontology (HPO) terms. While HPO offers a standardized vocabulary for phenotypes, existing tools often fail to capture the full scope of phenotypes due to limitations from traditional heuristic or rule-based approaches. Our models leverage large language models to automate the detection of phenotype terms, including those not in the current HPO. We compare these models with PhenoTagger, another HPO recognition tool, and found that our models identify a wider range of phenotype concepts, including previously uncharacterized ones. Our models also show strong performance in case studies on biomedical literature. We evaluate the strengths and weaknesses of BERT- and GPT-based models in aspects such as architecture and accuracy. Overall, our models enhance automated phenotype detection from clinical texts, improving downstream analyses on human diseases.
Keywords: BERT; GPT; Human Phenotype Ontology; clinical notes; electronic health records; named entity recognition; transformer.
© 2023 The Authors.
Conflict of interest statement
The authors declare no competing interests.
Figures
Update of
-
Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT.ArXiv [Preprint]. 2023 Nov 9:arXiv:2308.06294v2. ArXiv. 2023. Update in: Patterns (N Y). 2023 Dec 05;5(1):100887. doi: 10.1016/j.patter.2023.100887. PMID: 37986722 Free PMC article. Updated. Preprint.
References
-
- Groft S.C., Posada M., Taruscio D. Progress, challenges and global approaches to rare diseases. Acta Paediatr. 2021;110:2711–2716. - PubMed
-
- Hartley T., Lemire G., Kernohan K.D., Howley H.E., Adams D.R., Boycott K.M. New Diagnostic Approaches for Undiagnosed Rare Genetic Diseases. Annu. Rev. Genom. Hum. Genet. 2020;21:351–372. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
