Launching into clinical space with medspaCy: a new clinical text processing toolkit in Python
- PMID: 35308962
- PMCID: PMC8861690
Launching into clinical space with medspaCy: a new clinical text processing toolkit in Python
Abstract
Despite impressive success of machine learning algorithms in clinical natural language processing (cNLP), rule-based approaches still have a prominent role. In this paper, we introduce medspaCy, an extensible, open-source cNLP library based on spaCy framework that allows flexible integration of rule-based and machine learning-based algorithms adapted to clinical text. MedspaCy includes a variety of components that meet common cNLP needs such as context analysis and mapping to standard terminologies. By utilizing spaCy's clear and easy-to-use conventions, medspaCy enables development of custom pipelines that integrate easily with other spaCy-based modules. Our toolkit includes several core components and facilitates rapid development of pipelines for clinical text.
©2021 AMIA - All rights reserved.
Figures
References
-
- Ferrucci D, Lally A. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering. 2004 9;10(3-4):327–348.
-
- Cunningham H. GATE, a general architecture for text engineering. Computers and the Humanities. 2002;36(2):223–254.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources