The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings

Affiliations

¹ Department of Biomedical Informatics, University of Arkansas for Medical Sciences, USA.
² Department of Internal Medicine, Washington University, USA.
³ Department of Neurology, University of Arkansas for Medical Sciences, USA.
⁴ Department of Internal Medicine, Tulane University, USA.
⁵ College of Medicine, University of Arkansas for Medical Sciences, USA.
⁶ Department of Biological Sciences, Arkansas State University, USA.
⁷ Department of Population Health Sciences, University of Texas Health Science Centre at San Antonio, USA.
⁸ Division of Gastroenterology and Hepatology, University of Arkansas for Medical Sciences, USA.

PMID: 35373222
PMCID: PMC8970464
DOI: 10.5220/0010903300003123

The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings

Shorabuddin Syed et al. Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb.

. 2022 Feb:5:189-200.

doi: 10.5220/0010903300003123.

Affiliations

¹ Department of Biomedical Informatics, University of Arkansas for Medical Sciences, USA.
² Department of Internal Medicine, Washington University, USA.
³ Department of Neurology, University of Arkansas for Medical Sciences, USA.
⁴ Department of Internal Medicine, Tulane University, USA.
⁵ College of Medicine, University of Arkansas for Medical Sciences, USA.
⁶ Department of Biological Sciences, Arkansas State University, USA.
⁷ Department of Population Health Sciences, University of Texas Health Science Centre at San Antonio, USA.
⁸ Division of Gastroenterology and Hepatology, University of Arkansas for Medical Sciences, USA.

PMID: 35373222
PMCID: PMC8970464
DOI: 10.5220/0010903300003123

Abstract

Colonoscopy is a screening and diagnostic procedure for detection of colorectal carcinomas with specific quality metrics that monitor and improve adenoma detection rates. These quality metrics are stored in disparate documents i.e., colonoscopy, pathology, and radiology reports. The lack of integrated standardized documentation is impeding colorectal cancer research. Clinical concept extraction using Natural Language Processing (NLP) and Machine Learning (ML) techniques is an alternative to manual data abstraction. Contextual word embedding models such as BERT (Bidirectional Encoder Representations from Transformers) and FLAIR have enhanced performance of NLP tasks. Combining multiple clinically-trained embeddings can improve word representations and boost the performance of the clinical NLP systems. The objective of this study is to extract comprehensive clinical concepts from the consolidated colonoscopy documents using concatenated clinical embeddings. We built high-quality annotated corpora for three report types. BERT and FLAIR embeddings were trained on unlabeled colonoscopy related documents. We built a hybrid Artificial Neural Network (h-ANN) to concatenate and fine-tune BERT and FLAIR embeddings. To extract concepts of interest from three report types, 3 models were initialized from the h-ANN and fine-tuned using the annotated corpora. The models achieved best F1-scores of 91.76%, 92.25%, and 88.55% for colonoscopy, pathology, and radiology reports respectively.

Keywords: Clinical Concept Extraction; Colonoscopy; Deep Learning; Natural Language Processing; Word Embeddings.

PubMed Disclaimer

Figures

**Figure 1:**
Colonoscopy taxonomy depicting clinical entities and their classifications. Colonoscopy reports were annotated for entities mentioned in the taxonomy.

**Figure 2:**
Pathology taxonomy depicting clinical entities and their classifications. Pathology reports were annotated for entities mentioned in the taxonomy.

**Figure 3:**
Radiology imaging taxonomy depicting clinical entities and their classifications. Radiology reports were annotated for entities mentioned in the taxonomy.

**Figure 4:**
Workflow depicting training of language models, concatenating embeddings, instantiating and fine-tuning h-ANN models to extract clinical concepts from colonoscopy related documents. GI: Gastroenterology, h-ANN: Hybrid artificial neural network.

**Figure 5:**
The h-ANN architecture depicting embedding, Bi-LSTM, and CRF layers. Concatenated BERT and FLAIR embeddings are given as input features to the Bi-LSTM layer. BERT: Bidirectional Encoder Representations from Transformers.

**Figure 6:**
Training curves for h-ANNpath, h-ANNcol, and h-ANNrad models. F1 score on the test set was measured after completion of each epochs.

See this image and copyright information in PMC

References

1. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, … Zheng XJA. (2016). TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. abs/1603.04467.
1. Abdalla M, Abdalla M, Rudzicz F, & Hirst G (2020). Using word embeddings to improve the privacy of clinical notes. Journal of the American Medical Informatics Association, 27(6), 901–907. doi:10.1093/jamia/ocaa038 - DOI - PMC - PubMed
1. Akbik A, Bergmann T, Blythe D, Rasul K, Schweter S, & Vollgraf R (2019). FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP. 10.18653/v1/n19-4010 - DOI
1. Akbik A, Blythe D, & Vollgraf R (2018). Contextual String Embeddings for Sequence Labeling. Santa Fe, New Mexico, USA: Association for Computational Linguistics.
1. Alsentzer E, Murphy J, Boag W, Weng W-H, Jindi D, Naumann T, & McDermott M (2019). Publicly Available Clinical BERT Embeddings. Minneapolis, Minnesota, USA: Association for Computational Linguistics.

Grants and funding

UL1 TR003107/TR/NCATS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings

Affiliations

The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources