A neural joint model for entity and relation extraction from biomedical text

Fei Li¹, Meishan Zhang², Guohong Fu², Donghong Ji³

Affiliations

¹ School of Computer, Wuhan University, Bayi Road, Wuhan, China.
² School of Computer Science and Technology, Heilongjiang University, Xuefu Road, Harbin, China.
³ School of Computer, Wuhan University, Bayi Road, Wuhan, China. dhji@whu.edu.cn.

PMID: 28359255
PMCID: PMC5374588
DOI: 10.1186/s12859-017-1609-9

A neural joint model for entity and relation extraction from biomedical text

Fei Li et al. BMC Bioinformatics. 2017.

. 2017 Mar 31;18(1):198.

doi: 10.1186/s12859-017-1609-9.

Authors

Fei Li¹, Meishan Zhang², Guohong Fu², Donghong Ji³

Affiliations

¹ School of Computer, Wuhan University, Bayi Road, Wuhan, China.
² School of Computer Science and Technology, Heilongjiang University, Xuefu Road, Harbin, China.
³ School of Computer, Wuhan University, Bayi Road, Wuhan, China. dhji@whu.edu.cn.

PMID: 28359255
PMCID: PMC5374588
DOI: 10.1186/s12859-017-1609-9

Abstract

Background: Extracting biomedical entities and their relations from text has important applications on biomedical research. Previous work primarily utilized feature-based pipeline models to process this task. Many efforts need to be made on feature engineering when feature-based models are employed. Moreover, pipeline models may suffer error propagation and are not able to utilize the interactions between subtasks. Therefore, we propose a neural joint model to extract biomedical entities as well as their relations simultaneously, and it can alleviate the problems above.

Results: Our model was evaluated on two tasks, i.e., the task of extracting adverse drug events between drug and disease entities, and the task of extracting resident relations between bacteria and location entities. Compared with the state-of-the-art systems in these tasks, our model improved the F1 scores of the first task by 5.1% in entity recognition and 8.0% in relation extraction, and that of the second task by 9.2% in relation extraction.

Conclusions: The proposed model achieves competitive performances with less work on feature engineering. We demonstrate that the model based on neural networks is effective for biomedical entity and relation extraction. In addition, parameter sharing is an alternative method for neural models to jointly process this task. Our work can facilitate the research on biomedical text mining.

Keywords: Biomedical text; Entity recognition; Joint model; Neural network; Relation extraction.

PubMed Disclaimer

Figures

**Fig. 1**
The CNN for extracting character-level representations. A rectangular grid indicates a vector and a square indicates one dimension of this vector, so character embeddings or representations can be denoted as n-dimensional vectors. Shading rectangular grids indicate special padding vectors

**Fig. 2**
The Bi-LSTM-RNN for biomedical entity recognition. *Rectangular grids* indicate vectors of feature embeddings or representations. At the *bottom*, three kinds of vectors are concatenated and fed into LSTMs. *Dashed arrow lines* denote bottom-up computations along the network framework and *solid arrow lines* denote left-to-right computations along the sentence

**Fig. 3**
The Bi-LSTM-RNN for relation classification. The input sentence is tokenized before it is analyzed by a dependency parser. Tokens are indexed by Arabic numerals. Basic (a.k.a, projective) dependency style is utilized to build a tree. The bold lines in the tree denote the shortest dependency path (SDP) between “gliclazide” and “hepatitis” with their lowest common ancestor “induced”. x _i indicates the input vector of a LSTM unit as shown in Eq. 6 and i corresponds to the index of a token. In the Bi-LSTM-RNN layer, solid arrow lines denote bottom-up and top-down computations along the SDP in the dependency tree. ↑ h _a, ↑ h _b, ↓ h _a, ↓ h _b are listed in Eq. 8

See this image and copyright information in PMC

Cited by

OGER++: hybrid multi-type entity recognition.
Furrer L, Jancso A, Colic N, Rinaldi F. Furrer L, et al. J Cheminform. 2019 Jan 21;11(1):7. doi: 10.1186/s13321-018-0326-3. J Cheminform. 2019. PMID: 30666476 Free PMC article.
Opportunities and obstacles for deep learning in biology and medicine.
Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Ferrero E, Agapow PM, Zietz M, Hoffman MM, Xie W, Rosen GL, Lengerich BJ, Israeli J, Lanchantin J, Woloszynek S, Carpenter AE, Shrikumar A, Xu J, Cofer EM, Lavender CA, Turaga SC, Alexandari AM, Lu Z, Harris DJ, DeCaprio D, Qi Y, Kundaje A, Peng Y, Wiley LK, Segler MHS, Boca SM, Swamidass SJ, Huang A, Gitter A, Greene CS. Ching T, et al. J R Soc Interface. 2018 Apr;15(141):20170387. doi: 10.1098/rsif.2017.0387. J R Soc Interface. 2018. PMID: 29618526 Free PMC article. Review.
Bio-semantic relation extraction with attention-based external knowledge reinforcement.
Li Z, Lian Y, Ma X, Zhang X, Li C. Li Z, et al. BMC Bioinformatics. 2020 May 24;21(1):213. doi: 10.1186/s12859-020-3540-8. BMC Bioinformatics. 2020. PMID: 32448122 Free PMC article.
A Joint Extraction System Based on Conditional Layer Normalization for Health Monitoring.
Shi B, Fan R, Zhang L, Huang J, Xiong N, Vasilakos A, Wan J, Zhang L. Shi B, et al. Sensors (Basel). 2023 May 16;23(10):4812. doi: 10.3390/s23104812. Sensors (Basel). 2023. PMID: 37430725 Free PMC article.
Spatial Relation Extraction from Radiology Reports using Syntax-Aware Word Representations.
Datta S, Roberts K. Datta S, et al. AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:116-125. eCollection 2020. AMIA Jt Summits Transl Sci Proc. 2020. PMID: 32477630 Free PMC article.

See all "Cited by" articles

References

1. Wei C, Peng Y, Leaman R, Davis AP, Mattingly CJ, Li J, Wiegers TC, Lu Z. Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task. Database. 2016;2016:1–8. doi: 10.1093/database/baw140. - DOI - PMC - PubMed
1. Pyysalo S, Ginter F, Heimonen J, Björne J, Boberg J, Järvinen J, Salakoski T. Bioinfer: a corpus for information extraction in the biomedical domain. BMC Bioinforma. 2007;8:266–7. doi: 10.1186/1471-2105-8-50. - DOI - PMC - PubMed
1. Segura-Bedmar I, Martínez P, Herrero-Zazo M. Proceedings of the 7th International Workshop on Semantic Evaluation. Atlanta: Association for Computational Linguistics; 2013. Semeval-2013 task 9 : Extraction of drug-drug interactions from biomedical texts (ddiextraction 2013)
1. Gurulingappa H, Mateen-Rajput A, Roberts A, Fluck J, Hofmann-Apitius M, Toldo L. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects frommedical case reports. J Biomed Inform. 2012;45:885–92. doi: 10.1016/j.jbi.2012.04.008. - DOI - PubMed
1. Deléger L, Bossy R, Chaix E, Ba M, Ferré A, Bessières P, Nédellec C. Proceedings of the 4th BioNLP Shared Task Workshop. Berlin: Association for Computational Linguistics; 2016. Overview of the bacteria biotope task at bionlp shared task 2016.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A neural joint model for entity and relation extraction from biomedical text

Affiliations

A neural joint model for entity and relation extraction from biomedical text

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources