Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 24;21(1):213.
doi: 10.1186/s12859-020-3540-8.

Bio-semantic relation extraction with attention-based external knowledge reinforcement

Affiliations

Bio-semantic relation extraction with attention-based external knowledge reinforcement

Zhijing Li et al. BMC Bioinformatics. .

Abstract

Background: Semantic resources such as knowledge bases contains high-quality-structured knowledge and therefore require significant effort from domain experts. Using the resources to reinforce the information retrieval from the unstructured text may further exploit the potentials of such unstructured text resources and their curated knowledge.

Results: The paper proposes a novel method that uses a deep neural network model adopting the prior knowledge to improve performance in the automated extraction of biological semantic relations from the scientific literature. The model is based on a recurrent neural network combining the attention mechanism with the semantic resources, i.e., UniProt and BioModels. Our method is evaluated on the BioNLP and BioCreative corpus, a set of manually annotated biological text. The experiments demonstrate that the method outperforms the current state-of-the-art models, and the structured semantic information could improve the result of bio-text-mining.

Conclusion: The experiment results show that our approach can effectively make use of the external prior knowledge information and improve the performance in the protein-protein interaction extraction task. The method should be able to be generalized for other types of data, although it is validated on biomedical texts.

Keywords: Attention mechanism; Bio-text-mining; Biological semantic relation; Knowledge base.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflict of interest related to the research presented herein.

Figures

Fig. 1
Fig. 1
Relation extraction example in BioNLP-2016 competition. In this figure, we give an example of the relation extraction task in BioNLP-2016 competition
Fig. 2
Fig. 2
Flow chart of the proposed system. The processes of our system include preprocessing, word embedding, prior knowledge from UniProt KB, entity representation, BiLSTM, Bio-information retrieval (BioModels), and entity and relation extraction. For the prior knowledge from UniProt KB, we use Bioservices, urllib, BeautifulSoup tool does finish a series of processes. For the Bio-information retrieval (BioModels) part, we apply the attention mechanism to import the prior knowledge into the system. We use the method to do entity extraction and relation extraction. It is mainly about the relation extraction
Fig. 3
Fig. 3
Entity representation. The entity representation includes information from both KB and scientific literature
Fig. 4
Fig. 4
An example of the process of extracting the related entities of the entity “ATERF1” from BioModels. We give an example of a specific process of searching the related entities of the given entity
Fig. 5
Fig. 5
The architecture of the bio-information retrieval from BioModels. The attention mechanism of how to introduce the information from BioModels into the BiLSTM architecture
Fig. 6
Fig. 6
BiLSTM model for the entity and relation extraction. The flow chart of the BiLSTM network to predict the type of the entity and relation

Similar articles

Cited by

References

    1. Chelliah V, Laibe C, Novère NL. Encyclopedia of Systems Biology. New York, NY: Springer; 2013. BioModels database: a repository of mathematical models of biological processes. - PubMed
    1. Krallinger M, Vazquez M, Leitner F, Salgado D, Chatr-Aryamontri A, Winter A, Perfetto L, Briganti L, Licata L, Iannuccelli M, Castagnoli L, Cesareni G, Tyers M, Schneider G, Rinaldi F, Leaman R, Gonzalez G, Matos S, Kim S, Wilbur WJ, Rocha L, Shatkay H, Tendulkar AV, Agarwal S, Liu F, Wang X, Rak R, Noto K, Elkan C, Lu Z, Dogan RI, Fontaine JF, Andrade-Navarro MA, Valencia A. The protein-protein interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text. BMC Bioinformatics. 2011;12(8):S3. doi: 10.1186/1471-2105-12-S8-S3. - DOI - PMC - PubMed
    1. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE. A human protein-protein interaction network: a resource for annotating the proteome. Cell. 2015;122:957–968. doi: 10.1016/j.cell.2005.08.029. - DOI - PubMed
    1. Hua Lei, Quan Chanqin. A Shortest Dependency Path Based Convolutional Neural Network for Protein-Protein Relation Extraction. BioMed Research International. 2016;2016:1–9. - PMC - PubMed
    1. Arighi CN, Lu Z, Krallinger M, Cohen KB, Wilbur WJ, Valencia A, Hirschman L, Wu CH. Overview of the BioCreative III workshop. BMC Bioinformatics. 2011;12(Suppl 8):S1. doi: 10.1186/1471-2105-12-S8-S1. - DOI - PMC - PubMed

LinkOut - more resources