Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019:24:112-123.

Automatic Human-like Mining and Constructing Reliable Genetic Association Database with Deep Reinforcement Learning

Affiliations

Automatic Human-like Mining and Constructing Reliable Genetic Association Database with Deep Reinforcement Learning

Haohan Wang et al. Pac Symp Biocomput. 2019.

Abstract

The increasing amount of scientific literature in biological and biomedical science research has created a challenge in continuous and reliable curation of the latest knowledge discovered, and automatic biomedical text-mining has been one of the answers to this challenge. In this paper, we aim to further improve the reliability of biomedical text-mining by training the system to directly simulate the human behaviors such as querying the PubMed, selecting articles from queried results, and reading selected articles for knowledge. We take advantage of the efficiency of biomedical text-mining, the exibility of deep reinforcement learning, and the massive amount of knowledge collected in UMLS into an integrative artificial intelligent reader that can automatically identify the authentic articles and effectively acquire the knowledge conveyed in the articles. We construct a system, whose current primary task is to build the genetic association database between genes and complex traits of human. Our contributions in this paper are three-fold: 1) We propose to improve the reliability of text-mining by building a system that can directly simulate the behavior of a researcher, and we develop corresponding methods, such as Bi-directional LSTM for text mining and Deep Q-Network for organizing behaviors. 2) We demonstrate the effectiveness of our system with an example in constructing a genetic association database. 3) We release our implementation as a generic framework for researchers in the community to conveniently construct other databases.

PubMed Disclaimer

Figures

Fig. 1:
Fig. 1:
Overview of Eir’s possible behaviors

Similar articles

Cited by

References

    1. Larsen PO and Von Ins M, The rate of growth in scientific publication and the decline in coverage provided by science citation index, Scientometrics 84, 575 (2010). - PMC - PubMed
    1. Raja K, Patrick M, Gao Y, Madu D, Yang Y and Tsoi LC, A review of recent advancement in integrating omics data with literature mining towards biomedical discoveries, International journal of genomics 2017 (2017). - PMC - PubMed
    1. Cohen AM and Hersh WR, A survey of current work in biomedical text mining, Briefings in bioinformatics 6, 57 (2005). - PubMed
    1. Cohen KB and Demner-Fushman D, Biomedical natural language processing (John Benjamins Publishing Company, 2014).
    1. Poste G, Bring on the biomarkers, Nature 469, 156 (2011). - PubMed

Publication types