Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Nov 25:2016:baw145.
doi: 10.1093/database/baw145. Print 2016.

Text mining resources for the life sciences

Affiliations
Review

Text mining resources for the life sciences

Piotr Przybyła et al. Database (Oxford). .

Abstract

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability.

PubMed Disclaimer

References

    1. Vardakas K.Z., Tsopanakis G., Poulopoulou A., Falagas M.E. (2015) An analysis of factors contributing to PubMed’s growth. J Informetrics, 9, 592–617.
    1. Druss B.G., Marcus S.C. (2005) Growth and decentralization of the medical literature: implications for evidence-based medicine. J Med. Libr. Assoc., 93, 499–501. - PMC - PubMed
    1. Larsen P.O., von Ins M. (2010) The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84, 575–603. - PMC - PubMed
    1. Simpson M.S., Demner-Fushman D. (2012) Biomedical text mining: a survey of recent progress In: Aggarwal C.C., Zhai C. (eds). Mining Text Data. Springer, New York, pp. 465–517.
    1. Ananiadou S., Kell D.B., Tsujii J. (2006) Text mining and its potential applications in systems biology. Trends Biotechnol., 24, 571–579. - PubMed

Publication types