Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep;6(3):195-204.
doi: 10.1007/s12021-008-9031-0. Epub 2008 Oct 24.

Textpresso for neuroscience: searching the full text of thousands of neuroscience research papers

Affiliations

Textpresso for neuroscience: searching the full text of thousands of neuroscience research papers

Hans-Michael Müller et al. Neuroinformatics. 2008 Sep.

Abstract

Textpresso is a text-mining system for scientific literature. Its two major features are access to the full text of research papers and the development and use of categories of biological concepts as well as categories that describe or relate objects. A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. Here we describe Textpresso for Neuroscience, part of the core Neuroscience Information Framework (NIF). The Textpresso site currently consists of 67,500 full text papers and 131,300 abstracts. We show that using categories in literature can make a pure keyword query more refined and meaningful. We also show how semantic queries can be formulated with categories only. We explain the build and content of the database and describe the main features of the web pages and the advanced search options. We also give detailed illustrations of the web service developed to provide programmatic access to Textpresso. This web service is used by the NIF interface to access Textpresso. The standalone website of Textpresso for Neuroscience can be accessed at http://www.textpresso.org/neuroscience/.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Example of complex query without any keywords: What TRP channels are associated with particular neural cell types in specific brain areas? No keywords are used but three categories. 18 sentences are identified in 8 papers from 67,500 papers. Note that this query returns more hits than when replacing the category ‘TRP channel’ with the keyword ‘TRPC1’ as the category comprises more terms than just one keyword
Fig. 2
Fig. 2
Example of a more complex query: Are any drugs of abuse other than nicotine associated with nicotinic receptors? Keywords nicotinic receptor but excluding nicotine, and category Prescription Drugs of Abuse returns 150 matches in 79 documents from 67,500 papers
Fig. 3
Fig. 3
Example search results. Bibliography and matching sentences are displayed
Fig. 4
Fig. 4
Advanced search options are available to refine queries
Fig. 5
Fig. 5
The web service ‘search’ is available for automated queries
Fig. 6
Fig. 6
After an automated search has been performed, bibliographies and matching sentences can be retrieved with the web service ‘retrieve’

References

    1. Chen D, Müller H-M, Sternberg PW. Automatic document classification of biological literature. BMC Bioinformatics. 2006;7:370. doi: 10.1186/1471-2105-7-370. - DOI - PMC - PubMed
    1. Doms A, Schroeder M. GoPubMed: exploring PubMed with the Gene Ontology. Nucleic Acids Research. 2005;33:W783–W786. doi: 10.1093/nar/gki470. - DOI - PMC - PubMed
    1. Gardner D, Akil H, Ascoli GA, Bowden DM, Bug W, Donohue DE, Goldberg DH, Grafstein B, Grethe JS, Gupta A, Halavi M, Kennedy DN, Marenco L, Martone ME, Miller PL, Müller HM, Robert A, Shepherd GM, Sternberg PW, Van Essen DC, Williams RW. The Neuroscience Information Framework: a data and knowledge environment for neuroscience. Neuroinformatics. 2008 this issue. - PMC - PubMed
    1. Gupta A, Bug W, Marenco L, Qian X, Condit C, Rangarajan A, Müller HM, Miller PL, Sanders B, Grethe JS, Astakhov V, Shepherd GM, Sternberg PW, Martone ME. Federated access to heterogeneous information resources in the Neuroscience Information Framework (NIF) Neuroinformatics. 2008 this issue. - PMC - PubMed
    1. Hunter L, Cohen KB. Biomedical language processing: perspective what’s beyond PubMed? Molecular Cell. 2006;21:589–594. doi: 10.1016/j.molcel.2006.02.012. - DOI - PMC - PubMed

Publication types

MeSH terms