Text mining and protein annotations: the construction and use of protein description sentences

Martin Krallinger¹, Rainer Malik, Alfonso Valencia

Affiliations

PMID: 17503385

Text mining and protein annotations: the construction and use of protein description sentences

Martin Krallinger et al. Genome Inform. 2006.

. 2006;17(2):121-30.

Authors

Martin Krallinger¹, Rainer Malik, Alfonso Valencia

Affiliation

¹ Dep. Struct. Comp. Biology Spanish National Cancer Centre (CNIO), Melchor Fernández Almagro, 3, E-28029 Madrid, Spain. mkrallinger@cnio.es

PMID: 17503385

Abstract

Existing biological knowledge stored as structured database records has been extracted manually by database curators analyzing the scientific literature. Most of this information was derived from sentences which describe biologically relevant aspects of genes and gene products. We introduce the Protein description sentence (Prodisen) corpus, a useful resource for the automatic identification and construction of text-based protein and gene description records using information extraction and text classification techniques. Basic guidelines and criteria relevant for the construction of a text corpus of functional descriptions of genes and proteins are proposed. The steps used for the corpus construction and its features are presented. Moreover, some of the potential applications of the Prodisen corpus for biomedical text mining purposes are explored and the obtained results are presented.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Text mining and protein annotations: the construction and use of protein description sentences

Affiliation

Text mining and protein annotations: the construction and use of protein description sentences

Authors

Affiliation

Abstract

MeSH terms

Substances