GENIA corpus--semantically annotated corpus for bio-textmining

J-D Kim¹, T Ohta, Y Tateisi, J Tsujii

Affiliations

PMID: 12855455
DOI: 10.1093/bioinformatics/btg1023

GENIA corpus--semantically annotated corpus for bio-textmining

J-D Kim et al. Bioinformatics. 2003.

. 2003:19 Suppl 1:i180-2.

doi: 10.1093/bioinformatics/btg1023.

Authors

J-D Kim¹, T Ohta, Y Tateisi, J Tsujii

Affiliation

¹ CREST, Japan Science and Technology Corporation, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.

PMID: 12855455
DOI: 10.1093/bioinformatics/btg1023

Abstract

Motivation: Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of this literature, however, causes a major bottleneck for applying NLP techniques. GENIA corpus is being developed to provide reference materials to let NLP techniques work for bio-textmining.

Results: GENIA corpus version 3.0 consisting of 2000 MEDLINE abstracts has been released with more than 400,000 words and almost 100,000 annotations for biological terms.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

GENIA corpus--semantically annotated corpus for bio-textmining

Affiliation

GENIA corpus--semantically annotated corpus for bio-textmining

Authors

Affiliation

Abstract

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources