Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003:19 Suppl 1:i180-2.
doi: 10.1093/bioinformatics/btg1023.

GENIA corpus--semantically annotated corpus for bio-textmining

Affiliations

GENIA corpus--semantically annotated corpus for bio-textmining

J-D Kim et al. Bioinformatics. 2003.

Abstract

Motivation: Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of this literature, however, causes a major bottleneck for applying NLP techniques. GENIA corpus is being developed to provide reference materials to let NLP techniques work for bio-textmining.

Results: GENIA corpus version 3.0 consisting of 2000 MEDLINE abstracts has been released with more than 400,000 words and almost 100,000 annotations for biological terms.

PubMed Disclaimer

Publication types

LinkOut - more resources