Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun 26;13 Suppl 11(Suppl 11):S1.
doi: 10.1186/1471-2105-13-S11-S1.

The Genia Event and Protein Coreference tasks of the BioNLP Shared Task 2011

Affiliations

The Genia Event and Protein Coreference tasks of the BioNLP Shared Task 2011

Jin-Dong Kim et al. BMC Bioinformatics. .

Abstract

Background: The Genia task, when it was introduced in 2009, was the first community-wide effort to address a fine-grained, structural information extraction from biomedical literature. Arranged for the second time as one of the main tasks of BioNLP Shared Task 2011, it aimed to measure the progress of the community since 2009, and to evaluate generalization of the technology to full text papers. The Protein Coreference task was arranged as one of the supporting tasks, motivated from one of the lessons of the 2009 task that the abundance of coreference structures in natural language text hinders further improvement with the Genia task.

Results: The Genia task received final submissions from 15 teams. The results show that the community has made a significant progress, marking 74% of the best F-score in extracting bio-molecular events of simple structure, e.g., gene expressions, and 45% ~ 48% in extracting those of complex structure, e.g., regulations. The Protein Coreference task received 6 final submissions. The results show that the coreference resolution performance in biomedical domain is lagging behind that in newswire domain, cf. 50% vs. 66% in MUC score. Particularly, in terms of protein coreference resolution the best system achieved 34% in F-score.

Conclusions: Detailed analysis performed on the results improves our insight into the problem and suggests the directions for further improvements.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Event annotation example.
Figure 2
Figure 2
Protein coreference annotation.
Figure 3
Figure 3
Event distribution in different sections. The interval of the contour lines is 5%. For example, in the Methods and Caption sections, 40% of the events are of Gene_expression.

Similar articles

Cited by

References

    1. Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J. Overview of BioNLP'09 Shared Task on Event Extraction. Proceedings of Natural Language Processing in Biomedicine (BioNLP) NAACL 2009 Workshop. 2009. pp. 1–9.http://aclweb.org/anthology-new/W/W09/W09-1401.pdf
    1. Miwa M, Sætre R, Kim JD, Tsujii J. Event Extraction with Complex Event Classification Using Rich Features. Journal of Bioinformatics and Computational Biology (JBCB) 2010;8:131–146. doi: 10.1142/S0219720010004586. http://www.worldscinet.com/jbcb/08/0801/S0219720010004586.html - DOI - PubMed
    1. Poon H, Vanderwende L. Joint Inference for Knowledge Extraction from Biomedical Literature. Proceedings of NAACL-HLT'10. 2010. pp. 813–821.http://aclweb.org/anthology-new/N/N10/N10-1123.pdf
    1. Vlachos A. Two Strong Baselines for the BioNLP 2009 Event Extraction Task. Proceedings of BioNLP'10. 2010. pp. 1–9.http://aclweb.org/anthology-new/W/W10/W10-1901.pdf
    1. Miwa M, Pyysalo S, Hara T, Tsujii J. A Comparative Study of Syntactic Parsers for Event Extraction. Proceedings of BioNLP'10. 2010. pp. 37–45.http://aclweb.org/anthology-new/W/W10/W10-1905.pdf

Publication types

LinkOut - more resources