Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Nov 24;7 Suppl 3(Suppl 3):S2.
doi: 10.1186/1471-2105-7-S3-S2.

Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approaches

Affiliations
Comparative Study

Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approaches

Sampo Pyysalo et al. BMC Bioinformatics. .

Abstract

Background: We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for its effect on parsing performance and consider combinations of these approaches.

Results: In addition to a 45% increase in parsing efficiency, we find that the best approach, incorporating information from a domain part-of-speech tagger, offers a statistically significant 10% relative decrease in error.

Conclusion: When available, a high-quality domain part-of-speech tagger is the best solution to unknown word issues in the domain adaptation of a general parser. In the absence of such a resource, surface clues can provide remarkably good coverage and performance when tuned to the domain. The adapted parser is available under an open-source license.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example Link Grammar Parser linkages.
Figure 2
Figure 2
Vocabulary handling in the interaction and transcript corpora.

Similar articles

Cited by

References

    1. Sekine S. Proceedings of the 5th ACL Conference on Applied Natural Language Processing (ANLP'97) Washington D.C., USA; 1997. The Domain Dependence of Parsing; pp. 96–102.
    1. Grishman R. Adaptive Information Extraction and Sublanguage Analysis. In: Nebel B, editor. Proceedings of the Workshop on Adaptive Text Extraction and Mining at the 17th International Joint Conference on Artificial Intelligence (IJCAI'01) Seattle, USA; 2001.
    1. Lease M, Charniak E. Parsing Biomedical Literature. In: Dale R, Wong KF, Su J, Kwong OY, editor. Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP'05) Korea: Springer; 2005. pp. 58–69.
    1. Pyysalo S, Ginter F, Pahikkala T, Boberg J, Järvinen J, Salakoski T. Evaluation of Two Dependency Parsers on Biomedical Corpus Targeted at Protein-Protein Interactions. Int J Med Inform. 2006;75:430–442. - PubMed
    1. Blaschke C, Andrade MA, Ouzounis CA, Valencia A. Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions. In: Lengauer T, Schneider R, Bork P, Brutlag DL, Glasgow JI, Mewes HW, Zimmer R, editor. Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology (ISMB'99) 1999. pp. 60–67. - PubMed

Publication types

LinkOut - more resources