Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;8(2):e55674.
doi: 10.1371/journal.pone.0055674. Epub 2013 Feb 18.

Utilizing descriptive statements from the biodiversity heritage library to expand the Hymenoptera Anatomy Ontology

Affiliations

Utilizing descriptive statements from the biodiversity heritage library to expand the Hymenoptera Anatomy Ontology

Katja C Seltmann et al. PLoS One. 2013.

Abstract

Hymenoptera, the insect order that includes sawflies, bees, wasps, and ants, exhibits an incredible diversity of phenotypes, with over 145,000 species described in a corpus of textual knowledge since Carolus Linnaeus. In the absence of specialized training, often spanning decades, however, these articles can be challenging to decipher. Much of the vocabulary is domain-specific (e.g., Hymenoptera biology), historically without a comprehensive glossary, and contains much homonymous and synonymous terminology. The Hymenoptera Anatomy Ontology was developed to surmount this challenge and to aid future communication related to hymenopteran anatomy, as well as provide support for domain experts so they may actively benefit from the anatomy ontology development. As part of HAO development, an active learning, dictionary-based, natural language recognition tool was implemented to facilitate Hymenoptera anatomy term discovery in literature. We present this tool, referred to as the 'Proofer', as part of an iterative approach to growing phenotype-relevant ontologies, regardless of domain. The process of ontology development results in a critical mass of terms that is applied as a filter to the source collection of articles in order to reveal term occurrence and biases in natural language species descriptions. Our results indicate that taxonomists use domain-specific terminology that follows taxonomic specialization, particularly at superfamily and family level groupings and that the developed Proofer tool is effective for term discovery, facilitating ontology construction.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Screenshot of the mx interface for string matching terms in the database with OCR text.
Possible additional new terms are proposed for the user to include.
Figure 2
Figure 2. Most commonly used anatomical terms in Hymenoptera.
Terms in this figure are ranked based on occurrence among all articles (how many articles a term occurred). Number on chart and size of pie represents the number of total times the term occurred in all articles.
Figure 3
Figure 3. The number of characters (terms) present in at least 2, 10, 50, and 100 articles.
Figure 4
Figure 4. Variation of number of returned clusters based on clustering method and term occurrence in articles.
Figure 5
Figure 5. Sorensen Average tree with superfamily name, and number of groupings calculated to superfamily level.
The tree represented is the entire, untrimmed tree and the number after the superfamily is the number of groupings retrieved when the tree is trimmed.

References

    1. Bodenreider O (2006) Lexical, terminological and ontological resources for biological text mining. In: Ananiadou S, McNaught J, editors. Text Mining for Biology and Biomedicine. Boston and London: Artech House. 43–66.
    1. International Code of Zoological Nomenclature website. Available: http://iczn.org/code. Accessed 2012 Oct 8.
    1. International Commission on Zoological Nomenclature (2012) Amendment of Articles 8, 9, 10, 21 and 78 of the International Code of Zoological Nomenclature to expand and refine methods of publication. ZooKeys 219: 1–10 doi:10.3897/zookeys.219.3944. - DOI - PMC - PubMed
    1. Yoder MJ, Mikó I, Seltmann KC, Bertone MA, Deans AR (2010) A gross anatomy ontology for Hymenoptera. PLoS ONE 5: e15991. - PMC - PubMed
    1. Seltmann K, Yoder M, Miko I, Forshage M, Bertone M, et al. (2012) A hymenopterists’ guide to the Hymenoptera Anatomy Ontology: utility, clarification, and future directions. Journal of Hymenoptera Research 27: 67.

Publication types

LinkOut - more resources