Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 15;20(Suppl 10):305.
doi: 10.1186/s12911-020-01319-3.

Missing lateral relationships in top-level concepts of an ontology

Affiliations

Missing lateral relationships in top-level concepts of an ontology

Ling Zheng et al. BMC Med Inform Decis Mak. .

Abstract

Background: Ontologies house various kinds of domain knowledge in formal structures, primarily in the form of concepts and the associative relationships between them. Ontologies have become integral components of many health information processing environments. Hence, quality assurance of the conceptual content of any ontology is critical. Relationships are foundational to the definition of concepts. Missing relationship errors (i.e., unintended omissions of important definitional relationships) can have a deleterious effect on the quality of an ontology. An abstraction network is a structure that overlays an ontology and provides an alternate, summarization view of its contents. One kind of abstraction network is called an area taxonomy, and a variation of it is called a subtaxonomy. A methodology based on these taxonomies for more readily finding missing relationship errors is explored.

Methods: The area taxonomy and the subtaxonomy are deployed to help reveal concepts that have a high likelihood of exhibiting missing relationship errors. A specific top-level grouping unit found within the area taxonomy and subtaxonomy, when deemed to be anomalous, is used as an indicator that missing relationship errors are likely to be found among certain concepts. Two hypotheses pertaining to the effectiveness of our Quality Assurance approach are studied.

Results: Our Quality Assurance methodology was applied to the Biological Process hierarchy of the National Cancer Institute thesaurus (NCIt) and SNOMED CT's Eye/vision finding subhierarchy within its Clinical finding hierarchy. Many missing relationship errors were discovered and confirmed in our analysis. For both test-bed hierarchies, our Quality Assurance methodology yielded a statistically significantly higher number of concepts with missing relationship errors in comparison to a control sample of concepts. Two hypotheses are confirmed by these findings.

Conclusions: Quality assurance is a critical part of an ontology's lifecycle, and automated or semi-automated tools for supporting this process are invaluable. We introduced a Quality Assurance methodology targeted at missing relationship errors. Its successful application to the NCIt's Biological Process hierarchy and SNOMED CT's Eye/vision finding subhierarchy indicates that it can be a useful addition to the arsenal of tools available to ontology maintenance personnel.

Keywords: Abstraction network; Error concentration; Missing relationship error; National Cancer Institute thesaurus (NCIt); Omission error; Ontology modeling; Ontology quality assurance; SNOMED CT; Taxonomy.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Concept Cellular Process from NCIt shown in Protégé, including the subclass (IS-A) relationship to Biological Process, and the relationship (role) Biological Process Has Associated Location to Cell
Fig. 2
Fig. 2
a Excerpt of 13 concepts from the NCIt’s Biological Process hierarchy. Upward arrows represent IS-A relationships. Concepts with the same set of relationships are enclosed in a common, colored area. E.g., Cancer Cell Growth Regulation and Morphogenesis have one relationship Part of Process. Areas with the same number of relationships have the same color. E.g., the area {Location} and the area {Part of Process} are green. Area roots, e.g., Cellular Process, have bold outlines. b Area taxonomy for a, composed of five areas. Areas are represented by colored boxes labeled with their sets of relationships and numbers of concepts. They are organized in color-coded levels, according to number of relationships. The three concepts having the Location relationship are now represented by an area box named {Location}. Child-of links between areas are bold arrows; e.g., {Location, Part of Process} on Level 2 and {Location, Initiator BP, Part of Process} on Level 3 are child-of area {Location}
Fig. 3
Fig. 3
Complete area taxonomy for the NCIt’s Biological Process hierarchy. Most child-of’s have been omitted to avoid overload. Note how the importance of the relationship Location is reflected in the area taxonomy. Area {Location} has 207 concepts, and Location appears in 20 of 37 area names
Fig. 4
Fig. 4
An excerpt of the subtaxonomy for the Eye/vision finding subhierarchy in SNOMED CT, presenting 48 areas out of 97 areas in the complete subtaxonomy
Fig. 5
Fig. 5
Path of seven IS-As to the root in the NCIt Biological Process hierarchy
Fig. 6
Fig. 6
Revised area taxonomy for the NCIt BP hierarchy incorporating the confirmed corrections. Pink highlights the areas that are different from the original in Fig. 3

References

    1. Giannangelo K, Fenton SH. SNOMED CT survey: an assessment of implementation in EMR/EHR applications. Perspect Health Inf Manag. 2008;5:7. - PMC - PubMed
    1. Bodenreider O. Biomedical ontologies in action: role in knowledge management, data integration and decision support. Yearb Med Inform. 2008;2008:67–79. - PMC - PubMed
    1. Hoehndorf R, Schofield PN, Gkoutos GV. The role of ontologies in biological and biomedical research: a functional perspective. Brief Bioinform. 2015;16(6):1069–1080. - PMC - PubMed
    1. Ochs C, Case JT, Perl Y. Tracking the remodeling of SNOMED CT's bacterial infectious diseases. In: AMIA annual symposium proceeding, vol 2016; 2016. p. 974–83. - PMC - PubMed
    1. Halper M, Gu H, Perl Y, Ochs C. Abstraction networks for terminologies: supporting management of "big knowledge". Artif Intell Med. 2015;64(1):1–16. - PMC - PubMed

Publication types