Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan-Feb;16(1):89-102.
doi: 10.1197/jamia.M2541. Epub 2008 Oct 24.

Auditing the semantic completeness of SNOMED CT using formal concept analysis

Affiliations

Auditing the semantic completeness of SNOMED CT using formal concept analysis

Guoqian Jiang et al. J Am Med Inform Assoc. 2009 Jan-Feb.

Abstract

Objective: This study sought to develop and evaluate an approach for auditing the semantic completeness of the SNOMED CT contents using a formal concept analysis (FCA)-based model.

Design: We developed a model for formalizing the normal forms of SNOMED CT expressions using FCA. Anonymous nodes, identified through the analyses, were retrieved from the model for evaluation. Two quasi-Poisson regression models were developed to test whether anonymous nodes can evaluate the semantic completeness of SNOMED CT contents (Model 1), and for testing whether such completeness differs between 2 clinical domains (Model 2). The data were randomly sampled from all the contexts that could be formed in the 2 largest domains: Procedure and Clinical Finding. Case studies (n = 4) were performed on randomly selected anonymous node samples for validation.

Measurements: In Model 1, the outcome variable is the number of fully defined concepts within a context, while the explanatory variables are the number of lattice nodes and the number of anonymous nodes. In Model 2, the outcome variable is the number of anonymous nodes and the explanatory variables are the number of lattice nodes and a binary category for domain (Procedure/Clinical Finding).

Results: A total of 5,450 contexts from the 2 domains were collected for analyses. Our findings revealed that the number of anonymous nodes had a significant negative correlation with the number of fully defined concepts within a context (p < 0.001). Further, the Clinical Finding domain had fewer anonymous nodes than the Procedure domain (p < 0.001). Case studies demonstrated that the anonymous nodes are an effective index for auditing SNOMED CT.

Conclusion: The anonymous nodes retrieved from FCA-based analyses are a candidate proxy for the semantic completeness of the SNOMED CT contents. Our novel FCA-based approach can be useful for auditing the semantic completeness of SNOMED CT contents, or any large ontology, within or across domains.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A long normal form for “hypophysectomy.” The semantics of the long normal form may be interpreted as that hypophysectomy is a subtype of procedure which is defined by the conditions “method = excision-action” and “procedure site = pituitary structure.” The square brackets indicate pair of a concept identifier with its preferred name (separated by bar “|”). The curly brackets indicate the conditions used for defining “procedure.”
Figure 2
Figure 2
A line diagram of concept lattice for the domain “hypophysectomy.” The arrows indicate those nodes that were called “anonymous node.”
Figure 3
Figure 3
The sample domain “Open wound of shoulder region and upper limb with tendon involvement (SCTID_269176007)” in 20070131 version of SNOMED CT. This figure is a part of screenshot of CliniClue 2006—Terminology Browser (http://www.clinical-info.co.uk).
Figure 4
Figure 4
The sample domain “Phlebitis of intracranial venous sinus (SCTID_18058007)” in 20070131 version of SNOMED CT. This figure is a part of screenshot of CliniClue 2006—Terminology Browser (http://www.clinical-info.co.uk).
Figure 5
Figure 5
The sample domain “Operation on vas deferens (SCTID_23304006)” in 20070131 version of SNOMED CT. This figure is a part of screenshot of CliniClue 2006—Terminology Browser (http://www.clinical-info.co.uk).
Figure 6
Figure 6
The sample domain “Serologic test for herpes virus (SCTID_14421005)” in 20070131 version of SNOMED CT. This figure is a part of screenshot of CliniClue 2006—Terminology Browser (http://www.clinical-info.co.uk).

References

    1. ISO 17115 Health Informatics—Vocabulary for Terminological Systems1st edition. Geneva, Switzerland: The International Organization for Standardization; 2007. http://www.iso.org .
    1. Chute CG, Cohn SP, Campbell JR, ANSI Healthcare Informatics Standards Board Vocabulary Working Group and the Computer-Based Patient Records Institute Working Group on Codes and Structures A framework for comprehensive health terminology systems in the United States: development guidelines, criteria for selection, and public policy implications J Am Med Inform Assoc 1998;5:503-510. - PMC - PubMed
    1. Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-first century Methods Inf Med 1998;37:394-403. - PMC - PubMed
    1. ISO 1087-1: Terminology Work—Vocabulary, Part 1: Theory and Application: Technical Committee TC 37/SC 1; ISO Standards—Terminology (Principles and Coordination). 1996. The International Organization for Standardization (Geneva, Switzerland. http://www.iso.org).
    1. ISO 1087-2: Terminology Work—Vocabulary, Part 2: Computer Applications: Technical Committee TC 37/SC 1; ISO Standards—Computer Applications for Terminology. 1996. The International Organization for Standardization (Geneva, Switzerland. http://www.iso.org).

Publication types