Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Aug:56:333-47.
doi: 10.1016/j.jbi.2015.06.026. Epub 2015 Jul 4.

An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data

Affiliations

An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data

Omri Mugzach et al. J Biomed Inform. 2015 Aug.

Abstract

Objective: Our goal is to create an ontology that will allow data integration and reasoning with subject data to classify subjects, and based on this classification, to infer new knowledge on Autism Spectrum Disorder (ASD) and related neurodevelopmental disorders (NDD). We take a first step toward this goal by extending an existing autism ontology to allow automatic inference of ASD phenotypes and Diagnostic & Statistical Manual of Mental Disorders (DSM) criteria based on subjects' Autism Diagnostic Interview-Revised (ADI-R) assessment data.

Materials and methods: Knowledge regarding diagnostic instruments, ASD phenotypes and risk factors was added to augment an existing autism ontology via Ontology Web Language class definitions and semantic web rules. We developed a custom Protégé plugin for enumerating combinatorial OWL axioms to support the many-to-many relations of ADI-R items to diagnostic categories in the DSM. We utilized a reasoner to infer whether 2642 subjects, whose data was obtained from the Simons Foundation Autism Research Initiative, meet DSM-IV-TR (DSM-IV) and DSM-5 diagnostic criteria based on their ADI-R data.

Results: We extended the ontology by adding 443 classes and 632 rules that represent phenotypes, along with their synonyms, environmental risk factors, and frequency of comorbidities. Applying the rules on the data set showed that the method produced accurate results: the true positive and true negative rates for inferring autistic disorder diagnosis according to DSM-IV criteria were 1 and 0.065, respectively; the true positive rate for inferring ASD based on DSM-5 criteria was 0.94.

Discussion: The ontology allows automatic inference of subjects' disease phenotypes and diagnosis with high accuracy.

Conclusion: The ontology may benefit future studies by serving as a knowledge base for ASD. In addition, by adding knowledge of related NDDs, commonalities and differences in manifestations and risk factors could be automatically inferred, contributing to the understanding of ASD pathophysiology.

Keywords: Autism; Diagnosis; Ontology; Ontology Web Language; Reasoning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Top-level class diagram of the autism ontology, showing key classes and relationships. Gray squares represent the classes that extend the ontology by Tu et al. White classes are taken from the BFO.
Figure 2
Figure 2
Ontology population process overview – Basic phenotypes representation. (1) The top-level classes of the basic phenotype hierarchy were taken from McCray et al. (2) The Personal_Traits class from (1) was integrated as a child of the Autism_Phenotype (ASD_Related_Phenotype) class, which is a child of the BFO disposition class. The ADI-R items and their range of values (e.g., ImaginativePlay_NotAvailable) were integrated as children of the concepts in McCray’s hierarchy. (3) Vocabulary terms (where available) were added to the concepts in the hierarchy as annotations. (4) SWRL rules were then used to (5) associate with a human subject a basic phenotype from the hierarchy corresponding to an ADI-R item in this human’s ADI-R data.
Figure 3
Figure 3
DSM-IV and DSM-5 class hierarchies
Figure 4
Figure 4. Infering the “Head_Shaking_Never” basic phenotype of a Human from ADI-R data
(1) An individual of the ADI-R assessment result belonging to a patient whose subjectKey is 11000. The item functional communication head shaking (funcon_chshake) has a value of 2. (2) A SWRL rule infers the “Head_Shaking_Never” phenotype for subjects who scored 2 for item 44 in the ADI-R. (3) A specific individual of the Human class (in this example, the individual with ID 11000) with his set of inferred phenotypes, including the one inferred by this SWRL rule.
Figure 5
Figure 5
Ontology population process overview – DSM diagnostic criteria representation as OWL class hierarchy. (1) To define a DSM criterion in OWL, we obtain from Huerta’s mapping a list of ADI-R items (see second row in the table shown in the figure). (2) The basic phenotypes corresponding to the ADI-R items are logically combined into an OWL class expression (see Figure 6). (3) For higher-level (L2, L1) criteria, the k-of-N Protégé plugin is used to create class expressions. (4) The resulting L1, L2, and L3 classes are arranged in a hierarchy. Note that the second part of the DSM criterion in Figure 6 (gesture or mime) was represented using additional ADI-R items related to gesture or mime as provided by the professional experts with whom we consulted.
Figure 6
Figure 6. Combining basic subject phenotypes with logical operators
This example shows the OWL class definition corresponding to DSM-IV’s diagnostic criterion A2(a): “delay in, or total lack of, the development of spoken language (not accompanied by an attempt to compensate through alternative modes of communication such as gesture or mime)”. This is a union of five basic phenotypes related to the “most abnormal 4–5” (the most severe phenotype the subject exhibited at age 4–5) or the “current finding” (the phenotype that is currently exhibited). The phenotypes described here are related to the following ADI-R items: (1) overall level of language; (2) nodding; (3) head shaking; (4) conventional or instrumental gestures; (5) direct gaze.
Figure 7
Figure 7
An overview of the inference of ASD-related phenotypes from SFARI data. Shapes in white show sources and software that were available to us; shapes in gray show our own development. (1) A Protégé plugin was used to generate ADI-R OWL individuals corresponding to ADI-R questionnaire results of patients from the SFARI data set. (2) Each ADI-R result item was translated via a SWRL rule which was executed by the SWRL engine to populate for each OWL Human individual a set of basic phenotypes corresponding to the ADI-R items for that patient. (3) Based on DSM criteria, OWL classes of Human_with_DSM_Diagnostic_Criterion were defined. Combinatorial class expressions were created automatically via a Protégé plugin for enumeration of combinatorial k-of-N expressions. (4) A reasoner was used to infer for each Human patient which DSM diagnostic criteria he meets based on his SWRL-inferred basic phenotypes.
Figure 8
Figure 8
An Individual representing the concept Autism along with its synonyms Autistic Disorder, Childhood Autism and Infantile Autism. All concepts are instances of the Concept class. The synonyms in this figure are type-of Autism but are still considered as synonyms of the same concept.
Figure 9
Figure 9
The Autism_High_Level_Visualizer class enables a high-level visualization of autism risk factors and comorbidities knowledge.
Figure 10
Figure 10
Conditional Probability individual. The probability (1) that a subject will be diagnosed with autism (3) given that he was diagnosed with autoimmune disease (2) is 0.006. These data were gathered from healthcare systems in the Boston area (4) as reported by Kohane et al. (5). Possible types of healthcare systems are: hospital_outpatient, hospital_inpatient, community_clinic, private_clinic.
Figure 11
Figure 11
An individual of the Risk Factor class. (1) Gestational diabetes is an environmental risk factor for autism, occurring (2) during pregnancy to (3) the mother of a child who develops ASD. The exposure is of class (4) obstetric complications and cited by Gardener (6).
Figure 12
Figure 12
Percentage of subject records that fit the represented DSM-IV criteria
Figure 13
Figure 13
Percentage of subject records that fit the represented DSM-5 criteria

Similar articles

Cited by

References

    1. Rossignol DA, Frye RE. A review of research trends in physiological abnormalities in autism spectrum disorders: immune dysregulation, inflammation, oxidative stress, mitochondrial dysfunction and environmental toxicant exposures. Mol Psychiatry Nature Publishing Group. 2011;17(4):389–401. - PMC - PubMed
    1. Huguet G, Ey E, Bourgeron T. The genetic landscapes of autism spectrum disorders. Annu Rev Genomics Hum Genet. 2013 Jan;14:191–213. - PubMed
    1. Stessman HA, Bernier R, Eichler EE. A genotype-first approach to defining the subtypes of a complex disease. Cell Elsevier Inc. 2014 Feb;156(5):872–7. - PMC - PubMed
    1. Mathur S, Dinakarpandian D. Finding disease similarity based on implicit semantic similarity. J Biomed Inform Elsevier Inc. 2012 Apr;45(2):363–71. - PubMed
    1. Kohane IS, McMurry A, Weber G, MacFadden D, Rappaport L, Kunkel L, et al. The Co-Morbidity Burden of Children and Young Adults with Autism Spectrum Disorders. In: Smalheiser NR, editor. PLoS One. 4. Vol. 7. 2012. Apr, p. e33224. - PMC - PubMed

Publication types