Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 7;98(1):149-64.
doi: 10.1016/j.ajhg.2015.11.024.

Systematic Phenomics Analysis Deconvolutes Genes Mutated in Intellectual Disability into Biologically Coherent Modules

Affiliations

Systematic Phenomics Analysis Deconvolutes Genes Mutated in Intellectual Disability into Biologically Coherent Modules

Korinna Kochinke et al. Am J Hum Genet. .

Abstract

Intellectual disability (ID) disorders are genetically and phenotypically extremely heterogeneous. Can this complexity be depicted in a comprehensive way as a means of facilitating the understanding of ID disorders and their underlying biology? We provide a curated database of 746 currently known genes, mutations in which cause ID (ID-associated genes [ID-AGs]), classified according to ID manifestation and associated clinical features. Using this integrated resource, we show that ID-AGs are substantially enriched with co-expression, protein-protein interactions, and specific biological functions. Systematic identification of highly enriched functional themes and phenotypes revealed typical phenotype combinations characterizing process-defined groups of ID disorders, such as chromatin-related disorders and deficiencies in DNA repair. Strikingly, phenotype classification efficiently breaks down ID-AGs into subsets with significantly elevated biological coherence and predictive power. Custom-made functional Drosophila datasets revealed further characteristic phenotypes among ID-AGs and specific clinical classes. Our study and resource provide systematic insights into the molecular and clinical landscape of ID disorders, represent a significant step toward overcoming current limitations in ID research, and prove the utility of systematic human and cross-species phenomics analyses in highly heterogeneous genetic disorders.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Systematic Analyses of Genes Implicated in ID Reveal Functional Groups and Molecular Modules (A) Gene Ontology-based annotation of ID-AG function. Bar diagrams show enrichments of ID-AGs in each of the indicated Gene Ontology-based groups against the genome-wide background. The total number of genes per group is displayed in the respective bar. (Benjamini-Hochberg, padj < 0.05, ∗∗padj < 0.01, ∗∗∗padj < 0.001.) (B) Physical PPI network of ID-AG products. Circles indicate highly connected ID-AG communities; similar colors illustrate functional proximity (Figure S1). Genes directly connecting to communities are colored if they share Gene Ontology-based terms with the connected communities. Dark gray indicates nodes without associated Gene Ontology-based terms, and light gray indicates nodes with at least a first-degree connection to communities. ID-AGs without connections to other ID-AGs are not shown.
Figure 2
Figure 2
Bipartite Clinical ID-Classification System (A) Main clinical classes. The “syndromicity” axis of ID entities is defined as follows: classes 1, 4, and 7 comprise disorders that are syndromic with structural malformations (SWSM); classes 2, 5, and 8 include disorders that are syndromic without structural malformations (SWOSM); and classes 3, 6, and 9 comprise non-syndromic (NS) ID disorders. The “manifestation, severity, and penetrance” axis of ID entities is defined as follows: classes 1–3 contain disorders with severe and fully penetrant manifestation of ID (CS), classes 4–6 include disorders with mild to moderate or very variable ID (CM), and classes 7–9 comprise disorders with ID in a rare (8a) or atypical (e.g., progressive, neurodegenerative features) (8b) manifestation (NC). (B) ID-accompanying phenotypes: ID-accompanying clinical features that occur with an estimated frequency of >20% within the respective disorder. Abbreviations are as follows: lifesp, lifespan; and an, anomalies. Clinical features marked with an asterisk are explained as follows: progression/regression, progression of disease and regression of development; neurological symptoms, e.g., hypotonia, ataxia, and tremor; metabolic/mitochondrial an., e.g., enzymatic defects; vegetative anomalies, e.g., breathing anomalies and increased sweating; behavioral anomalies, e.g., autism and aggression; ectodermal anomalies, e.g., skin, hair, and nail anomalies; and eye anomalies, structural and functional. Figure S2 shows the numbers of genes per clinical class and ID-accompanying phenotype, a network view of the distribution of genes per clinical class, and a distribution of ID-accompanying phenotypes over main clinical classes.
Figure 3
Figure 3
Relationships among Genes, Phenotypes, and Molecular Function in ID Hierarchical clustering of ID-AGs and ID-accompanying phenotypes. Phenotypic similarity of (groups of) ID-AGs is indicated by the proximity of genes (x axis) and the proximity of ID-accompanying clinical features based on their co-occurrence in ID disorders (y axis). Gene Ontology-based terms that were significantly enriched after multiple-testing corrections in two or more adjacent ID-accompanying phenotypes are displayed on the right-hand side. Colored rectangles highlight randomly chosen clusters. These are highly enriched with cilia (red), chromatin (yellow), synapses (turquoise), and mitochondria (blue). Genes within the clusters are shown in the boxes below in the same color code. Those genes that are already associated with the respective Gene Ontology-based term are highlighted in bold. Abbreviations are as follow: an, anomalies; malf, malformation; non-struct, non-structural MRI anomalies; hedgehog, hedgehog signaling; Wnt, Wnt signaling; MAPK, MAPK signaling; and response to GF, response to growth factor.
Figure 4
Figure 4
Phenotype Delineation of Groups of Process-Defined ID Disorders The typical phenotype combinations characterizing ID disorders associated with a specific molecular process or system were defined according to the Gene Ontology-based groups shown in Figure 1A. The volcano plots show relative enrichments (x axes in log10 scale) of ID-accompanying phenotypes (A–X) among the indicated molecular process or system in relation to their occurrence among all 650 ID-AGs, plotted against the corresponding p values (y axes in −log10 scale). Letters (A–X) refer to ID-accompanying phenotypes as listed in Figure 2B. ID-accompanying phenotypes highlighted in red show significant specificity (Benjamini Hochberg, padj < 0.05), thus identifying clinical features that are characteristic of the respective molecular-process-defined ID disorder group.
Figure 5
Figure 5
Genomic, Proteomic, and Phenotype Datasets Define Predictive Patterns in ID (A–E) 650 ID-AGs and clinical subsets were matched to public datasets and show patterns relating to clinical classes. (A) PPIs, (B) co-expression in BrainSpan, (C) co-expression in GTEx, (D) hPSD, and (E) autism candidate genes. Enrichment scores are provided for nine main classes (1–8b) and the total set of 650 ID-AGs (outer frame). (Benjamini-Hochberg, padj < 0.05; ∗∗padj < 0.01; ∗∗∗padj < 0.001.) Note that class 8b, belonging to the SWOSM super-class column, is depicted in the third column because of symmetry reasons and because class 9 contains only a single gene. (F) The predictive power of 650 ID-AGs to identify ID-AGs in leave-one-out analysis on the basis of proximity in the reference gene network is illustrated by standard precision-recall analysis. Precision is defined as the number of correctly predicted ID-AGs as a proportion of all genes predicted for a given recall. Recall is the proportion of all ID-AGs that are recovered. The significance of these predictions was determined by comparison with precision-recall curves obtained with number-matched random genes. These are represented by gray areas shaded to indicate the p values as shown in the legend and reveal the highly significant power of the 650 ID-AGs to predict each other from the genome-wide background. (G) Examples of precision-recall for individual ID clinical classes and ID-accompanying phenotype categories, notably from the 650 ID-AG background. Thus, deconvoluting ID-AGs according to phenotypes results in added predictive value (compared to that of random IDA-Gs).
Figure 6
Figure 6
Custom-Made Functional Datasets in Drosophila Reveal Additional Patterns (A) Schematic representation of the neuronal screen and assessed phenotypes. Viable pan-neuronal knockdown ID models were tested at two different time points for their ability to escape from a platform. (B) Phenotypes evaluated in the wing screen. Examples of genes (human orthologs) and associated phenotypes are shown. (C) Phenotypes and their frequencies upon neuronal knockdown. Note that knockdown of ID-AGs (green) tended to cause early phenotypes, whereas knockdown of non-ID-AGs (gray) caused significantly more late phenotypes. (D) Phenotypes and their frequencies upon knockdown in the wing. ID-AGs are highly enriched with lethal, posterior-margin, and wing-field phenotypes. Broad morphological phenotypes are evenly represented among ID-AGs and non-ID-AGs. Bar graphs in (C) and (D) show genes in each phenotype group as a percentage of all genes in each dataset (264 ID-AGs in the neuronal dataset, 261 ID-AGs in the wing dataset, and 31 non-ID-AGs in both assays). The p values were determined with Fisher’s exact test and corrected for multiple testing (Benjamini-Hochberg, padj < 0.05; ∗∗padj < 0.01; ∗∗∗padj < 0.001). Note that each gene can be associated with more than one phenotype. (E and F) Enrichment of early behavioral (early walker and early sitter, E) and wing morphological phenotypes (trichome density and missing veins, F), resolved according to ID clinical classes, shows that the increased abundance of the phenotypes among ID-AGs (Figure 6C) arises from enrichment of phenotypes in specific clinical classes. (E′ and F′) Precision-recall analysis (see Figure 5 for details) shows the significant predictive power of the custom-made phenotypes to identify other ID-AG orthologs associated with the same phenotype. p value curves from number-matched, randomly sub-sampled ID-AG sets are indicated.

References

    1. Schalock R.L., Borthwick-Duffy S.A., Bradley V.J., Buntinx W.H.E., Coulter D.L., Craig E.M., Gomez S.C., Lachapelle Y., Luckasson R., Reeve A. American Association on Intellectual and Developmental Disabilities; 2010. Intellectual Disability: Definition, Classification, and Systems of Supports.
    1. Ropers H.H. Genetics of early onset cognitive impairment. Annu. Rev. Genomics Hum. Genet. 2010;11:161–187. - PubMed
    1. Grozeva D., Carss K., Spasic-Boskovic O., Tejada M.I., Gecz J., Shaw M., Corbett M., Haan E., Thompson E., Friend K., Italian X-linked Mental Retardation Project. UK10K Consortium. GOLD Consortium Targeted Next-Generation Sequencing Analysis of 1,000 Individuals with Intellectual Disability. Hum. Mutat. 2015;36:1197–1204. - PMC - PubMed
    1. Redin C., Gérard B., Lauer J., Herenger Y., Muller J., Quartier A., Masurel-Paulet A., Willems M., Lesca G., El-Chehadeh S. Efficient strategy for the molecular diagnosis of intellectual disability using targeted high-throughput sequencing. J. Med. Genet. 2014;51:724–736. - PMC - PubMed
    1. Bahi-Buisson N., Poirier K., Fourniol F., Saillour Y., Valence S., Lebrun N., Hully M., Bianco C.F., Boddaert N., Elie C., LIS-Tubulinopathies Consortium The wide spectrum of tubulinopathies: what are the key features for the diagnosis? Brain. 2014;137:1676–1700. - PubMed

Publication types

LinkOut - more resources