Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct 17;8(1):774.
doi: 10.1038/s41467-017-00831-x.

Integrating evolutionary and regulatory information with a multispecies approach implicates genes and pathways in obsessive-compulsive disorder

Affiliations

Integrating evolutionary and regulatory information with a multispecies approach implicates genes and pathways in obsessive-compulsive disorder

Hyun Ji Noh et al. Nat Commun. .

Abstract

Obsessive-compulsive disorder is a severe psychiatric disorder linked to abnormalities in glutamate signaling and the cortico-striatal circuit. We sequenced coding and regulatory elements for 608 genes potentially involved in obsessive-compulsive disorder in human, dog, and mouse. Using a new method that prioritizes likely functional variants, we compared 592 cases to 560 controls and found four strongly associated genes, validated in a larger cohort. NRXN1 and HTR2A are enriched for coding variants altering postsynaptic protein-binding domains. CTTNBP2 (synapse maintenance) and REEP3 (vesicle trafficking) are enriched for regulatory variants, of which at least six (35%) alter transcription factor-DNA binding in neuroblastoma cells. NRXN1 achieves genome-wide significance (p = 6.37 × 10-11) when we include 33,370 population-matched controls. Our findings suggest synaptic adhesion as a key component in compulsive behaviors, and show that targeted sequencing plus functional annotation can identify potentially causative variants, even when genomic data are limited.Obsessive-compulsive disorder (OCD) is a neuropsychiatric disorder with symptoms including intrusive thoughts and time-consuming repetitive behaviors. Here Noh and colleagues identify genes enriched for functional variants associated with increased risk of OCD.

PubMed Disclaimer

Conflict of interest statement

The authors declare is no competing financial interests.

Figures

Fig. 1
Fig. 1
PolyStrat analysis of pooled-targeted-sequencing data. a Venn diagrams showing the number of SNPs annotated as functional and/or conserved by PolyStrat. Each of the four dashed circles represents the 41,504 total high-confidence SNPs detected. Within each circle, SNPs are stratified by their annotations. Each colorful interior circle represents SNPs annotated as exonic (blue), regulatory (green), conserved (red), or diverged (gray) bases. SNPs with multiple annotations are represented by circle overlaps, and SNPs without any of the included annotations are within the white space of the dashed outer circle. b PolyStrat p-values for 608 genes (circles) stratified by the 16 (12 shown) annotation categories tested show that just five genes (NRXN1, HTR2A, LIPH, CTTNBP2, and REEP3) have p-values below the experiment-wide significance threshold after correction for multiple testing (red dashed line). Two moderately associated, OCD-relevant genes discussed in the text are also noted (STRN and CACNA1C). “Evo” (=evolutionary) are SNPs either conserved (“Cons”) or divergent (“Div”). The vast majority of genes tested fail to exceed the significance threshold, with the median p-value for each category shown as a dark black line separating two boxes representing the 25–75% quantile. Notch in boxes shows the 95% confidence interval around median. c p-values for the five genes robustly implicated in animal models of OCD are significantly lower than p-values for the rest of the genes in our sequencing set (603 genes), and this difference increases when just rare variants are tested. The solid horizontal line shows median p-value, the boxed area the 25–75% quantiles, and the vertical black lines extend from the minimum to maximum p-values observed. AF, allele frequency
Fig. 2
Fig. 2
Targeted-sequencing detects both coding and regulatory candidate variants in the four top-scoring genes a NRXN1 b HTR2A c CTTNBP2 and d REEP3. Sequenced “Target regions” are shown as gray boxes above the red “–log10 p single” track displaying the association p-values for all detected variants and the blue “AF ca/co ratio” track showing the ratio of OCD AF over control AF. “Candidate variants” are annotated as missense (red), synonymous (green), untranslated (purple), DHS variants (blue), or unannotated (black). Lastly, the gene is shown as a horizontal blue line with exons (solid boxes) and arrows indicating direction of transcription. The highest scoring isoform of NRXN1, NRXN1a-2, is shown in (a)
Fig. 3
Fig. 3
Validated top candidate variants disrupt functional elements active in brain. a Frequencies of the top candidate variants show that most are rare (AF < 0.01) in our cohort, illustrating the value of sequencing rather than array-based genotyping for detecting candidate variants. bg Allele frequencies from pooled sequencing of individuals in the original cohort (cohort 1) are validated by genotyping in b cases and c controls, and by d allele-frequency differences between cases and controls. For the vast majority of variants, the pooled-sequencing allele frequencies are also highly correlated with frequencies observed in 33,370 ExAC individuals in both cases (b, gray dots) and controls (c, gray dots). Genotyping an independent cohort (cohort 2) of cases and controls reveals genotyping allele frequencies from cohort 1 are correlated in e cases and f controls; and for g allele-frequency differences between cases and controls. Correlation test was performed using Fisher’s Z transform. h The top candidate variants in CTTNBP2 and REEP3, the two genes enriched for regulatory variants, disrupt DNase hypersensitivity sites (DHS), enhancers, and transcription factor-binding sites (TFBS) annotated as functional in brain tissues and cell lines in ENCODE and Roadmap Epigenomics. All 17 variants disrupt elements active in either the dorsolateral prefrontal cortex and/or substantia nigra, which are among the i brain regions involved in the CSTC circuit implicated in OCD, illustrated with black arrows. Image adapted from Creative Commons original by Patrick J. Lynch and C. Carl Jaffe, MD. j The CSTC circuit requires a balance between a direct, GABAergic signaling pathway and an indirect pathway that involves both GABAergic and glutamate signaling. In OCD patients, an imbalance favoring the direct over the indirect pathway disrupts the normal functioning of the CSTC circuit
Fig. 4
Fig. 4
Top genes validate in functional assays in context of known protein structure and in comparison to ExAC. EMSA of all 17 candidate regulatory variants in a REEP3 and b CTTNBP2 reveals that six variants either decrease (black arrows) or increase (red) protein binding when the variant sequence (MT) is compared with the reference allele (WT). The signal disappears when competing unlabeled probes (Cold+) are added, confirming the specificity of the DNA-protein binding. Raw images as well as EMSA results for experimental replicates and for other candidate regulatory variants that showed weak-binding changes are shown in Supplementary Fig. 8. c All seven of the candidate missense variants in NRXN1, shown here as colored elements, alter the extracellular postsynaptic-binding region of our top-scoring protein isoform NRXN1a-2 , , . Of the six extracellular LNS (laminin, nectin, sex-hormone-binding globulin) domains in the longer isoform NRXN1a, five with known protein structure are shown and labeled L2-L6. NRXN1a-2 includes four of these domains (L2–L5; dashed ellipse). d The isoform-based test comparing OCD cases to ExAC finds NRXN1 as the top-scoring gene with genome-wide significance. The ExAC analysis score is defined as the ratio of χ 2 statistics (i=1nOi-Ei2Ei, where n = total number of isoforms, Oi = number of non-reference alleles observed in isoform i, Ei = number of non-reference alleles expected from ExAC in isoform i) between OCD vs. ExAC comparison and control vs. ExAC comparison
Fig. 5
Fig. 5
All four top candidate genes function at the synapse and interact with proteins implicated in OCD by previous studies in the three species. Human, dog, and mouse are marked with superscripts h, d, and m, respectively. Genes identified in this study are shown in red. Solid lines indicate direct interactions, and dashed lines indicate indirect interactions

References

    1. Pauls DL. The genetics of obsessive-compulsive disorder: a review. Dialogues Clin. Neurosci. 2010;12:149–163. - PMC - PubMed
    1. Stewart SE, et al. Genome-wide association study of obsessive-compulsive disorder. Mol. Psychiatry. 2013;18:788–798. doi: 10.1038/mp.2012.85. - DOI - PMC - PubMed
    1. Mattheisen M, et al. Genome-wide association study in obsessive-compulsive disorder: results from the OCGAS. Mol. Psychiatry. 2014;20:337–344. doi: 10.1038/mp.2014.43. - DOI - PMC - PubMed
    1. Welch JM, et al. Cortico-striatal synaptic defects and OCD-like behaviours in Sapap3-mutant mice. Nature. 2007;448:894–900. doi: 10.1038/nature06104. - DOI - PMC - PubMed
    1. Ahmari SE, et al. Repeated cortico-striatal stimulation generates persistent OCD-like behavior. Science. 2013;340:1234–1239. doi: 10.1126/science.1234733. - DOI - PMC - PubMed

Publication types

MeSH terms