Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep;54(9):1275-1283.
doi: 10.1038/s41588-022-01156-2. Epub 2022 Aug 29.

Large-scale sequencing identifies multiple genes and rare variants associated with Crohn's disease susceptibility

Aleksejs Sazonovs #  1 Christine R Stevens #  2   3   4 Guhan R Venkataraman #  5 Kai Yuan #  3   4 Brandon Avila  2 Maria T Abreu  6 Tariq Ahmad  7 Matthieu Allez  8 Ashwin N Ananthakrishnan  9 Gil Atzmon  10   11 Aris Baras  12 Jeffrey C Barrett  13 Nir Barzilai  11   14 Laurent Beaugerie  15 Ashley Beecham  16   17 Charles N Bernstein  18 Alain Bitton  19 Bernd Bokemeyer  20 Andrew Chan  21   22 Daniel Chung  23 Isabelle Cleynen  24 Jacques Cosnes  25 David J Cutler  26   27 Allan Daly  28 Oriana M Damas  29 Lisa W Datta  30 Noor Dawany  31 Marcella Devoto  31   32   33   34 Sheila Dodge  35 Eva Ellinghaus  36 Laura Fachal  1 Martti Farkkila  37 William Faubion  38 Manuel Ferreira  12 Denis Franchimont  39 Stacey B Gabriel  35 Tian Ge  3   40   41 Michel Georges  42 Kyle Gettler  43 Mamta Giri  43 Benjamin Glaser  44 Siegfried Goerg  45 Philippe Goyette  46 Daniel Graham  47   48   49 Eija Hämäläinen  50 Talin Haritunians  51 Graham A Heap  7 Mikko Hiltunen  52 Marc Hoeppner  53 Julie E Horowitz  12 Peter Irving  54   55 Vivek Iyer  28 Chaim Jalas  56 Judith Kelsen  31 Hamed Khalili  21 Barbara S Kirschner  57 Kimmo Kontula  58 Jukka T Koskela  50 Subra Kugathasan  27 Juozas Kupcinskas  59 Christopher A Lamb  60   61 Matthias Laudes  45 Chloé Lévesque  46 Adam P Levine  62 James D Lewis  34   63 Claire Liefferinckx  39 Britt-Sabina Loescher  36 Edouard Louis  42 John Mansfield  60   61 Sandra May  36 Jacob L McCauley  16   17 Emebet Mengesha  51 Myriam Mni  42 Paul Moayyedi  64 Christopher J Moran  23 Rodney D Newberry  65 Sirimon O'Charoen  63 David T Okou  27   66 Bas Oldenburg  67 Harry Ostrer  68 Aarno Palotie  2   3   4   50   69   70 Jean Paquette  46 Joel Pekow  57 Inga Peter  43 Marieke J Pierik  71 Cyriel Y Ponsioen  72 Nikolas Pontikos  62 Natalie Prescott  73 Ann E Pulver  74 Souad Rahmouni  42 Daniel L Rice  1 Päivi Saavalainen  75 Bruce Sands  43 R Balfour Sartor  76 Elena R Schiff  62 Stefan Schreiber  36 L Philip Schumm  77 Anthony W Segal  62 Philippe Seksik  15 Rasha Shawky  78 Shehzad Z Sheikh  76 Mark S Silverberg  79 Alison Simmons  80 Jurgita Skeiceviciene  59 Harry Sokol  15 Matthew Solomonson  2 Hari Somineni  81 Dylan Sun  12 Stephan Targan  51 Dan Turner  82 Holm H Uhlig  83   84 Andrea E van der Meulen  85 Séverine Vermeire  86   87 Sare Verstockt  87 Michiel D Voskuil  88 Harland S Winter  23 Justine Young  63 Belgium IBD ConsortiumCedars-Sinai IBDInternational IBD Genetics ConsortiumNIDDK IBD Genetics ConsortiumNIHR IBD BioResourceRegeneron Genetics CenterSHARE ConsortiumSPARC IBD NetworkUK IBD Genetics ConsortiumRichard H Duerr  89 Andre Franke  36 Steven R Brant  30   90 Judy Cho  43 Rinse K Weersma  88 Miles Parkes  91 Ramnik J Xavier  47   48   49   92   93   94   95   96 Manuel A Rivas  5 John D Rioux  46   97 Dermot P B McGovern  51 Hailiang Huang  98   99 Carl A Anderson  100 Mark J Daly  101   102   103   104
Affiliations

Large-scale sequencing identifies multiple genes and rare variants associated with Crohn's disease susceptibility

Aleksejs Sazonovs et al. Nat Genet. 2022 Sep.

Abstract

Genome-wide association studies (GWASs) have identified hundreds of loci associated with Crohn's disease (CD). However, as with all complex diseases, robust identification of the genes dysregulated by noncoding variants typically driving GWAS discoveries has been challenging. Here, to complement GWASs and better define actionable biological targets, we analyzed sequence data from more than 30,000 patients with CD and 80,000 population controls. We directly implicate ten genes in general onset CD for the first time to our knowledge via association to coding variation, four of which lie within established CD GWAS loci. In nine instances, a single coding variant is significantly associated, and in the tenth, ATG4C, we see additionally a significantly increased burden of very rare coding variants in CD cases. In addition to reiterating the central role of innate and adaptive immune cells as well as autophagy in CD pathogenesis, these newly associated genes highlight the emerging role of mesenchymal cells in the development and maintenance of intestinal inflammation.

PubMed Disclaimer

Conflict of interest statement

Competing Interests Statement

A.B., M.F, J.E.H., D.S are current or former employees and/or stockholders of Regeneron Genetics Center or Regeneron Pharmaceuticals. M.A. is consulting for or part of the advisory board for AbbVie Inc, Bellatrix Pharmaceuticals, Bristol Myers Squibb, Eli Lilly Pharmaceuticals, Gilead, Janssen Ortho, LLC, and Prometheus Biosciences; teaching, lecturing, or speaking at Alimentiv, Arena Pharmaceuticals, Janssen, Prime CME, Takeda Pharmaceuticals. A.B is an employee of Regeneron and owns stock in Regeneron. O.M.D. has served in the IBD fellowship funding committee for Pfizer and has a funded research project by Pfizer. H.K. receives grant funding from Takeda and Pfizer and has received consulting fees from Takeda. A.P. is a member of Astra Zenecas Genomics Advisory Board. M.A.R. is on the SAB of 54gene and has advised BioMarin, Third Rock Ventures, MazeTx, and Related Sciences. G.A.H. is an employee of Takeda, former employee of AbbVie and owns stock in Takeda and AbbVie. C.A.L. reports grants from Genentech, grants and personal fees from Janssen, grants and personal fees from Takeda, grants from AbbVie, personal fees from Ferring, grants from Eli Lilly, grants from Pfizer, grants from Roche, grants from UCB Biopharma, grants from Sanofi Aventis, grants from Biogen IDEC, grants from Orion OYJ, personal fees from Dr Falk Pharma, grants from AstraZeneca, outside the submitted work. H.H.U reports research collaboration or consultancy with Janssen, Eli Lilly, UCB Pharma, Celgene, MiroBio, OMass, and Mestag. D.P.B.M. has consulted for Takeda, Boehringer Ingelheim, Palatin Technologies, Bridge Biotherapeutics, Pfizer, and Gilead. M.P. received an unrestricted research grant from Pfizer UK and speaker fees from Janssen. P.I. received lecture fees from AbbVie, BMS, Celgene, Celltrion, Falk Pharma, Ferring, Galapagos, Gilead, MSD, Janssen, Pfizer, Takeda, Tillotts, Sapphire Medical, Sandoz, Shire and Warner Chilcott; financial support for research from Celltrion, MSD, Pfizer and Takeda; advisory fees from AbbVie, Arena, Boehringer-Ingelheim, BMS, Celgene, Celltrion, Genentech, Gilead, Hospira, Janssen, Lilly, MSD, Pfizer, Pharmacosmos, Prometheus, Roche, Sandoz, Samsung Bioepis, Takeda, Topivert, VH2, Vifor Pharma and Warner Chilcott. Cedars-Sinai and D.P.B.M. have financial interests in Prometheus Biosciences, a company which has access to the data and specimens in Cedars-Sinais MIRIAD Biobank (including the Cedars-Sinai data and specimens used in this study) and seeks to develop commercial products. H.H. has received consultancy fees from Ono Pharmaceutical and honorarium from Xian Janssen Pharmaceutical. C.A.A. has received consultancy fees from Genomics plc and BridgeBio Inc. and lecture fees from GSK. M.J.D. is a founder of Maze Therapeutics. The remaining authors declare no competing interests.

Figures

Extended Data Fig. 1
Extended Data Fig. 1. Overview of the study design
We utilized a logistic mixed-model for the association analysis, followed by the fixed-effect meta-analysis to combine multiple cohorts. Multiple cohorts serve the purpose of replication. Two large cohorts at Broad Institute of different exome capture platforms were used to discover candidate variants (Nextera WES and Twist WES ). Two independent cohorts at Sanger (Sanger WGS and Sanger WES) and one Kiel/Regeneron cohort (Regeneron WES) were used to replicate the findings.
Extended Data Fig. 2
Extended Data Fig. 2. Quality control procedures applied in the Broad sequencing pipeline
We show as an example the quality control steps performed on variants and subjects from the Broad sequencing platform. Quality controls performed on data from other platforms follow a similar plan and are described in Methods. Quality control steps using external information from gnomAD were colored green. Thresholds and details can be found in Methods.
Extended Data Fig. 3
Extended Data Fig. 3. QQ plots of the heterogeneity-of-effect test between the Nextera and Twist discovery cohorts
Only QC passed variants with minor allele frequency in NFE between 0.0001 and 0.10 were included. a, all variants. b, non-synonymous variants. c, synonymous variants. In a and b, the y axis is capped at -log10 p = 30 while the top four variants (three in NOD2 and one in IL23R) have -log10 p > 100. In c, to remove the synonymous variants that tag causal non-synonymous variants and artifacts through LD, we removed loci hosting large-effect coding variants (IL23R, NOD2, LRRK2, TYK2, ATG16L1, SLC39A8, PTGER4, IRGM, CARD9), implicated by variants removed in the heterogeneous test (AHNAK2, LILRA), and with long range LD (MHC).
Extended Data Fig. 4
Extended Data Fig. 4. Power to detect single variant associations.
We performed a series of power calculations using the methodology described by Johnson and Abecasis (2017). Our initial ‘exome-wide scan’ (two cohorts) had fewer samples and a more lenient significance threshold than subsequent meta-analysis (five cohorts). However, both analyses had similar power to detect true associations at their respective significance levels. Our single-variant association analyses did not have the power to uncover association to variants with a MAF = 0.0001 and below (unless the variant has a very strong effect, e.g. 0.76 power at OR = 8). Similarly, the exome-wide scan had limited power to detect association to variants with a MAF = 0.001 and OR < 2, but was well-powered above these thresholds. a, Power of the exome-wide scan analysis b, Power of the meta-analysis. c, Power to detect single-variant associations at different minor allele frequencies at α = 0.0002 (‘scan’; dashed lines) and 3 x 10–7 (‘meta’; solid lines) and assuming Crohn’s disease population prevalence of 276 in 100,000, and an additive effect model.
Extended Data Fig. 5
Extended Data Fig. 5. Relation to known IBD associations
Numbers in brackets are the number of variants assigned to the categories out of the 45 exome-wide significant variants.
Extended Data Fig. 6
Extended Data Fig. 6. WES variants from this study implicating known IBD loci
a-c: a novel CD variant implicating TAGAP. d-g: CD variants tagging fine-mapped IBD associations in LRRK2. a and d, P-value for variants from the fine-mapping study5. b and e, PIP from fine-mapping. c, f and g, P-value for variants from this study. Open circle indicating LD information is missing. LD calculated between the plotted variant and the best variant in b for panel c, and variants with best PIP in credible sets 1 and 2 (panel e) respectively for panels f and g.
Extended Data Fig. 7
Extended Data Fig. 7. Nextera and Twist callset population assignment.
Principal components for a, c, before removing non-European samples for Twist and Nextera respectively. b, d, after removing non-European samples for Twist and Nextera respectively. Principal components generated from the 1000 Genome Project Phase III data and different colors stand for different continental / superpopulations. Study subjects (black dots) were projected onto principal components.
Figure 1.
Figure 1.. Odds ratio and minor allele frequency for exome-wide significant findings that are not tagging stronger, established non-coding association signals.
Known causal candidate: in a credible set from a fine-mapping study with posterior inclusion probability > 5% or reported in previous studies, (Online Methods). New locus: in a locus not yet implicated by GWAS. New variant in known locus: in a known GWAS locus, but represents an association independent from previously-reported IBD putative causal variants (Online Methods).
Figure 2.
Figure 2.. Schematic representation of inflamed mucosa showing the mesenchymal related genes with newly identified mutations.
(1) Following mucosal injury, mesenchymal cells (MCs) are highly activated by pro-inflammatory signals such as TNF-α, CCL19/CCL21, PAF and TGF-β1. (2) Among these, TNF-α increases PAF-R expression in intestinal epithelial cells during wound repair. However, prolonged exposure to PAF dissolves cell junctions and increases epithelial permeability. In endothelial cells, the PAF-R/PAF axis has a similar effect on Vascular Endothelial-Cadherin (VE-CAD) assembly. A leaky endothelium can result in an increase in immune cell infiltration and aggravate inflammation at injured sites. (3) Secretion of CCL19/CCL21 by activated stromal fibroblasts in response to epithelial damage or infection attracts dendritic cells (DC) and other immune cells which then migrate to mesenteric lymph nodes, where the immune response is coordinated. CCR7+ DC, macrophages and T cells also exacerbate inflammation through pro-inflammatory mediators such as CCL19/CCL21. Plasma cells express SDF2L1 in response to inflammation and ER stress due to massive antibody production. (4) Mucosal repair mediated by TGF-β1/SMAD3 signaling has a key role in epithelial homeostasis after tissue injury. Importantly, a causal variant in SMAD3* further supports the importance of this pathway in disease susceptibility. PDLIM5 is a known regulator of SMAD3 stability during EMT. Uncontrolled EMT increases fibroblast proliferation and excessive ECM production leading to fibrosis. Active HGF released by HGFAC antagonizes TGF-β1 resulting in a decrease of EMT. (5) HGF secreted by fibroblasts play a role in maintaining the stem cell niche in intestinal crypts. SDF2L1 expressed by Paneth cells in response to ER stress may participate in this process. (6) The current genetic findings provide support to the scientific rationale for targeting EMT and fibrosis for the treatment of CD, such as with anti-integrin antibodies (anti-αvβ6), recombinant human HGF (rhHGF) and Rho Kinase Inhibitor (ROCKi); shown in red.

References

    1. Jostins L et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012). - PMC - PubMed
    1. Liu JZ et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015). - PMC - PubMed
    1. Luo Y et al. Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7. Nat. Genet. 49, 186–192 (2017). - PMC - PubMed
    1. de Lange KM et al. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease. Nat. Genet. 49, 256–261 (2017). - PMC - PubMed
    1. Huang H et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547, 173–178 (2017). - PMC - PubMed

Publication types

MeSH terms