Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 13;15(1):8029.
doi: 10.1038/s41467-024-51819-3.

NOTCH3 p.Arg1231Cys is markedly enriched in South Asians and associated with stroke

Collaborators, Affiliations

NOTCH3 p.Arg1231Cys is markedly enriched in South Asians and associated with stroke

Juan Lorenzo Rodriguez-Flores et al. Nat Commun. .

Abstract

The genetic factors of stroke in South Asians are largely unexplored. Exome-wide sequencing and association analysis (ExWAS) in 75 K Pakistanis identified NM_000435.3(NOTCH3):c.3691 C > T, encoding the missense amino acid substitution p.Arg1231Cys, enriched in South Asians (alternate allele frequency = 0.58% compared to 0.019% in Western Europeans), and associated with subcortical hemorrhagic stroke [odds ratio (OR) = 3.39, 95% confidence interval (CI) = [2.26, 5.10], p = 3.87 × 10-9), and all strokes (OR [CI] = 2.30 [1.77, 3.01], p = 7.79 × 10-10). NOTCH3 p.Arg231Cys was strongly associated with white matter hyperintensity on MRI in United Kingdom Biobank (UKB) participants (effect [95% CI] in SD units = 1.1 [0.61, 1.5], p = 3.0 × 10-6). The variant is attributable for approximately 2.0% of hemorrhagic strokes and 1.1% of all strokes in South Asians. These findings highlight the value of diversity in genetic studies and have major implications for genomic medicine and therapeutic development in South Asian populations.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following competing interests. Funding. Fieldwork for this study was funded by the Center for Non-Communicable Diseases, Pakistan. DNA sequencing was funded by Regeneron Pharmaceuticals Inc. Employment. J.L.R.F., A.R.S., N.B., D.eS., M.C., J.O., J.R., B.Y., M.K., J.B., G.T., S.G., T.D., N.V., L.A.L., A.Z., N.P., F.S., J.M., G.C., P.X., A.B., S.A.D.G., H.M., I.T., K.K., A.E., D.D.A., S.F., W.L., T.C. and R.G.C. consortium members are or were employees of Regeneron Genetics Center L.L.C. or Regeneron Pharmaceuticals Inc. and contributed to this manuscript as part of their regular duties as salaried employees. E.T. and S.C. are or were student interns of Regeneron Genetics Center LLC or Regeneron Pharmaceuticals Inc. and contributed to this manuscript as part of their internship activites. A.R., M.J., M.Z., M.R.M., M.B.L., P.F., and D.S. and S.K. are or were employees of the Center for Non-Communicable Disease and received salaried compensation for their contribution to this manuscript. Personal Financial Interests. J.L.R.F., A.R.S., N.B., D.eS., M.C., J.O., J.R., B.Y., M.K., J.B., G.T., S.G., T.D., N.V., L.A.L., A.Z., N.P., F.S., J.M., G.C., P.X., A.B., S.A.D.G., H.M., I.T., K.K., A.E., D.D.A., S.F., W.L., T.C., and R.G.C. consortium members are or were employees of Regeneron Genetics Center L.LC. or Regeneron Pharmaceuticals Inc. and received stock and stock options as part of their compensation as employees. J.L.R.F., A.R.S., D.S., A.B., and S.K. are named inventors on patent pending US 20230000897A1 that discloses methods of treating subjects having a cerebrovascular disease by administering Neurogenic Locus Notch Homolog Protein 3 (NOTCH3) agents, and methods of identifying subjects having an increased risk of developing a cerebrovascular disease. The remaining authors have no competing interests.

Figures

Fig. 1
Fig. 1. ExWAS identifies NOTCH3 p.Arg1231Cys associated with subcortical hemorrhagic stroke in Pakistan genome resource 31 K discovery cohort.
A Flow chart of the study described in this report. The discovery cohort consisted of a 31 K stroke case-control cohort (n = 31,737, including n = 5135 stroke cases and n = 26,602 controls) from the Pakistan Genome Resource (PGR) (green boxes). A second PGR follow-up cohort of 44 K (n = 44,082) included 30 K participants with self-reported stroke case:control status for replication (n = 30,399, including n = 160 cases and n = 30,239 controls). UK Biobank data from 450 K sequenced participants was used for further analysis in a predominantly European ancestry population (blue boxes), 380 K of whom had stroke case:control status known (n = 9143 cases and n = 371,403 controls), and 35 K of whom had brain MRI data (n = 35,344). B Manhattan plot of subcortical hemorrhagic stroke ExWAS in PGR discovery cohort participants (n = 1388 hemorrhagic stroke cases and n = 26,602 controls) with likelihood ratio test -log10 p-values of calculated using REGENIE (y-axis) across chromosomes (alternating gray and black dots) and variants (x-axis). A single variant (NC_000019.10:g.15179052 G > A) on chromosome 19 predicting a missense variant p.Arg1231Cys in NOTCH3 (pink diamond) exceeded the genome-wide significance threshold of 5 × 10−8 (red line). C. NOTCH3 locus zoom plot of subcortical stroke ExWAS. The likelihood ratio test -log10 p values for variants tested are shown on the y-axis. The p.Arg1231Cys variant is labeled as a diamond. Other variants (circles) are colored based on linkage disequilibrium with the reference variant in 1000 Genomes. Gene exon (thick line) and intron (thin line) model shown below the graph.
Fig. 2
Fig. 2. NOTCH3 EGFr domain disruption by p.Arg1231Cys.
Shown is NOTCH3 p.Arg1231 in context of human NOTCH3 protein domains and cross-species alignment of NOTCH3 amino acid sequences. A Human NOTCH3 Protein Domains. Shown is the position of the associated variant in context of transcript exons (top, alternating blue and purple with numbering) and protein domains (bottom, color coded). NOTCH3 can be divided into four major regions, from left-to-right the signal peptide (light blue), the extra-cellular domain (ECD, brown), the transmembrane domain (orange), and the intra-cellular domain (ICD, blue). The majority of the ECD is composed of n = 34 Epidermal Growth Factor-like repeat (EGFr) domains (in purple with white numbers). Domains involved in signaling are highlighted, including EGFr domains 10 to 11 involved in ligand binding (light purple, numbered), three cleavage domains (S1 in yellow, S2 in yellow with diagonal black stripes, S3 in yellow with black horizontal stripes), and three Lin12/NOTCH repeats (light green, numbered). The ICD contains the Recombination signal-binding protein for Ig of κ region (RAM) domain for transcription factor interaction (green and white checkers), the Nuclear Localization Sequences (NLS, orange with black stripes), five Ankyrin repeats involved in signal transduction (green with white numbers), and the Proline, glutamic acid, serine, and threonine-rich (PEST) domain essential for degradation (green with white stripes). The p.Arg1231Cys variant (red line top to bottom) removes a disulfide-bridge-forming cysteine in the 31st EGFr domain of the ECD, coded by the 22nd exon. B Cross-species Alignment of NOTCH3 (EGFr) Domain # 31 Amino Acid Sequences calculated using BLAST. Shown is an amino acid alignment of 31st EGFr domain of NOTCH3 (human sequence amino acids 1205 to 1244), including (top-to-bottom) human reference, human p.Arg1231Cys mutant, chimpanzee (Pan troglodytes), mouse (Mus musculus), zebrafish (Danio rerio), western clawed frog (Xenopus tropicalis), and green sea turtle (Chelonia mvdas), indicating conservation of the arginine (R) at position 1231 in mammals. Highly-conserved cysteine (C) residues (normally 6 per EGFr) are highlighted in yellow.
Fig. 3
Fig. 3. Forest plot showing replication of NOTCH3 p.Arg1231Cys association with stroke across 61 K Pakistan Genome Resource meta-analysis.
Shown is the cohort name, trait, odds ratio with 95% confidence interval, likelihood ratio test p value calculated using REGENIE, alternate allele frequency, case count, and control count for five stroke phenotypes in the PGR 31 K discovery cohort (n = 5135 stroke cases and n = 26,602 controls) (top), and Inverse Variance Weighted (IVW) Meta-Analysis using METAL of stroke in 61 K PGR cohort, including 31 K PGR discovery cohort and 30 K PGR replication cohort (n = 160 cases and n = 30,239 controls) subset of the 44 K PGR follow-up cohort (bottom).

References

    1. Mills, M. C. & Rahal, C. The GWAS Diversity Monitor tracks diversity by disease in real time. Nat. Genet52, 242–243 (2020). 10.1038/s41588-020-0580-y - DOI - PubMed
    1. Collaborators, G. B. D. S. Global, regional, and national burden of stroke and its risk factors, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Neurol.20, 795–820 (2021). 10.1016/S1474-4422(21)00252-0 - DOI - PMC - PubMed
    1. Sherin, A. et al. Prevalence of stroke in Pakistan: findings from Khyber Pakhtunkhwa integrated population health survey (KP-IPHS) 2016-17. Pak. J. Med. Sci.36, 1435–1440 (2020). 10.12669/pjms.36.7.2824 - DOI - PMC - PubMed
    1. Valcarcel-Nazco, C. et al. Variability in the use of neuroimaging techniques for diagnosis and follow-up of stroke patients. Neurologia (Engl. Ed.)34, 360–366 (2019). - PubMed
    1. Farooq, A., Venketasubramanian, N. & Wasay, M. Stroke care in Pakistan. Cerebrovasc. Dis. Extra11, 118–121 (2021). 10.1159/000519554 - DOI - PMC - PubMed

Publication types

LinkOut - more resources