Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 6;107(2):196-210.
doi: 10.1016/j.ajhg.2020.06.002. Epub 2020 Jun 25.

Regional Variation of Splicing QTLs in Human Brain

Affiliations

Regional Variation of Splicing QTLs in Human Brain

Yida Zhang et al. Am J Hum Genet. .

Abstract

A major question in human genetics is how sequence variants of broadly expressed genes produce tissue- and cell type-specific molecular phenotypes. Genetic variation of alternative splicing is a prevalent source of transcriptomic and proteomic diversity in human populations. We investigated splicing quantitative trait loci (sQTLs) in 1,209 samples from 13 human brain regions, using RNA sequencing (RNA-seq) and genotype data from the Genotype-Tissue Expression (GTEx) project. Hundreds of sQTLs were identified in each brain region. Some sQTLs were shared across brain regions, whereas others displayed regional specificity. These "regionally ubiquitous" and "regionally specific" sQTLs showed distinct positional distributions of single-nucleotide polymorphisms (SNPs) within and outside essential splice sites, respectively, suggesting their regulation by distinct molecular mechanisms. Integrating the binding motifs and expression patterns of RNA binding proteins with exon splicing profiles, we uncovered likely causal variants underlying brain region-specific sQTLs. Notably, SNP rs17651213 created a putative binding site for the splicing factor RBFOX2 and was associated with increased splicing of MAPT exon 3 in cerebellar tissues, where RBFOX2 was highly expressed. Overall, our study reveals a more comprehensive spectrum and regional variation of sQTLs in human brain and demonstrates that such regional variation can be used to fine map potential causal variants of sQTLs and their associated neurological diseases.

Keywords: RNA-seq; alternative splicing; genetic variation; single-nucleotide polymorphism; splicing quantitative trait loc; transcriptome.

PubMed Disclaimer

Conflict of interest statement

Y.X. is a scientific co-founder of Panorama Medicine Inc.

Figures

Figure 1
Figure 1
Overview of Study and Data (A) Overview of study. RNA-seq data derived from human brain tissue samples were downloaded from the GTEx Portal. Data were processed to quantify gene expression (in TPM) and alternative splicing (in PSI) across 13 brain regions. Splicing QTLs (sQTLs) were identified for each individual brain region, based on alternative splicing and genotype information. The sQTL data were used to study the relationship among the positional distribution of sQTLs, significance of sQTLs, and brain region specificity. RNA binding protein (RBP) expression was incorporated with alternative splicing (AS) and single-nucleotide polymorphism (SNP) data to identify potential causal cis variants and trans regulators. Data were used to identify sQTLs potentially associated with 14 neurological disorders. (B) t-SNE clustering of all brain tissue samples based on gene expression (left) and alternative splicing (right). Samples are color coded by brain region. Dashed ovals encircle samples from physically proximate brain regions.
Figure 2
Figure 2
Identification of sQTLs in Each Brain Region (A) Stacked bar plot showing the number of sQTLs (including SE, A5SS, and A3SS events) identified in each brain region, with the number in parentheses referring to the number of tissue samples with available genotype information. (B) Stacked bar plot showing the histogram for the number of brain regions where sQTLs (including SE, A5SS, and A3SS events) were determined to be significant. (C) Heatmap showing the number of disease sQTLs (including SE, A5SS, and A3SS events) associated with each neurological disorder in each brain region. Each row represents one brain region. Each column represents one neurological disorder. Bar plot above heatmap shows the total number of unique sQTLs associated with each neurological disorder.
Figure 3
Figure 3
Relationship among SNP Position, Significance, and Brain Region Specificity of sQTLs (SE Events) (A) Fraction of sQTL with at least one significant SNP within 300 bp of the splice sites as a function of the significance (−log10(p value)) cutoff for significant sQTLs. Each curve represents the result in one brain region. (B) Boxplot showing the distribution of significance (−log10(p value)) of all SNPs within 200 kb of all significant (p ≤ 10−5) sQTLs (SE events). SNPs are grouped based on SNP position relative to the splice sites. (C) Boxplot showing the relationship between overall sQTL significance (−log10(p value)) and the number of brain regions where sQTLs are significant. Each dot represents one sQTL (SE event). An sQTL having at least one significantly associated SNP in a brain region is considered significant in that brain region. Overall sQTL significance is calculated by considering all brain regions where the sQTL is significant, and taking the median of the smallest p values. (D) Bar plots (outside) and cumulative distribution function (CDF) (inside) showing the relationship between SNP position and brain region specificity of sQTLs (SE events). The sQTLs are grouped based on the position of their significant SNP (e.g., dinucleotide, splice site, etc.). For each group, the bar plot shows the histogram of the percentage of sQTLs that are significant in a given number of brain regions. Each bar is labeled above with the number of significant sQTLs in the given number of brain regions.
Figure 4
Figure 4
Regionally Ubiquitous Versus Regionally Specific sQTLs (A) Heatmap showing effect sizes (correlation coefficients between alternative splicing levels and SNP genotypes) of regionally ubiquitous sQTLs. Each row represents one sQTL. Each column represents one brain region. (B) Radar plot showing significance (−log10(p value)) of a regionally ubiquitous sQTL (RTEL exon 23 and rs6062302) in each brain region. Red circle indicates uncorrected p = 10−5. Significant brain regions are highlighted in red text. (C) Boxplot showing the significant association (p value) of SNP rs6062302 with PSI value (exon inclusion level) of RTEL exon 23 in two brain regions: cerebellar hemisphere and nucleus accumbens basal ganglia. (D) LD plot showing a GWAS variant (rs2297440, GWAS p value = 4 × 10−46, glioma; green) in high LD with the sQTL SNP (rs6062302; purple). SNP rs6062302 itself is also a GWAS variant related to glioma (GWAS p value = 1 × 10−13). (E) Heatmap showing effect sizes (correlation coefficients between alternative splicing levels and SNP genotypes) of regionally specific sQTLs. (F) Radar plot showing significance (−log10(p value)) of a regionally specific sQTL (SLC26A10 exon 12 and rs1871417) in each brain region. Red circle indicates uncorrected p = 10−5. Significant brain regions are highlighted in red text. (G) Boxplot showing the association (p value) of SNP rs1871417 with PSI value (exon inclusion level) of SLC26A10 exon 12 in two brain regions: cerebellar hemisphere (significant) and nucleus accumbens basal ganglia (not significant). (H) LD plot showing a GWAS variant (rs10876993, GWAS p value = 4 × 10−6, immune system disease; green) in high LD with the sQTL SNP (rs1871417; purple).
Figure 5
Figure 5
Regionally Specific sQTLs and RBP Expression (A–C) Radar plots showing significance (−log10(p value)) of three regionally specific sQTLs (A: rs6580200 and CXXC5 exon 2; B: rs971570 and TRIM26 exon 2; C: rs6580806 and POU6F1 exon 4) in each brain region. (D) Heatmap showing gene expression (z-score-transformed TPM) of selected RBPs across all 13 brain regions. Each row represents one RBP. Each column represents one brain region. In addition to all 102 Homo sapiens RBPs with DeepBind RBP-RNA binding models, the heatmap includes RBFOX2, RBFOX3, NOVA1, and NOVA2.
Figure 6
Figure 6
Using Regional Specificity of sQTL Signals to Prioritize Causal sQTL cis Variants and trans Regulators (A) Top: Gene structure and six isoforms of human MAPT (tau) gene. Exons −1 and 14 are in the untranslated regions (UTRs). Primary transcript of human MAPT contains 13 exons (exons 4A, 6, and 8 are not transcribed in human brain). Among the 13 exons, exons 2, 3, and 10 are alternatively spliced, generating six mRNA isoforms. Bottom: Functional domains of the longest full-length MAPT protein isoform (including exons 2, 3, and 10). (B) Plots showing distribution of all SNPs within 200 kb (upper) and all significant SNPs within 300 bp (lower) of MAPT exon 3 in cerebellum. Window is extended 30 kb (upper) and 30 bp (lower). Each dot represents one SNP, color-coded according to its LD with the top sQTL SNP (rs62055489). y axis shows the significance of association (−log10(p value)) between each SNP and the sQTL exon. Horizontal line indicates the significance cutoff (p = 10−5). Genes in the UCSC Genome Browser (see Web Resources) are shown in panels below plots. (C) Bubble plot showing effects of six SNPs on RBP-RNA binding, as predicted by DeepBind. Axes show RBP binding scores of sequences with reference allele (x axis) or alternative allele (y axis) for each RBP. Bubble size is proportional to the difference in DeepBind scores between the two alleles. (D) DeepBind variant map for SNP rs17651213 with the RBFOX binding site. Star indicates the position of the SNP. (E) Radar plots showing significance (−log10(p value)) of the sQTL (MAPT exon 3 and rs17651213) and mean gene expression level (TPM) of the RBP (RBFOX2) in each brain region. (F) LD plot showing SNP rs17651213 (blue) in high LD with the top sQTL SNP (rs62055489; purple), 11 PD GWAS SNPs (GWAS p values ranging from 2 × 10−118 to 2 × 10−6, green), and 1 AD GWAS SNP (GWAS p value = 6 × 10−6, green). (G) Diagram illustrating a mechanistic model for tissue-/cell-type-specific sQTLs.

Similar articles

Cited by

References

    1. Nilsen T.W., Graveley B.R. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010;463:457–463. - PMC - PubMed
    1. Wang E.T., Sandberg R., Luo S., Khrebtukova I., Zhang L., Mayr C., Kingsmore S.F., Schroth G.P., Burge C.B. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470–476. - PMC - PubMed
    1. Barbosa-Morais N.L., Irimia M., Pan Q., Xiong H.Y., Gueroussov S., Lee L.J., Slobodeniuc V., Kutter C., Watt S., Çolak R. The evolutionary landscape of alternative splicing in vertebrate species. Science. 2012;338:1587–1593. - PubMed
    1. Yeo G., Holste D., Kreiman G., Burge C.B. Variation in alternative splicing across human tissues. Genome Biol. 2004;5:R74. - PMC - PubMed
    1. Licatalosi D.D., Darnell R.B. Splicing regulation in neurologic disease. Neuron. 2006;52:93–101. - PubMed

Publication types

Substances

LinkOut - more resources