Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 30;3(1):100244.
doi: 10.1016/j.xgen.2022.100244. eCollection 2023 Jan 11.

The landscape of expression and alternative splicing variation across human traits

Affiliations

The landscape of expression and alternative splicing variation across human traits

Raquel García-Pérez et al. Cell Genom. .

Abstract

Understanding the consequences of individual transcriptome variation is fundamental to deciphering human biology and disease. We implement a statistical framework to quantify the contributions of 21 individual traits as drivers of gene expression and alternative splicing variation across 46 human tissues and 781 individuals from the Genotype-Tissue Expression project. We demonstrate that ancestry, sex, age, and BMI make additive and tissue-specific contributions to expression variability, whereas interactions are rare. Variation in splicing is dominated by ancestry and is under genetic control in most tissues, with ribosomal proteins showing a strong enrichment of tissue-shared splicing events. Our analyses reveal a systemic contribution of types 1 and 2 diabetes to tissue transcriptome variation with the strongest signal in the nerve, where histopathology image analysis identifies novel genes related to diabetic neuropathy. Our multi-tissue and multi-trait approach provides an extensive characterization of the main drivers of human transcriptome variation in health and disease.

Keywords: BMI; age; alternative splicing; ancestry; diabetes; gene expression; human traits; sex; tissue; transcriptome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Contributions of demographic traits to gene expression variation (A) Number of DEGs per tissue and demographic trait. Heatmap cell colors are normalized to the maximum value per trait. Tissues are sorted by sample size, and tissue labels correspond to tissue names described in Figure S1. (B) Proportion of total tissue expression variation explained by each demographic trait. Top bars show the number of tissues for which each demographic trait explains the largest proportion of variation. (C) Mean gene expression variation explained by each demographic trait in each tissue. This decreases with sample size since larger numbers of samples provide power to detect smaller contributions. (D) Examples of genes with a large proportion of expression variation explained by a demographic trait.
Figure 2
Figure 2
Tissue sharing of DEGs and contribution of genetic variants to expression differences between human populations (A) Distribution of the number of tissues in which a gene is DE with each demographic trait. Labeled ancestry- and age-DEGs correspond to highly tissue-shared genes enriched in glutathione-related metabolic processes and p53 pathways, respectively. For sex-DEGs, black points correspond to highly shared X-chromosome-inactivation (XCI) escapees and Y genes. As examples, the well-known XCI escapee XIST and the ubiquitously transcribed Y gene UTY are labeled. The top three most tissue-shared BMI-DEGs are labeled. Bottom bars show the proportions of tissue-specific DEGs and DEGs in a low (2–5), moderate (6–9), or high (≥10) number of tissues with each demographic trait. (B) Median tissue expression values for a highly tissue-shared ancestry (top) and age (bottom) DEG (EA, European American; AA, African American). (C) Percentage of cis-driven of the total ancestry eGenes DE across tissues. (D) Example of a cis-driven DEG in sun-exposed skin. Left: bar plot showing the allele frequency of the eQTL variant in each population. Right: PWP2 violin plots of gene expression levels stratified by population and individual genotype. (E) cis-driven DEGs are associated with eQTLs with larger Fst values (Wilcoxon signed-rank test, p = 1.9e−10). Violin plots show the distribution of tissue median Fst values for cis-driven (left) and cis-independent (right) DEGs. (F) cis-driven DEGs are more tissue shared (Wilcoxon signed-rank test, p = 3.9e−07). Violin plots show the distribution of median tissue-sharing values for cis-driven (left) and cis-independent (right) DEGs. (G) cis-eQTLs explain a larger amount of gene expression variation than ancestry (Wilcoxon signed-rank test, p = 2.8e−14). Violin plots show the distribution of tissue median gene expression variation explained by eQTLs in cis-driven and cis-independent DEGs (left) and by ancestry in cis-independent DEGs (right).
Figure 3
Figure 3
Additive contributions of demographic traits are common and interactions rare (A) Bar plot indicating the number of tissues with at least 20 DEGs with two demographic traits, of those how many have a significant overlap, and of those how many have a significant bias in the direction of change. (B) Proportion of DEGs with sex and age (left) or sex and BMI (right) in each tissue that fall in each color-coded category. (C and D) Left: examples of two tissues with more DEGs with two traits than expected that also have a bias in the direction of change. The scatterplots show the log2 fold change associated with each demographic trait, and each point represents a gene. Red, genes in categories with larger observed versus expected ratios. Labeled genes are among the ones with larger fold changes with both traits. Right: violin plots of expression levels for example genes, stratified by age range or sex. Bars at the bottom indicate the proportion of expression variation explained by each demographic trait. (E) Comparison of age fold changes calculated separately for males and females in genes with a significant interaction between sex and age (Wilcoxon signed-rank test, p <2.2e−16). (F) Example of a gene with a significant interaction between sex and age: its expression increases with age in males but it decreases with age in females. Expression levels are stratified by sex and age range.
Figure 4
Figure 4
Contribution of demographic traits to AS variation (A) Schematic illustration of the different types of splicing events. For each type of splicing event, we present the spliced-in and spliced-out versions of the splicing event. In black is the exonic/intronic sequence that is included in the spliced-in isoform and for which PSI values are calculated. (B) Cumulative distribution of the number of tissues in which splicing events are AS. (C) Functional characterization of AS events. (D) Proportion of AS events associated with a switch between a non-coding and a coding isoform per type of event. Boxplots show the distribution of the proportion of AS events per tissue. (E) Number of DSEs per tissue and demographic trait. Heatmap cell colors are normalized to maximum value per column. (F) Proportion of the total tissue AS variation explained by each demographic trait. Top bars are the numbers of tissues for which each demographic trait explains the largest proportion. (G) Examples of the potential functional consequences of DSEs. Shown are schematic representations of the PFAM domain and the transcript structure of isoforms that either include or exclude the splicing event and that contribute to the DSEs. For each event, PSI values are represented as boxplots with samples stratified by population or age range. Violin plots show the PSI distribution. Points correspond to individual PSI values. The number of individuals in each group is shown within the plot. Bars at the bottom indicate the proportion of alternative splicing variation explained by each demographic trait. (H) Comparison of the relative contribution of each demographic trait to the total tissue expression and splicing variation explained. For each trait, the average value across tissues is plotted. The error bars correspond to the standard deviation. For each demographic trait, we considered only tissues with at least five DEGs and five DSEs.
Figure 5
Figure 5
Splicing patterns of ribosomal proteins vary across human populations (A) Percentage of cis-driven DSEs across tissues. (B) cis-driven DSEs are associated with sQTLs with larger Fst values (Wilcoxon signed-rank test, p = 1.563e−12). Violin plots show the distribution of tissue median Fst values for cis-driven (left) and cis-independent (right) DSEs. (C) cis-driven DSEs are more tissue shared (Wilcoxon signed-rank test, p = 1.395e−06). Violin plots show the distribution of median tissue-sharing values for cis-driven (left) and cis-independent (right) DSEs. (D) cis-sQTLs explain a larger amount of splicing variation than ancestry (Wilcoxon signed-rank test, p = 1.421e−13). Violin plots show the distribution of tissue median splicing variation explained by sQTLs in cis-driven and cis-independent DSEs (left) and by ancestry in cis-independent DSEs (right). (E) Distribution of the number of tissues in which a splicing event is DS with each demographic trait. Ancestry-DSEs in ribosomal proteins are highlighted in black and labeled if shared in 10 or more tissues. Bottom bars show the proportions of tissue-specific DSEs and DSEs in a low (2–5), moderate (6–9), or high (≥10) number of tissues with each demographic trait. (F) Functional enrichment of genes with highly shared ancestry-DSEs. (G) Bar plot shows the proportion and number of ancestry-DSEs in ribosomal proteins in two or more tissues that have the same or different directionality. (H) Example of a genetic variant associated with the splicing pattern of a ribosomal protein not previously reported as an sGene. Bar plot shows the allele frequencies in each population. Violin plots show the PSI distribution stratified by population and genotype. Points correspond to individual PSI values (EA, European American; AA, African American). (I) Examples of two highly tissue-shared ancestry-DSEs on ribosomal proteins that affect a protein-coding domain.
Figure 6
Figure 6
Types 1 and 2 diabetes alter the transcriptome of multiple tissues, especially of the tibial nerve (A) Clinical traits and affected tissues. (B) Number of DEGs per tissue and clinical trait. Bars are colored according to the tissues. (C) Proportion of the total tissue expression variation explained by each clinical trait. (D) DEGs with type 1 or 2 diabetes in three or more tissues. The y axis corresponds to the fold change between healthy and diseased samples per tissue. In bold are known disease-related genes.,,,,, (E) Left: the overlap between DEGs with types 1 and 2 diabetes in the tibial nerve. Right: DEGs with both diabetes have the same directionality. Genes driving functional enrichments (Table S6H) are labeled in the plot. (F) On the left, tissue images from a healthy and a diabetic individual. Note the larger diameter of the fascicles (circles) and smaller interstitial spaces in the diabetic donor, consistent with previous observations. The right shows the ROC curves of the top and a median performing classifier. (G) Top: DEGs with age and clinical traits show a biased directionality. Our observations are not confounded by age differences between healthy and diseased individuals (Figure S7F). Bars indicate the proportion (and number) of DEGs in each of the four possible directionalities. Bottom: scatterplot shows the fold change associated with age (x axis) versus the fold change between healthy and type 2 diabetes (y axis) in the tibial nerve. Labeled genes (24) have been previously associated with type 2 diabetes susceptibility in the tibial nerve through transcriptome-wide association studies. (H) LPL expression changes with age and type 2 diabetes. Gene expression levels are represented as boxplots with samples stratified by both traits. Bars at the bottom indicate the proportions of expression variation explained by age and type 2 diabetes.

References

    1. Melé M., Ferreira P.G., Reverter F., DeLuca D.S., Monlong J., Sammeth M., Young T.R., Goldmann J.M., Pervouchine D.D., Sullivan T.J., et al. Human genomics. The human transcriptome across tissues and individuals. Science. 2015;348:660–665. doi: 10.1126/science.aaa0355. - DOI - PMC - PubMed
    1. GTEx Consortium The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369:1318–1330. doi: 10.1126/science.aaz1776. - DOI - PMC - PubMed
    1. Cardoso-Moreira M., Halbert J., Valloton D., Velten B., Chen C., Shao Y., Liechti A., Ascenção K., Rummel C., Ovchinnikova S., et al. Gene expression across mammalian organ development. Nature. 2019;571:505–509. doi: 10.1038/s41586-019-1338-5. - DOI - PMC - PubMed
    1. He P., Williams B.A., Trout D., Marinov G.K., Amrhein H., Berghella L., Goh S.-T., Plajzer-Frick I., Afzal V., Pennacchio L.A., et al. The changing mouse embryo transcriptome at whole tissue and single-cell resolution. Nature. 2020;583:760–767. doi: 10.1038/s41586-020-2536-x. - DOI - PMC - PubMed
    1. Mazin P.V., Khaitovich P., Cardoso-Moreira M., Kaessmann H. Alternative splicing during mammalian organ development. Nat. Genet. 2021;53:925–934. doi: 10.1038/s41588-021-00851-w. - DOI - PMC - PubMed

LinkOut - more resources