Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 11;21(1):235.
doi: 10.1186/s13059-020-02129-6.

sn-spMF: matrix factorization informs tissue-specific genetic regulation of gene expression

Affiliations

sn-spMF: matrix factorization informs tissue-specific genetic regulation of gene expression

Yuan He et al. Genome Biol. .

Abstract

Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues from the Genotype-Tissue Expression (GTEx) project. The learned factors reflect tissues with known biological similarity and identify transcription factors that may mediate tissue-specific effects. sn-spMF, available at https://github.com/heyuan7676/ts_eQTLs , can be applied to learn biologically interpretable patterns of eQTL tissue-specificity and generate testable mechanistic hypotheses.

Keywords: Matrix factorization; Tissue-specific eQTLs; Transcription factors; Ubiquitous eQTLs.

PubMed Disclaimer

Conflict of interest statement

FA is an inventor on a patent application related to TensorQTL; HKI has received speaker honoraria from GSK and AbbVie.

Figures

Fig. 1
Fig. 1
Matrix factorization model to dissect eQTL effects across tissues. a Simplified examples of the relationship between eQTL effect sizes and factors. eQTL1: the effect of an eQTL in the spleen can be represented by a spleen-specific factor. eQTL2: the effect of an eQTL in all nine tissues can be summarized as a ubiquitous effect across all tissues. eQTL3: the effect of an eQTL in four brain tissues and three skin tissues can be summarized as the summation of brain-specific effect and skin-specific effect. b Learning factors underlying eQTL effects from GTEx. X matrix represents the effect size of eQTLs across tissues (see the “Methods” section). Patterns of tissue-sharing and tissue-specificity are observed in X. Matrix factorization is implemented to learn the factor matrix F, where each factor captures a pattern of eQTL effect sizes across tissues. c Matrix W represents the weights for each eQTL across tissues. Each weight is the reciprocal of the standard error. d The objective function in sn-spMF, where α and λ are sparsity penalty parameters, and D is the number of eQTLs
Fig. 2
Fig. 2
Assignment of eQTLs to factors. Effect sizes and 95% confidence intervals of four eQTLs across 49 tissues are illustrated. The fitted linear combination of factors for the eQTL is displayed in gray scale at the right of each panel. Faded colors indicate factors with coefficients with FDR ≥ 0.05. Asterisk on the tissue indicates that this eQTL was significant with FDR < 0.05 in that tissue. a A liver-specific eQTL (GLT1D1-rs1012994). b An eQTL (AATF-rs76014915) with activity in brain tissues and tibial nerve. c A ubiquitous eQTL (U2AF1-rs234719). d An eQTL (CD14-rs2563249) with ubiquitous and testis-specific effects
Fig. 3
Fig. 3
Identification of tissue-specific and ubiquitous eQTLs. a Fraction of tested eQTLs that load on each factor. b Fraction of eQTLs that load on ubiquitous and tissue-specific factors. c The overlap of tested eQTLs that loaded on the ubiquitous factor (u-eQTLs) and any tissue-specific factor (ts-eQTLs). d Fraction of eQTLs that load on different numbers of tissue-specific factors. eQTLs that load with a specific number of ts-factors can fall into one of two categories: those with the ubiquitous factor and those with only ts-factors. The figure shows the fraction of tested eQTLs that load on each number of ts-factors with colors to show the contribution for each category. e Fraction of eQTLs with activity in different numbers of tissues. The numbers of unique tissues represented in the set of factors for each eQTL are summed
Fig. 4
Fig. 4
Enriched GO terms for eQTL genes from sn-spMF at FDR < 0.1. Color represents the level of enrichment (− log10P value). The significantly enriched GO terms are annotated by numbers representing the odds ratio. To compute the OR for each factor, background genes include all genes tested for the represented tissues in the factor. GO terms and factors are ordered by hierarchical clustering. Examples of relevant GO terms in related tissues are annotated
Fig. 5
Fig. 5
Enrichment of TFBS for u-eQTLs and ts-eQTLs. a Number of TFs whose binding sites are enriched for eQTLs across factors at FDR < 0.05 for sn-spMF, flashr bf, and heuristic 1 methods. Enh, enhancers; TssA, active transcription start sites. b Total number of TFs with binding sites enriched for either only u-eQTLs, or only ts-eQTLs, or both. c Distribution of the number of tissue-specific factors each TF is enriched in. df Enrichment for example TFs among eQTLs across each factor (− log10(P value)) where the TF was expressed in corresponding tissues for d FOSL2, e GATA4, and f HNF4A. Black bars represent that the BH-corrected P value is < 0.05
Fig. 6
Fig. 6
Example liver-specific eQTL, TNKS-rs9987289, in a TFBS of HNF4A that co-localizes with liver-specific phenotypes. a Effect size and 95% confidence interval of TNKS-rs9987289 across 49 tissues in GTEx. b Allele-specific HNF4A ChIP-seq reads over rs9987289 in the liver (see the “Methods” section, two-sided binomial test P value =8.8×10−5). c Normalized expression levels of TNKS in the liver among individuals with different genotypes at rs9987289. P value =3.4×10−4 from GTEx eQTL analysis. d Schematic illustration of hypothesized mechanism: allele-specific binding of HNF4A at rs9987289 and altered levels of expression of TNKS. e Manhattan plot (LocusZoom v0.4.8) [54] of TNKS expression levels in the liver around rs9987289. f Manhattan plot for LDL GWAS around rs9987289

Similar articles

Cited by

References

    1. GTEx Consortium Genetic effects on gene expression across human tissues. Nature. 2017;550:204–13. - PMC - PubMed
    1. C Nica A, Parts L, Glass D, Nisbet J, Barrett A, Sekowska M, Travers M, Potter S, Grundberg E, Small K, K Hedman A, Bataille V, Bell J, Surdulescu G, S Dimas A, Ingle C, O Nestle F, Di Meglio P, Min J, Spector T. The architecture of gene regulatory variation across multiple human tissues: the muther study. PLoS Genet. 2011;7:1002003. - PMC - PubMed
    1. Battle A, Mostafavi S, Zhu X, Potash J, Weissman M, Mccormick C, Haudenschild C, Beckman K, Shi J, Mei R, Urban A, B Montgomery S, F Levinson D, Koller D. Characterizing the genetic basis of transcriptome diversity through rna-sequencing of 922 individuals. Genome Res. 2013; 24. 10.1101/gr.155192.113. - PMC - PubMed
    1. Innocenti F, M Cooper G, Stanaway I, Gamazon E, D Smith J, Mirkov S, Ramirez J, Liu W, S Lin Y, Moloney C, Force Aldred S, D Trinklein N, Schuetz E, A Nickerson D, E Thummel K, J Rieder M, Rettie A, J Ratain M, J Cox N, Brown C. Identification, replication, and functional fine-mapping of expression quantitative trait loci in primary human liver tissue. PLoS Genet. 2011;7:1002078. - PMC - PubMed
    1. Gibbs J, P van der Brug M, Hernandez D, J Traynor B, A Nalls M, Lai S-L, Arepalli S, Dillman A, Rafferty I, Troncoso J, Johnson R, Ronald Zielke H, Ferrucci L, Longo D, Cookson M, B Singleton A. Abundant quantitative trait loci exist for dna methylation and gene expression in human brain. PLoS Genet. 2010;6:1000952. - PMC - PubMed

Publication types

Substances

LinkOut - more resources