Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 May;182(1):79-90.
doi: 10.1534/genetics.109.100362. Epub 2009 Mar 6.

Expression quantitative trait loci mapping with multivariate sparse partial least squares regression

Affiliations

Expression quantitative trait loci mapping with multivariate sparse partial least squares regression

Hyonho Chun et al. Genetics. 2009 May.

Abstract

Expression quantitative trait loci (eQTL) mapping concerns finding genomic variation to elucidate variation of expression traits. This problem poses significant challenges due to high dimensionality of both the gene expression and the genomic marker data. We propose a multivariate response regression approach with simultaneous variable selection and dimension reduction for the eQTL mapping problem. Transcripts with similar expression are clustered into groups, and their expression profiles are viewed as a multivariate response. Then, we employ our recently developed sparse partial least-squares regression methodology to select markers associated with each cluster of genes. We demonstrate with extensive simulations that our eQTL mapping with multivariate response sparse partial least-squares regression (M-SPLS eQTL) method overcomes the issue of multiple transcript- or marker-specific analyses, thereby avoiding potential elevation of type I error. Additionally, joint analysis of multiple transcripts by multivariate response regression increases power for detecting weak linkages. We illustrate that M-SPLS eQTL compares competitively with other approaches and has a number of significant advantages, including the ability to handle highly correlated genotype data and computational efficiency. We provide an application of this methodology to a mouse data set concerning obesity and diabetes.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—
Figure 1.—
(A) Set of true linkages. (B) Absolute values of the linkages estimated by M-SPLS regression. (C) Absolute values of the estimated linkages after considering bootstrap confidence intervals. In A–C, the x-axis represents markers, and the y-axis (on the left) represents transcripts. The shading of each pixel represents the strength of linkage signal. (D) Ninety-five percent C.I.'s for marker 137 across all the transcripts in the cluster. The y-axis depicts the size of the coefficients.
F<sc>igure</sc> 2.—
Figure 2.—
Results for simulation C-1 (noisy cluster with many nonmapping transcripts). Symbols represent different numbers of markers (○, r = 3; •, r = 10) associated with ρ ∈ {0.1, 0.3, 0.6, 0.9} proportion of transcripts in the cluster. Different line types indicate weak (dashed line) or strong (solid line) control by a single eQTL architecture.
F<sc>igure</sc> 3.—
Figure 3.—
Results for simulations C-2 (heterogeneous cluster: C-2.1 represents weaker control by the two eQTL architectures compared to C-2.2) and C-3 (cluster with weak linkages). Top panels represent type-I error and power for U-SPLS (U) and M-SPLS (M) with vertical lines representing simulation standard errors. Bottom panels report the proportion of linked transcripts for each marker by U-SPLS (bottom left panel) and M-SPLS (bottom right panel) in simulation C-3. Hotspot markers are indicated with solid lines.
F<sc>igure</sc> 4.—
Figure 4.—
M-SPLS solution for a cluster of 83 transcripts including the 3 lipid metabolism transcripts.

References

    1. Allison, D. B., B. Thiel, P. S. Jean, R. C. Elston, M. C. Infante et al., 1998. Multiple phenotype modeling in gene-mapping studies of quantitative traits: power advantages. Am. J. Hum. Genet. 63 1190–1201. - PMC - PubMed
    1. Bair, E., T. Hastie, D. Paul and R. Tibshirani, 2006. Prediction by supervised principal components. J. Am. Stat. Assoc. 101 119–137.
    1. Brem, R., and L. Kruglyak, 2005. The landscape of genetic complexity across 5700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA 102 1572–1577. - PMC - PubMed
    1. Brem, R. B., G. Yvert, R. Clinton and L. Kruglyak, 2002. Genetic dissection of transcriptional regulation in budding yeast. Science 296 752–755. - PubMed
    1. Chen, M., and C. Kendziorski, 2007. A statistical framework for expression quantitative trait loci (eQTL) mapping. Genetics 177 761–771. - PMC - PubMed

Publication types

Substances