Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec;189(4):1449-59.
doi: 10.1534/genetics.111.131425. Epub 2011 Sep 16.

Bayesian detection of expression quantitative trait loci hot spots

Affiliations

Bayesian detection of expression quantitative trait loci hot spots

Leonardo Bottolo et al. Genetics. 2011 Dec.

Abstract

High-throughput genomics allows genome-wide quantification of gene expression levels in tissues and cell types and, when combined with sequence variation data, permits the identification of genetic control points of expression (expression QTL or eQTL). Clusters of eQTL influenced by single genetic polymorphisms can inform on hotspots of regulation of pathways and networks, although very few hotspots have been robustly detected, replicated, or experimentally verified. Here we present a novel modeling strategy to estimate the propensity of a genetic marker to influence several expression traits at the same time, based on a hierarchical formulation of related regressions. We implement this hierarchical regression model in a Bayesian framework using a stochastic search algorithm, HESS, that efficiently probes sparse subsets of genetic markers in a high-dimensional data matrix to identify hotspots and to pinpoint the individual genetic effects (eQTL). Simulating complex regulatory scenarios, we demonstrate that our method outperforms current state-of-the-art approaches, in particular when the number of transcripts is large. We also illustrate the applicability of HESS to diverse real-case data sets, in mouse and human genetic settings, and show that it provides new insights into regulatory hotspots that were not detected by conventional methods. The results suggest that the combination of our modeling strategy and algorithmic implementation provides significant advantages for the identification of functional eQTL hotspots, revealing key regulators underlying pathways.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ROC curves for hotspots detection using HESS (blue line), MOM (red line), BAYES (green line), and M-SPLS (black star) in the four simulated scenarios (Figure S2). From top to bottom, left to right: SIM1, q = 100 and six hotspots; SIM2, q = 100 and three hotspots; SIM3, q = 1000 and six hotspots; SIM4, q = 1000 and three hotspots. For M-SPLS, type I error and power were calculated conditionally on the list of latent vector components. (Top) MOM is indicated by a red dashed line to highlight that it is not designed in the cases when the number of markers is larger than the number of traits.
Figure 2
Figure 2
ROC curves for transcript–marker associations using HESS (blue line), MOM (red line), BAYES (green line), and M-SPLS (black star) in the four simulated scenarios (Figure S2). From top to bottom, left to right: SIM1, q = 100 and six hotspots; SIM2, q = 100 and three hotspots; SIM3, q = 1000 and six hotspots; SIM4, q = 1000 and three hotspots. For M-SPLS, power is calculated conditionally on the list of transcript–marker associations selected by bootstrap confidence interval at a fixed type I error (α = 10−4, 10−3, 10−2, 0.05). In the top, MOM is indicated by a red dashed line to highlight that it is not designed in the cases when the number of markers is larger than the number of traits.
Figure 3
Figure 3
Proportion of transcripts associated with each marker in the mouse data example (n = 60, P = 145, and q = 1573). Transcript–marker association was declared at 5% local FDR with marginal probability of inclusion >0.95. The 16 red triangles indicate markers (two of them are overlapping and hence are not distinguishable) that have been identified as hotspots with tail posterior probability >0.8.
Figure 4
Figure 4
Heat map of the marginal probabilities of inclusion for each transcript–marker pair in the mouse data example (n = 60, P = 145, and q = 1573). The 16 red triangles indicate markers that have been identified as hotspots with tail posterior probability >0.8.
Figure 5
Figure 5
Tail posterior probability for each marker in the human data example (Gutenberg Heart Study, n = 1490, P = 209, and q = 648). Red triangles indicate markers that have been identified as hotspots with tail posterior probability >0.8. The vertical gray line highlights the physical position of annotated SNP rs9557217 and rs9585056 that were previously associated with IDIN network in the Cardiogenics Study cohort and EBI2 expression (Heinig et al. 2010). Thick horizontal bars on the top of the figure display physical position of genes in the 1-Mb region obtained from Ensemble database.

References

    1. Altshuler D., Brooks L. D., Chakravarti A., Collins F. S., Daly M. D., et al. , 2005. A haplotype map of the human genome. Nature 437: 1299–1320 - PMC - PubMed
    1. Banerjee S., Yandell B. S., Yi N., 2008. Bayesian quantitative trait loci mapping for multiple traits. Genetics 179: 2275–2289 - PMC - PubMed
    1. Bottolo L., Richardson S., 2010. Evolutionary stochastic search for Bayesian model exploration. Bayesian Anal. 5: 583–618
    1. Bottolo L., Chadeau-Hyam M., Hastie D. I., Langley S. R., Petretto E., et al. , 2011. ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration. Bioinformatics 27: 587–588 - PMC - PubMed
    1. Breitling R., Li Y., Tesson B. M., Fu J., Wiltshire T., et al. , 2008. Genetical genomics: spotlight on QTL hotspots. PLoS Genet. 4: e1000232. - PMC - PubMed

Publication types