Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 26;224(2):iyad060.
doi: 10.1093/genetics/iyad060.

A gene-level test for directional selection on gene expression

Affiliations

A gene-level test for directional selection on gene expression

Laura L Colbran et al. Genetics. .

Abstract

Most variants identified in human genome-wide association studies and scans for selection are noncoding. Interpretation of their effects and the way in which they contribute to phenotypic variation and adaptation in human populations is therefore limited by our understanding of gene regulation and the difficulty of confidently linking noncoding variants to genes. To overcome this, we developed a gene-wise test for population-specific selection based on combinations of regulatory variants. Specifically, we use the QX statistic to test for polygenic selection on cis-regulatory variants based on whether the variance across populations in the predicted expression of a particular gene is higher than expected under neutrality. We then applied this approach to human data, testing for selection on 17,388 protein-coding genes in 26 populations from the Thousand Genomes Project. We identified 45 genes with significant evidence (FDR<0.1) for selection, including FADS1, KHK, SULT1A2, ITGAM, and several genes in the HLA region. We further confirm that these signals correspond to plausible population-level differences in predicted expression. While the small number of significant genes (0.2%) is consistent with most cis-regulatory variation evolving under genetic drift or stabilizing selection, it remains possible that there are effects not captured in this study. Our gene-level QX score is independent of standard genomic tests for selection, and may therefore be useful in combination with traditional selection scans to specifically identify selection on regulatory variation. Overall, our results demonstrate the utility of combining population-level genomic data with functional data to understand the evolution of gene expression.

Keywords: evolution; gene regulation; human evolution; quantitative genetics; selection.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
We adapted the Qx statistic to test for selection on regulatory variants. a) Spearman ρ between observed and predicted expression in 1 kG for 7,251 JTI models trained in GTEx LCLs, b) and that ρ plotted vs. the in-sample training R2. c) Schematic of Qx calculation as applied to JTI models. The QX score is based on the JTI effect sizes and allele frequencies across populations for the set of regulatory variants for each gene. The F matrix contains frequencies across the same populations for variants that were the same frequency in the JTI study population, but were not associated with expression of that gene, thereby modeling the expected covariance for these variants. QX is higher when the regulatory variants for a gene exceed those expected patterns.
Fig. 2.
Fig. 2.
Using a gamma-corrected P-value, we identify 45 genes with evidence of selection. a) Power curves for each P-value method, based on simulations (Methods). We calculated the power for the gamma-corrected version of the QX test, as well as for 2 variations of the effect-permuted test. In the first, we drew the effect sizes from the simulation that modeled the corresponding selection strength, and for the other from the neutral effect model. FOS, fitness optimum shift. b) The QX score can be decomposed into its FST-like component and its LD-like component. Significant (FDR<0.1) genes in 1 kG for the gamma-controlled and effect-permuted P-values are highlighted in blue and red, respectively, while the red line indicates where FST=LD. The QX score for each gene is obtained by adding the two components together. The Spearman rank correlation between the components is 0.15. c) Manhattan plot and d) QQ-plot for gamma-corrected QX  P-values for 17,388 genes. The horizontal line in (c) at −log10(p) = 3.95 corresponds to FDR<0.1.
Fig. 3.
Fig. 3.
The QX statistic is not correlated with other selection statistics. Pairwise heatmap of Spearman rank correlations between QX  P-values and various selection-related scores.
Fig. 4.
Fig. 4.
Different combinations of variant effects can drive predicted differences. a) Median predicted expression in each 1 kG population for the top gene in each peak of the gamma-corrected P-values. For display purposes, for each gene is standardized across populations. b) for FADS1 there is one primary haplotype (tagged by rs174549), while for c) ACO2 there are 3 variants driving the upregulation in PEL. Cells are colored by the product of JTI effect size times effect allele frequency in each population. Values in b) and c) are mean-centered for each variant.

Similar articles

Cited by

References

    1. Agresti A, Coull BA. Approximate is better than “exact” for interval estimation of binomial proportions. Am Stat. 1998;52(2):119–126.
    1. Aguet F, Brown AA, Castel SE, Davis JR, He Y, Jo B, Mohammadi P, Park YS, Parsana P, Segrè AV, et al. Genetic effects on gene expression across human tissues. Nature. 2017;550(7675):204–213. - PMC - PubMed
    1. Ameur A, Enroth S, Johansson Å, Zaboli G, Igl W, Johansson ACV, Rivas MA, Daly MJ, Schmitz G, Hicks AA, et al. Genetic adaptation of fatty-acid metabolism: a human-specific haplotype increasing the biosynthesis of long-chain omega-3 and omega-6 fatty acids. Am J Hum Genet. 2012;90(5):809–820. doi:10.1016/j.ajhg.2012.03.014 - DOI - PMC - PubMed
    1. Benton ML, Talipineni SC, Kostka D, Capra JA. Genome-wide enhancer annotations differ significantly in genomic distribution, evolution, and function. BMC Genomics. 2019;20(1):511. doi:10.1186/s12864-019-5779-x - DOI - PMC - PubMed
    1. Berg JJ, Coop G. A population genetic signal of polygenic adaptation. PLoS Genet. 2014;10(8):e1004412. doi:10.1371/journal.pgen.1004412 - DOI - PMC - PubMed

Publication types