Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar;32(3):409-424.
doi: 10.1101/gr.276064.121. Epub 2022 Feb 22.

Functional noncoding SNPs in human endothelial cells fine-map vascular trait associations

Affiliations

Functional noncoding SNPs in human endothelial cells fine-map vascular trait associations

Anu Toropainen et al. Genome Res. 2022 Mar.

Abstract

Functional consequences of genetic variation in the noncoding human genome are difficult to ascertain despite demonstrated associations to common, complex disease traits. To elucidate properties of functional noncoding SNPs with effects in human endothelial cells (ECs), we utilized our previous molecular quantitative trait locus (molQTL) analysis for transcription factor binding, chromatin accessibility, and H3K27 acetylation to nominate a set of likely functional noncoding SNPs. Together with information from genome-wide association studies (GWASs) for vascular disease traits, we tested the ability of 34,344 variants to perturb enhancer function in ECs using the highly multiplexed STARR-seq assay. Of these, 5711 variants validated, whose enriched attributes included: (1) mutations to TF binding motifs for ETS or AP-1 that are regulators of the EC state; (2) location in accessible and H3K27ac-marked EC chromatin; and (3) molQTL associations whereby alleles associate with differences in chromatin accessibility and TF binding across genetically diverse ECs. Next, using pro-inflammatory IL1B as an activator of cell state, we observed robust evidence (>50%) of context-specific SNP effects, underscoring the prevalence of noncoding gene-by-environment (GxE) effects. Lastly, using these cumulative data, we fine-mapped vascular disease loci and highlighted evidence suggesting mechanisms by which noncoding SNPs at two loci affect risk for pulse pressure/large artery stroke and abdominal aortic aneurysm through respective effects on transcriptional regulation of POU4F1 and LDAH Together, we highlight the attributes and context dependence of functional noncoding SNPs and provide new mechanisms underlying vascular disease risk.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Construction and characterization of the STARR-seq library. (A) Schematic of the STARR-seq library design. A candidate set of 198-bp oligos was selected based on their overlap with binding (b)QTLs, histone mark (hm)QTLs, chromatin accessibility (ca)QTLs, and cis-eQTLs. Reference and alternative variants were cloned into a plasmid vector under origin of replication core-promoter. The reporter library was transfected into cultured teloHAECs, and deep sequencing was conducted. The enrichment of reporter RNA expression over the input DNA library directly and quantitatively reflects enhancer activity. (B) Chromatin state distribution of the STARR-seq library regions across the core set of ENCODE samples. STARR-seq library regions were intersected with previously published genome-wide chromatin partitioning into 18 chromatin states by ChromHMM. (C) STARR-seq library regions that overlap an enhancer, relative to the total number of enhancers called in the sample. Enhancer coordinates and sample grouping were obtained from EpiMap. (D) Sharing of STARR-seq library enhancers and promoters across a diverse set of epigenomes. STARR-seq library regions were intersected with the ChromHMM chromatin states (18-state model) for every sample in EpiMap, an epigenetic compendium spanning the human body. For each STARR-seq region, the number of samples with a promoter or enhancer overlap was counted. The different enhancer states (active, bivalent, genic, and weak) were combined, as were the promoter states (active TSS, bivalent TSS, flanking TSS, flanking TSS upstream, and flanking TSS downstream).
Figure 2.
Figure 2.
Enhancer activity profile of the STARR-seq reporter regions in teloHAEC cells. (A) STARR-seq replicate correlations for the measured reporter activity of library inserts normalized to DNA amounts in the input plasmid pool (the RNA/DNA ratio). (B) Cumulative distribution of enhancer activity (RNA/DNA ratio) for oligos overlapping active enhancers in HUVECs, compared to 100 scrambled regions,100 negative region oligos (that did not overlap any active chromatin marks in the ENCODE study). The activity distributions of the all STARR-seq regions and the top 10% most active regions are also shown. When selecting the top 10% most active STARR-seq regions, only the allele with the highest activity was considered for each region. (C) For different epigenomes, the fraction of enhancer-overlapping STARR-seq regions that are among the top 10% most active STARR-seq regions, as measured in teloHAECs. The active enhancers of each ENCODE sample were intersected with STARR-seq library regions to obtain the epigenome-specific set of enhancer overlapping oligos, which was then intersected with overall strongest reporter signals from teloHAEC STARR-seq. The top 10% regions were selected as in panel B. (D) De novo motif analysis of the STARR-seq regions showing enhancer activity within the upper 10th percentile in teloHAECs. The top 10% regions were selected as in panel B, and motif enrichment was calculated against all other regions in the library.
Figure 3.
Figure 3.
Allele-specific enhancer function is associated with motif mutations. (A) Volcano plot of allele-specific effects in STARR-seq with the log2 fold change on the x axis and −log10 false discovery rate on the y-axis. All oligos, representing both reference and alternative alleles for each variable position, from each replicate were analyzed. (B) Dot-plot presenting the hypergeometric enrichment –log(P-value) and KS testing –log(P-value) of the top 50 most enriched motifs mutated in the validated set from STARR-seq. Motifs that are directly for ETS family or AP-1 family transcription factors are indicated. (C) PWM motif mutation score density for ERG motif mutations in validated (red) and nonvalidated (gray). (D) PWM motif mutation score density for ATF3 motif mutations in validated (red) and nonvalidated (gray). (E) PWM motif mutation score (x-axis) versus the SVM motif mutation score for ERG motif mutation SNPs. (F) PWM motif mutation score (x-axis) versus the SVM motif mutation score for ATF3 motif mutation SNPs. (G) Enrichment of SVM (blue) and PWM motif mutations scores (orange) in the STARR-seq-validated set.
Figure 4.
Figure 4.
Epigenetic traits are linked to validation in STARR-seq. (A) Venn diagram comparing the number of SNPs in an open chromatin region in HAECs (bottom), in an H3K27ac peak (right), or validated in STARR-seq (left). (B) Enrichment of SNPs in epigenetic trait peaks in the STARR-seq-validated SNP set. H3K27ac (top four) restricted to various sizes around the center of the peak. (C) Peak size density of ATAC-seq peaks and H3K27ac ChIP-seq peaks. (D) Enrichment of SNPs in ENCODE open regions that are shared and not shared with HAECs. (E) Venn Diagram comparing the number of SNPs that are molQTLs (right) and validated in STARR-seq (left). (F) Enrichment of molQTL SNPs in STARR-seq-validated SNPs. (G) Heat map of number of STARR-seq-validated SNPs in subsignificant molQTL bins.
Figure 5.
Figure 5.
IL1B treatment trends with molQTLs and motif mutations. (A) Diagram of treatment design and Venn diagram of SNPs that validated in untreated (left), IL1B-treated for 6 h (right), and IL1B-treated for 24 h (bottom). (B) Heat map of enrichment scores comparing SNPs uniquely in specific treatment conditions between STARR-seq and the molQTLs (i.e., SNPs only significant in STARR-seq 0-h and not in 6-h or 24-h IL1B-treated). (C) Heat map of enrichment scores comparing SNPs uniquely in specific treatment conditions between STARR-seq and the motif mutations.
Figure 6.
Figure 6.
Large artery stroke- and pulse pressure–associated locus at gene POU4F1. (A) Locus zoom plots depicting the SNPs present in the region surrounding POU4F1 locus and the –log(P-values) in the following data sets: eQTL within HAECs (Stolze et al. 2020); eQTLs in artery aorta by GTEx; GWAS statistics for pulse pressure (Evangelou et al. 2018); and GWAS statistics for large artery stroke (Dichgans et al. 2014). Additional information about genomic region is at the bottom, containing the putative enhancer where rs4304924 resides and the surrounding genes. Tracks display H3K27ac, ATAC-seq, ERG binding, RELA binding, and RNA-seq expression from HAECs. (B) POU4F1 gene expression in HAECs by microarray (Probe set ID: 206940_s_at) and three arterial tissues from GTEx compared between genotypes at rs4304924. (C) RPKM-normalized H3K27ac tags in HAECs surrounding the rs4304924 SNP by genotype. (D) PWM for the CRX/GSC motif and the motif sequences created by the two alleles at rs4304924. (E) STARR-seq allele-specific expression (RNA/DNA ratio) for rs4304924 across the different treatment points.

References

    1. The 1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature 526: 68–74. 10.1038/nature15393 - DOI - PMC - PubMed
    1. Alasoo K, Rodrigues J, Mukhopadhyay S, Knights AJ, Mann AL, Kundu K, HIPSCI Consortium, Hale C, Dougan G, Gaffney DJ. 2018. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat Genet 50: 424–431. 10.1038/s41588-018-0046-7 - DOI - PMC - PubMed
    1. Arnold CD, Gerlach D, Stelzer C, Boryń LM, Rath M, Stark A. 2013. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339: 1074–1077. 10.1126/science.1232542 - DOI - PubMed
    1. Birdsey GM, Shah AV, Dufton N, Reynolds LE, Osuna Almagro L, Yang Y, Aspalter IM, Khan ST, Mason JC, Dejana E, et al. 2015. The endothelial transcription factor ERG promotes vascular stability and growth through Wnt/β-catenin signaling. Dev Cell 32: 82–96. 10.1016/j.devcel.2014.11.016 - DOI - PMC - PubMed
    1. Boix CA, James BT, Park YP, Meuleman W, Kellis M. 2021. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature 590: 300–307. 10.1038/s41586-020-03145-z - DOI - PMC - PubMed

Publication types