Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 9;129(2):240-258.
doi: 10.1161/CIRCRESAHA.121.318971. Epub 2021 May 24.

Single-Cell Epigenomics and Functional Fine-Mapping of Atherosclerosis GWAS Loci

Affiliations

Single-Cell Epigenomics and Functional Fine-Mapping of Atherosclerosis GWAS Loci

Tiit Örd et al. Circ Res. .

Abstract

[Figure: see text].

Keywords: atherosclerosis; coronary artery disease; genetics; genome-wide association study; myocardial infarction.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Clustering and identification of cell type clusters in human atherosclerosis single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-Seq) data. A and B, t-distributed stochastic neighbor embedding (tSNE) projection of the 7009 snATAC-Seq profiles represented as (A) 15 clusters identified using automated clustering and (B) the 5 manually annotated clusters corresponding to smooth muscle cells (SMCs), endothelial cells (ECs), macrophages (MPs), T/natural killer (NK)-cells and B cells. C, Dot plot demonstrates the top chromatin accessibility marker genes of each atherosclerotic lesion cell type. Dot size corresponds to the proportion of cells within the cluster that displays gene accessibility (% access.), and dot color intensity corresponds to the average accessibility level (avg. acc.; counts in the gene body and promoter region, depth-normalized to 10,000 total counts per cell and log-transformed). For each cell type, the top 10 markers by fold change (provided false discovery rate [FDR]<0.05 by Wilcoxon rank-sum test) were selected for plotting. D, Heatmap showing differences in chromatin accessibility for the TF (transcription factor) motifs with the greatest accessibility variability between the 5 major atherosclerotic lesion cell populations. Color bar displays the chromVAR (Chromatin Variation Across Regions) accessibility deviation Z score, and motifs were required to satisfy FDR<0.05 (Wilcoxon rank-sum test) comparing one cell type vs all other cells. E, tSNE projection of selected TF motif accessibility Z scores. F, UpSet plot of intersected peaks among cell types. The 20 most populated intersection groups are presented. G, Pie chart representing the relative fraction of common or shared snATAC-Seq peaks in the different genomic annotations. ETS1 indicates ETS proto-oncogene 1, transcription factor; PU.1, purine-rich box 1; SOX, SRY-related HMG-box; TEAD3, TEA domain transcription factor 3; and UTR, untranscribed region.
Figure 2.
Figure 2.
Identification of subtypes of macrophages (MPs) and smooth muscle cells (SMCs) based on the integration of single-cell gene expression and chromatin accessibility data. A and B, Coembedded UMAP representation of both single-nucleus (sn)ATAC-Seq cells and single-cell RNA sequencing (scRNA-Seq) cells, showing (A) the cell type annotation and (B) the data type of origin. C, Row-normalized chromatin accessibility profiles for snATAC-Seq peaks reveals high similarity of cell subtypes. D, Dot plot demonstrating the top gene expression markers of the 3 MP subtypes. Dot size corresponds to the proportion of cells within the cluster that express the gene, and dot color intensity corresponds to average expression level (gene counts depth-normalized to 10 000 total counts per cell and log-transformed). E, Heatmap showing differences in chromatin accessibility for the TF (transcription factor) motifs with the greatest accessibility variability between the 3 MP subtypes. Color bar displays the chromVAR accessibility deviation Z score. F–H, Single-cell ATAC trajectory analysis of SMC lineage cells shown colored by (F) pseudotime, (G) cluster assignment, and (H) examples of differentially accessible motifs that vary significantly along the inferred trajectory. In D and E, results were required to pass Wilcoxon rank-sum test false discovery rate (FDR)<0.05 comparing one cell type vs all other cells. AP-1 indicates activator protein 1; BATF, basic leucine zipper ATF-like transcription factor; B, B cell; EC, endothelial cells; FB, fibroblast; JUN, Jun proto-oncogene, AP-1 transcription factor subunit; MO, monocyte; NFATC, nuclear factor of activated T cells 1; NK, natural killer; STAT, signal transducer and activator of transcription 1; and T, T cell.
Figure 3.
Figure 3.
Comparison of in vivo–specific accessible regulatory elements to chromatin accessibility in cell culture models. A, Overlap of in vitro and in vivo regular (nonsuperenhancer) cisregulatory element (CRE) counts by cell type. B, Motif enrichment within in vivo–specific single-nucleus (sn)ATAC-Seq peaks. Selected JASPAR (http://jaspar.genereg.net/) motif logos are presented. C, Top molecular function and biological process of the genes within 100 kb from the in vivo–specific CREs, analyzed using GREAT (Genomic Regions Enrichment of Annotations Tool). D, Overlap of in vitro and in vivo superenhancer (SE) counts by cell type. E, Top biological process of the nearby genes located within 1 MB of the superenhancer. F, Pseudobulk coverage track visualization of snATAC-Seq signal at the cell type–specific superenhancers. B indicates B cell; chr, chromosome; EC, endothelial cells; ECM, extracellular matrix; FOX, Forkhead box; HOX, homeobox; IRF1, interferon regulatory factor 1; MEF2C, myocyte enhancer factor 2C; MP, macrophage; NFATC2, nuclear factor of activated T cells 2; NK, natural killer; NR2F2, nuclear receptor subfamily 2 group F member 2; NS, not significant; POU4F2, POU class 4 homeobox 2; ROS, reactive oxygen species; SMC, smooth muscle cell; and T, T cell.
Figure 4.
Figure 4.
Enrichment of coronary artery disease (CAD)/myocardial infarction (MI) genome-wide association study (GWAS) signals in cell type–specific open chromatin and their linkage to potential target genes. A, Enrichment of GWAS single-nucleotide polymorphisms (SNPs) for cardiometabolic traits in single-nucleus (sn)ATAC-Seq peaks of the 5 major cell types detected in atherosclerotic lesion samples. B, Mean accessibility of all the peaks within the 3427 cis-coaccessibility network (CCAN) displayed as row-normalized values by cell type. C, Pseudobulk snATAC-Seq coverage track (light blue) visualization of 2 smooth muscle cell (SMC)-specific CCANs centered around the TBX2 (chr17:59018300-59570950) and the SEMA5A (chr5:9014150-9768900) genes. The peak-peak cis-coaccessibility is shown by arches for the pairs that exhibit Cicero score >0.5. D, Selected promoter-associated snATAC-Seq peaks that harbor CAD/MI GWAS SNPs and exhibit cell type–specific chromatin accessibility and gene expression. The row normalization was performed separately for snATAC-Seq (peak cuts per cell) and single-cell RNA sequencing (scRNA-Seq) (transcripts per million; TPM) data. The top 10% row value is shown in red. E, Gene ontology enrichment for cell type–specific connections between a promoter and a peak containing a CAD/MI SNP (peak-promoter coaccessibility score >0.5). Cell type specificity was defined as peak accessibility and gene expression signal within the top 10% among all the cell types studied but only passed that threshold in one cell type. The resulting gene lists were profiled for over-representation of Gene Ontology Biological Process categories. The results were corrected for multiple testing to an experiment-wide threshold of a=0.05, and up to 7 most significant categories (provided Padj<0.05) were picked for each gene list for plotting. F, The arterial cis–expression quantitative trait loci (eQTL) SNPs and the associated gene are shown for snATAC-Seq peaks that exhibit peak-promoter coaccessibility score >0.5. Peaks and genes were further filtered for cell type specificity by requiring peak accessibility and gene expression level to be within the top 10% among all the cell types studied but only one of the cell types (see Figure XVI in the Data Supplement). Only one SNP per peak is shown. If ≥2 peaks demonstrate cell type–specific accessibility, only one SNP is listed and p denotes the number of total cell type–specific peaks. For full list see Table XIII in the Data Supplement. B indicates B cell; chr, chromosome; EC, endothelial cells; dep., dependent; ECM, extracellular matrix; MP, macrophage; NK, natural killer; pos, position (coordinate) in chromosome; reg., regulation; SRP, signal recognition particle; and T, T cell; and Wnt, Wingless-related integration site.
Figure 5.
Figure 5.
Identification of the coronary artery disease (CAD)/myocardial infarction (MI) functional variants using information from molecular quantitative trait loci (molQTLs). A, Schematic representing the experimental fine-mapping of the causal variants in human aortic endothelial cells (HAECs) and human aortic smooth muscle cells (HASMCs). B, Dot plot summarizing the significant molQTLs that overlap CAD/MI single-nucleotide polymorphism (SNP)-containing single-nucleus (sn)ATAC-Seq peaks and where both the peak and the promoter are accessible in ECs. The epigenetic assays were carried out using bulk epigenomics methods on a panel of human aortic EC cultures derived from genetically diverse donors. Dot size corresponds to statistical significance (false discovery rate [FDR]<0.05, determined by RASQUAL, n=21–44) and dot color corresponds to the allele-specific ratio (blue, more reference allele reads; red, more alternative allele reads). The most potential predicted target genes, which demonstrate peak-promoter coaccessibility score >0.5, peak accessibility–gene expression correlation > 0.5, or cis–expression quantitative trait loci (eQTL) association in Genotype-Tissue Expression (GTEX) v8 arterial tissues (Q value≤0.05) or HAECs (FDR<0.05), are shown. *Presence of other single-nucleotide polymorphisms (SNPs) with identical effect in the same snATAC-Seq peak. For the complete gene list, see Table XIV in the Data Supplement. C, Viewpoint plots centered at the molQTL SNPs for FES and BCAR1 loci. Arcs depict peak-promoter coaccessibility (blue arcs) and the correlation between peak accessibility and gene expression (red arcs). The most highly supported target genes are highlighted in bold. Ctrl indicates control conditions; ERG, ETS transcription factor ERG; H3K27ac, histone 3 lysine 27 acetylation; hap, haplotype; IL, interleukin; LD, linkage disequilibrium; ORI, origin of replication with core promoter functionality; pA, poly(A) site; and TF, transcription factor.
Figure 6.
Figure 6.
Identification of the coronary artery disease (CAD)/myocardial infarction (MI) functional variants using self-transcribing active regulatory region sequencing (STARR-Seq) in human aortic smooth muscle cells (HASMCs). A, Bar plot summarizing the CAD/MI single-nucleotide polymorphisms (SNPs) that demonstrated significant allele-specific enhancer (ASE) activity in STARR-Seq performed in cholesterol-loaded primary HASMCs (false discovery rate [FDR] < 0.05 determined by mpralm, n=3). Whenever available, the top 3 target genes predicted by peak-promoter coaccessibility, peak accessibility–gene expression correlation, and cis–expression quantitative trait loci (eQTL) association in Genotype-Tissue Expression (GTEX) v8 arterial tissues (Q value ≤0.05) are shown. Peak-promoter pairs with a peak-promoter coaccessibility score >0.5 and peak accessibility–gene expression correlation >0.5 were considered connected. rs6496126 was not shown as no target genes were predicted. For the complete list, see Table XV in the Data Supplement. B, Viewpoint plot of peak-promoter coaccessibility, peak accessibility–gene expression correlation, and the single-nucleus ATAC-Seq tracks centered at the STARR-Seq significant SNPs rs734780, rs28522673, and rs61776719. The predicted target genes are highlighted in bold. B indicates B cell; EC, endothelial cells; FMC, fibromyocyte; MP, macrophage; NK, natural killer; and SMC, smooth muscle cell; and T, T cell.
Figure 7.
Figure 7.
Association of the coronary artery disease (CAD)/myocardial infarction (MI) variants with smooth muscle cell (SMC) phenotypes. A, Heatmap showing the effect sizes for significant associations between 12/34 CAD/MI variants and 12 SMCs phenotypes determined from 151 human aortic smooth muscle cell (HASMC) donors using FaST-LMM (Factored Spectrally Transformed Linear Mixed Models). 12 of the 34 CAD/MI loci identified in genome-wide association study (GWAS) showed a nominal association (P<0.05) with at least one SMC phenotype. Rows show 12 SMC phenotypes, and columns show the index variants in the CAD/MI loci. The color key of the correlations is shown on the left. The colors refer to single-nucleotide polymorphism (SNP) weight (β) direction and magnitude, ranging from −2.5 (blue) to 2 (red). Significant associations (P<0.05) are indicated with a colored box. Negative effect sizes (blue) indicate that CAD/MI risk allele was associated with a lower SMC phenotype. In contrast, positive effect sizes (red) indicate that CAD/MI risk allele was associated with a higher SMC phenotype. Whenever the expression of a predicted target gene also correlates with the trait, the plot (B–E) is listed after the rsID. B, Correlation of KLF4 (rs944172 locus) expression with SMC proliferation relative to control and to PDGF-BB (platelet-derived growth factor BB) stimulus. C, Correlation between HAPLN3 (rs734780 locus) expression and the proliferation response to TGF (transforming growth factor) β1 (D) Correlation between CSK, ULK3, SCAMP5, C15orf39 and UBL7 (rs1543927 locus) and calcification under the osteogenic stimulus. E, Correlation between TMED9, DBN1, PRR7, and FAM193B (rs335428 locus) and proliferation response to IL (interleukin) 1β. AUC indicates area under the receiver operator characteristic curve; and TPM, transcripts per million.

Comment in

References

    1. Wirka RC, Wagh D, Paik DT, Pjanic M, Nguyen T, Miller CL, Kundu R, Nagao M, Coller J, Koyano TK, et al. . Atheroprotective roles of smooth muscle cell phenotypic modulation and the TCF21 disease gene as revealed by single-cell analysis. Nat Med. 2019;25:1280–1289. doi: 10.1038/s41591-019-0512-5 - PMC - PubMed
    1. Fernandez DM, Rahman AH, Fernandez NF, Chudnovskiy A, Amir ED, Amadori L, Khan NS, Wong CK, Shamailova R, Hill CA, et al. . Single-cell immune landscape of human atherosclerotic plaques. Nat Med. 2019;25:1576–1588. doi: 10.1038/s41591-019-0590-4 - PMC - PubMed
    1. Depuydt MAC, Prange KHM, Slenders L, Örd T, Elbersen D, Boltjes A, de Jager SCA, Asselbergs FW, de Borst GJ, Aavik E, et al. . Microanatomy of the human atherosclerotic plaque by single-cell transcriptomics. Circ Res. 2020;127:1437–1455. doi: 10.1161/CIRCRESAHA.120.316770 - PMC - PubMed
    1. Pan H, Xue C, Auerbach BJ, Fan J, Bashore AC, Cui J, Yang DY, Trignano SB, Liu W, Shi J, et al. . Single-cell genomics reveals a novel cell state during smooth muscle cell phenotypic switching and potential therapeutic targets for atherosclerosis in mouse and human. Circulation. 2020;142:2060–2075. - PMC - PubMed
    1. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004 - PMC - PubMed

Publication types

MeSH terms