Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan;27(1):38-52.
doi: 10.1101/gr.212092.116. Epub 2016 Nov 9.

A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity

Affiliations

A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity

Fumitaka Inoue et al. Genome Res. 2017 Jan.

Erratum in

Abstract

Candidate enhancers can be identified on the basis of chromatin modifications, the binding of chromatin modifiers and transcription factors and cofactors, or chromatin accessibility. However, validating such candidates as bona fide enhancers requires functional characterization, typically achieved through reporter assays that test whether a sequence can increase expression of a transcriptional reporter via a minimal promoter. A longstanding concern is that reporter assays are mainly implemented on episomes, which are thought to lack physiological chromatin. However, the magnitude and determinants of differences in cis-regulation for regulatory sequences residing in episomes versus chromosomes remain almost completely unknown. To address this systematically, we developed and applied a novel lentivirus-based massively parallel reporter assay (lentiMPRA) to directly compare the functional activities of 2236 candidate liver enhancers in an episomal versus a chromosomally integrated context. We find that the activities of chromosomally integrated sequences are substantially different from the activities of the identical sequences assayed on episomes, and furthermore are correlated with different subsets of ENCODE annotations. The results of chromosomally based reporter assays are also more reproducible and more strongly predictable by both ENCODE annotations and sequence-based models. With a linear model that combines chromatin annotations and sequence information, we achieve a Pearson's R2 of 0.362 for predicting the results of chromosomally integrated reporter assays. This level of prediction is better than with either chromatin annotations or sequence information alone and also outperforms predictive models of episomal assays. Our results have broad implications for how cis-regulatory elements are identified, prioritized and functionally validated.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Study design for lentiMPRA. (A) Schematic diagram of lentiMPRA. Candidate enhancers and barcode tags were synthesized in tandem as a microarray-derived oligonucleotide library and cloned into the pLS-mP vector, followed by cloning of a minimal promoter (mP) and reporter (EGFP) between them. The resulting lentiMPRA library was packaged with either wild-type or mutant integrase and infected into HepG2 cells. Both DNA and mRNA were extracted, and the barcode tags were sequenced to test their enhancer activities in an episomal versus genome integrating manner. (B) HepG2 cells infected with lentiviral reporter construct bearing no enhancer (pLS-mP), an SV40 enhancer (pLS-SV40-mP), or Ltv1 (pLS-Ltv1-mP), a known liver enhancer (Patwardhan et al. 2012), with or without antirepressors. The inclusion of antirepressors results in stronger and more consistent expression, but is still dependent on the presence of an enhancer. (C) FACS analyses quantifying the fluorescence intensity of pLS-mP, pLS-SV40-mP, and pLS-Ltv1-mP with (blue lines) or without (red lines) antirepressors following infection into HepG2 with 1 copy of viral molecule per cell. Analysis of all GFP expressing cells (fluorescence more than 500 intensity units) shows a higher proportion of cells that strongly express GFP (more than 2000 units) when antirepressors are included. Specifically, 54.8% versus 45.1% for SV40, and 35.6% versus 29.0% for Ltv1, of cells with and without antirepressors, respectively. (D) Venn diagram showing the composition of the lentiMPRA library. Two thousand two hundred thirty-six enhancer candidate sequences were chosen on the basis of having ENCODE HepG2 ChIP-seq peaks for EP300 and H3K27ac marks. The candidates overlapped with or without ChIP-seq peaks for FOXA1, FOXA2, or HNF4A. Half the candidates overlapped with ChIP-seq peaks for RAD21, SMC3, and CHD2. In addition, the library included 102 positive and 102 negative controls.
Figure 2.
Figure 2.
Pairwise correlation of per-insert RNA/DNA ratios between replicates, within and between MT versus WT experiments. The lower left triangle shows pairwise scatter plots. The diagonal provides replicate names and the respective histogram of the RNA/DNA ratios for that replicate. The upper triangle provides Pearson (p) and Spearman (s) correlation coefficients. MT versus MT (green box) or WT versus WT (blue box) comparisons are substantially more correlated than MT versus WT (yellow boxes) comparisons, consistent with systematic differences between the episomal versus integrated contexts for reporter assays that exceed technical noise. The two right-most columns and two bottom-most rows correspond to MT and WT after combining across the three replicates, with the combined MT versus the combined WT comparison in the red box.
Figure 3.
Figure 3.
Comparisons between the nonintegrating (MT) and integrating (WT) libraries. (A) Scatter plot of combined MT versus WT RNA/DNA ratios. MT ratios show a smaller dynamic range and thus seem compressed compared to WT results. Data points are colored by the type of insert sequence, including two types of controls: a total of four positive and negative controls (black) as well as the highest 100 and lowest 100 synthetic regulatory element sequences (SRES, red) identified by Smith et al. (2013). The four classes of putative enhancer elements are the following: regions of FOXA1, FOXA2, or HNF4A binding that overlap H3K27ac and EP300 calls as well as at least one of three factors RAD21, CHD2, or SMC3 (type 1); regions like in type 1 but with no RAD21, CHD2, or SMC3 overlapping (type 2); EP300 peak regions overlapping H3K27ac as well as at least one overlap with RAD21, CHD2, or SMC3, but without peaks in FOXA1, FOXA2, or HNF4A (type 3); regions like in type 3 but with no RAD21, CHD2, or SMC3 overlapping (type 4). As shown here and in Supplemental Figure S11, we do not observe major differences between the four design types, either with respect to activity or MT versus WT. (B,C) Enhancer activity of 200 synthetic regulatory element sequences (SRES) in the MT (B) and WT experiments (C). Scatter plot of RNA/DNA ratios for the top 100 positive and top 100 negative synthetic regulatory element (SRE) sequences in HepG2 experiments by Smith et al. (2013). Plots show the combined RNA/DNA ratios on the y-axis and measurements by Smith et al. (2013) on the x-axis. Intervals indicate the mean, minimum, and maximum values observed for three replicates performed with each experiment.
Figure 4.
Figure 4.
Squared Kendall's tau (T2) values for available genome annotations for predicting the activity of candidate enhancer sequences in the nonintegrating (MT) and integrating (WT) experiments. (A) WT RNA/DNA ratios correlate better with annotations than the respective MT values. The left panel highlights the top correlated annotations for WT and MT ratios. The right panel highlights annotations with the largest difference in T2 values between the MT and WT experiments. (B) Same analysis for the 20% most active elements (Supplemental Table S1).
Figure 5.
Figure 5.
Prediction models. (A,B) Correlation of gkm-SVM scores obtained for a combined HepG2 model with RNA/DNA ratios obtained from the mutant (MT) and wild-type integrase (WT) experiments. Data points are colored by the type of insert sequence, including two types of controls: 200 synthetic regulatory element sequences (SRES, red) identified by Smith et al. (2013), and four other control sequences (dark gray). The four classes of putative enhancer elements are the following: (type 1) regions of FOXA1, FOXA2, or HNF4A binding that overlap H3K27ac and EP300 calls as well as at least one of three factors RAD21, CHD2, or SMC3; (type 2) regions like in type 1 but with RAD21, CHD2, or SMC3; (type 3) EP300 peak regions overlapping H3K27ac as well as at least one overlap with RAD21, CHD2, or SMC3, but without peaks in FOXA1, FOXA2, or HNF4A; (type 4) regions like in type 3 but with no remodeling factor overlapping. Correlations are partially driven by the SRES; when excluding all controls, Spearman's R2 values drop from 0.080 to 0.039 and from 0.128 to 0.076 for MT and WT, respectively. (C,D) Scatter plots of measured RNA/DNA ratios with predicted activity from linear Lasso models using annotations (numerical and categorical) as well as sequence-based (individual LS-GKM scores) information. Correlation coefficients are 0.45 Pearson/0.40 Spearman for the nonintegrated experiment (MT) and 0.60 Pearson/0.57 Spearman for the integrated constructs (WT). The models selected 110 (MT) and 133 (WT) of a total of 384 annotation features. Based on Pearson R2 values, these combined models explain 20.6% (MT) and 36.2% (WT) of the variance observed in these experiments.

Comment in

Similar articles

Cited by

References

    1. Alcorn JA, Feitelberg SP, Brenner DA. 1990. Transient induction of c-jun during hepatic regeneration. Hepatology 11: 909–915. - PubMed
    1. Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, et al. 2014. An atlas of active enhancers across human cell types and tissues. Nature 507: 455–461. - PMC - PubMed
    1. Archer TK, Lefebvre P, Wolford RG, Hager GL. 1992. Transcription factor loading on the MMTV promoter: a bimodal mechanism for promoter activation. Science 255: 1573–1576. - PubMed
    1. Arnold CD, Gerlach D, Stelzer C, Boryń ŁM, Rath M, Stark A. 2013. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339: 1074–1077. - PubMed
    1. Arnold CD, Gerlach D, Spies D, Matts JA, Sytnikova YA, Pagani M, Lau NC, Stark A. 2014. Quantitative genome-wide enhancer activity maps for five Drosophila species show functional enhancer conservation and turnover during cis-regulatory evolution. Nat Genet 46: 685–692. - PMC - PubMed

Publication types

MeSH terms