Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun;570(7759):122-126.
doi: 10.1038/s41586-019-1210-7. Epub 2019 May 15.

Transcriptional cofactors display specificity for distinct types of core promoters

Affiliations

Transcriptional cofactors display specificity for distinct types of core promoters

Vanja Haberle et al. Nature. 2019 Jun.

Abstract

Transcriptional cofactors (COFs) communicate regulatory cues from enhancers to promoters and are central effectors of transcription activation and gene expression1. Although some COFs have been shown to prefer certain promoter types2-5 over others (for example, see refs 6,7), the extent to which different COFs display intrinsic specificities for distinct promoters is unclear. Here we use a high-throughput promoter-activity assay in Drosophila melanogaster S2 cells to screen 23 COFs for their ability to activate 72,000 candidate core promoters (CPs). We observe differential activation of CPs, indicating distinct regulatory preferences or 'compatibilities'8,9 between COFs and specific types of CPs. These functionally distinct CP types are differentially enriched for known sequence elements2,4, such as the TATA box, downstream promoter element (DPE) or TCT motif, and display distinct chromatin properties at endogenous loci. Notably, the CP types differ in their relative abundance of H3K4me3 and H3K4me1 marks (see also refs 10-12), suggesting that these histone modifications might distinguish trans-regulatory factors rather than promoter- versus enhancer-type cis-regulatory elements. We confirm the existence of distinct COF-CP compatibilities in two additional Drosophila cell lines and in human cells, for which we find COFs that prefer TATA-box or CpG-island promoters, respectively. Distinct compatibilities between COFs and promoters can explain how different enhancers specifically activate distinct sets of genes9, alternative promoters within the same genes, and distinct transcription start sites within the same promoter13. Thus, COF-promoter compatibilities may underlie distinct transcriptional programs in species as divergent as flies and humans.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Extended Data Figure 1 |
Extended Data Figure 1 |. Selection of core promoter candidates and cofactors.
a, List of initial 13 Drosophila melanogaster cofactors (COFs) used in this study (see Extended Data Figure 7 for 10 additional COFs). For each COF, relevant information about its function is shown (functional domain / enzymatic activity / protein complex) and the name of the respective mammalian homolog from Ensembl database. b, Core promoter (CP) candidates from the D. melanogaster genome were selected sequentially (in order of the white arrow) based on transcription start sites (TSSs) from datasets that map endogenous transcription initiation (CAGE and RAMPAGE), TSSs in reporter assays (STAP-seq), or FlyBase (version 5.57) and Ensembl (version 78) gene annotations (for each new dataset, only TSSs that were more than 10 base-pairs (bp) away from TSSs already present in the selection were added). As negative controls, random positions without any evidence of initiation were selected. A total of 72,000 TSSs were used as reference points to design core-promoter oligos encompassing 66 bp upstream and 66 bp downstream of the TSS. c, Overview of COF-recruitment STAP-seq (COF-STAP-seq), a high-throughput activator bypass,,-like assay that we created by combining a plasmid-based high-throughput promoter-activity assay, Self-Transcribing Active Core Promoter-sequencing (STAP-seq) with the GAL4-DNA-binding-domain (GAL4-DBD)-mediated recruitment of individual COFs as in ref. . The D. melanogaster CP candidate library, pre-mixed with the Drosophila pseudoobscura (D. pseudoobscura) CP spike-in mix, was co-transfected with an expression plasmid for one of the GAL4-DBD-COF fusion proteins. If binding of a GAL4-DBD-COF to the 4x-UAS array activates transcription from a candidate CP, this generates reporter RNAs with a short 5’ sequence tag, derived from the 3’ end of the corresponding CP. These reporter transcripts are captured with a 5’ RNA linker that includes a 10 nucleotide (nt) long unique molecular identifier (UMI), allowing counting of individual reporter RNA molecules. In addition, the RNA linker contains a 4 nt sample barcode (BC), used for sample identification, enabling pooled processing of up to 8 samples after linker ligation. This is followed by selective reverse transcription, PCR amplification, deep sequencing and mapping of the 5’ sequence tags to quantify productive initiation events at single base-pair resolution for all candidate CPs in the library and spike-in CPs.
Extended Data Figure 2 |
Extended Data Figure 2 |. Cofactor recruitment reproducibly activates transcription preferentially from annotated core promoter sequences.
a, Pairwise comparisons of normalized STAP-seq tag counts between 3 independent biological replicates per cofactor (COF) across all 72,000 tested core promoter (CP) candidates. The Pearson’s correlation coefficient (PCC) is denoted for each comparison. b, Total unique STAP-seq tag counts for P65, GFP and the 13 COFs (left: raw counts; right: counts relative to spike-in). Bar heights: mean counts; error bars: standard deviation (SD). n=3 independent biological replicates for each COF. c, Distribution of normalized STAP-seq tag counts from all COFs at candidates grouped by different annotated genomic regions (FlyBase version 5.57). ‘Core promoter’ regions were defined as 100bp regions from 50 bp upstream to 50 bp downstream of annotated gene TSSs, and ‘Proximal promoter’ as regions up to 250 bp upstream of annotated gene TSSs. ‘Gene body’ includes both exons and introns, but excludes 5’ UTRs, which form a separate category. ‘Random negative regions’ represent candidates selected as negative controls (see Extended Data Fig. 1b) irrespective of their genomic location. n: number of independent CP candidates per box; boxes: median and interquartile range; dots: mean; whiskers: 5th and 95th percentiles. d, Genomic distribution of CP candidates (top; n=72,000) and of unique STAP-seq tags, i.e. transcripts initiated at CP candidates upon activation by any of the COFs (bottom; n= 41,069,770). Annotated gene core promoters (red) are highly enriched for STAP-seq tags.
Extended Data Figure 3 |
Extended Data Figure 3 |. Transcriptional cofactors have characteristically different core-promoter activation profiles.
a, COF-STAP-seq signals (transcription initiation events) of each of the 13 cofactors (COF) and the positive and negative controls (P65 and GFP, respectively) from core promoter (CP) candidates in the representative genomic locus (same as in Fig. 1b but showing all 13 COFs). Negative values denote transcription initiation on the antisense strand. b, Principal component analysis of STAP-seq tag count normalized to spike-ins for 30,936 CPs significantly activated above GFP by at least one COF (≥ 2-fold enrichment over GFP and Student’s t-test FDR ≤ 0.06; see Methods) in 3 biological replicates per tested COF and controls. Scatterplot of projections onto the first two principal components (left) and the percent of variance explained by each principal component (right) are shown. c, Hierarchical clustering of individual biological replicates per COF based on Pearson’s correlation coefficient (PCC) across 30,936 CPs activated by at least one COF. All biological replicates cluster closely together and reproduce the functional COF groups shown in Fig. 1c derived from merged replicates. Blue-to-red shading indicates the PCC for each comparison d, Comparison of CP activation above GFP (induction) in STAP-seq (x-axis) and luciferase (y-axis) for 50 CPs tested with P65 and 4 different COFs. PCC indicated for each comparison.
Extended Data Figure 4 |
Extended Data Figure 4 |. Cofactor−core-promoter compatibilities are cell type independent
a, Representative genomic locus showing differential COF-STAP-seq signals for recruitment of MED25, Lpt, Chro and Mof in three Drosophila melanogaster cell lines. Each COF preferentially activates the same CPs in all 3 cell lines (S2, OSC and Kc167 cells), and these preferences differ between COFs. STAP-seq data: merge of 3 independent biological replicates. b, Hierarchical clustering of P65 and 6 COFs tested in all 3 cell lines based on Pearson’s correlation coefficient (PCC) of CP activation in each cell line. c, Activation of all 72,000 CP candidates by different COFs in the 3 cell lines. For each COF, the CPs are first sorted by activation in S2 cells and then the activation in OSC and Kc167 cells is displayed in the same order. PCCs (right) were calculated by comparing OSC or Kc167 with S2 cells, respectively. d, COF-STAP-seq activation of 50 CPs selected for luciferase assays in S2 cells (see Fig. 1d) by different COFs and P65 in the 3 cell lines (subset of c). Differential activation of CPs by each COF is consistent across all cell lines. e, Pairwise comparison of CP activation by different COFs above GFP (induction) in OSC vs. S2 cells (top row) and Kc167 vs. S2 cells (bottom row) for of all 72,000 CP candidates.
Extended Data Figure 5 |
Extended Data Figure 5 |. Cofactors preferentially activate core promoters of their endogenously bound and regulated target genes.
a-e, Binding of Trr (a), Lpt (b), Mof (c) and Trx (e) in S2 cells and Chro in Drosophila melanogaster embryos (d) to 5,933 CPs active in COF-STAP-seq and endogenously in S2 cells (as in Fig. 1e but for additional COFs). Per COF, CPs are sorted by STAP-seq activation (left) and ChIP-seq coverage is shown in heatmaps and boxplots (-150 to +50bp window around the TSS; n=297 independent CPs per box; box shading: mean STAP-seq tag count; boxes: median and interquartile range; whiskers: 5th and 95th percentiles; P values: one-sided Wilcoxon rank sum test; all ChIP-seq data from previous publications; see Supplementary Table 1 for details and references). For all COFs, the most strongly activated CPs in COF-STAP-seq are significantly more strongly bound by the respective COF in their endogenous genomic context compared to CPs that are activated weakly (note that even though this also holds for Lpt, the trend for Lpt starts only after the most strongly activated CPs (first two bins), which are less strongly bound than expected). f, Expression fold change upon Trx depletion by RNA interference (RNAi) for genes associated with top and bottom 25% of CPs by activation with Trx (RNA-seq data from ref. ; see also Supplementary Table 1). Only CPs associated with genes active in S2 cells and activated in COF-STAP-seq by at least one COF are included. g, STAP-seq tag count for CPs of genes down-regulated upon Trx depletion by RNAi versus CPs of all other genes expressed in S2 cells and activated by at least one COF (RNA-seq data from ref. ; n denotes number of independent CPs; boxes: median and interquartile range; whiskers: 5th and 95th percentiles; P values: one-sided Wilcoxon rank sum test).
Extended Data Figure 6 |
Extended Data Figure 6 |. Defining and validating core promoter groups activated preferentially by different cofactors.
a, Spike-in normalized COF-STAP-seq tag counts (left heatmap) for 30,936 core promoter (CP) candidates (columns) clustered based on their preferential activation by different cofactors (COFs; rows). These tag counts were transformed for each CP separately into Z-scores (right heatmap) to highlight the differential activation by different COFs independently of the overall activity of the CP. We then used these Z-score-transformed values to cluster the CPs into 5 groups of respectively similar activation profiles across all COFs irrespective of absolute activation levels using k-means clustering (the CPs in both heatmaps are organized identically according to these groups, see coloured bar on top). Line-plot on the left shows the average spike-in normalized COF-STAP-seq tag count across all CPs of each group for each of the 13 COFs and the 2 controls. b, Percent of variance in the data explained by clustering CPs into different number of clusters with k-means (k ranging from 1 to 10). Increasing the number of clusters beyond 5 does not add much to explaining the variance in the data. c, Gain of percent variance explained by increasing the number of clusters in steps of one from 3 to 6. d, Distribution of sum of squared distances to centroids of the clusters for number of clusters ranging from 1 to 10, using a 5-fold cross-validation approach. The data was binned randomly into 5 equally sized bins, one bin was left aside as a test set and clustering was performed on the remaining 4 bins. Sum of squared distances to the nearest centroid for each data point in the test set was then calculated. The procedure was repeated for each number of clusters (k). Increasing the number of clusters beyond 5 does not lead to substantially more coherent or dense clusters. For each box n=30,936 independent CPs. e-g, Clustering of 30,936 CPs (columns) based on their preferential activation by different COFs (rows) as in a, but using data for only one replicate as indicated. k-means clustering (k=5) for each individual replicate reproduces qualitatively the same groups obtained with the merged replicates (see a). h, Agreement between assignment of CPs to groups in individual replicates and in the pooled data (left). In each replicate, around 85% of CPs are assigned to the same group as in the assignment based on pooled replicates. Barplot: Number of replicates that reproduce group assignment for individual CPs is shown on the right. For around 94% of CPs, the group assignment is reproduced in at least two replicates. i, Pairwise distances in CP response to 6 COFs and two controls for CPs belonging to the same (intra-) or different (inter-) clusters (defined in S2 cells) in all 3 Drosophila melanogaster cell lines. n = 115,508,123 and 362,994,457 independent CP pairs for intra and inter-cluster boxes, respectively. * P-value ≤ 0.01; one-sided Wilcoxon rank sum test. j, Induction (activation above GFP) of CPs (5 groups defined in S2 cells; see panel a) by P65 and 6 COFs in S2 (top), OSC (middle) and Kc167 (bottom) cells. Each of the 6 COFs preferentially activates the same CP groups in all 3 cell lines, i.e. COFs’ CP preferences appear to be cell type independent. n = 5723, 11538, 3203, 5038 and 5434 CPs, for Groups 1 to 5 respectively. (d, i and j) boxes: median and interquartile range; whiskers: 5th and 95th percentiles.
Extended Data Figure 7 |
Extended Data Figure 7 |. Core-promoter preferences of ten additional cofactors
a, List of ten additionally tested Drosophila melanogaster cofactors (COF). For each COF, relevant information about its function is shown (functional domain / enzymatic activity / protein complex) and the name of the respective mammalian homolog. b, Total COF-STAP-seq tag counts relative to spike-in for GFP (negative control) and the ten COFs. Bar heights: mean counts; error bars: standard deviation (SD); n=3 independent biological replicates per COF. c, Percent of variance in the data explained by clustering core promoters (CPs) into different number of clusters with k-means (k ranging from 1 to 10) using the original dataset containing 13 COFs, P65 and GFP (as in Extended Data Figure 6b; blue) or the extended dataset with 10 additional COFs (23 total; red). The curves are highly similar for both datasets, i.e. the same number of clusters explains the same amount of variance in both the original and the extended dataset. d, as Extended Data Figure 6a but for extended dataset of 23 COFs: spike-in normalized STAP-seq tag counts (left heatmap) for 30,936 CPs (columns) clustered based on their preferential activation by 23 different COFs and 2 controls (rows). Tag counts were transformed into Z-scores (right heatmap), which were used to cluster CPs into 5 clusters with k-means. For comparison, groups defined on the dataset containing 13 COFs and 2 controls (Extended Data Fig. 6a) are shown in the top row and groups defined with this extended dataset are shown below. e, Correlation between each of the six activating COFs in the extended dataset and the 13 COFs of the original dataset. Pearson’s correlation coefficients ≥ 0.9 are marked by an asterisk.
Extended Data Figure 8 |
Extended Data Figure 8 |. Core promoters activated by distinct cofactors discriminate between housekeeping and developmental gene regulation
a, Expression variability between around 8,000 single cells of a stage 6 Drosophila melanogaster embryo for genes associated with each of the 5 different core promoter (CP) groups (single cell RNA-seq data from ref. 27). b, Gene-ontology (GO) term enrichment analysis (GOStats R/Bioconductor package version 2.34.0) for genes associated with the 5 different CP groups. c, d, Activation of 72,000 CP candidates by a developmental (dev; from the gene zfh1) and a housekeeping (hk; from the gene ssp3) enhancer (enhancers and enhancer-less control obtained from refs and 14). CPs are grouped into 5 groups as in Extended Data Fig. 6a. The enhancer-less control reflects the basal activity of the CPs. Group 3 CPs have the highest basal activity but are further activated by the hk enhancer. n = 5723, 11538, 3203, 5038 and 5434 independent CPs, for Groups 1 to 5 respectively; boxes: median and interquartile range; whiskers: 5th and 95th percentiles. e, f Transcription factor motif enrichment analysis in the sequence 500 bp upstream of the TSS (e) or within the nearest developmental or housekeeping enhancer (from ref. ; f) for the 5 CP groups. n = 5723, 11538, 3203, 5038 and 5434 independent CPs, for Groups 1 to 5 respectively. NS = not significant (two-sided Fisher’s exact test; P-values corrected for multiple testing by Benjamini-Hochberg procedure; FDR > 0.01).
Extended Data Figure 9 |
Extended Data Figure 9 |. Core promoters activated preferentially by distinct cofactors differ in their sequence and in endogenous chromatin features.
a, Occurrence of specific dinucleotides (see label in each heatmap) relative to TSSs for core promoters (CPs) of the five groups defined in Extended Data Fig. 6a. Within each group, CPs are sorted decreasingly by the COF-STAP-seq tag count of the respective strongest COFs (denoted on the left). Darker shade reflects higher density of the respective dinucleotides at specific positions. b, c, Examples of genomic loci with CPs active in S2 cells that are differentially activated by COFs in STAP-seq. All supporting data tracks are from S2 cells and re-analysed from previous publications (see Supplementary Table 1 for details and references). (b) CPs of KLHL18 and Spt3 (Group 3), and GCC185 and DCAF12 (Group 4), are preferentially activated by Mof and Chro, respectively, and have high levels of H3K4me3 downstream of their TSSs. In contrast, the CP of Ect3 (Group 1) is preferentially activated by P300 and has high levels of H3K4me1 both upstream and downstream of the TSS but almost no H3K4me3, although Ect3 is expressed and the CP is endogenously active in S2 cells. (c) CPs of CkIIalpha-i3 (Group 4) and CG13896 (Group 3) are preferentially activated by Chro and Mof, respectively, and both bear high levels of H3K4me3 and low levels of H3K4me1 downstream of the TSS. In contrast, the CP of CG13895 (Group 1) is preferentially activated by P300 and is marked by higher levels of H3K4me1, but lower levels of H3K4me3, although the gene is expressed in S2 cells. d, Average H3K4me1 ChIP-seq coverage in the 500 bp window upstream (left) and 500 bp window downstream (right) of the TSS for 5 groups of CPs active in S2 cells (as in Fig. 3b). n = 646, 363, 1842, 1885 and 179 CPs, for Groups 1 to 5, respectively. e, Heatmaps showing endogenous expression (as measured by RNA-seq [left] and GRO-seq [right]) of genes associated with CPs active in S2 cells from the 5 CP groups (RNA-seq and GRO-seq data from refs and , respectively; see Supplementary Table 1 for details and references). Within each group, CPs are sorted decreasingly by STAP-seq of the respective strongest COFs (denoted on the left). f, Gene expression for genes associated with 5 groups of CPs as in e but shown as box plots. n = 646, 363, 1842, 1885 and 179 CPs, for Groups 1 to 5, respectively. (d and f) boxes: median and interquartile range; whiskers: 5th and 95th percentiles. g, Example of differentially activated alternative promoters, h, Example of differentially activated closely-spaced TSSs (g and h: merge of three independent biological replicates).
Extended Data Figure 10 |
Extended Data Figure 10 |. Sequence-encoded cofactor−core-promoter compatibility is conserved in human.
a, Total unique STAP-seq tag counts relative to spike-in for P65, GFP and five human cofactors (COFs) from COF-STAP-seq in human HCT116 cells. Bar heights: mean counts; error bars: standard deviation (SD); n=3 independent biological replicates for each COF). b, COF-STAP-seq signals (transcription initiation) activated by P65, and the five human COFs for the CPs of MMP1 (TATA-box promoter; left) and CIZ1 (CpG-island promoter; right; STAP-seq data: merge of 3 independent biological replicates). c, Hierarchical clustering of independent biological replicates for all tested human COFs based on Pearson’s correlation coefficients (PCCs) across 12,000 human CP candidates. d, Occurrence of different dinucleotides (TA, AT, AA, CG and GC) around TSSs in CPs sorted by the ratio between COF-STAP-seq signals with MED15 and MLL3, for 9,607 CPs activated by either COF.
Figure 1 |
Figure 1 |. Differential activation of core promoter candidates by transcriptional cofactors.
a, Schematic overview of the COF-STAP-seq high-throughput promoter-activity assay. Cofactors (COFs) are recruited one-by-one via a GAL4-DNA-binding domain to a core-promoter (CP) candidate library; reporter-transcript tags are quantified by sequencing. b, Transcription from CPs activated by four COFs and GFP in a representative genomic locus (negative values: antisense transcription). c, Hierarchical clustering of COFs based on Pearson correlation coefficients (PCCs) of COF-STAP-seq tag counts across 30,936 CPs activated by at least one COF. d, Heatmap of STAP-seq (S) and luciferase (L) signals for activation of 50 CPs by four COFs (right: PCCs between STAP-seq and luciferase values; see Extended Data Figure 3d). e, COF binding in S2 cells to 5,933 CPs active in COF-STAP-seq and endogenously. Per COF, CPs are sorted by STAP-seq activation (left) and ChIP-seq coverage is shown in heatmaps and boxplots (-150 to +50bp window around the TSS; n=297 independent CPs per box; box shading: mean STAP-seq tag count) f, Expression fold change upon COF inhibition for genes associated with top and bottom 25% COF-STAP-seq CPs for the respective COF. g, COF-STAP-seq tag count for CPs of genes down-regulated upon COF inhibition and CPs of all other genes. (e-g) considering only CPs (or genes) active in COF-STAP-seq and endogenously in S2; data reanalysed from refs in Supplementary Table 1; n denotes number of independent CPs; boxes: median and interquartile range; whiskers: 5th and 95th percentiles; P values: one-sided Wilcoxon rank sum test.
Figure 2 |
Figure 2 |. Groups of core promoters activated preferentially by different cofactors contain different core-promoter motifs.
a, Occurrence of known fly core-promoter (CP) motifs for CPs separated into 5 groups based on cofactor (COF) responsiveness in STAP-seq (Extended Data Figure 6). Within each group, CPs are sorted by decreasing STAP-seq tag count for the respective strongest COFs (left). Inset: occurrences of TCT in top 10% Group 4 CPs. b, Mutual enrichments (red) or depletions (blue) of motifs in CP groups. n = 5723, 11538, 3203, 5038 and 5434 CPs for Groups 1 to 5 respectively. NS = not significant (two-sided Fisher’s exact test P > 0.01). c, General-transcription-factor ChIP-seq coverage around TSSs for CPs active in S2 cells (sorted as in a; inset: Trf2 ChIP-seq coverage at top 10% Group 4 CPs; ChIP-seq data (S2 and Kc167 cells, embryos) from refs in Supplementary Table 1). d, as c but showing average coverage in -150 to +50bp windows around the TSS (n = 646, 363, 1842, 1885 and 179 CPs for Groups 1 to 5, respectively; boxes: median and interquartile range; whiskers: 5th and 95th percentiles).
Figure 3 |
Figure 3 |. H3K4me1 and H3K4me3 differentially mark promoters activated by distinct cofactors.
a, Endogenous chromatin properties of 5 core-promoter (CP) groups activated by distinct cofactors (COFs; left). Heatmaps show coverage (MNase-, DHS- or ChIP-seq from S2 cells) around the TSS of CPs grouped and sorted as in Fig. 2c. b, as a but average coverage in a defined window around the TSS (-150 to +50 for DHS-seq, Trr, and Pol II; -150 to +250 for Set1; +1 to +500 for MNase-seq, H3K4me1 and H3K4me3). n = 646, 363, 1842, 1885 and 179 CPs, for Groups 1 to 5, respectively. c, Examples of differentially activated and H3K4me1- versus H3K4me3 marked CPs (COF-STAP-seq data: merge of three independent biological replicates). d, COF-STAP-seq signals of CPs sorted by decreasing H3K4me1-versus-H3K4me3 ratios at endogenous loci in S2 cells. e, COF-STAP-seq signals of top and bottom 20% CPs from d (n = 983 CPs per box). a-e considering only CPs active in S2 cells; all data but COF-STAP-seq reanalysed from refs in Supplementary Table 1; boxes: median and interquartile range; whiskers: 5th and 95th percentiles; P values: two-sided Wilcoxon rank sum test.
Figure 4 |
Figure 4 |. Cofactor−core-promoter compatibility is a conserved regulatory principle that underlies differential gene and alternative promoter activation.
a, Transcription activated by human P65, MED15, and MLL3 from the CPs of the REN (left) and IRAK1 (right) genes (human COF-STAP-seq: merge of 3 independent biological replicates). b, Hierarchical clustering of COFs based on Pearson’s correlation of STAP-seq tag counts across 12,000 CP candidates. c, Occurrence of TA and CG dinucleotides in CPs sorted by MED15 versus MLL3 activation (left). d, Distribution of TATA-box and Initiator scores, GC content and CpG dinucleotide observed-over-expected (O/E) ratio for top vs. bottom 10% of CPs from c (n = 961 CPs for each box; boxes: median and interquartile range; whiskers: 5th and 95th percentiles; P values: two-sided Wilcoxon rank sum test. e, f, Examples of differentially activated alternative promoters (e) or closely-spaced TSSs (f; see also Extended Data Fig. 9g, h; merge of three independent biological replicates). c, Model of COF−CP regulatory compatibility, which allows independent regulation of different genes, alternative promoters of the same gene (left) or individual TSSs within a single promoter (right).

Comment in

References

    1. Zabidi MA, Stark A. Regulatory Enhancer−Core- Promoter Communication via Transcription Factors and Cofactors. Trends Genet. 2016;32:801–814. - PMC - PubMed
    1. Ohler U, Liao G-C, Niemann H, Rubin GM. Computational analysis of core promoters in the Drosophila genome. Genome Biol. 2002;3:RESEARCH0087. - PMC - PubMed
    1. Rach EA, Yuan H-Y, Majoros WH, Tomancak P, Ohler U. Motif composition, conservation and condition-specificity of single and alternative transcription start sites in the Drosophila genome. Genome Biol. 2009;10:R73. - PMC - PubMed
    1. Parry TJ, et al. The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery. Genes Dev. 2010;24:2013–2018. - PMC - PubMed
    1. Hoskins RA, et al. Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 2011;21:182–192. - PMC - PubMed

Publication types