Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 6:10:e67403.
doi: 10.7554/eLife.67403.

Information content differentiates enhancers from silencers in mouse photoreceptors

Affiliations

Information content differentiates enhancers from silencers in mouse photoreceptors

Ryan Z Friedman et al. Elife. .

Abstract

Enhancers and silencers often depend on the same transcription factors (TFs) and are conflated in genomic assays of TF binding or chromatin state. To identify sequence features that distinguish enhancers and silencers, we assayed massively parallel reporter libraries of genomic sequences targeted by the photoreceptor TF cone-rod homeobox (CRX) in mouse retinas. Both enhancers and silencers contain more TF motifs than inactive sequences, but relative to silencers, enhancers contain motifs from a more diverse collection of TFs. We developed a measure of information content that describes the number and diversity of motifs in a sequence and found that, while both enhancers and silencers depend on CRX motifs, enhancers have higher information content. The ability of information content to distinguish enhancers and silencers targeted by the same TF illustrates how motif context determines the activity of cis-regulatory sequences.

Keywords: computational biology; enhancers; genetics; genomics; information theory; massively parallel reporter assays; mouse; silencers; systems biology.

Plain language summary

Different cell types are established by activating and repressing the activity of specific sets of genes, a process controlled by proteins called transcription factors. Transcription factors work by recognizing and binding short stretches of DNA in parts of the genome called cis-regulatory sequences. A cis-regulatory sequence that increases the activity of a gene when bound by transcription factors is called an enhancer, while a sequence that causes a decrease in gene activity is called a silencer. To establish a cell type, a particular transcription factor will act on both enhancers and silencers that control the activity of different genes. For example, the transcription factor cone-rod homeobox (CRX) is critical for specifying different types of cells in the retina, and it acts on both enhancers and silencers. In rod photoreceptors, CRX activates rod genes by binding their enhancers, while repressing cone photoreceptor genes by binding their silencers. However, CRX always recognizes and binds to the same DNA sequence, known as its binding site, making it unclear why some cis-regulatory sequences bound to CRX act as silencers, while others act as enhancers. Friedman et al. sought to understand how enhancers and silencers, both bound by CRX, can have different effects on the genes they control. Since both enhancers and silencers contain CRX binding sites, the difference between the two must lie in the sequence of the DNA surrounding these binding sites. Using retinas that have been explanted from mice and kept alive in the laboratory, Friedman et al. tested the activity of thousands of CRX-binding sequences from the mouse genome. This showed that both enhancers and silencers have more copies of CRX-binding sites than sequences of the genome that are inactive. Additionally, the results revealed that enhancers have a diverse collection of binding sites for other transcription factors, while silencers do not. Friedman et al. developed a new metric they called information content, which captures the diverse combinations of different transcription binding sites that cis-regulatory sequences can have. Using this metric, Friedman et al. showed that it is possible to distinguish enhancers from silencers based on their information content. It is critical to understand how the DNA sequences of cis-regulatory regions determine their activity, because mutations in these regions of the genome can cause disease. However, since every person has thousands of benign mutations in cis-regulatory sequences, it is a challenge to identify specific disease-causing mutations, which are relatively rare. One long-term goal of models of enhancers and silencers, such as Friedman et al.’s information content model, is to understand how mutations can affect cis-regulatory sequences, and, in some cases, lead to disease.

PubMed Disclaimer

Conflict of interest statement

RF, DG, CM, JC, BC, MW No competing interests declared

Figures

Figure 1.
Figure 1.. Activity of putative cis-regulatory sequences with cone-rod homeobox (CRX) motifs.
(a) Volcano plot of activity scores relative to the Rho promoter alone. Sequences are grouped as strong enhancers (dark blue), weak enhancers (light blue), inactive (green), silencers (red), or ambiguous (gray). Horizontal line, false discovery rate (FDR) q = 0.05. Vertical lines, twofold above and below Rho. (b) Fraction of ChIP-seq and ATAC-seq peaks that belong to each activity group. (c) Predicted CRX occupancy of each activity group. Horizontal lines, medians; enh., enhancer. Numbers at top of (b and c) indicate n for groups.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Reproducibility of massively parallel reporter assay (MPRA) measurements.
Each row represents a different library and experiment. For each column, the first replicate in the title is the x-axis and the second replicate is the y-axis.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Calibration of massively parallel reporter assay (MPRA) libraries with the Rho promoter.
Probability density histogram of the same 150 scrambled sequences in two libraries after normalizing to the basal Rho promoter.
Figure 2.
Figure 2.. Strong enhancers contain a diverse array of motifs.
(a) Receiver operating characteristic for classifying strong enhancers from silencers. Solid black, 6-mer support vector machine (SVM); orange, eight transcription factors (TFs) predicted occupancy logistic regression; aqua, predicted cone-rod homeobox (CRX) occupancy logistic regression; dashed black, chance; shaded area, 1 standard deviation based on fivefold cross-validation. (b and c) Total predicted TF occupancy (b) and frequency of TF motifs (c) in each activity class. (d) Frequency of co-occurring TF motifs in strong enhancers. Lower triangle is expected co-occurrence if motifs are independent. (e) Frequency of activity classes, colored as in (b), for sequences in CRX, NRL, and/or MEF2D ChIP-seq peaks. (f) Frequency of TF ChIP-seq peaks in activity classes. TFs in (c) are sorted by feature importance of the logistic regression model in (a).
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Precision recall curve for strong enhancer vs. silencer classifiers.
Solid black, 6-mer support vector machine (SVM); orange, eight transcription factors (TFs) predicted occupancy logistic regression; aqua, predicted cone-rod homeobox (CRX) occupancy logistic regression; dashed black, chance; shaded area, 1 standard deviation based on fivefold cross-validation.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Results from de novo motif analysis.
Motifs enriched in strong enhancers (a) and silencers (b). Bottom, de novo motif identified with DREME; top, matched known motif identified with TOMTOM.
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Additional validation of the eight transcription factors (TFs) predicted occupancy logistic regression model.
(a and b) Predictions of the 6-mer support vector machine (SVM) (black) and eight TFs predicted occupancy logistic regression model (orange) on an independent test set. (c and d) Null distribution of 100 logistic regression models trained using randomly selected motifs (gray) compared to the true features (orange). Shaded area, 1 standard deviation based on fivefold cross-validation. (a and c) Receiver operating characteristic, (b and d) precision recall curve. Dashed black line represents chance in all panels.
Figure 3.
Figure 3.. Information content classifies strong enhancers.
(a) Information content for different activity classes. (b) Receiver operating characteristic of information content to classify strong enhancers from silencers (orange) or inactive sequences (indigo).
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Precision recall curve of logistic regression classifier using information content.
Orange, strong enhancer vs. silencer; indigo, strong enhancer vs. inactive; shaded area, 1 standard deviation based on fivefold cross-validation.
Figure 4.
Figure 4.. Sequence features of autonomous and non-autonomous strong enhancers.
(a) Activity of library in the presence (x-axis) or absence (y-axis) of the Rho promoter. Dark blue, strong enhancers; light blue, weak enhancers; green, inactive; red, silencers; gray, ambiguous; horizontal line, cutoff for autonomous activity. Points on the far left and/or very bottom are sequences that were present in the plasmid pool but not detected in the RNA. (b–d) Comparison of autonomous and non-autonomous strong enhancers for information content (b), predicted cone-rod homeobox (CRX) occupancy (c), and frequency of transcription factor (TF) motifs (d).
Figure 5.
Figure 5.. Independence of transcription factor (TF) motifs in strong enhancers.
(a) Activity of sequences with and without cone-rod homeobox (CRX) motifs. Points are colored by the activity group with CRX motifs intact: dark blue, strong enhancers; light blue, weak enhancers; green, inactive; red, silencers; gray, ambiguous; horizontal dotted lines and color bar represent the cutoffs for the same groups when CRX motifs are mutated. Solid black line is the y = x line. (b–d) Comparison of strong enhancers with high and low CRX dependence for information content (b), predicted CRX occupancy (c), and residual information content (d). (e) Representative strong enhancers with high (top) or low (bottom) CRX dependence.

Similar articles

Cited by

References

    1. Alexandre C, Vincent JP. Requirements for transcriptional repression and activation by engrailed in Drosophila embryos. Development. 2003;130:729–739. doi: 10.1242/dev.00286. - DOI - PubMed
    1. Andzelm MM, Cherry TJ, Harmin DA, Boeke AC, Lee C, Hemberg M, Pawlyk B, Malik AN, Flavell SW, Sandberg MA, Raviola E, Greenberg ME. MEF2D drives photoreceptor development through a genome-wide competition for tissue-specific enhancers. Neuron. 2015;86:247–263. doi: 10.1016/j.neuron.2015.02.038. - DOI - PMC - PubMed
    1. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. MEME SUITE: Tools for motif discovery and searching. Nucleic Acids Research. 2009;37:W202–W208. doi: 10.1093/nar/gkp335. - DOI - PMC - PubMed
    1. Barolo S, Posakony JW. Three habits of highly effective signaling pathways: Principles of transcriptional control by developmental cell signaling. Genes & Development. 2002;16:1167–1181. doi: 10.1101/gad.976502. - DOI - PubMed
    1. Brand AH, Micklem G, Nasmyth K. A yeast silencer contains sequences that can promote autonomous plasmid replication and transcriptional activation. Cell. 1987;51:709–719. doi: 10.1016/0092-8674(87)90094-8. - DOI - PubMed

Publication types

Substances

Associated data