Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr;95(4):185-95.
doi: 10.1016/j.ygeno.2010.01.002. Epub 2010 Jan 15.

Conservation and regulatory associations of a wide affinity range of mouse transcription factor binding sites

Affiliations

Conservation and regulatory associations of a wide affinity range of mouse transcription factor binding sites

Savina A Jaeger et al. Genomics. 2010 Apr.

Abstract

Sequence-specific binding by transcription factors (TFs) interprets regulatory information encoded in the genome. Using recently published universal protein binding microarray (PBM) data on the in vitro DNA binding preferences of these proteins for all possible 8-base-pair sequences, we examined the evolutionary conservation and enrichment within putative regulatory regions of the binding sequences of a diverse library of 104 nonredundant mouse TFs spanning 22 different DNA-binding domain structural classes. We found that not only high affinity binding sites, but also numerous moderate and low affinity binding sites, are under negative selection in the mouse genome. These 8-mers occur preferentially in putative regulatory regions of the mouse genome, including CpG islands and non-exonic ultraconserved elements (UCEs). Of TFs whose PBM "bound" 8-mers are enriched within sets of tissue-specific UCEs, many are expressed in the same tissue(s) as the UCE-driven gene expression. Phylogenetically conserved motif occurrences of various TFs were also enriched in the noncoding sequence surrounding numerous gene sets corresponding to Gene Ontology categories and tissue-specific gene expression clusters, suggesting involvement in transcriptional regulation of those genes. Altogether, our results indicate that many of the sequences bound by these proteins in vitro, including lower affinity DNA sequences, are likely to be functionally important in vivo. This study not only provides an initial analysis of the potential regulatory associations of 104 mouse TFs, but also presents an approach for the functional analysis of TFs from any other metazoan genome as their DNA binding preferences are determined by PBMs or other technologies.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Enrichment of particular TFs’ 8-mers within putative regulatory regions
(A) CpG islands are enriched for PBM ‘bound’ 8-mers for E2F and ETS proteins. Results for 8-mers bound at E ≥ 0.43 or E ≥ 0.45 are shown in Supplementary Figure S2. (B) ‘Moderate’ (0.40 ≤ E < 0.45) and ‘high’ (E ≥ 0.45) affinity 8-mers of TFs in the BRIGHT and homeodomain classes are enriched within non-exonic UCEs as compared to shuffled sequences generated to have the same dinucleotide content. In both panels, “score” and P-values are as described in Supplementary Figure 1.
Figure 2
Figure 2. TFs expressed in same tissues as the UCEs in which their PBM ‘bound’ 8-mers are enriched
TF binding site sequence logos are presented for graphical convenience; binding sequence enrichment analysis was based on PBM k-mer data. Fold-enrichment of PBM ‘bound’ 8-mers (E ≥ 0.45) within UCEs is calculated as compared to 10 sets of UCE sequences shuffled at the di-nucleotide level. Because of space limitations, only a subset of the expression patterns and TFs are shown. A full listing of PBM 8-mer enrichment results for all TFs and all examined UCE expression patterns is provided in Supplementary Table S2. (UCE reporter assay whole-mount in situ images at mouse E11.5 adapted by permission from Macmillan Publishers Ltd: [Nature] [20], copyright (2006).)
Figure 3
Figure 3. The median 8-mer substitution rate decreases monotonically with increasing binding site affinity
All bins of 8-mers – ‘high’ affinity (0.45 ≤ E ≤ 0.50), ‘moderate’ affinity (0.40 ≤ E < 0.45), ‘low’ affinity (0.35 ≤ E < 0.40), ‘very low’ affinity (0.30 ≤ E < 0.35), and ‘nonspecific’ (E < 0.30) categories – within (A) high CpG regions and (B) low CpG regions exhibited mean substitution rates significantly different from each other (P < 0.05, Tukey’s Honestly Significant Differences test). In each box plot, the central bar indicates the median, the edges of the box indicate the 25th and 75th percentiles, and whiskers extend to the most extreme data points not considered outliers.
Figure 4
Figure 4. Evolutionary conservation properties of 8-mers of different E-scores
Scatter plot point density is indicated by the color bar in each panel. Horizontal lines at E=0 and E=0.45 are shown in each plot for convenience. For each of 104 TFs examined, the 8-mers belonging to either the ‘high’ affinity category (0.45 ≤ E ≤ 0.50) or those bound most weakly (E < 0) were ranked according to their substitution rates. Significance was assessed by both the area under the receiver operating characteristic curve (AUC > 0.5; shown in each panel) and the Wilcoxon-Mann-Whitney test (P < 0.05); AUC > 0.5 at P < 0.05 indicates that ‘high’ affinity 8-mers are significantly more highly conserved than the most weakly bound 8-mers for a particular TF. For E2F2, the ‘high’ affinity 8-mers exhibited greater conservation than the 8-mers bound most weakly by the TF within both the (A) low and (B) high CpG regions. For Zic2, (C) within the low CpG regions the ‘high’ affinity 8-mers exhibited greater conservation overall (P < 0.05) than the 8-mers bound most weakly, while (D) within high CpG regions the ‘high’ affinity 8-mers were overall less conserved (P < 0.05) than the most weakly bound 8-mers.
Figure 5
Figure 5. Lever screen of GO categories
Heatmap color gradient indicates Lever AUC values. The columns have been restricted such that only those GO categories that exhibit significant enrichment (AUC ≥ 0.8, Q ≤ 0.01) for at least one of the TFs’ binding sites are shown. The rows (TFs) were sorted according to TF structural class, while the columns (GO categories) were clustered hierarchically according to the AUC values calculated by Lever [40].

Similar articles

Cited by

References

    1. Tanay A. Extensive low-affinity transcriptional interactions in the yeast genome. Genome Res. 2006 - PMC - PubMed
    1. Segal E, Raveh-Sadka T, Schroeder M, Unnerstall U, Gaul U. Predicting expression patterns from regulatory sequence in Drosophila segmentation. Nature. 2008;451:535–40. - PubMed
    1. Benos PV, Bulyk ML, Stormo GD. Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res. 2002;30:4442–51. - PMC - PubMed
    1. Bulyk ML, Johnson PL, Church GM. Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res. 2002;30:1255–61. - PMC - PubMed
    1. Man TK, Stormo GD. Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay. Nucleic Acids Res. 2001;29:2471–8. - PMC - PubMed

Publication types

Substances

LinkOut - more resources