Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 12;2(6):100232.
doi: 10.1016/j.xplc.2021.100232. eCollection 2021 Nov 8.

DNA features beyond the transcription factor binding site specify target recognition by plant MYC2-related bHLH proteins

Affiliations

DNA features beyond the transcription factor binding site specify target recognition by plant MYC2-related bHLH proteins

Irene López-Vidriero et al. Plant Commun. .

Abstract

Transcription factors (TFs) regulate gene expression by binding to cis-regulatory sequences in the promoters of target genes. Recent research is helping to decipher in part the cis-regulatory code in eukaryotes, including plants, but it is not yet fully understood how paralogous TFs select their targets. Here we addressed this question by studying several proteins of the basic helix-loop-helix (bHLH) family of plant TFs, all of which recognize the same DNA motif. We focused on the MYC-related group of bHLHs, that redundantly regulate the jasmonate (JA) signaling pathway, and we observed a high correspondence between DNA-binding profiles in vitro and MYC function in vivo. We demonstrated that A/T-rich modules flanking the MYC-binding motif, conserved from bryophytes to higher plants, are essential for TF recognition. We observed particular DNA-shape features associated with A/T modules, indicating that the DNA shape may contribute to MYC DNA binding. We extended this analysis to 20 additional bHLHs and observed correspondence between in vitro binding and protein function, but it could not be attributed to A/T modules as in MYCs. We conclude that different bHLHs may have their own codes for DNA binding and specific selection of targets that, at least in the case of MYCs, depend on the TF-DNA interplay.

Keywords: bHLH; plants; target specificity; transcription factor.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Genome bHLH-PBMs provide high specificity and predict function in vivo. (A) Signal distributions of the probes with the DNA elements indicated after incubation with the MYC proteins. Signal intensities of the probes are presented as Log2[box/Gm], where Gm represents the mutant G-box. Letters above the boxes represent groups of statistical significance at p < 0.05 (ANOVA and post hoc classification with Tukey's test; different letters denote different groups). Data correspond to the genome design. (B) Logos representing sequence alignment of the top 500 (left) and bottom 500 (right) probes, sorted by Log2[box/Gm], in the genome design. Short A/T-rich stretches centered at ±4–7 bp in top probes are highlighted in dark blue. (C) [A + T] content at each position of the 500 probes bound with the highest (top) and lowest intensities (bottom). Nucleotide coordinates are numbered from the G-box toward the ends. (D) Signal distributions (in Log2[box/Gm]) of the sets of probes containing an A/T at each of the positions of the probe. Nucleotide coordinates are as in (C). Letters below the boxes represent groups of statistical significance at p < 0.05 (ANOVA and post hoc classification with Tukey's test; different letters denote different groups). (E) Expression in response to jasmonate (JA) of the genes represented by the probes in the gene features indicated. Dark and light blue represent D1 and D10, respectively, from the MYC2 experiment. Corrected expression values are presented as the mean Log2RPKM[JA/mock]∗[proportion induced genes] from a time-course experiment in response to the hormone (Hickman et al., 2017). Numbers indicate the genes analyzed in the D1 (dark blue) and D10 (light blue) groups.
Figure 2
Figure 2
DAP-seq profiles of Arabidopsis MYCs and their orthologs in tomato and Marchantia. (A) Fold enrichments of the MYC-binding sites (G-, T/G-box, and PBE) in the peaks significantly enriched in the Arabidopsis MYC2 experiment and in tomato (SlMYC2) and Marchantia (MpMYCx). Fold enrichments were calculated relative to the average distributions of 25 sets of random sequences with the same number and length. (B) Distribution of the fold enrichment of the DAP-seq peaks in relation to their number of the MBS elements in 201-bp fragments. Letters above the boxes represent groups of statistical significance at p < 0.05 (ANOVA and post hoc classification with Tukey's test). (C) Relative binding of MYCs to all the MBS doublets in the genome, represented as the mean ratio Log2[CPMDAP/CPMinput] and normalized to the total number of reads in MBSs. Numbers above the heatmap indicate the number of nucleotides between the two modules of the doublet, from 0 to 9. (D) Ath indicates the mean expression values in response to JA of genes containing multimeric MBSs in their promoter regions (3 kb, including the 5′ UTR). Expression data correspond to the Arabidopsis time-course experiment in Figure 1. Sly indicates the mean expression values in response to wounding of Solanum lycopersicum genes containing multimeric MBSs in their promoter regions. Expression data were obtained from a time-course experiment in response to wounding in tomato (Du et al., 2017). Mpo indicates the mean expression values in response to OPDA and wounding of genes containing multimeric MBSs in their promoter regions in Marchantia wild-type Tak1 (T) and the mpmycx (m) mutant (Peñuelas et al., 2019). (E) Mean expression values of genes containing different MBS doublet configurations in their promoter regions (3 kb, including the 5′ UTR). Expression values are as in (D).
Figure 3
Figure 3
Highly specific MYC DNA binding in vitro. (A) Screen captures of Integrated Genome Viewer visualizations of DAP- and ChIP-seq binding profiles of Arabidopsis MYCs in some predicted targets. Arrowheads indicate the positions of different MBSs: red, G-box; green, PBE; yellow, T/G-box. Scale bars represent 1 kb. (B and C) (B) Same as in (A) but data are from tomato and (C) from Marchantia. (D) Sequence logos for Arabidopsis, tomato, and Marchantia MYCs obtained from MEME analysis. A/T modules centered at position ±5 relative to the MBS are boxed in red. (E) Same as in (D) but peak sequences did not include a canonical MBS. (F) Diagram of the ML pipeline to generate new binding models for better target prediction. (G) ROC and AUC values of the four binding models generated with the Arabidopsis, tomato, and Marchantia MYC2 proteins. A theoretical diagonal with an AUC value of 0.5 would indicate that the models have no predictive potential (true-positive rate = true-negative rate), whereas AUC = 1 would represent a perfect predictive model. (H) Proportion of Arabidopsis MYC2 targets identified by ChIP-seq and predicted from the PWM and PWM + DNA shape models with binding score ≥0.98. (I) Same as in (H) but with tomato SlMYC2 targets. (J) Expression of the genes corresponding to binding score ≥0.98 identified with any of the four models (PWM, TFFM, PWM + shape, TFF + shape). Gene expression data are from the wild-type (Tak1) and the mpmycx (mycx) mutant in response to OPDA and wounding (W).
Figure 4
Figure 4
Flanking A/T-rich modules contribute to DNA shape. (A) Sequence alignment of the bound and unbound MBS-containing sequences from the MYC2 experiment. Higher specificity was obtained from the alignment of the sequences from the first decile (D1) of bound sequences. Red rectangles highlight the A/T-rich modules. Alignments correspond to Arabidopsis MYC2 (left), tomato SlMYCx (middle), and Marchantia MpMYCx (right). (B) Mean values of the DNA shape feature ProT of the bound, bound D1, and unbound MBS sequences. The central MBS is shadowed in gray to highlight the flanking sequences. Data correspond to MYC2 (left), SlMYCx (middle), and MpMYCx (right). (C) Same as in (A) but alignments correspond to non-MBS-containing sequences. (D) Same as in (B) but ProT values correspond to non-MBS-containing sequences. (E) Left, MBS-containing amplicons (wt) and their corresponding loss-of-affinity mutants (m) used in focused DAP-seq experiments. Mutated positions at ±4, 5, 6 are in red, and MBS elements are shaded in gray. Right, relative binding of MYC3 to wt and mutant MBS-containing amplicons. Binding results correspond to 12 cycles of amplification. (F) Left, MBS-containing amplicons (wt) and their corresponding gain-of-affinity mutants (m), with mutated positions in red. Right, relative binding of MYC3 to wt and gain-of-affinity mutants.
Figure 5
Figure 5
DAP-seq experiments in vitro predict MYC function in vivo. (A) Top, gene clustering relative to the Arabidopsis MYC DAP and ChIP signals in 2-kb upstream regions. Only clusters with the strongest signal within 1 kb upstream are shown. Clusters were obtained from the combined analysis of the four Arabidopsis MYC2-related DAP experiments and the MYC2 ChIP-seq experiments. Bottom, enrichment of GO terms (biological process) associated with the gene clusters shown above. (B) Same as in (A) but data correspond to tomato SlMYC2-related.
Figure 6
Figure 6
Binding specificities of bHLHs other than MYCs. (A)De novo discovery with MEME of motifs recognized by the indicated bHLH proteins. (B) ROC and AUC values of the four binding models generated with the indicated bHLHs. (C) ProT means of the bound and unbound MBS-containing sequences from PIF5 and bHLH18 experiments. (D) A + T content of the same fragments and experiments as in (C).
Figure 7
Figure 7
Differential binding of bHLHs to the G-boxes in the genome. Heatmap of relative binding of the indicated bHLHs to all the G-boxes in the Arabidopsis genome (n = 14 508). Binding data are presented as Log2[CPMDAP/CPMinput], normalized to the total reads in G-boxes, and clustered with k-means (n = 8 clusters) and additional hierarchical clustering.

References

    1. Abe N., Dror I., Yang L., Slattery M., Zhou T., Bussemaker H.J., Rohs R., Mann R.S. Deconvolving the recognition of DNA shape from sequence. Cell. 2015;161:307–318. - PMC - PubMed
    1. Badis G., Berger M.F., Philippakis A.A., Talukder S., Gehrke A.R., Jaeger S.A., Chan E.T., Metzler G., Vedenko A., Chen X. Diversity and complexity in DNA recognition by transcription factors. Science. 2009;324:1720–1723. - PMC - PubMed
    1. Bailey T.L., Boden M., Buske F.A., Frith M., Grant C.E., Clementi L., Ren J., Li W.W., Noble W.S. MEME Suite: tools for motif discovery and searching. Nucleic Acids Res. 2009;37:W202–W208. - PMC - PubMed
    1. Bartlett A., O’Malley R.C., Huang S.C., Galli M., Nery J.R., Gallavotti A., Ecker J.R. Mapping genome-wide transcription-factor binding sites using DAP-seq. Nat. Protoc. 2017;12:1659–1672. - PMC - PubMed
    1. Boer D.R., Freire-Rios A., van den Berg W.A.M., Saaki T., Manfield I.W., Kepinski S., López-Vidrieo I., Franco-Zorrilla J.M., De Vries S.C., Solano R. Structural basis for DNA binding specificity by the auxin-dependent ARF transcription factors. Cell. 2014;156:577–589. - PubMed

Publication types

MeSH terms