The Arabidopsis basic/helix-loop-helix transcription factor family

Gabriela Toledo-Ortiz¹, Enamul Huq, Peter H Quail

Affiliations

PMID: 12897250
PMCID: PMC167167
DOI: 10.1105/tpc.013839

Comparative Study

The Arabidopsis basic/helix-loop-helix transcription factor family

Gabriela Toledo-Ortiz et al. Plant Cell. 2003 Aug.

. 2003 Aug;15(8):1749-70.

doi: 10.1105/tpc.013839.

Authors

Gabriela Toledo-Ortiz¹, Enamul Huq, Peter H Quail

Affiliation

¹ Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.

PMID: 12897250
PMCID: PMC167167
DOI: 10.1105/tpc.013839

Abstract

The basic/helix-loop-helix (bHLH) proteins are a superfamily of transcription factors that bind as dimers to specific DNA target sites and that have been well characterized in nonplant eukaryotes as important regulatory components in diverse biological processes. Based on evidence that the bHLH protein PIF3 is a direct phytochrome reaction partner in the photoreceptor's signaling network, we have undertaken a comprehensive computational analysis of the Arabidopsis genome sequence databases to define the scope and features of the bHLH family. Using a set of criteria derived from a previously defined consensus motif, we identified 147 bHLH protein-encoding genes, making this one of the largest transcription factor families in Arabidopsis. Phylogenetic analysis of the bHLH domain sequences permits classification of these genes into 21 subfamilies. The evolutionary and potential functional relationships implied by this analysis are supported by other criteria, including the chromosomal distribution of these genes relative to duplicated genome segments, the conservation of variant exon/intron structural patterns, and the predicted DNA binding activities within subfamilies. Considerable diversity in DNA binding site specificity among family members is predicted, and marked divergence in protein sequence outside of the conserved bHLH domain is observed. Together with the established propensity of bHLH factors to engage in varying degrees of homodimerization and heterodimerization, these observations suggest that the Arabidopsis bHLH proteins have the potential to participate in an extensive set of combinatorial interactions, endowing them with the capacity to be involved in the regulation of a multiplicity of transcriptional programs. We provide evidence from yeast two-hybrid and in vitro binding assays that two related phytochrome-interacting members in the Arabidopsis family, PIF3 and PIF4, can form both homodimers and heterodimers and that all three dimeric configurations can bind specifically to the G-box DNA sequence motif CACGTG. These data are consistent, in principle, with the operation of this combinatorial mechanism in Arabidopsis.

PubMed Disclaimer

Figures

**Figure 1.**
Multiple Sequence Alignment of the bHLH Domains of the 147 Members of the AtbHLH Protein Family. Each protein is identified by its PID number and AtbHLH number (Heim et al., 2003). The EN assigned in this study is based on the order in which the proteins are shown in this alignment. The scheme at top depicts the locations and boundaries of the basic, helix, and loop regions within the bHLH domain. The numbers below the scheme (1 to 61) indicate the position within the bHLH motif as defined in this study. For those proteins for which a name has been given, the name is provided after the PID number. The shading of the alignment presents identical residues in black, conserved residues in dark gray, and similar residues in light gray. Dots denote gaps. The Arabidopsis consensus motif at bottom is based on the residues with 50% conservation among the 147 proteins shown.

**Figure 2.**
Neighbor-Joining Phylogenetic Tree of the AtbHLH Domains Indicating the Predicted DNA Binding Activities and the Intron Distribution Pattern within the Domain. The unrooted tree, constructed using PAUP 4.0, summarizes the evolutionary relationships among the 147 members of the AtbHLH protein family. The proteins are named according to their PID numbers (see Figure 1 and Table 2). The tree was constructed using the amino acid sequence of the bHLH domain for each protein. The tree shows the 21 phylogenetic subfamilies (right column, numbered 1 to 21 and marked with different alternating tones of a gray background to make subfamily identification easier) with high predictive value (bootstrap support of 50 or greater). The internal nodes are not supported by the sampling method and do not necessarily give a true indication of the phylogenetic relationships between the different subfamilies of bHLH proteins. Functionally characterized AtbHLH proteins are indicated with arrows and their names (Table 2; see also supplemental Table 4 online). The tree shown has branch lengths that are not proportional to the distance between sequences. The alignment on which the tree is based is shown in Figure 1. The color code in the central column (Intron Pattern) indicates the numbers and positions of the introns localized in the bHLH domain of each protein. The colors correspond to the intron patterns shown in Figure 3. The color code in the left column (Predicted DNA binding Category) indicates the predicted DNA binding activity of each protein. Pink indicates putative G-box binders; blue indicates putative non-G-box binders; green indicates putative non-E-box binders (i.e., possible DNA binding capacity but no predicted recognition of an E-box); and yellow indicates putative non-DNA binders (see Table 3 for categories).

**Figure 3.**
Intron Distribution within the bHLH Domains of the AtbHLH Proteins. Scheme of the intron distribution patterns (color coded and designated A to I) within the bHLH domains of the AtbHLH proteins. Introns are indicated by triangles and numbered (1 to 3) based on those present in the bHLH region of PIF3, which is shown at top. When the position of the intron coincides with that found in PIF3, the intron number is given above the triangle. For patterns F, G, and H, no intron number above the triangle indicates that the location of the intron within the bHLH domain is different from that found in PIF3. The percentage of proteins with each pattern is given at right. The correlation of intron distribution patterns and phylogenetic subfamilies is provided in Figure 2 (central column, color coded), and the chromosomal distribution of intron patterns is provided in Figure 4 (colored ovals adjacent to each entry number).

**Figure 4.**
Chromosomal Locations, Intron Distribution Patterns, and Duplication Events for *AtbHLH* Genes. Deduced chromosomal positions of the *AtbHLH* genes are indicated by EN (assigned in Figure 1). Segmentally duplicated regions in the chromosomes (Chr I to V) are indicated by boxes of the same color (adapted from TIGR). The total number of *bHLH* genes per chromosome is indicated at the top of each chromosome in parentheses. The scale is in megabases (Mb) and is adapted from the scale available on the TIGR database (see Methods). The small colored ovals at left of the ENs indicate the intron distribution patterns within each gene. The color code corresponds to the intron patterns shown in Figure 3. Connecting lines (blue and pink) mark the specific cases in which there is a strong correlation between duplicated genomic regions and the presence of *bHLH* genes with both closely related predicted amino acid sequence (close ENs) and the same intron pattern. The blue lines link cases associated with apparent intrachromosomal duplications (see supplemental Figure 7B online), and the pink lines link cases associated with apparent interchromosomal duplications (for more details, see supplemental Figure 7C online).

**Figure 5.**
PIF4 Heterodimerizes with PIF3. **(A)** PIF3 and PIF4 interact in a yeast two-hybrid assay. The left panel shows interaction in a plate growth assay. The combination of constructs used in each section is indicated in the circle (middle) and at right. The right panel shows Miller units in a quantitative liquid β-galactosidase assay. GBD and GAD denote GAL4 DNA binding and activation domains, respectively. GAD:PIF4 denotes the GAL4 activation domain:PIF4 fusion protein, and GBD:PIF3 denotes the GAL4 DNA binding domain:PIF3 fusion protein. aa, amino acids; NLS, nuclear localization signal. **(B)** PIF3 and PIF4 interact in vitro. Full-length PIF3 or PIF4 cDNAs either alone or fused to GAD were used as templates for synthesizing the proteins for this coimmunoprecipitation assay. All proteins were synthesized as ³⁵S-Met–labeled products in a TnT reaction. PIF4:GAD, PIF4 fused at its C terminus to the GAL4 activation domain; GAD:PIF3, PIF3 fused at its C terminus to the GAL4 activation domain. **(C)** PIF3 and PIF4 bind to the G-box both as homodimers and as a PIF3:PIF4 heterodimer. PIF4:GAD and a truncated N308PIF3 clone were coexpressed in a TnT reaction, and 1 μL of this TnT mix was used for DNA binding. PIF4:GAD and N308PIF3 also were expressed in a TnT reaction separately and used to bind to the G-box DNA as homodimers. A total of 30,000 cpm of labeled probe was used in each lane. The binding conditions were as described by Huq and Quail (2002). pLUC control plasmid was translated in the TnT reaction and used as the TnT-only control. The samples were separated on a 5% gel, and the gels were dried and exposed to PhosphorImager (Molecular Dynamics, Sunnyvale, CA) or x-ray film for analysis. FP, free probe; mut, mutant; wt, wild type.

See this image and copyright information in PMC

Comment in

Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana.
Bailey PC, Martin C, Toledo-Ortiz G, Quail PH, Huq E, Heim MA, Jakoby M, Werber M, Weisshaar B. Bailey PC, et al. Plant Cell. 2003 Nov;15(11):2497-502. doi: 10.1105/tpc.151140. Plant Cell. 2003. PMID: 14600211 Free PMC article. No abstract available.

References

1. Abe, H., Urao, T., Ito, T., Seki, M., Shinozaki, K., and Yamaguchi-Shinozaki, K. (2003). Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) function as transcriptional activators in abscisic acid signaling. Plant Cell 15, 63–78. - PMC - PubMed
1. Abe, H., Yamaguchi-Shinozaki, K., Urao, T., Iwasaki, T., Hosokawa, D., and Shinozaki, K. (1997). Role of Arabidopsis MYC and MYB homologs in drought- and abscisic acid–regulated gene expression. Plant Cell 9, 1859–1868. - PMC - PubMed
1. Atchley, W.R., and Fitch, W.M. (1997). A natural classification of the basic helix-loop-helix class of transcription factors. Proc. Natl. Acad. Sci. USA 94, 5172–5176. - PMC - PubMed
1. Atchley, W.R., Therhalle, W., and Dress, A. (1999). Positional dependence, cliques and predictive motifs in the bHLH protein domain. J. Mol. Evol. 48, 501–516. - PubMed
1. Baudino, T.A., and Cleveland, J.L. (2001). The Max network gone mad. Mol. Cell. Biol. 21, 691–702. - PMC - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Arabidopsis basic/helix-loop-helix transcription factor family

Affiliation

The Arabidopsis basic/helix-loop-helix transcription factor family

Authors

Affiliation

Abstract

Figures

Comment in

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases