Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul;153(3):1398-412.
doi: 10.1104/pp.110.153593. Epub 2010 May 14.

Genome-wide classification and evolutionary analysis of the bHLH family of transcription factors in Arabidopsis, poplar, rice, moss, and algae

Affiliations

Genome-wide classification and evolutionary analysis of the bHLH family of transcription factors in Arabidopsis, poplar, rice, moss, and algae

Lorenzo Carretero-Paulet et al. Plant Physiol. 2010 Jul.

Abstract

Basic helix-loop-helix proteins (bHLHs) are found throughout the three eukaryotic kingdoms and constitute one of the largest families of transcription factors. A growing number of bHLH proteins have been functionally characterized in plants. However, some of these have not been previously classified. We present here an updated and comprehensive classification of the bHLHs encoded by the whole sequenced genomes of Arabidopsis (Arabidopsis thaliana), Populus trichocarpa, Oryza sativa, Physcomitrella patens, and five algae species. We define a plant bHLH consensus motif, which allowed the identification of novel highly diverged atypical bHLHs. Using yeast two-hybrid assays, we confirm that (1) a highly diverged bHLH has retained protein interaction activity and (2) the two most conserved positions in the consensus play an essential role in dimerization. Phylogenetic analysis permitted classification of the 638 bHLH genes identified into 32 subfamilies. Evolutionary and functional relationships within subfamilies are supported by intron patterns, predicted DNA-binding motifs, and the architecture of conserved protein motifs. Our analyses reveal the origin and evolutionary diversification of plant bHLHs through differential expansions, domain shuffling, and extensive sequence divergence. At the functional level, this would translate into different subfamilies evolving specific DNA-binding and protein interaction activities as well as differential transcriptional regulatory roles. Our results suggest a role for bHLH proteins in generating plant phenotypic diversity and provide a solid framework for further investigations into the role carried out in the transcriptional regulation of key growth and developmental processes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Plant and animal bHLH consensus. Alignment of the plant and animal bHLH consensus used as predictive motifs. The plant consensus is based on an alignment of plant bHLHs and contains positions conserved in more than 50% of the sequences. In such positions, amino acids conserved in more than 10% of the sequences were also included. The animal consensus is based on Atchley et al. (1999). Shown at the bottom are the boundaries of the different regions of the bHLH domain.
Figure 2.
Figure 2.
Yeast two-hybrid analysis of AtPAR1 protein interaction activities. A, Homodimerization activity of wild-type AtPAR1. B, Homodimerization activity of two mutated versions of AtPAR1, L1mut (Leu-27Glu) and L2mut (Leu-73Lys). SD-LT refers to the selective medium for transformed yeast cells, and SD-AHLT refers to the selective medium to perform the growth assay indicative of protein-protein interaction. Numbers refer to the combinations of BD and AD yeast constructs used in each section, as indicated in the right panels. All transformations within a section were done simultaneously. Cotransformations were repeated at least twice with identical results.
Figure 3.
Figure 3.
Phylogenetic relationships, intron pattern, DNA-binding motifs, and architecture of conserved protein motifs in 32 plant bHLH subfamilies. A, ML tree of 638 plant bHLH proteins (for the full representation of the tree, see Supplemental Fig. S2). The tree has been rooted using the single representative from C. merolae. Subfamilies are represented collapsed as triangles (except for subfamilies 5, 12, and 24), with both depth and width proportional to sequence divergence and size, respectively. Subfamilies supported by bootstrap values greater than 50 in NJ or MP analysis are colored black. Subfamilies 5, 12, and 24, highlighted with gray shading, were ambiguously retrieved in NJ, MP, and BA trees. Orphan genes are represented as single lines. The tree is drawn to scale, with branch lengths proportional to evolutionary distances between nodes. The scale bar indicates the estimated number of amino acid replacements per site. B, Summary of information of 32 plant bHLH subfamilies. Predicted DNA-binding motifs are as follows: I, E non G; II, G binder; III, non E binder; IV, E-box; V, G-box; VI, non DNA binder. For intron pattern designations, see Figure 5. C, Architecture of protein conserved motifs. Motifs are graphically represented as white boxes drawn to scale for a representative plant bHLH protein of each subfamily. Motifs matching regions of the bHLH domain are colored gray.
Figure 4.
Figure 4.
Evolution of bHLH gene family size in plants. Estimates of bHLH gene family size in the MRCA of examined plant species are represented at the corresponding nodes of a tree depicting their evolutionary relationships. Numbers correspond to minimum and maximum estimates. Branch lengths are proportional to evolutionary divergence time, according to previous estimates (Chaw et al., 2004; Yoon et al., 2004; Tuskan et al., 2006; Merchant et al., 2007; Rensing et al., 2008). The scale bar represents millions of years ago. The number of bHLH genes (subfamilies) identified in extant species is indicated for Arabidopsis (At), poplar (Pt), rice (Os), moss (Pp), four chlorophyte species (Ch), and C. merolae (Cm).
Figure 5.
Figure 5.
Intron patterns within the bHLH domains. Alignment of bHLH domains representative of 11 intron patterns, named from a to l. The ? indicates nine additional gene-specific intron patterns. Locations of introns are indicated by triangles, and the number within the triangle corresponds to the intron phase. The number of bHLHs displaying each pattern in Arabidopsis (At), poplar (Pt), rice (Os), moss (Pp), and algae is given in the table at right of the alignment.

References

    1. Abascal F, Zardoya R, Posada D. (2005) ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21: 2104–2105 - PubMed
    1. Abe H, Yamaguchi-Shinozaki K, Urao T, Iwasaki T, Hosokawa D, Shinozaki K. (1997) Role of Arabidopsis MYC and MYB homologs in drought- and abscisic acid-regulated gene expression. Plant Cell 9: 1859–1868 - PMC - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402 - PMC - PubMed
    1. Amoutzias GD, Robertson DL, Oliver SG, Bornberg-Bauer E. (2004) Convergent evolution of gene networks by single-gene duplications in higher eukaryotes. EMBO Rep 5: 274–279 - PMC - PubMed
    1. Amoutzias GD, Veron AS, Weiner J, III, Robinson-Rechavi M, Bornberg-Bauer E, Oliver SG, Robertson DL. (2007) One billion years of bZIP transcription factor evolution: conservation and change in dimerization and DNA-binding site specificity. Mol Biol Evol 24: 827–835 - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources