Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2012 Jul 9:12:106.
doi: 10.1186/1471-2229-12-106.

Genome-wide analysis of the MYB transcription factor superfamily in soybean

Affiliations
Comparative Study

Genome-wide analysis of the MYB transcription factor superfamily in soybean

Hai Du et al. BMC Plant Biol. .

Abstract

Background: The MYB superfamily constitutes one of the most abundant groups of transcription factors described in plants. Nevertheless, their functions appear to be highly diverse and remain rather unclear. To date, no genome-wide characterization of this gene family has been conducted in a legume species. Here we report the first genome-wide analysis of the whole MYB superfamily in a legume species, soybean (Glycine max), including the gene structures, phylogeny, chromosome locations, conserved motifs, and expression patterns, as well as a comparative genomic analysis with Arabidopsis.

Results: A total of 244 R2R3-MYB genes were identified and further classified into 48 subfamilies based on a phylogenetic comparative analysis with their putative orthologs, showed both gene loss and duplication events. The phylogenetic analysis showed that most characterized MYB genes with similar functions are clustered in the same subfamily, together with the identification of orthologs by synteny analysis, functional conservation among subgroups of MYB genes was strongly indicated. The phylogenetic relationships of each subgroup of MYB genes were well supported by the highly conserved intron/exon structures and motifs outside the MYB domain. Synonymous nucleotide substitution (dN/dS) analysis showed that the soybean MYB DNA-binding domain is under strong negative selection. The chromosome distribution pattern strongly indicated that genome-wide segmental and tandem duplication contribute to the expansion of soybean MYB genes. In addition, we found that ~ 4% of soybean R2R3-MYB genes had undergone alternative splicing events, producing a variety of transcripts from a single gene, which illustrated the extremely high complexity of transcriptome regulation. Comparative expression profile analysis of R2R3-MYB genes in soybean and Arabidopsis revealed that MYB genes play conserved and various roles in plants, which is indicative of a divergence in function.

Conclusions: In this study we identified the largest MYB gene family in plants known to date. Our findings indicate that members of this large gene family may be involved in different plant biological processes, some of which may be potentially involved in legume-specific nodulation. Our comparative genomics analysis provides a solid foundation for future functional dissection of this family gene.

PubMed Disclaimer

Figures

Figure 1
Figure 1
R2 and R3 MYB repeats are highly conserved across all R2R3-MYB proteins in the soybean genome. The sequence logos of the R2 (a) and R3 (b) MYB repeats are based on full-length alignments of all soybean R2R3-MYB domains. Multiple alignment analysis of 244 soybean-typical R2R3-MYB domains was performed with ClustalW (for full representation of the alignment, see Additional file 2). The bit score indicates the information content for each position in the sequence. Asterisks indicate the conserved tryptophan residues (Trp) in the MYB domain.
Figure 2
Figure 2
Phylogenetic relationships and subgroup designations in MYB proteins from soybean (Gm),Arabidopsis(At) and other plants. (a) Neighbor-joining tree representing relationships among 252 MYB proteins from soybean and 132 MYB proteins from Arabidopsis, including five 3R-MYB proteins from Arabidopsis and six 3R-MYB proteins from soybean. The proteins are clustered into 47 subgroups, which are designated with a subgroup number (e.g., C1) and marked with different alternating tones of a gray background to facilitate subfamily identification with high predictive value. The numbers beside the branches represent bootstrap support values (>50%) from 1000 replications. Sixteen proteins did not fit well into clusters. Colored circles indicate the corresponding intron distribution patterns, as shown in Figure 3. (b) Structure of MYB genes in soybean and Arabidopsis. Exon(s) are indicated by green boxes, MYB domain(s) by red boxes, untranslated region(s) by blue boxes, and spaces between the colored boxes correspond to introns. The sizes of exons and introns can be estimated using the horizontal scale bar. (c) Architecture of conserved protein motifs in 47 subfamilies. The motifs on the right were detected using MEME and are graphically represented as white boxes drawn to scale for a representative plant MYB protein of each subfamily. (d) Expression patterns of MYB genes in soybean and Arabidopsis in different organs. R, root; L, leaf; F, flower; S, seed; N, legume-specific nodulation. In this expression pattern analysis, the highest values among the expression values of the four organs published in the AtGenExpress and SoySeq databases were selected.
Figure 3
Figure 3
Schematic diagram of intron distribution patterns within the MYB domains of soybean MYB (GmMYB) proteins. Alignment of MYB domains is representative of 14 intron patterns, designated a to n. Locations of introns are indicated by white triangles, and the number within each triangle indicates the splicing phases of the MYB domain sequences: 0 refers to phase 0; 1 to phase 1; and 2 to phase 2. The number of GmMYB proteins with each pattern is presented on the right. The correlation of intron distribution patterns and phylogenetic subfamilies is provided in Figure 2 and Additional file 3.
Figure 4
Figure 4
Chromosomal locations, region duplication, and predicted cluster for soybean MYB genes. Chromosomal positions of the MYB genes in soybean are mapped on the basis of JGI soybean Genome version 7.0. The chromosome number is indicated above each chromosome. The scale is in megabases (Mb). The number below the chromosome name indicates the length. The phylogenetic category of each gene (Figure 2) is indicated by the subgroup number. Each pair of duplicated MYB genes is connected with a red line. Connecting lines mark the specific cases in which there is a strong correlation between duplicated genomic regions and the presence of MYB genes with closely related predicted amino acid sequences. Colored boxes indicate groups of gene clusters with paralogous and syntenic genes on the chromosomes. Yellow bars on the chromosomes and blue numbers beside the bars indicate the 24 predicted duplication regions. The green and white bars on the chromosomes indicate the centromeres and pericentromeres, respectively.

Similar articles

Cited by

References

    1. Riechmann JL, Heard J, Martin G, Reuber L, Jiang C, Keddie J, Adam L, Pineda O, Ratcliffe OJ, Samaha RR, Creelman R, Pilgrim M, Broun P, Zhang JZ, Ghandehari D, Sherman BK, Yu G. Arabidopsis transcription factors: genome–wide comparative analysis among eukaryotes. Science. 2000;290:2105–2110. doi: 10.1126/science.290.5499.2105. - DOI - PubMed
    1. Amoutzias GD, Veron AS, Weiner J 3rd, Robinson-Rechavi M, Bornberg-Bauer E, Oliver SG, Robertson DL. One billion years of bZIP transcription factor evolution: conservation and change in dimerization and DNA-binding site specificity. Mol Biol Evol. 2007;24:827–835. - PubMed
    1. Lipsick JS. One billion years of Myb. Oncogene. 1996;13:223–235. - PubMed
    1. Stracke R, Werber M, Weisshaar B. The R2R3–MYB gene family in Arabidopsis thaliana. Curr Opin Plant Bio. 2001;l4:447–456. - PubMed
    1. Ogata K, Morikawa S, Nakamura H, Hojo H, Yoshimura S, Zhang R, Aimoto S, Ametani Y, Hirata Z, Sarai A, Ishii S, Nishimura Y. Comparison of the free and DNA–complexed forms of the DNA–binding domain from c–Myb. Nat Struct Biol. 1995;2:309–320. doi: 10.1038/nsb0495-309. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources