Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Jul;15(7):1538-51.
doi: 10.1105/tpc.011544.

Molecular and phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: new openings to the MADS world

Affiliations

Molecular and phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: new openings to the MADS world

Lucie Parenicová et al. Plant Cell. 2003 Jul.

Abstract

MADS-box transcription factors are key regulators of several plant development processes. Analysis of the complete Arabidopsis genome sequence revealed 107 genes encoding MADS-box proteins, of which 84% are of unknown function. Here, we provide a complete overview of this family, describing the gene structure, gene expression, genome localization, protein motif organization, and phylogenetic relationship of each member. We have divided this transcription factor family into five groups (named MIKC, Malpha, Mbeta, Mgamma, and Mdelta) based on the phylogenetic relationships of the conserved MADS-box domain. This study provides a solid base for functional genomics studies into this important family of plant regulatory genes, including the poorly characterized group of M-type MADS-box proteins. MADS-box genes also constitute an excellent system with which to study the evolution of complex gene families in higher plants.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Phylogenetic Analysis of MADS-Box–Containing Proteins of Arabidopsis. Scheme depicting the relationships of major clades of MADS domain–containing proteins in Arabidopsis. The topology was estimated by analysis of the MADS-box (58 unambiguously aligned residues) using the program MrBayes (Huelsenbeck, 2000). Bootstrap proportions were derived from quartet-puzzling bootstrap analysis of 39 representative sequences drawn from across the clades recovered by the initial Bayesian analysis (see Methods).
Figure 2.
Figure 2.
An Analytical View of the Mα Group of the Arabidopsis MADS-Box Gene Family. The following parts are shown from left to right. Protein maximum likelihood tree: The tree was constructed in the Bayesian framework using MrBayes software (Huelsenbeck, 2000) under the jtt substitution model with a γ distribution to accommodate differences of substitution rates between sites. Bootstrap proportions were calculated using programs within the PHYLIP package (Felsenstein, 1989). Expression pattern: The gene expression has been determined by RT-PCR using pairs of gene-specific primers. A positive signal is indicated by a colored box for the following tissues: brown for roots (R), green for rosette leaves (L), yellow for inflorescences (I), and red for siliques (S). The white box indicates that no expression could be detected. Gene structure: The gene structure is presented by blue exon(s) and spaces between the blue boxes correspond to introns. The sizes of exons and introns can be estimated using the vertical lines. Protein structure: The search for the common motifs shared among the MADS-box proteins of each group was done with MEME (see Methods). The output of the analysis is schematically represented here. Each colored box represents a new motif. A white box present in an otherwise continuing sequence of colored boxes means a deletion of an amino acid sequence at the specific position. Black bars represent an amino acid sequence not showing any significant homology to other amino acid sequences within the group of proteins. The length of the motif can be estimated using the scale at top. aa, amino acids.
Figure 3.
Figure 3.
An Analytical View of the Mβ Group of the Arabidopsis MADS-Box Gene Family. The following parts are shown from left to right. Protein maximum likelihood tree: The tree was constructed as described in Figure 2. Expression pattern: The gene expression has been determined by RT-PCR using pairs of gene-specific primers. A positive signal is indicated by a colored box for the following tissues: brown for roots (R), green for rosette leaves (L), yellow for inflorescences (I), and red for siliques (S). The white box indicates that no expression could be detected. Gene structure: The gene structure is presented by blue exon(s) and spaces between the blue boxes correspond to introns. The sizes of exons and introns can be estimated using the vertical lines. Protein structure: Each colored box represents a new motif. A white box present in an otherwise continuing sequence of colored boxes means a deletion of an amino acid sequence at the specific position. Black bars represent an amino acid sequence not showing any significant homology to other amino acid sequences within the group of proteins. The length of the motif can be estimated using the scale at top. aa, amino acids.
Figure 4.
Figure 4.
An Analytical View of the Mγ Group of the Arabidopsis MADS-Box Gene Family. The following parts are shown from left to right. Protein maximum likelihood tree: The tree was constructed as described in Figure 2. Expression pattern: The gene expression has been determined by RT-PCR using pairs of gene-specific primers. A positive signal is indicated by a colored box for the following tissues: brown for roots (R), green for rosette leaves (L), yellow for inflorescences (I), and red for siliques (S). The white box indicates that no expression could be detected. Gene structure: The gene structure is presented by blue exon(s) and spaces between the blue boxes correspond to introns. The sizes of exons and introns can be estimated using the vertical lines. Protein structure: Each colored box represents a new motif. A white box present in an otherwise continuing sequence of colored boxes means a deletion of an amino acid sequence at the specific position. Black bars represent an amino acid sequence not showing any significant homology to other amino acid sequences within the group of proteins. The length of the motif can be estimated using the scale at top. aa, amino acids.
Figure 5.
Figure 5.
An Analytical View of the Mδ Group of the Arabidopsis MADS-Box Gene Family. The following parts are shown from left to right. Protein maximum likelihood tree: The tree was constructed as described in Figure 2. Expression pattern: The gene expression has been determined by RT-PCR using pairs of gene-specific primers. A positive signal is indicated by a colored box for the following tissues: brown for roots (R), green for rosette leaves (L), yellow for inflorescences (I), and red for siliques (S). The white box indicates that no expression could be detected. Gene structure: The gene structure is presented by blue exon(s) and spaces between the blue boxes correspond to introns. The sizes of exons and introns can be estimated using the vertical lines. Protein structure: Each colored box represents a new motif. A white box present in an otherwise continuing sequence of colored boxes means a deletion of an amino acid sequence at the specific position. Black bars represent an amino acid sequence not showing any significant homology to other amino acid sequences within the group of proteins. The length of the motif can be estimated using the scale at top. The AGL33 gene and protein structure is depicted together with this group because it shows the highest amino acid sequence similarity of the MADS-box with the Mδ group. aa, amino acids.
Figure 6.
Figure 6.
An Analytical View of the MIKC Group of the Arabidopsis MADS-Box Gene Family. The following parts are shown from left to right. Protein maximum likelihood tree: The tree was constructed as described in Figure 2. Expression pattern: The gene expression has been determined by RT-PCR using pairs of gene-specific primers. A positive signal is indicated by a colored box for the following tissues: brown for roots (R), green for rosette leaves (L), yellow for inflorescences (I), and red for siliques (S). The white box indicates that no expression could be detected. The detection method for the expression of genes published previously (#) is marked as follows: §, in situ hybridization; ϕ, RNA gel blot analysis; ξ, RT-PCR (see supplemental data online). Gene structure: The gene structure is presented by blue exon(s) and spaces between the blue boxes correspond to introns. The sizes of exons and introns can be estimated using the vertical lines. Protein structure: Each colored box represents a new motif. A white box present in an otherwise continuing sequence of colored boxes means a deletion of an amino acid sequence at the specific position. Black bars represent an amino acid sequence not showing any significant homology to other amino acid sequences within the group of proteins. The length of the motif can be estimated using the scale at top. aa, amino acids.
Figure 7.
Figure 7.
Relationships between Arabidopsis and Rice MADS-Box Proteins. Phylogenetic analysis of 58 conserved amino acid residues from the MADS-box domain in representative sequences from Arabidopsis and rice. The tree was constructed using the program MrBayes (Huelsenbeck, 2000), and the support values shown are Bayesian posterior probabilities. Branches with <50% support have been collapsed to give polytomies. Rice proteins are indicated in blue. Detailed information about the rice sequences used in this analysis is given in the supplemental data online.
Figure 8.
Figure 8.
RNA Gel Blot Analysis of New MADS-Box Gene Expression. Gene expression was analyzed using 5 μg of total RNA isolated from wild-type Arabidopsis plants grown under long-day conditions. Expression was detected for two genes: AGL30 (A) and AGL80 (B). As a loading control, the blots were reprobed with ACTINE fragments from Arabidopsis (bottom gels) (see Methods). Arrows indicate the sizes of the bands of the hybridizing ladder. M, 1-kb DNA ladder (Invitrogen); L, rosette leaf; I, inflorescence; S, silique; 0, silique at 0 days after pollination.
Figure 9.
Figure 9.
AGL80 and AGL104 Expression Analyzed by in Situ Hybridization. (A) Arabidopsis stage-9 flower. AGL80 expression is detected in early postmeiotic microspores. (B) Arabidopsis stage-12 to -13 flower. AGL80 expression is observed in the transmitting tract and the nucellus. (C) Section hybridized with AGL104 antisense RNA. In an early stage-3 flower, AGL104 expression is detected inside of whorl 1, whereas in older flowers, it is detected in young developing anthers, in petals, and within carpels (septum and developing ovules). Bars = 50 μm.
Figure 10.
Figure 10.
Distribution of the Members of the MADS-Box Gene Family in the Arabidopsis Genome. MADS-box genes are plotted according to their sequence positions along the five chromosomes. Genes located in close proximity to one another cannot be plotted individually and are listed according to their relative positions. Genes from the five groups are represented by different colors (MIKC, blue; M α, pink; M β, green; M γ, red; and M δ, orange). Genes present in chromosome segments affected by large duplications are boxed together. Each box corresponds to a single chromosome fragment. Asterisks indicate that there is a related gene in the duplicated segment. Tandem repeated genes (closely related genes that flank each other directly) are joined by thick black lines. Closely related genes separated by a maximum of three unrelated genes are joined by thick blue lines.

References

    1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 24, 3389–3402. - PMC - PubMed
    1. Alvarez-Buylla, E.R., Pelaz, S., Liljegren, S.J., Gold, S.E., Burgeff, C., Ditta, G.S., De Pouplana, L.R., Martínez-Castilla, L., and Yanofsky, M.F. (2000). An ancestral MADS-box gene duplication occurred before the divergence of plants and animals. Proc. Natl. Acad. Sci. USA 97, 5328–5333. - PMC - PubMed
    1. Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815. - PubMed
    1. Bailey, T.L., and Elkan, C. (1994). Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In Proceeding of the Second International Conference on Intelligent Systems for Molecular Biology. (Menlo Park, CA: AAAI Press), pp. 28–36. - PubMed
    1. Bowman, J.L., Alvarez, J., Weigel, D., Meyerowitz, E.M., and Smyth, D.R. (1993). Control of flower development in Arabidopsis thaliana by APETALA1 and interacting genes. Development 119, 721–743.

Publication types

MeSH terms

Associated data