Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Aug 1;31(15):4401-9.
doi: 10.1093/nar/gkg642.

Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations

Affiliations

Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations

Michiel Vandenbussche et al. Nucleic Acids Res. .

Abstract

Frameshift mutations generally result in loss-of-function changes since they drastically alter the protein sequence downstream of the frameshift site, besides creating premature stop codons. Here we present data suggesting that frameshift mutations in the C-terminal domain of specific ancestral MADS-box genes may have contributed to the structural and functional divergence of the MADS-box gene family. We have identified putative frameshift mutations in the conserved C-terminal motifs of the B-function DEF/AP3 subfamily, the A-function SQUA/AP1 subfamily and the E-function AGL2 subfamily, which are all involved in the specification of organ identity during flower development. The newly evolved C-terminal motifs are highly conserved, suggesting a de novo generation of functionality. Interestingly, since the new C-terminal motifs in the A- and B-function subfamilies are only found in higher eudicotyledonous flowering plants, the emergence of these two C-terminal changes coincides with the origin of a highly standardized floral structure. We speculate that the frameshift mutations described here are examples of co-evolution of the different components of a single transcription factor complex. 3' terminal frameshift mutations might provide an important but so far unrecognized mechanism to generate novel functional C-terminal motifs instrumental to the functional diversification of transcription factor families.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Alignment of paleoAP3 and euAP3 C-terminal motifs present within the DEF/AP3 subfamily. Although protein sequences belonging to the DEF/AP3 subfamily display extensive homology almost along their entire length (not shown), two lineages can be distinguished on the basis of their completely different C-terminal motifs (columns indicated with paleoAP3 and euAP3 motifs). In contrast, the cDNA fragments encoding the conserved motifs align very well (right column) upon the introduction of a gap of eight base pairs in the coding sequences of paleoAP3 lineage members. The euAP3 motif, which is uniquely present in DEF/AP3 subfamily members isolated from higher eudicots, may thus have originated by a frameshift mutation caused by the eight base pair insertion (indicated by a double headed arrow) into a paleoAP3 ancestral gene. This is illustrated by the second reading frame translation of paleoAP3 members (indicated with 2nd reading frame), which resembles the euAP3 motif. For details on the 3rd reading frame of the euAP3 motif, we refer to the text. A full set of analyzed sequences is presented in the Supplementary Material.
Figure 2
Figure 2
(Opposite) Neighbor-joining tree of the MIKC type MADS-box gene family. The Neighbor-joining tree has been constructed using the MIK domains of a representative subset of 97 sequences from the total collection of available plant MIKC type MADS-box sequences (see Supplementary Material). These 97 sequences have been selected as follows: subclasses within subfamilies were determined based on the presence of deviating but conserved C-terminal motifs. For each subclass, one to three representative sequences from each major plant group (when available) were selected. The tree was rooted with two MIKC type MADS-box genes from the moss Physcomitrella patens and the clubmoss Lycopodium annotinum. To assess support for the inferred relationships, 1000 bootstrap samples were generated. In a final step, we mapped C-terminal conserved epitopes on the tree. Local bootstrap probabilities are indicated for branches supported with more than 60%. Asterisks behind protein motifs represent stop codons. Motifs not terminating with an asterisk are followed by a variable number of non-conserved residues (not shown). A two-letter code preceding the gene names as found in the database indicates the species involved. Species names and taxa are indicated as follows. Angiosperms: Higher eudicots (open circles with inner filled circles): Am: Antirrhinum majus; At: Arabidopsis thaliana; Hm: Hydrangea macrophylla; Le: Lycopersicon esculentum; Md: Malus domestica; Ph: Petunia hybrida; Basal eudicots (open circles): De: Dicentra eximia; Pn: Papaver nudicaule; Rf: Ranunculus ficaria; Sc: Sanguinaria canadensis; Monocotyledons (filled circles): Hv: Hordeum vulgare; Lr: Lilium regale; Lt: Lolium temulentum; Os: Oryza sativa; Ta: Triticum aestivum; Zm: Zea mays; Others: Mp: Magnolia praecocissima (Magnoliales) (open squares), Cf: Calycanthus floridus (Laurales) (open square with inner filled square). Gymnosperms (filled triangles): Pa: Picea abies (Coniferales); Pr: Pinus radiata (Coniferales); Gg: Gnetum gnemon (Gnetales); Ce: Cycas edentata (Cycadales). Outgroup: La: Lycopodium annotinum (Lycopodiophyta) (filled star); Pp: Physcomitrella patens (Bryophyta) (plus sign). For each subfamily, the total number of analyzed sequences and different species is indicated in parentheses (no. sequences/no. species).
Figure 3
Figure 3
Alignment of paleoAP1 and euAP1 C-terminal motifs present within the SQUA/AP1 subfamily. Within the SQUA/AP1 subfamily, two distinct lineages (euAP1 and paleoAP1 lineages) can be distinguished, each displaying highly conserved but completely different C-terminal motifs (columns indicated with paleoAP1 and euAP1 motifs). Representatives of both lineages have been isolated from a number of higher eudicot species, while magnoliid dicot and monocot species appear to yield only the paleoAP1 type. Although these two types of C-terminal motifs are totally unrelated at the protein level, the cDNA fragments encoding these conserved motifs align surprisingly well (right column). This suggests that the euAP1 motif may have originated by a frameshift mutation in a paleoAP1 ancestral gene at a position upstream of the paleoAP1 motif. To illustrate this, we have shown frameshift translations of paleoAP1 members (column indicated with 2nd reading frame) and of euAP1 members (column indicated with 3rd reading frame), which resemble the euAP1 motif and the ancestral paleoAP1 motif, respectively. A full set of analyzed sequences is presented in the Supplementary Material.
Figure 4
Figure 4
Alignment of C-terminal motifs of monocot OsMADS1 and ZMM7 type AGL2 like subfamily members. In monocot species, we have identified three distinct types of AGL2 like subfamily members, each displaying different C-terminal motifs (Fig. 2). Here we show part of the C-terminal domain alignment of OsMADS1 and ZMM7 type sequences. Both types have an internal motif in common (indicated with common internal AGL2 motif), while their C-termini have fully diverged at the protein level. Sequences belonging to the OsMADS1 type can be further subdivided into two classes: a short version terminating with a ZMM3 motif, and a longer version with a C-terminal extension terminating with a short conserved OsMADS1 motif. As in the previous cases, we found that the cDNA fragments encoding the ZMM3 motif of the OsMADS1 type align with those encoding the ZMM7 motif by introducing a gap representing a frameshift mutation in the OsMADS1 type sequences. The alignment of the cDNA fragments encoding these ZMM3 and ZMM7 motifs is shown on the right and the 2nd reading frame translation of the ZMM3 motif is shown below the ZMM7 motif.
Figure 5
Figure 5
Model for the generation of novel C-terminal motifs within the MADS-box gene family. After duplication of an ancestral gene X, the Y copy accumulates mutations in the C-terminal domain, while retaining the essential MIK domain. Insertions or deletions will cause a frameshift in the coding sequence. Rarely, these frameshift mutations may yield novel functional motifs that consequently will be conserved. In cases where the novel motif is recruited from poorly conserved regions (e.g. Y 2–4) in the ancestral sequence, the sequence relation with the ancestral gene X will become unclear after a period of independent evolution. In the Y copy, new motifs may be added downstream of the ancestral motif as an extra feature, with retention of the ancestral motif which in this case becomes internal (e.g. Y3); or with subsequent loss of the ancestral motif (all other cases).

References

    1. Theissen G., Becker,A., Di Rosa,A., Kanno,A., Kim,J.T., Munster,T., Winter,K.U. and Saedler,H. (2000) A short history of MADS-box genes in plants. Plant Mol. Biol., 42, 115–149. - PubMed
    1. Theissen G., Kim,J.T. and Saedler,H. (1996) Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J. Mol. Evol., 43, 484–516. - PubMed
    1. Theissen G. (2001) Development of floral organ identity: stories from the MADS house. Curr. Opin. Plant Biol., 4, 75–85. - PubMed
    1. Pelaz S., Ditta,G.S., Baumann,E., Wisman,E. and Yanofsky,M.F. (2000) B and C floral organ identity functions require SEPALLATA MADS-box genes. Nature, 405, 200–203. - PubMed
    1. Liljegren S.J., Ditta,G.S., Eshed,Y., Savidge,B., Bowman,J.L. and Yanofsky,M.F. (2000) SHATTERPROOF MADS-box genes control seed dispersal in Arabidopsis. Nature, 404, 766–770. - PubMed