Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006;34(22):6505-20.
doi: 10.1093/nar/gkl888. Epub 2006 Nov 27.

The natural history of the WRKY-GCM1 zinc fingers and the relationship between transcription factors and transposons

Affiliations

The natural history of the WRKY-GCM1 zinc fingers and the relationship between transcription factors and transposons

M Madan Babu et al. Nucleic Acids Res. 2006.

Abstract

WRKY and GCM1 are metal chelating DNA-binding domains (DBD) which share a four stranded fold. Using sensitive sequence searches, we show that this WRKY-GCM1 fold is also shared by the FLYWCH Zn-finger domain and the DBDs of two classes of Mutator-like element (MULE) transposases. We present evidence that they share a stabilizing core, which suggests a possible origin from a BED finger-like intermediate that was in turn ultimately derived from a C2H2 Zn-finger domain. Through a systematic study of the phyletic pattern, we show that this WRKY-GCM1 superfamily is a widespread eukaryote-specific group of transcription factors (TFs). We identified several new members across diverse eukaryotic lineages, including potential TFs in animals, fungi and Entamoeba. By integrating sequence, structure, gene expression and transcriptional network data, we present evidence that at least two major global regulators belonging to this superfamily in Saccharomyces cerevisiae (Rcs1p and Aft2p) have evolved from transposons, and attained the status of transcription regulatory hubs in recent course of ascomycete yeast evolution. In plants, we show that the lineage-specific expansion of WRKY-GCM1 domain proteins acquired functional diversity mainly through expression divergence rather than by protein sequence divergence. We also use the WRKY-GCM1 superfamily as an example to illustrate the importance of transposons in the emergence of new TFs in different lineages.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Topology diagram and cartoon representation of Zn-chelating DBDs. (a) WRKY domain from the plant TF WRKY4 (Arabidopsis thaliana, PDB:1wj2), which is primarily expressed in the leaf, root and seed. (b) The DBD from Glial Cell Missing 1 (Mus musculus, PDB:1odh) protein. The Zn-ribbon module inserted between the two conserved cysteines in the WRKY–GCM1 domain (shown in red) facilitates the binding of another Zn atom which is coordinated through the four conserved cysteines in this module. (c) The set of conserved intramolecular interactions which stabilize the fold in the classical WRKY proteins and the GCM1 protein. Lines represent interactions between the amino acids; metal chelating residues are shown in red; residue positions which participate in critical stabilizing interactions are shown in purple; + represent the position which contacts the backbone of the DNA; X represents the position which contacts the base. (d) The BED finger protein Zbed1x (Homo sapiens, PDB: 2CT5) and (e) the classical C2H2 Zn-finger domain from the RAG1 protein (PDB:1rmd). The first strand (and equivalent strands from the other structures) containing the WRKY motif in WRKY4 is shown in yellow. The two strands which house the conserved cysteine that participates in co-coordinating the Zn atom is shown in green. The last secondary structural element containing the pair of histidine residues is shown in blue.
Figure 2
Figure 2
Multiple sequence alignment of the WRKY domains. Proteins are denoted by their gene names, species abbreviations and GenBank identifier (gi) numbers. The secondary structure derived from the average solution structure of WRKY4 is shown above the alignment, where E represents a β-strand. Residues involved in contacting DNA in the structure of the WRKY domain in the GCM protein (PDB: 1odh) are shown below the alignment. Positions which contact the DNA are shown below the secondary structure profile. ‘b’ represents a position which contacts the DNA backbone and ‘&’ mark positions which contact the base. Conserved interactions which are critical for stabilizing the fold are shown at the bottom of the alignment. The coloring reflects the conservation profile at 80% consensus. (A) GCM-type WRKY–GCM1 domains from mammals and Drosophila. Note the large insertion between strand 2 and strand 3, which normally contains a copy of the evolutionarily mobile Zn-ribbon module (see Figure 1). (B) Representative members of the classical WRKY family seen in the TFs of plants, Dictyostelium and Giardia lamblia. Members in this family do not show any major insertion between the conserved cysteines and typically contain a WRKY motif in the first strand. (C) The HxC family of WRKY–GCM1 domain family. (D) WRKY–GCM1 domain of the insert containing type. (E) FLYWCH-type WRKY domains seen primarily in animals. The coloring scheme and consensus abbreviations are as follows: h, hydrophobic (h: ACFILMVWY) and a, aromatic (a: FWY) residues shaded yellow; b, big (LIYERFQKMW) residues shaded gray; s, small (AGSVCDN) residues colored green; and p, polar (STEDKRNQHC) residues colored magenta. Species abbreviations are as follows: Afum: Aspergillus fumigatus; Agos: Ashbya gossypii; Amel: Apis mellifera; Anid: Aspergillus nidulans; Atha: Arabidopsis thaliana; Calb: Candida albicans; Cbri: Caenorhabditis briggsae; Cele: Caenorhabditis elegans; Cglo: Chaetomium globosum; Cimm: Coccidioides immitis; Cneo: Cryptococcus neoformans; Cint: Ciona intestinalis; Ddis: Dictyostelium discoideum; Dmel: Drosophila melanogaster; Ecun: Encephalitozoon cuniculi; Ehis: Entamoeba histolytica; Foxy: Fusarium oxysporum; Ggal: Gallus gallus; Glam: Giardia lamblia; Gzea: Gibberella zeae; Hsap: Homo sapiens; Klac: Kluyveromyces lactis; Mgri: Magnaporthe grisea; Mmus: Mus musculus; Ncra: Neurospora crassa; Scer: Saccharomyces cerevisiae; Sjap: Schistosoma japonicum; Spur: Strongylocentrotus purpuratus; Tcas: Tribolium castaneum; Umay: Ustilago maydis; Ylip: Yarrowia lipolytica.
Figure 3
Figure 3
Domain architectures of proteins that contain the WRKY–GCM1 domain in the different lineages. A representative member for each distinct architectural class is denoted by its gene name, species abbreviation and GenBank identifier (gi) number. Species abbreviations are as follows. Afum: Aspergillus fumigatus; Anid: Aspergillus nidulans; Atha: Arabidopsis thaliana; Cele: Caenorhabditis elegans; Cglo: Chaetomium globosum; Cint: Ciona intestinalis; Ddis: Dictyostelium discoideum; Dmel: Drosophila melanogaster; Ecun: Encephalitozoon cuniculi; Ehis: Entamoeba histolytica; Glam: Giardia lamblia; Hsap: Homo sapiens; Mtru: Medicago truncatula; Ntab: Nicotiana tabacum; Osat: Oryza sativa; Scer: Saccharomyces cerevisiae; Spur: Strongylocentrotus purpuratus; Umay: Ustilago maydis; Ylip: Yarrowia lipolytica. Asterisk denotes distinct domain architectures seen within a single species in the lineage.
Figure 4
Figure 4
(a) Evolutionary relationship between the members of the insert containing WRKY–GCM1 domain family. Red circle represents known and potential TFs. Gray circle represents members which are transposases. A blue circle in the internal nodes of the tree indicates strong bootstrap support, (>80%). (b) Relationship between the Rcs1p and Aft2p homologs in the different fungal genomes. The tree has been rooted with the Ustilago maydis protein as the out-group. Arrowheads denote points where a potential gene duplication event occurred. A blue circle in the internal nodes of the tree indicates strong bootstrap support (>80%). (c) A section of the transcriptional regulatory network showing the target genes for the WRKY domain containing yeast TFs, Rcs1p (YGL071W) and Aft2p (YPL202C). The TFs are shown as green circles, and the target genes are shown as yellow circles. A line represents a direct transcriptional regulatory interaction between the TF and the target gene.
Figure 5
Figure 5
Gene expression profile for the WRKY domain containing proteins from Arabidopsis thaliana across (a) different developmental stages and organs. The samples are ordered according to the organs (root, leaf, apex, flowers, floral organs and seeds; asterisk denotes pollen within the floral organs category), and progressively from embryogenesis to senescence. The WRKY domain containing genes shown on the right as rows have been ordered according to their evolutionary relationship as obtained from using the similarity between their sequences. The neighbor-joining tree was obtained using the distances calculated according to the JTT distance matrix in the MEGA package. Boxes denote similarly expressed tissue-specific clusters of organs and genes. (b) Different light exposures with two time points for each condition. The samples are ordered according to the dark/light source (continuous darkness, continuous blue light, continuous far-red light, continuous red light, continuous white light, pulse of red light, pulse of UV-A light and pulse of UV-A/B light), one after 45 min and another after 4 h of exposure. The WRKY domain containing genes are ordered in the same way. Boxes denote clusters of genes which show high expression for a give light condition or in darkness. The gene expression data were obtained from Schmid et al. (29), and the expression matrix was generated using matrix2png.

Similar articles

Cited by

References

    1. Lespinet O., Wolf Y.I., Koonin E.V., Aravind L. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 2002;12:1048–1059. - PMC - PubMed
    1. Lindahl L., Hinnebusch A. Diversity of mechanisms in the regulation of translation in prokaryotes and lower eukaryotes. Curr. Opin. Genet. Dev. 1992;2:720–726. - PubMed
    1. Aravind L., Anantharaman V., Balaji S., Babu M.M., Iyer L.M. The many faces of the helix–turn–helix domain: transcription regulation and beyond. FEMS Microbiol Rev. 2005;29:231–262. - PubMed
    1. Babu M.M., Luscombe N.M., Aravind L., Gerstein M., Teichmann S.A. Structure and evolution of transcriptional regulatory networks. Curr. Opin. Struct. Biol. 2004;14:283–291. - PubMed
    1. Aravind L., Koonin E.V. DNA-binding proteins and evolution of transcription regulation in the archaea. Nucleic Acids Res. 1999;27:4658–4670. - PMC - PubMed

Publication types

MeSH terms