Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Apr;134(4):1718-32.
doi: 10.1104/pp.103.037788.

The GATA family of transcription factors in Arabidopsis and rice

Affiliations

The GATA family of transcription factors in Arabidopsis and rice

José C Reyes et al. Plant Physiol. 2004 Apr.

Abstract

GATA transcription factors are a group of DNA binding proteins broadly distributed in eukaryotes. The GATA factors DNA binding domain is a class IV zinc finger motif in the form CX(2)CX(17-20)CX(2)C followed by a basic region. In plants, GATA DNA motifs have been implicated in light-dependent and nitrate-dependent control of transcription. Herein, we show that the Arabidopsis and the rice (Oryza sativa) genomes present 29 and 28 loci, respectively, that encode for putative GATA factors. A phylogenetic analysis of the 57 GATA factors encoding genes, as well as the study of their intron-exon structure, indicates the existence of seven subfamilies of GATA genes. Some of these subfamilies are represented in both species but others are exclusive for one of them. In addition to the GATA zinc finger motif, polypeptides of the different subfamilies are characterized by the presence of additional domains such as an acidic domain, a CCT (CONSTANS, CO-like, and TOC1) domain, or a transposase-like domain also found in FAR1 and FHY3. Subfamily VI comprises genes that encode putative bi-zinc finger polypeptides, also found in metazoan and fungi, and a tri-zinc finger protein which has not been previously reported in eukaryotes. The phylogeny of the GATA zinc finger motif, excluding flanking regions, evidenced the existence of four classes of GATA zinc fingers, three of them containing 18 residues in the zinc finger loop and one containing a 20-residue loop. Our results support multiple models of evolution of the GATA gene family in plants including gene duplication and exon shuffling.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Phylogenetic analysis of Arabidopsis GATA genes. A, Neighbor-Joining tree of full-length amino acid sequences from Arabidopsis GATA genes. Bootstrap values from 1,000 replicates are shown. The scale bar corresponds to 0.1 estimated amino acid substitutions per site. B, Protein domain organization of the corresponding polypeptides. C, Exon-intron structure of the corresponding genes. Position of the nucleotide sequence that codifies for the GATA zinc finger is depicted in black.
Figure 2.
Figure 2.
Alignment of the CCT domains from Arabidopsis and rice GATA factors. The CCT domains of Arabidopsis TOC1 (AF272039) and CO (X94937) are also included. Identical residues in at least 8 of the 10 sequences are shaded in back.
Figure 3.
Figure 3.
Chromosomal positions of Arabidopsis GATA genes. Subfamily I gene are depicted in red, subfamily II genes are shown in green, subfamily III genes are shown in blue, and subfamily IV genes are shown in black. Colored bands connect corresponding duplicated segments that contain GATA genes. Numbers next to the bands indicate estimated age (in millions of years) of the duplication according to Simillion et al., 2002. Centromers are marked in red. Numbers below the genes correspond to the nucleotide chromosomal coordinates of the gene. The scale is in megabases.
Figure 4.
Figure 4.
Phylogenetic analysis of rice GATA genes. A, Neighbor-Joining tree of full-length amino acid sequences from rice GATA genes. Bootstrap values from 1,000 replicates are shown. The scale bar corresponds to 0.1 estimated amino acid substitutions per site. B, Protein domain organization of the corresponding polypeptides. C, Exon-intron structure of the corresponding genes. Position of the nucleotide sequence that codifies for the GATA zinc finger is depicted in black.
Figure 5.
Figure 5.
Amino acid sequence alignment of Arabidopsis and rice GATA-like zinc finger domains. We aligned the 55-amino acid region of the cGATA1 sequence (residues 162–216) containing all sites that physically interact with DNA to the corresponding regions of other GATA domains. When two zinc fingers are present in the same polypeptide, the N-finger is denoted by -N and the C-finger is denoted by -C. In the case of OsGATA24 with four fingers, the different domains are numbered from the amino- to the carboxy terminus. Five nonplant zinc fingers are also included: cGATA1-N, cGATA1-C, hGATA5-N, hGATA-C, and AreA. Residues conserved in all GATA motifs or in most of the plant GATA domains are highlighted in yellow. Residues specifically conserved in Class A, B, C, or D, zinc fingers are highlighted in red, green, blue or pink, respectively. Conservative changes were defined as those that have a value higher than +2 in the BLOSUM62 scoring matrix (that means that the following amino acid changes were considered as conservatives: E-D, R-K, L-I, V-I, Y-F, Y-H, and Y-W). The bottom part of the figure shows the secondary structure elements corresponding to the indicated amino acids in the structure of the cGATA1C-finger domain for reference (Omichinski et al., 1993).
Figure 6.
Figure 6.
Phylogenetic tree of the amino acid sequences of Arabidopsis and rice GATA-like zinc finger domains. The tree was inferred by the Neighbor-Joining method from the alignment shown in Figure 5. Deduced sequence of the At3g17660 gene was used as outgroup. The scale bar corresponds to 0.1 estimated amino acid substitutions per site.
Figure 7.
Figure 7.
Phylogenetic tree of GATA-like zinc finger domains from plant, and representative metazoan and fungi, proteins. After alignment of 87 GATA zinc finger amino acid sequences from Arabidopsis, rice, and other eukaryotes, a Neighbor-Joining tree was constructed, using the deduced sequence of the Arabidopsis At3g17660 gene as an outgroup. The triangles represent the clades comprising all Class A, B, C, and D sequences. Names of the proteins are followed by the taxa name and the number of residues in the zinc finger loop. The scale bar corresponds to 0.1 estimated amino acid substitutions per site.

References

    1. Argüello-Astorga G, Herrera-Estrella L (1998) Evolution of light-regulated plant promoters. Annu Rev Plant Physiol Plant Mol Biol 49: 525–555 - PubMed
    1. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL (2002) The Pfam protein families database. Nucleic Acids Res 30: 276–280 - PMC - PubMed
    1. Borello U, Ceccarelli E, Giuliano G (1993) Constitutive, light-responsive and circadian clock-responsive factors compete for the different l box elements in plant light-regulated promoters. Plant J 4: 611–619 - PubMed
    1. Caddick MX, Arst HN, Jr. (1990) Nitrogen regulation in Aspergillus: are two fingers better than one? Gene 95: 123–127 - PubMed
    1. Castresana C, Staneloni RJ, Malik VS, Cashmore AR (1987) Molecular characterization of two clusters of genes encoding the Type I CAB polypeptides of PSII in Nicotiana plumbaginifolia. Plant Mol Biol 10: 117–126 - PubMed

Publication types

MeSH terms