. 2010 Feb 19:10:53.

doi: 10.1186/1471-2148-10-53.

Characterization of the tandem CWCH2 sequence motif: a hallmark of inter-zinc finger interactions

Minoru Hatayama¹, Jun Aruga

Affiliations

PMID: 20167128
PMCID: PMC2837044
DOI: 10.1186/1471-2148-10-53

Characterization of the tandem CWCH2 sequence motif: a hallmark of inter-zinc finger interactions

Minoru Hatayama et al. BMC Evol Biol. 2010.

. 2010 Feb 19:10:53.

doi: 10.1186/1471-2148-10-53.

Authors

Minoru Hatayama¹, Jun Aruga

Affiliation

¹ Laboratory for Behavioral and Developmental Disorders, RIKEN Brain Science Institute, Wako-shi, Saitama 351-0198, Japan.

PMID: 20167128
PMCID: PMC2837044
DOI: 10.1186/1471-2148-10-53

Abstract

Background: The C2H2 zinc finger (ZF) domain is widely conserved among eukaryotic proteins. In Zic/Gli/Zap1 C2H2 ZF proteins, the two N-terminal ZFs form a single structural unit by sharing a hydrophobic core. This structural unit defines a new motif comprised of two tryptophan side chains at the center of the hydrophobic core. Because each tryptophan residue is located between the two cysteine residues of the C2H2 motif, we have named this structure the tandem CWCH2 (tCWCH2) motif.

Results: Here, we characterized 587 tCWCH2-containing genes using data derived from public databases. We categorized genes into 11 classes including Zic/Gli/Glis, Arid2/Rsc9, PacC, Mizf, Aebp2, Zap1/ZafA, Fungl, Zfp106, Twincl, Clr1, and Fungl-4ZF, based on sequence similarity, domain organization, and functional similarities. tCWCH2 motifs are mostly found in organisms belonging to the Opisthokonta (metazoa, fungi, and choanoflagellates) and Amoebozoa (amoeba, Dictyostelium discoideum). By comparison, the C2H2 ZF motif is distributed widely among the eukaryotes. The structure and organization of the tCWCH2 motif, its phylogenetic distribution, and molecular phylogenetic analysis suggest that prototypical tCWCH2 genes existed in the Opisthokonta ancestor. Within-group or between-group comparisons of the tCWCH2 amino acid sequence identified three additional sequence features (site-specific amino acid frequencies, longer linker sequence between two C2H2 ZFs, and frequent extra-sequences within C2H2 ZF motifs).

Conclusion: These features suggest that the tCWCH2 motif is a specialized motif involved in inter-zinc finger interactions.

PubMed Disclaimer

Figures

**Figure 1**
**Structures of tCWCH2 sequence motifs from three different proteins**. (A) Superimposition of the 3D structures of ZIC3, GLI1, and Zap1 tCWCH2 is shown in stereo view. Backbones of the protein structures are indicated by the flat ribbon model. The side chains of two conserved tryptophan residues in tCWCH2 are indicated by the stick model. Red = ZIC3 (PDBID: 2RPC); Blue = GLI1 (PDBID: 2GLI); Green = Zap1 (PDBID: 1ZW8). (B) Amino acid sequence alignment of the tCWCH2 regions. Zinc-chelating cysteine and histidine residues are shown in gray boxes and conserved tryptophan residues are shown in white letters with black boxes. In (A) and (B), the gray lines with arrowheads indicate the sequence between the two CWCH2 motifs (linker sequence).

**Figure 2**
**Domain structure and function of tCWCH2-containing proteins**. The 11 gene classes are listed in descending order of the number of representative genes in each class. A *Monosiga* gene and two *Dictyostelium* genes, which do not belong to these classes, are indicated. Gene name is indicated at the left of each row (open box = C2H2 ZF; gray box = CWCH2; gray boxes linked with thick lines = tCWCH2; open box with curved ends = other domains). Representative genes and their amino acid length are indicated on the left and references are shown on the right. Hs, *Homo sapiens*; Dm, *Drosophila melanogaster*; Sc, *Saccharomyces cerevisiae*; Afu, *Aspergillus fumigatus*; Spo, *Schizosaccharomyces pombe*; Ro, *Rhizopus oryzae*; Mb, *Monosiga brevicollis*; Dd, *Dictyostelium discoideum*.

**Figure 3**
**Phylogenic tree of tCWCH2**. (A) Tree indicating similarities among the tCWCH2 gene classes. Statistical analysis was performed using three ZFs (tCWCH2 and a ZF in the C-terminal flanking region). (B) Sub-tree of Zic/Gli/Glis class gene. The alignment for this analysis is shown in Additional file 11. BI, NL, and NJ analyses were carried out to construct the molecular phylogenetic trees (Additional files 12, 13 and 14). The tree pattern is based on the BI tree. Scores for each branch indicate the statistical support values obtained in each phylogenetic tree construction method {BI (postprobability)/ML (bootstrap value)/NJ (bootstrap value)}. The absence of scores (-) indicates the branches that were not supported by the corresponding methods.

**Figure 4**
**Distribution of tCWCH2-containing genes in a eukaryotic phylogenetic tree**. Tree pattern (gray curved lines) is based on [78-80]. The distribution of tCWCH2 sequences is indicated by the black curved line. The distribution of each tCWCH2 gene class is indicated by a colored area. The *Monosiga* tCWCH2 sequence may be derived from the common ancestor of Arid2/Rsc9 (See Results).

**Figure 5**
**Sequence conservation among the classes of tCWCH2 sequence motifs**. Amino acid sequences were aligned and consensus sequences were generated (see Methods). Mizf ZF1-2, Mizf ZF3-4, and Zap1/ZafA ZF1-2 and ZF3-4 are indicated separately. The PDOC00028 C2H2 consensus is shown at the bottom as a general C2H2 consensus sequence (indicated by the tandem repeat of the consensus sequence). The short insertion sequences listed below were initially located at the sites indicated by the colored arrowheads in the alignments described above, but were separated to allow more comprehensive analyses. These sequences represent the longer linker sequences and the insertion of extra sequences described in the Results section. To achieve this analysis we omitted the four sequences AAWT01013938 (*Schmidtea mediterranea* Aebp2), CAG05504 (*Tetraodon nigroviridis* Gli), XP_001602003 (*Nasonia vitripennis* Gli), and XP_785526 (*Strongylocentrotus purpuratus* Gli) because these sequences contain exceptionally divergent sequences that disrupt the alignments.

**Figure 6**
**ϕ1 and ϕ2 positions of tCWCH2**. (A) Stereo view of the tCWCH2 sequence motif. A wire model of the main chains is shown with ZIC3, GLI1, and Zap1 in green and TFIIIA and Zif268 in red. The side chains of tryptophan residues in tCWCH2 and those in the ϕ1, ϕ2 residues are indicated by cylinders. ZIC3, GLI1, and Zap1 are gray cylinders and the others are red cylinders. The corresponding positions (ϕ1, ϕ2) are indicated by red boxes in Figure 5. (B) Frequency of amino acid appearance in ϕ1, ϕ2, and those of general C2H2 (PDOC00028) is shown. Valine, leucine, and isoleucine appear frequently at both positions and methionine frequently appears in ϕ2. This table summarizes all of the tCWCH2 sequences. Highlighted numbers indicate the preferential residues for ϕ1 and ϕ2 in tCWCH2.

**Figure 7**
**Comparison of linker length between the tCWCH2 sequence motif and general C2H2 ZF**. The bars indicate the mean lengths of the linker sequence between C2H2 motifs in the indicated groups. We defined the linker length as the number of amino acids between the last histidine of a ZF and the first cysteine of the C-terminally flanking ZF. The PDOC00028 (general C2H2) alignment contained 12326 sequences and we found 10025 linkers in it. The general C2H2 sequences was further divided into two groups based on the presence or absence of "TGE(K/R)P". The sequence numbers of each group: tCWCH2, 614; general C2H2, 10025; "TGE(K/P)", 4640; "non-TGE(K/R)P", 5385). Error bar, standard deviation. *, P < 1 × 10^-100in Mann-Whitney U-test.

**Figure 8**
**Evolution of the tCWCH2 domain**. We propose that the tCWCH2 patterns were generated from classical C2H2 sequences concurrently with the acquisition of the additional sequence features.

See this image and copyright information in PMC

References

1. Wolfe SA, Nekludova L, Pabo CO. DNA recognition by Cys2His2 zinc finger proteins. Annu Rev Biophys Biomol Struct. 2000;29:183–212. doi: 10.1146/annurev.biophys.29.1.183. - DOI - PubMed
1. Babu MM, Iyer LM, Balaji S, Aravind L. The natural history of the WRKY-GCM1 zinc fingers and the relationship between transcription factors and transposons. Nucleic Acids Res. 2006;34:6505–6520. doi: 10.1093/nar/gkl888. - DOI - PMC - PubMed
1. Krishna SS, Majumdar I, Grishin NV. Structural classification of zinc fingers: survey and summary. Nucleic Acids Res. 2003;31:532–550. doi: 10.1093/nar/gkg161. - DOI - PMC - PubMed
1. Brown RS. Zinc finger proteins: getting a grip on RNA. Curr Opin Struct Biol. 2005;15:94–98. doi: 10.1016/j.sbi.2005.01.006. - DOI - PubMed
1. Brayer KJ, Segal DJ. Keep your fingers off my DNA: protein-protein interactions mediated by C2H2 zinc finger domains. Cell Biochem Biophys. 2008;50:111–131. doi: 10.1007/s12013-008-9008-5. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Characterization of the tandem CWCH2 sequence motif: a hallmark of inter-zinc finger interactions

Affiliation

Characterization of the tandem CWCH2 sequence motif: a hallmark of inter-zinc finger interactions

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources