Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Jun;132(2):907-25.
doi: 10.1104/pp.102.013722. Epub 2003 May 1.

Comparative analysis of SET domain proteins in maize and Arabidopsis reveals multiple duplications preceding the divergence of monocots and dicots

Affiliations
Comparative Study

Comparative analysis of SET domain proteins in maize and Arabidopsis reveals multiple duplications preceding the divergence of monocots and dicots

Nathan M Springer et al. Plant Physiol. 2003 Jun.

Abstract

Histone proteins play a central role in chromatin packaging, and modification of histones is associated with chromatin accessibility. SET domain [Su(var)3-9, Enhancer-of-zeste, Trithorax] proteins are one class of proteins that have been implicated in regulating gene expression through histone methylation. The relationships of 22 SET domain proteins from maize (Zea mays) and 32 SET domain proteins from Arabidopsis were evaluated by phylogenetic analysis and domain organization. Our analysis reveals five classes of SET domain proteins in plants that can be further divided into 19 orthology groups. In some cases, such as the Enhancer of zeste-like and trithorax-like proteins, plants and animals contain homologous proteins with a similar organization of domains outside of the SET domain. However, a majority of plant SET domain proteins do not have an animal homolog with similar domain organization, suggesting that plants have unique mechanisms to establish and maintain chromatin states. Although the domains present in plant and animal SET domain proteins often differ, the domains found in the plant proteins have been generally implicated in protein-protein interactions, indicating that most SET domain proteins operate in complexes. Combined analysis of the maize and Arabidopsis SET domain proteins reveals that duplication of SET domain proteins in plants is extensive and has occurred via multiple mechanisms that preceded the divergence of monocots and dicots.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Proteins containing an interrupted SET domain were aligned using ClustalW, and the alignment was analyzed by parsimony using PHYLIP. The resulting phylogenetic tree is displayed with all bootstrap values >70% indicated at the nodes. The domain organization of representative plant proteins and the most closely related animal protein were investigated using National Center for Biotechnology Information (NCBI)-conserved domain database (CD) and SMART searches. All proteins are displayed as scaled schematic diagrams with the N terminus at the left. Shaded boxes within the protein schematics indicate recognizable domains. The accession numbers for the sequences used in the alignment are Q8R5A0-Mm (AAH23119), P97443-Mm (BAB26947), BOP-R-Mm (NP_081464), Q8SYH2-Dm (AAL49177), O46040-Dm (O46040), Q960X1-Dm (AAK93223), MGI:191333-Mm (XP_134310), Q9VVV8-Dm (AAF49199), AAN09278-Dm (AAN09278), ScSET5 (P38890), and ScSet6 (NP_015160).
Figure 2.
Figure 2.
The SET domains from maize and Arabidopsis SET domain proteins were aligned with the SET domain of yeast, D. melanogaster, and mouse proteins using ClustalW (see Supplementary Fig. 1 at www.plantphysiol.org). All bootstrap values >70% are indicated at the nodes. The accession numbers for the plant SET domain sequences are shown in Tables I and II. The D. melanogaster proteins used for this alignment were E(Z) (AAC46462), CG4976 (AAF56762), CG17396 (AAF45425), ASH1 (AAF49140), MES-4 (AAK84931), TRX (AAF55041), TRR (AAF45684), G9a-like (AAF45487), and SU(VAR3-9) (CAB93768). The mouse proteins used for the alignment are EZH1 (AAC50778), EZH2 (Q61188), NSD1 (AAC40182), ASH1 (AAK26242), HRX (AAA62593), MLL3-like (AAK70214), G9a (AAC84164), MLL2-like (BAB27589), Q91Z33 (AAH10250) SET7 (Q9NQR1), SUV39h1 (AF193862), and Suv39h2 (AAG09134). The yeast sequences used for the alignment are ScSET1 (AAB68867), ScSET2 (NP_012367), ScSET3 (NP_012954), and ScSET4 (NP_012430).
Figure 3.
Figure 3.
Class II SET domain proteins. The SET domain protein sequence from the plant class II SET domain proteins and several well-characterized animal class II SET domain proteins were aligned using ClustalW. The relationships of these sequences were investigated using PHYLIP using bootstrap analysis. All bootstrap values >70% are shown. The circles indicate the putative orthology groups. The Arabidopsis proteins are indicated by italic text, and the maize proteins are indicated by bold text. The domain organization of the plant proteins was investigated using NCBI-CD and SMART searches. All proteins are displayed as scaled schematic diagrams with the N terminus at the left. Shaded boxes within the protein schematics indicate recognizable domains. Due to their longer lengths, the animal NSD1 and ASH1 protein schematics are truncated at the N terminus; the sequence not shown does not contain any recognizable domains.
Figure 4.
Figure 4.
Class III SET domain proteins. The SET domains of the plant class III SET domain proteins and several related animal proteins were aligned using ClustalW. The relationship of these sequences was examined using PHYLIP, and a parsimonious tree is shown with bootstrap values >70%. Circles are used to indicate putative maize-Arabidopsis orthology groups. The Arabidopsis proteins are indicated by italic text, and the maize proteins are indicated by bold text. Several maize proteins, which are currently only partially sequenced, were placed within the orthology group that they are most closely related to. The domain organization of the class III SET domain proteins was analyzed by NCBI-CD and SMART searches. Schematic diagrams show the domain organization of these proteins with the N terminus on the left side. For several of the longer proteins, a region of the protein that did not contain any recognizable domains was truncated.
Figure 5.
Figure 5.
Class IV SET domain proteins. The amino acid sequences of the class IV SET domain proteins were aligned using ClustalW. This alignment was shaded using Boxshade such that identical amino acids are shaded black, and conserved resides are shaded in gray. The locations of the PHD (Λ) and SET (*) domains are indicated above the alignment. Schematic diagrams of these proteins are shown below the alignment.
Figure 6.
Figure 6.
Class V SET domain proteins. The SET domain of all class V SET domain proteins was aligned using ClustalW. The relationship of these sequences was examined using PHYLIP, and a parsimonious tree is shown with bootstrap values >70%. Circles are used to indicate putative maize/Arabidopsis orthology groups. The Arabidopsis proteins are indicated by italic text, and the maize proteins are indicated by bold text. Several maize proteins, which are only partially sequenced, are placed within the orthology group that they are most closely related to. We searched for recognizable domains in these proteins using NCBI-CD and SMART searches. Schematic diagrams indicate the domain organization for each of the full-length proteins, with the N terminus on the left.
Figure 7.
Figure 7.
Maize contains intron-less class V SET domain genes. PCR was used to test for the presence of introns in the sequence of several class V maize genes, Sdg101, Sdg103, Sdg104, Sdg105, Sdg111, and Sdg118. A segment of the coding region for each of these genes was amplified from B73 genomic DNA and 10-d-old B73 seedling cDNA.
Figure 8.
Figure 8.
Expression patterns of maize SET genes. PCR was performed on cDNA from eight different tissue sources to test for expression of maize SET genes. The sequence amplified is indicated next to the image of the gel, and the source of the RNA for each lane is indicated above the pictures (1, blank; 2, endosperm [11 d after pollination {DAP}]; 3, whole kernel [3 DAP]; 4, whole kernel [11 DAP]; 5, 10-d-old seedling; 6, root tips; 7, immature leaf; 8, mature leaf; 9, meiotic tassel).

References

    1. Aasland R, Gibson TJ, Stewart AF (1995) The PHD finger: implications for chromatin-mediated transcriptional regulation. Trends Biochem Sci 20: 56-59 - PubMed
    1. Aasland R, Stewart AF, Gibson T (1996) The SANT domain: a putative DNA-binding domain in the SWI-SNF and ADA complexes, the transcriptional co-repressor N-CoR and TFIIIB. Trends Biochem Sci 21: 87-88 - PubMed
    1. Alvarez-Venegas R, Avramova Z (2001) Two Arabidopsis homologs of the animal trithorax genes: a new structural domain is a signature feature of the trithorax gene family. Gene 271: 215-221 - PubMed
    1. Alvarez-Venegas R, Avramova Z (2002) SET-domain proteins of the Su(var)3-9, E(z) and trithorax families. Gene 285: 25-37 - PubMed
    1. Balciunas D, Ronne H (2000) Evidence of domain swapping within the jumonji family of transcription factors. Trends Biochem Sci 25: 274-276 - PubMed

Publication types

MeSH terms