Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Aug 9:7:200.
doi: 10.1186/1471-2164-7-200.

Cross genome comparisons of serine proteases in Arabidopsis and rice

Affiliations
Comparative Study

Cross genome comparisons of serine proteases in Arabidopsis and rice

Lokesh P Tripathi et al. BMC Genomics. .

Abstract

Background: Serine proteases are one of the largest groups of proteolytic enzymes found across all kingdoms of life and are associated with several essential physiological pathways. The availability of Arabidopsis thaliana and rice (Oryza sativa) genome sequences has permitted the identification and comparison of the repertoire of serine protease-like proteins in the two plant species.

Results: Despite the differences in genome sizes between Arabidopsis and rice, we identified a very similar number of serine protease-like proteins in the two plant species (206 and 222, respectively). Nearly 40% of the above sequences were identified as potential orthologues. Atypical members could be identified in the plant genomes for Deg, Clp, Lon, rhomboid proteases and species-specific members were observed for the highly populated subtilisin and serine carboxypeptidase families suggesting multiple lateral gene transfers. DegP proteases, prolyl oligopeptidases, Clp proteases and rhomboids share a significantly higher percentage orthology between the two genomes indicating substantial evolutionary divergence was set prior to speciation. Single domain architectures and paralogues for several putative subtilisins, serine carboxypeptidases and rhomboids suggest they may have been recruited for additional roles in secondary metabolism with spatial and temporal regulation. The analysis reveals some domain architectures unique to either or both of the plant species and some inactive proteases, like in rhomboids and Clp proteases, which could be involved in chaperone function.

Conclusion: The systematic analysis of the serine protease-like proteins in the two plant species has provided some insight into the possible functional associations of previously uncharacterised serine protease-like proteins. Further investigation of these aspects may prove beneficial in our understanding of similar processes in commercially significant crop plant species.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Unrooted N-J tree computed from multiple sequence alignments of Arabidopsis (red) and rice (blue) subtilisin domains. Subtilisin-like protease domains were aligned using ClustalW [95] program and the alignments were exported to Phylip package [96] for representing the Neighbor-Joining tree (see methods). The colors and circles represent different evolutionary clades identified in the analysis (see text for details). Clade I is represented in purple, Clade II is shaded orange, Clade III in green, Clade IV in brown and Clade V in yellow. For clarity, bootstrap values were replaced with symbols representing bootstrap percentages >50%. Bootstrap values between 50–60% are represented by an asterix, circles represent bootstrap values from 60%–80% while bootstrap values >80% are represented by rectangles. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene. A few species specific gene clusters were also identified in the analysis (see text for details).
Figure 2
Figure 2
Domain Architectures identified in Arabidopsis and rice serine Protease-like proteins. At -Arabidopsis thaliana; Os – Rice (Oryza sativa); B- Bacteria; A- Archaea; E- Eukaryota; Sxx- Serine protease family Sxx domain, where Sxx refers to the serine protease family as per MEROPS [5] classification (see text for details). PDZ- PDZ domain (Pfam [37] accession: PF00595); PA- Protease associated domain (Pfam [37] accession: PF02225); SUB N- Subtilisin N-terminal region (Pfam [37] accession: PF005922); DUF1034- Domain of unknown function (Pfam [37] accession: PF06280); Arf- ADP-ribosylation factor family (Pfam [37] accession: PF00025); C2- C2 domain (Pfam [37] accession: PF00168); zf-CCHC- Zinc knuckle (Pfam [37] accession: PF00098); rve- Integrase core domain (Pfam accession:PF00665); Extensin 2- Extensin-like region (Pfam [37] accession: PF04554); S9 N- Prolyl oligopeptidase, N-terminal beta-propeller domain (Pfam [37] accession: PF02897); PD40- WD40-like beta propeller repeat (Pfam [37] accession: PF07676); DPPIV N- Dipeptidyl peptidase (DPP IV) N-terminal region (Pfam [37] accession: PF00930); Transposase 21- Transposase family tnp2 (Pfam [37] accession: PF02992); Retrotrans gag- Retrotransposon gag protein (Pfam [37] accession: PF03732); ABC1- ABC1 family (Pfam [37] accession: PF03109); LON- ATP-dependent protease La (LON) domain (Pfam [37] accession: PF02190); AAA- ATPase family associated with various cellular activities (Pfam [37] accession: PF00004); UBA- UBA/TN-S (ubiquitin associated) domain (Pfam [37] accession: PF000627); zf-RanBP- Zinc finger in Ran binding protein and others (Pfam [37] accession: PF00641).
Figure 3
Figure 3
Unrooted N-J tree computed from multiple sequence alignments of Arabidopsis (red) and rice (blue) prolyl oligopeptidase domains. Prolyl oligopeptidase-like domains were aligned using ClustalW [95] program and the alignments were exported to Phylip package [96] for representing the Neighbor-Joining tree (see methods). The colors and circles represent the two evolutionary clades identified in the analysis (see text for details). Clade I is represented in Orange, Clade II is shaded green. For clarity, bootstrap values were replaced with symbols representing bootstrap percentages >50%. Bootstrap values between 50–60% are represented by an asterix, circles represent bootstrap values from 60%–80% while bootstrap values >80% are represented by rectangles. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene. Subfamily assignments where possible are indicated in parentheses below the gene name (see Figure SF3 and text for details).
Figure 4
Figure 4
Multiple sequence alignment of the Clp protease domain region of the annotated Arabidopsis and rice Clp protease-like proteins. The catalytic triad residues are indicated. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene. Several gene products that display mutation in one or more catalytic triad residues can be visualised here (see text for details).
Figure 5
Figure 5
Unrooted N-J tree computed from multiple sequence alignments of Arabidopsis (red) and rice (blue) Clp protease domains. Clp protease-like domains were aligned using ClustalW [95] program and the alignments were exported to Phylip package [96] for representing the Neighbor-Joining tree (see methods). The colors and circles represent different evolutionary clades identified in the analysis (see text for details). Clade I is represented in orange, while clades II-VIII are shaded in black. For clarity, bootstrap values were replaced with symbols representing bootstrap percentages >50%. Bootstrap values between 50–60% are represented by an asterix, circles represent bootstrap values from 60%–80% while bootstrap values >80% are represented by rectangles. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene.
Figure 6
Figure 6
Multiple sequence alignment of the Type I Spase domain region of the annotated Arabidopsis and rice Type I Spase-like proteins. The catalytic dyad residues are indicated. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene. The variations in the second residue (K/H) of catalytic dyad can be identified here (see text for details).
Figure 7
Figure 7
Unrooted N-J tree computed from multiple sequence alignments of Arabidopsis (red) and rice (blue) family S28 protease domains. S28-like protease domains were aligned using ClustalW [95] program and the alignments were exported to Phylip package [96] for representing the Neighbor-Joining tree (see methods). The colors represent the two evolutionary clades identified in the analysis (see text for details). Clade I is represented in Orange, Clade II is shaded green. For clarity, bootstrap values were replaced with symbols representing bootstrap percentages >50%. Bootstrap values between 50–60% are represented by an asterix, circles represent bootstrap values from 60%–80% while bootstrap values >80% are represented by rectangles. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene.
Figure 8
Figure 8
Unrooted N-J tree computed from multiple sequence alignments of Arabidopsis (red) and rice (blue) rhomboid protease domains. Rhomboid protease-like domains were aligned using ClustalW [95] program and the alignments were exported to Phylip package [96] for representing the Neighbor-Joining tree (see methods). The colors and circles represent different evolutionary clades identified in the analysis (see text for details). Clade I is represented in orange, Clade II is shaded brown, Clade III in green, Clade IV in purple. Clades V-VIII are shaded in black. For clarity, bootstrap values were replaced with symbols representing bootstrap percentages >50%. Bootstrap values between 50–60% are represented by an asterix, circles represent bootstrap values from 60%–80% while bootstrap values >80% are represented by rectangles. Gene names correspond to those in Additional files 1 and 2. For brevity, rice gene names have been shortened to OsXXg##### instead of LOC_OsXXg#####, XX referring to chromosome 1–12 and a 5 digit number assigned to each gene.

Similar articles

Cited by

References

    1. Callis J. Regulation of Protein Degradation. Plant Cell. 1995;7:845–857. doi: 10.1105/tpc.7.7.845. - DOI - PMC - PubMed
    1. Schaller A. A cut above the rest: the regulatory function of plant proteases. Planta. 2004;220:183–197. doi: 10.1007/s00425-004-1407-2. - DOI - PubMed
    1. Barrett AJ, Rawlings ND. Families and clans of serine peptidases. Arch Biochem Biophys. 1995;318:247–250. doi: 10.1006/abbi.1995.1227. - DOI - PubMed
    1. Rawlings ND, Barrett AJ. Dipeptidyl-peptidase II is related to lysosomal Pro-X carboxypeptidase. Biochim Biophys Acta. 1996;1298:1–3. - PubMed
    1. Rawlings ND, Morton FR, Barrett AJ. MEROPS: the peptidase database. Nucleic Acids Res. 2006;34:D270–2. doi: 10.1093/nar/gkj089. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances