Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004;5(8):R53.
doi: 10.1186/gb-2004-5-8-r53. Epub 2004 Jul 15.

Phylogenetic profiling of the Arabidopsis thaliana proteome: what proteins distinguish plants from other organisms?

Affiliations
Comparative Study

Phylogenetic profiling of the Arabidopsis thaliana proteome: what proteins distinguish plants from other organisms?

Rodrigo A Gutiérrez et al. Genome Biol. 2004.

Abstract

Background: The availability of the complete genome sequence of Arabidopsis thaliana together with those of other organisms provides an opportunity to decipher the genetic factors that define plant form and function. To begin this task, we have classified the nuclear protein-coding genes of Arabidopsis thaliana on the basis of their pattern of sequence similarity to organisms across the three domains of life.

Results: We identified 3,848 Arabidopsis proteins that are likely to be found solely within the plant lineage. More than half of these plant-specific proteins are of unknown function, emphasizing the general lack of knowledge of processes unique to plants. Plant-specific proteins that are membrane-associated and/or targeted to the mitochondria or chloroplasts are the most poorly characterized. Analyses of microarray data indicate that genes coding for plant-specific proteins, but not evolutionarily conserved proteins, are more likely to be expressed in an organ-specific manner. A large proportion (13%) of plant-specific proteins are transcription factors, whereas other basic cellular processes are under-represented, suggesting that evolution of plant-specific control of gene expression contributed to making plants different from other eukaryotes.

Conclusions: We identified and characterized the Arabidopsis proteins that are most likely to be plant-specific. Our results provide a genome-wide assessment that supports the hypothesis that evolution of higher plant complexity and diversity is related to the evolution of regulatory mechanisms. Because proteins that are unique to the green plant lineage will not be studied in other model systems, they should be attractive priorities for future studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Identification of plant-specific proteins. (a) Classification of Arabidopsis proteins based on their pattern of sequence similarity to other organisms. The 27,288 Arabidopsis proteins were classified on the basis of their phylogenetic profiles (PP). Each PP recorded whether similar sequences were found or not found in the protein sets from the following organisms: Homo sapiens, Rattus norvegicus, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, Schizosaccharomyces pombe, Saccharomyces cerevisiae, a combined set of 88 species of Bacteria, and a combined set of 16 species of Archaea. Not drawn to scale. (b) Identification of putative plant-specific proteins. The Arabidopsis proteins that lack similarity to any other organism (7,868 proteins represented in the black circle in (a)) were compared against sequences in the expressed sequence tag (EST) database of Arabidopsis and 13 other plant species. A total of 3,848 Arabidopsis proteins were identified as plant specific because they showed sequence similarity to proteins in the Arabidopsis EST database and to proteins in EST databases of at least four other plant species (at E-value ≤ 10-10). In addition, 892 other Arabidopsis proteins show similarity to the Arabidopsis and one to three other plant EST databases, 2,691 Arabidopsis proteins exhibit similarity to sequences in the Arabidopsis EST database only, and 437 lack similarity to any sequence in the EST databases used.
Figure 2
Figure 2
Arabidopsis genes encoding plant-specific proteins exhibit preferential expression in organs. (a) Heat map showing the 600 plant-specific genes that exhibited differential expression in at least one microarray experiment comparing RNA samples from different plant organs. Microarray experiments were obtained from the Stanford Microarray Database. The mean was calculated for the replicates. Organ preferential expression was defined as a twofold or higher ratio in the comparison. Gene expression is expressed as the log2(ratio). The bar at the top right indicates the magnitude of change. Green indicates induction and red indicates depression of gene expression. Ref, reference sample; see Materials and methods for details. (b) For all organ comparisons the number of differentially expressed genes in the plant-specific category was statistically higher than the number of differentially expressed genes that are not plant specific. Calculation of the statistical significance was done using the chi-square test for contingency tables.

Similar articles

Cited by

References

    1. The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. doi: 10.1038/35048692. - DOI - PubMed
    1. Sato S, Nakamura Y, Kaneko T, Asamizu E, Tabata S. Complete structure of the chloroplast genome of Arabidopsis thaliana. DNA Res. 1999;6:283–290. - PubMed
    1. Marienfeld J, Unseld M, Brennicke A. The mitochondrial genome of Arabidopsis is composed of both native and immigrant information. Trends Plant Sci. 1999;4:495–502. doi: 10.1016/S1360-1385(99)01502-2. - DOI - PubMed
    1. Koonin EV. Genome sequences: genome sequence of a model prokaryote. Curr Biol. 1997;7:R656–R659. doi: 10.1016/S0960-9822(06)00328-9. - DOI - PubMed
    1. Lespinet O, Wolf YI, Koonin EV, Aravind L. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 2002;12:1048–1059. doi: 10.1101/gr.174302. - DOI - PMC - PubMed

Publication types

MeSH terms