Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan 17:12:6.
doi: 10.1186/1471-2148-12-6.

Evolutionary dynamics of protein domain architecture in plants

Affiliations

Evolutionary dynamics of protein domain architecture in plants

Xue-Cheng Zhang et al. BMC Evol Biol. .

Abstract

Background: Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes.

Results: Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications.

Conclusions: Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Plant genomes maintain homogeneous distributions of protein domain architectures. The categories of protein domain architectures, labeled on top of each histogram and on bottom of each probability plot, are overall predicted (the left panel), species-unique (second to the left panel), as well as single-domain, double-domain, triple-domain and equal to or greater than four-domain architectures (the right four panels). The numbers are mean ± standard error. The x-axis for both the upper and lower panels is the proportions of Pfam-predicted domain architectures per genome in each category. The upper panel shows frequency distributions of the percentages of these categories of domain architectures. The lower panel is the probability plot (5% significance level) of the percentages of these various categories of domain architectures across plant species. The y-axis is the probability distributions relative to the mean values. AD represents the value of Anderson-Darling normality test. Note that the proportions of domain architectures of equal to or greater than four domains do not follow a normal distribution as evidenced by the associated p-value in the probability plot.
Figure 2
Figure 2
Evolutionary dynamics of domain architectures reflected by the presence and absence of architectures in plant lineages. Differentially colored boxes represent the presence of architecture in individual lineages or lineage combinations. Domain architecture patterns are defined by lineages or lineage combinations. Pattern A represents algal architectures; B, bryophyte and lycophyte architectures or early diverging architectures; ABCD, universal architectures; BCD, land architectures; CD angiosperm architectures; C, monocot architectures; and D, dicot architectures. Overall denotes the raw architectures without exclusion of less commonly represented architectures in each lineage and prevalent denotes architectures present in the majority of species in each lineage, i.e., at least three out of five algal species, both P. patens and S. moellendorffii species, two out of three monocot species, and three out of four dicot species. Architectures containing WD-40 domain are included as a representative to illustrate the dynamic changes in plant lineages. Numbers before the slash are collective counts of architectures in individual categories. Numbers after the slash denotes the percentages of architectures in individual categories.
Figure 3
Figure 3
Lineage-wise architecture expansion in plants. Pairwise comparisons of genomic dosages of architectures were made between lineages and colored boxes represent the significant expansion of architectures. Patterns of more than 25 counts of architectures are shown in red and less than 25 in light orange. The numbers denotes the counts and percentages of architectures of each pattern that have undergone significant expansion. Only the patterns that have an incidence higher than 1% are shown.

References

    1. Cheng J. DOMAC: An Accurate, Hybrid Protein Domain Prediction Server. Nucl Acids Res. 2007;35:w354–w356. doi: 10.1093/nar/gkm390. - DOI - PMC - PubMed
    1. Cheng J, Sweredoski M, Baldi P. DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks. Data Mining and Knowledge Discovery. 2006;13(1):1–10. doi: 10.1007/s10618-005-0023-5. - DOI
    1. Jaenicke R. Folding and association of proteins. Prog Biophys Mol Biol. 1987;49:117–237. doi: 10.1016/0079-6107(87)90011-3. - DOI - PubMed
    1. Fong JH, Geer LY, Panchenko AR, Bryant SH. Modeling the evolution of protein domain architectures using maximum parsimony. J Mol Biol. 2007;366:307–315. doi: 10.1016/j.jmb.2006.11.017. - DOI - PMC - PubMed
    1. Ekman D, Bjorklund AK, Elofsson A. Quantification of the elevated rate of domain rearrangement in metazoa. J Mol Biol. 2007;372:1337–1348. doi: 10.1016/j.jmb.2007.06.022. - DOI - PubMed

Publication types

LinkOut - more resources