The relationship between domain duplication and recombination
- PMID: 15663950
- DOI: 10.1016/j.jmb.2004.11.050
The relationship between domain duplication and recombination
Abstract
Protein domains represent the basic evolutionary units that form proteins. Domain duplication and shuffling by recombination are probably the most important forces driving protein evolution and hence the complexity of the proteome. While the duplication of whole genes as well as domain-encoding exons increases the abundance of domains in the proteome, domain shuffling increases versatility, i.e. the number of distinct contexts in which a domain can occur. Here, we describe a comprehensive, genome-wide analysis of the relationship between these two processes. We observe a strong and robust correlation between domain versatility and abundance: domains that occur more often also have many different combination partners. This supports the view that domain recombination occurs in a random way. However, we do not observe all the different combinations that are expected from a simple random recombination scenario, and this is due to frequent duplication of specific domain combinations. When we simulate the evolution of the protein repertoire considering stochastic recombination of domains followed by extensive duplication of the combinations, we approximate the observed data well. Our analyses are consistent with a stochastic process that governs domain recombination and thus protein divergence with respect to domains within a polypeptide chain. At the same time, they support a scenario in which domain combinations are formed only once during the evolution of the protein repertoire, and are then duplicated to various extents. The extent of duplication of different combinations varies widely and, in nature, will depend on selection for the domain combination based on its function. Some of the pair-wise domain combinations that are highly duplicated also recur frequently with other partner domains, and thus represent evolutionary units larger than single protein domains, which we term "supra-domains".
Similar articles
-
Domain combinations in archaeal, eubacterial and eukaryotic proteomes.J Mol Biol. 2001 Jul 6;310(2):311-25. doi: 10.1006/jmbi.2001.4776. J Mol Biol. 2001. PMID: 11428892
-
Supra-domains: evolutionary units larger than single protein domains.J Mol Biol. 2004 Feb 20;336(3):809-23. doi: 10.1016/j.jmb.2003.12.026. J Mol Biol. 2004. PMID: 15095989
-
Significant expansion of exon-bordering protein domains during animal proteome evolution.Nucleic Acids Res. 2005 Jan 7;33(1):95-105. doi: 10.1093/nar/gki152. Print 2005. Nucleic Acids Res. 2005. PMID: 15640447 Free PMC article.
-
Arrangements in the modular evolution of proteins.Trends Biochem Sci. 2008 Sep;33(9):444-51. doi: 10.1016/j.tibs.2008.05.008. Epub 2008 Jul 24. Trends Biochem Sci. 2008. PMID: 18656364 Review.
-
MIKC-type MADS-domain proteins: structural modularity, protein interactions and network evolution in land plants.Gene. 2005 Mar 14;347(2):183-98. doi: 10.1016/j.gene.2004.12.014. Epub 2005 Feb 22. Gene. 2005. PMID: 15777618 Review.
Cited by
-
Surprising complexity of the ancestral apoptosis network.Genome Biol. 2007;8(10):R226. doi: 10.1186/gb-2007-8-10-r226. Genome Biol. 2007. PMID: 17958905 Free PMC article.
-
Cryptic genetic variation shapes the fate of gene duplicates in a protein interaction network.Nat Commun. 2025 Feb 11;16(1):1530. doi: 10.1038/s41467-025-56597-0. Nat Commun. 2025. PMID: 39934115 Free PMC article.
-
SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny.Nucleic Acids Res. 2009 Jan;37(Database issue):D380-6. doi: 10.1093/nar/gkn762. Epub 2008 Nov 26. Nucleic Acids Res. 2009. PMID: 19036790 Free PMC article.
-
Genome-Wide Comparative Analysis of SRCR Gene Superfamily in Invertebrates Reveals Massive and Independent Gene Expansions in the Sponge and Sea Urchin.Int J Mol Sci. 2024 Jan 26;25(3):1515. doi: 10.3390/ijms25031515. Int J Mol Sci. 2024. PMID: 38338794 Free PMC article.
-
Evolution of domain architectures and catalytic functions of enzymes in metabolic systems.Genome Biol Evol. 2012;4(9):976-93. doi: 10.1093/gbe/evs072. Epub 2012 Aug 30. Genome Biol Evol. 2012. PMID: 22936075 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources