A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
- PMID: 14759257
- PMCID: PMC395751
- DOI: 10.1186/gb-2004-5-2-r7
A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
Abstract
Background: Sequencing the genomes of multiple, taxonomically diverse eukaryotes enables in-depth comparative-genomic analysis which is expected to help in reconstructing ancestral eukaryotic genomes and major events in eukaryotic evolution and in making functional predictions for currently uncharacterized conserved genes.
Results: We examined functional and evolutionary patterns in the recently constructed set of 5,873 clusters of predicted orthologs (eukaryotic orthologous groups or KOGs) from seven eukaryotic genomes: Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Encephalitozoon cuniculi. Conservation of KOGs through the phyletic range of eukaryotes strongly correlates with their functions and with the effect of gene knockout on the organism's viability. The approximately 40% of KOGs that are represented in six or seven species are enriched in proteins responsible for housekeeping functions, particularly translation and RNA processing. These conserved KOGs are often essential for survival and might approximate the minimal set of essential eukaryotic genes. The 131 single-member, pan-eukaryotic KOGs we identified were examined in detail. For around 20 that remained uncharacterized, functions were predicted by in-depth sequence analysis and examination of genomic context. Nearly all these proteins are subunits of known or predicted multiprotein complexes, in agreement with the balance hypothesis of evolution of gene copy number. Other KOGs show a variety of phyletic patterns, which points to major contributions of lineage-specific gene loss and the 'invention' of genes new to eukaryotic evolution. Examination of the sets of KOGs lost in individual lineages reveals co-elimination of functionally connected genes. Parsimonious scenarios of eukaryotic genome evolution and gene sets for ancestral eukaryotic forms were reconstructed. The gene set of the last common ancestor of the crown group consists of 3,413 KOGs and largely includes proteins involved in genome replication and expression, and central metabolism. Only 44% of the KOGs, mostly from the reconstructed gene set of the last common ancestor of the crown group, have detectable homologs in prokaryotes; the remainder apparently evolved via duplication with divergence and invention of new genes.
Conclusions: The KOG analysis reveals a conserved core of largely essential eukaryotic genes as well as major diversification and innovation associated with evolution of eukaryotic genomes. The results provide quantitative support for major trends of eukaryotic evolution noticed previously at the qualitative level and a basis for detailed reconstruction of evolution of eukaryotic genomes and biology of ancestral forms.
Figures







Similar articles
-
The COG database: an updated version includes eukaryotes.BMC Bioinformatics. 2003 Sep 11;4:41. doi: 10.1186/1471-2105-4-41. Epub 2003 Sep 11. BMC Bioinformatics. 2003. PMID: 12969510 Free PMC article.
-
Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes.BMC Evol Biol. 2003 Jan 6;3:2. doi: 10.1186/1471-2148-3-2. Epub 2003 Jan 6. BMC Evol Biol. 2003. PMID: 12515582 Free PMC article.
-
Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution.Genome Res. 2003 Oct;13(10):2229-35. doi: 10.1101/gr.1589103. Genome Res. 2003. PMID: 14525925 Free PMC article.
-
Evolutionary genomics of nucleo-cytoplasmic large DNA viruses.Virus Res. 2006 Apr;117(1):156-84. doi: 10.1016/j.virusres.2006.01.009. Epub 2006 Feb 21. Virus Res. 2006. PMID: 16494962 Review.
-
Comparative genomics and structural biology of the molecular innovations of eukaryotes.Curr Opin Struct Biol. 2006 Jun;16(3):409-19. doi: 10.1016/j.sbi.2006.04.006. Epub 2006 May 5. Curr Opin Struct Biol. 2006. PMID: 16679012 Review.
Cited by
-
Comparative Analysis of piRNA Profiles Helps to Elucidate Cryoinjury Between Giant Panda and Boar Sperm During Cryopreservation.Front Vet Sci. 2021 Apr 22;8:635013. doi: 10.3389/fvets.2021.635013. eCollection 2021. Front Vet Sci. 2021. PMID: 33969033 Free PMC article.
-
Akt Is S-Palmitoylated: A New Layer of Regulation for Akt.Front Cell Dev Biol. 2021 Feb 15;9:626404. doi: 10.3389/fcell.2021.626404. eCollection 2021. Front Cell Dev Biol. 2021. PMID: 33659252 Free PMC article.
-
The architectural design of networks of protein domain architectures.Biol Lett. 2013 Jun 12;9(4):20130268. doi: 10.1098/rsbl.2013.0268. Print 2013 Aug 23. Biol Lett. 2013. PMID: 23760167 Free PMC article.
-
The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation.Genome Biol. 2012 May 25;13(5):R39. doi: 10.1186/gb-2012-13-5-r39. Genome Biol. 2012. PMID: 22630137 Free PMC article.
-
Transcriptome Analysis of the Innate Immunity-Related Complement System in Spleen Tissue of Ctenopharyngodon idella Infected with Aeromonas hydrophila.PLoS One. 2016 Jul 6;11(7):e0157413. doi: 10.1371/journal.pone.0157413. eCollection 2016. PLoS One. 2016. PMID: 27383749 Free PMC article.
References
-
- Koonin EV, Aravind L, Kondrashov AS. The impact of comparative genomics on our understanding of evolution. Cell. 2000;101:573–576. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases