Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome
- PMID: 14656962
- PMCID: PMC403796
- DOI: 10.1101/gr.1429003
Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome
Abstract
Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we developed a pipeline using features such as intron-absence, frame-disruption, polyadenylation, and truncation. This has enabled us to identify in recent genome drafts approximately 8000 processed pseudogenes (distributed from http://pseudogene.org). Overall, processed pseudogenes are very similar to their closest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides. Their chromosomal distribution appears random and dispersed, with the numbers on chromosomes proportional to length, suggesting sustained "bombardment" over evolution. However, it does vary with GC-content: Processed pseudogenes occur mostly in intermediate GC-content regions. This is similar to Alus but contrasts with functional genes and L1-repeats. Pseudogenes, moreover, have age profiles similar to Alus. The number of pseudogenes associated with a given gene follows a power-law relationship, with a few genes giving rise to many pseudogenes and most giving rise to few. The prevalence of processed pseudogenes agrees well with germ-line gene expression. Highly expressed ribosomal proteins account for approximately 20% of the total. Other notables include cyclophilin-A, keratin, GAPDH, and cytochrome c.
Figures













Similar articles
-
Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.Genome Res. 2002 Feb;12(2):272-80. doi: 10.1101/gr.207102. Genome Res. 2002. PMID: 11827946 Free PMC article.
-
Comparative analysis of processed pseudogenes in the mouse and human genomes.Trends Genet. 2004 Feb;20(2):62-7. doi: 10.1016/j.tig.2003.12.005. Trends Genet. 2004. PMID: 14746985 Review.
-
Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome.Genome Res. 2002 Oct;12(10):1466-82. doi: 10.1101/gr.331902. Genome Res. 2002. PMID: 12368239 Free PMC article.
-
GC content evolution of the human and mouse genomes: insights from the study of processed pseudogenes in regions of different recombination rates.J Mol Evol. 2006 Jun;62(6):745-52. doi: 10.1007/s00239-005-0186-0. Epub 2006 Apr 28. J Mol Evol. 2006. PMID: 16752212
-
Processed pseudogenes: characteristics and evolution.Annu Rev Genet. 1985;19:253-72. doi: 10.1146/annurev.ge.19.120185.001345. Annu Rev Genet. 1985. PMID: 3909943 Review.
Cited by
-
Functional opsin retrogene in nocturnal moth.Mob DNA. 2016 Oct 19;7:18. doi: 10.1186/s13100-016-0074-8. eCollection 2016. Mob DNA. 2016. PMID: 27777631 Free PMC article.
-
From the archives: evolutionary origins of Delphinieae flowers, pseudogenes, and the light-responsive localization of COP1.Plant Cell. 2024 Feb 26;36(3):489-490. doi: 10.1093/plcell/koad312. Plant Cell. 2024. PMID: 38096564 Free PMC article. No abstract available.
-
NANOGP8: evolution of a human-specific retro-oncogene.G3 (Bethesda). 2012 Nov;2(11):1447-57. doi: 10.1534/g3.112.004366. Epub 2012 Nov 1. G3 (Bethesda). 2012. PMID: 23173096 Free PMC article.
-
Phylogeny and Comparative Analysis of Chinese Chamaesium Species Revealed by the Complete Plastid Genome.Plants (Basel). 2020 Jul 30;9(8):965. doi: 10.3390/plants9080965. Plants (Basel). 2020. PMID: 32751647 Free PMC article.
-
Restricting retrotransposons: a review.Mob DNA. 2016 Aug 11;7:16. doi: 10.1186/s13100-016-0070-z. eCollection 2016. Mob DNA. 2016. PMID: 27525044 Free PMC article. Review.
References
-
- Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., and Watson, J. 1994. Molecular biology of the cell. Garland Publishing, New York.
-
- Andersson, S.G., Zomorodipour, A., Andersson, J.O., Sicheritz-Ponten, T., Alsmark, U.C., Podowski, R.M., Naslund, A.K., Eriksson, A.S., Winkler, H.H., and Kurland, C.G. 1998. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396: 133-140. - PubMed
-
- Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815. - PubMed
WEB SITE REFERENCES
-
- http://bioinfo.mbb.yale.edu/genome/pseudogene; pseudogene database.
-
- http://www.ebi.ac.uk/GOA/; GO annotation of SWISS-PROT/TrEmbl proteins.
-
- http://www.ebi.ac.uk/proteome; EBI nonredundant human proteome.
-
- http://www.ebi.ac.uk/swissprot/; SWISS-PROT human protein sequences.
-
- http://www.ebi.ac.uk/trembl/; TrEMBL human protein sequences.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous