Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes
- PMID: 12954770
- PMCID: PMC203328
- DOI: 10.1093/nar/gkg745
Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes
Abstract
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. We analyzed a total of 1726 processed RP pseudogene sequences, comprising more than 700 000 bases. To be sure to differentiate the sequence changes occurring in the functional genes during evolution from those occurring in pseudogenes after they were fixed in the genome, we used only pseudogene sequences originating from parts of RP genes that are identical in human and mouse. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. Finally, our dataset is large enough that it has many indels, thus allowing for the first time statistically robust analysis of these events. Overall, we found that deletions are about three times more common than insertions (3740 versus 1291). The frequencies of both these events follow characteristic power-law behavior associated with the size of the indel. However, unexpectedly, the frequency of 3 bp deletions (in contrast to 3 bp insertions) violates this trend, being considerably higher than that of 2 bp deletions. The possible biological implications of such a 3 bp bias are discussed.
Figures








References
-
- Mighell A.J., Smith,N.R., Robinson,P.A. and Markham,A.F. (2000) Vertebrate pseudogenes. FEBS Lett., 468, 109–114. - PubMed
-
- Esnault C., Maestre,J. and Heidmann,T. (2000) Human line retrotransposons generate processed pseudogenes. Nature Genet., 24, 363–367. - PubMed
-
- Antonarakis S.E., Krawczak,M. and Cooper,D.N. (2000) Disease-causing mutations in the human genome. Eur. J. Pediatr., 159, S173–S178. - PubMed
-
- Krawczak M., Chuzhanova,N.A., Stenson,P.D., Johansen,B.N., Ball,E.V. and Cooper,D.N. (2000) Changes in primary DNA sequence complexity influence the phenotypic consequences of mutations in human gene regulatory regions. Hum. Genet., 107, 362–365. - PubMed
-
- Hess S.T., Blake,J.D. and Blake,R.D. (1994) Wide variations in neighbor-dependent substitution rates. J. Mol. Biol., 236, 1022–1033. - PubMed