Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence?

William Amos¹

Affiliations

PMID: 20071383
PMCID: PMC2871933
DOI: 10.1098/rspb.2009.1757

Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence?

William Amos. Proc Biol Sci. 2010.

. 2010 May 7;277(1686):1443-9.

doi: 10.1098/rspb.2009.1757. Epub 2010 Jan 13.

Author

William Amos¹

Affiliation

¹ Department of Zoology, Cambridge University, , Downing Street, Cambridge CB2 3EJ, UK.

PMID: 20071383
PMCID: PMC2871933
DOI: 10.1098/rspb.2009.1757

Abstract

Single nucleotide polymorphisms (SNPs) are distributed highly non-randomly in the human genome through a variety of processes from ascertainment biases (i.e. the preferential development of SNPs around interesting genes) to the action of mutation hotspots and natural selection. However, with more systematic SNP development, one might expect an increasing proportion of SNPs to be distributed more or less randomly. Here, I test this null hypothesis using stochastic simulations and compare this output with that of an alternative hypothesis that mutations are more likely to occur near existing SNPs, a possibility suggested both by molecular studies of meiotic mismatch repair in yeast and by data showing that SNPs cluster around heterozygous deletions. A purely Poisson process generates SNP clusters that differ from equivalent data from human chromosome 1 in both the frequency of different-sized clusters and the SNP density within each cluster, even for small clusters of just four or five SNPs, while clusters on the X chromosome differ from those on the autosomes. In contrast, modest levels of mutational non-independence generate a reasonable fit to the real data for both cluster frequency and density, and also exhibit the evolutionary transience noted for 'mutation hotspots'. Mutational non-independence therefore provides an interesting new hypothesis that appears capable of explaining the distribution of SNPs in the human genome.

PubMed Disclaimer

Figures

**Figure 1.**
How the frequencies and densities of SNP clusters vary with cluster size under different mutation models. Data series are: large black circles, data from human chromosome 1; large white circles, simulated randomly occurring mutations; smaller symbols, simulated mutation non-independence in which mutations are more likely to occur in regions of size 0.5 (grey triangles), 1 (white squares), 2 (grey diamonds) or 5 kb (crosses) around any existing heterozygous site. Simulated data are culled from 100 replicate simulations, accepting only those in which the terminal overall SNP density was one SNP every 700 ± 25 bases, yielding approximately 60 runs per set of conditions. Frequencies are normalized to the size of human chromosome 1. For full details of the simulations (see §2). (a) How the frequencies of different cluster sizes vary; (b) how cluster density varies for the same data.

**Figure 2.**
How the frequencies and densities of SNP clusters vary with cluster size between the X chromosome and the autosomes, with simulated random data for comparison. Data series are: large white circles, simulated randomly occurring mutations; large black circles, the X chromosome; grey diamonds, mean value for the autosomes, calculated for each chromosome separately and then averaged. Error bars are one standard error of the mean. Simulated data are culled from 100 replicate simulations, accepting only those in which the terminal overall SNP density was one SNP every 1220 ± 100 bases, yielding approximately 40 runs. For full details of the simulations see §2. (a) How the frequencies of different cluster sizes vary; (b) how cluster density varies for the same data. At all but six cluster sizes above four, the X-chromosome density is lower than for the autosomes.

**Figure 3.**
How the locations of SNP clusters vary over time. A single simulation was run for 80 000 generations. At every 4N = 4000 generations, the locations of clusters containing 25 or more SNPs were recorded and plotted on a separate line. Every individual carries a single pair of chromosomes, each 20 Mb long. For maximum comparability, SNP density was held as close as possible to 1 every 700 bases by varying the number of individuals on which SNPs were ascertained. A more or less random pattern is seen with little or no evidence of prolonged stability over time.

See this image and copyright information in PMC

Cited by

Correlated and geographically predictable Neanderthal and Denisovan legacies are difficult to reconcile with a simple model based on inter-breeding.
Amos W. Amos W. R Soc Open Sci. 2021 Jun 16;8(6):201229. doi: 10.1098/rsos.201229. R Soc Open Sci. 2021. PMID: 34150310 Free PMC article.
Intact Transition Epitope Mapping-Force Differences between Original and Unusual Residues (ITEM-FOUR).
Röwer C, Ortmann C, Neamtu A, El-Kased RF, Glocker MO. Röwer C, et al. Biomolecules. 2023 Jan 16;13(1):187. doi: 10.3390/biom13010187. Biomolecules. 2023. PMID: 36671572 Free PMC article.
An Integrative Phenotype-Genotype Approach Using Phenotypic Characteristics from the UAE National Diabetes Study Identifies HSD17B12 as a Candidate Gene for Obesity and Type 2 Diabetes.
Hachim MY, Aljaibeji H, Hamoudi RA, Hachim IY, Elemam NM, Mohammed AK, Salehi A, Taneera J, Sulaiman N. Hachim MY, et al. Genes (Basel). 2020 Apr 23;11(4):461. doi: 10.3390/genes11040461. Genes (Basel). 2020. PMID: 32340285 Free PMC article.
Allelic clustering and ancestry-dependent frequencies of rs6232, rs6234, and rs6235 PCSK1 SNPs in a Northern Ontario population sample.
Sirois F, Kaefer N, Currie KA, Chrétien M, Nkongolo KK, Mbikay M. Sirois F, et al. J Community Genet. 2012 Oct;3(4):319-22. doi: 10.1007/s12687-012-0081-5. Epub 2012 Feb 4. J Community Genet. 2012. PMID: 22307923 Free PMC article.
Effect of Hybridization on Somatic Mutations and Genomic Rearrangements in Plants.
Bashir T, Chandra Mishra R, Hasan MM, Mohanta TK, Bae H. Bashir T, et al. Int J Mol Sci. 2018 Nov 27;19(12):3758. doi: 10.3390/ijms19123758. Int J Mol Sci. 2018. PMID: 30486351 Free PMC article. Review.

See all "Cited by" articles

References

1. Akey J. M., Zhang K., Xiong M., Jin L.2003The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol. Biol. Evol. 20, 232–242 (doi:10.1093/molbev/msg032) - DOI - PubMed
1. Amos W.2010Heterozygosity and mutation rate: evidence for an interaction and its implications. BioEssays 32, 82–90 - PubMed
1. Amos W., Flint J., Xu X.2008Heterozygosity increases microsatellite mutation rate, linking it to demographic history. BMC Genet. 9, 72 (doi:10.1186/1471-2156-9-72) - DOI - PMC - PubMed
1. Baker S. M., et al. 1995Male mice defective in the DNA mismatch repair gene PMS2 exhibit abnormal chromosome synapsis in meiosis. Cell 82, 309–319 (doi:10.1016/0092-8674(95)90318-6) - DOI - PubMed
1. Borts R. H., Haber J. E.1989Length and distribution of meiotic gene conversion tracts and crossovers in Saccharomyces cerevisiae. Genetics 123, 69–80 - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence?

Affiliation

Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence?

Author

Affiliation

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources