Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 7;277(1686):1443-9.
doi: 10.1098/rspb.2009.1757. Epub 2010 Jan 13.

Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence?

Affiliations

Even small SNP clusters are non-randomly distributed: is this evidence of mutational non-independence?

William Amos. Proc Biol Sci. .

Abstract

Single nucleotide polymorphisms (SNPs) are distributed highly non-randomly in the human genome through a variety of processes from ascertainment biases (i.e. the preferential development of SNPs around interesting genes) to the action of mutation hotspots and natural selection. However, with more systematic SNP development, one might expect an increasing proportion of SNPs to be distributed more or less randomly. Here, I test this null hypothesis using stochastic simulations and compare this output with that of an alternative hypothesis that mutations are more likely to occur near existing SNPs, a possibility suggested both by molecular studies of meiotic mismatch repair in yeast and by data showing that SNPs cluster around heterozygous deletions. A purely Poisson process generates SNP clusters that differ from equivalent data from human chromosome 1 in both the frequency of different-sized clusters and the SNP density within each cluster, even for small clusters of just four or five SNPs, while clusters on the X chromosome differ from those on the autosomes. In contrast, modest levels of mutational non-independence generate a reasonable fit to the real data for both cluster frequency and density, and also exhibit the evolutionary transience noted for 'mutation hotspots'. Mutational non-independence therefore provides an interesting new hypothesis that appears capable of explaining the distribution of SNPs in the human genome.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
How the frequencies and densities of SNP clusters vary with cluster size under different mutation models. Data series are: large black circles, data from human chromosome 1; large white circles, simulated randomly occurring mutations; smaller symbols, simulated mutation non-independence in which mutations are more likely to occur in regions of size 0.5 (grey triangles), 1 (white squares), 2 (grey diamonds) or 5 kb (crosses) around any existing heterozygous site. Simulated data are culled from 100 replicate simulations, accepting only those in which the terminal overall SNP density was one SNP every 700 ± 25 bases, yielding approximately 60 runs per set of conditions. Frequencies are normalized to the size of human chromosome 1. For full details of the simulations (see §2). (a) How the frequencies of different cluster sizes vary; (b) how cluster density varies for the same data.
Figure 2.
Figure 2.
How the frequencies and densities of SNP clusters vary with cluster size between the X chromosome and the autosomes, with simulated random data for comparison. Data series are: large white circles, simulated randomly occurring mutations; large black circles, the X chromosome; grey diamonds, mean value for the autosomes, calculated for each chromosome separately and then averaged. Error bars are one standard error of the mean. Simulated data are culled from 100 replicate simulations, accepting only those in which the terminal overall SNP density was one SNP every 1220 ± 100 bases, yielding approximately 40 runs. For full details of the simulations see §2. (a) How the frequencies of different cluster sizes vary; (b) how cluster density varies for the same data. At all but six cluster sizes above four, the X-chromosome density is lower than for the autosomes.
Figure 3.
Figure 3.
How the locations of SNP clusters vary over time. A single simulation was run for 80 000 generations. At every 4N = 4000 generations, the locations of clusters containing 25 or more SNPs were recorded and plotted on a separate line. Every individual carries a single pair of chromosomes, each 20 Mb long. For maximum comparability, SNP density was held as close as possible to 1 every 700 bases by varying the number of individuals on which SNPs were ascertained. A more or less random pattern is seen with little or no evidence of prolonged stability over time.

Similar articles

Cited by

References

    1. Akey J. M., Zhang K., Xiong M., Jin L.2003The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol. Biol. Evol. 20, 232–242 (doi:10.1093/molbev/msg032) - DOI - PubMed
    1. Amos W.2010Heterozygosity and mutation rate: evidence for an interaction and its implications. BioEssays 32, 82–90 - PubMed
    1. Amos W., Flint J., Xu X.2008Heterozygosity increases microsatellite mutation rate, linking it to demographic history. BMC Genet. 9, 72 (doi:10.1186/1471-2156-9-72) - DOI - PMC - PubMed
    1. Baker S. M., et al. 1995Male mice defective in the DNA mismatch repair gene PMS2 exhibit abnormal chromosome synapsis in meiosis. Cell 82, 309–319 (doi:10.1016/0092-8674(95)90318-6) - DOI - PubMed
    1. Borts R. H., Haber J. E.1989Length and distribution of meiotic gene conversion tracts and crossovers in Saccharomyces cerevisiae. Genetics 123, 69–80 - PMC - PubMed

LinkOut - more resources