Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Apr 3:7:66.
doi: 10.1186/1471-2164-7-66.

Periodicity of SNP distribution around transcription start sites

Affiliations

Periodicity of SNP distribution around transcription start sites

Koichiro Higasa et al. BMC Genomics. .

Abstract

Background: Several millions single nucleotide polymorphisms (SNPs) have already been collected and deposited in public databases and these are important resources not only for use as markers to identify disease-associated genes, but also to understand the mechanisms that underlie the genome diversification.

Results: A spectrum analysis of SNP density distribution in the genomic regions around transcription start sites (TSSs) revealed a remarkable periodicity of 146 nucleotides. This periodicity was observed in the regions that were associated with CpG islands (CGIs), but not in the regions without CpG islands (nonCGIs). An analysis of the sequence divergence of the same genomic regions between humans and chimpanzees also revealed a similar periodical pattern in CGI. The occurrences of any mono- or di-nucleotide sequences in these regions did not reveal such a periodicity, thus indicating that an interpretation of this periodicity solely based on the sequence-dependent susceptibility to mutation is highly unlikely.

Conclusion: The periodical patterns of nucleotide variability suggest the location of nucleosomes that are phased at TSS, and can be viewed as the genetic footprint of the chromatin state that has been maintained throughout mammalian evolutionary history. The results suggest the possible involvement of the nucleosome structure in the promoter function, and also a fundamental functional/structural difference between the two promoter classes, i.e., those with and without CGIs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of SNPs around TSSs.The distribution of the density of validated SNPs (no. of vSNPs per gene) at the positions relative to the TSSs of 10,171 genes are shown (gray). Noise filtering was performed using FFT. After the SNP density data was transformed to the frequency domain by means of an FFT, the one-sided low-pass Hanning filter for components below 50 nucleotides was applied. The denoised curve was obtained by the inverse FFT of the filtered array (magenta).
Figure 2
Figure 2
Spectrum analysis by Fast Fourier transformation. Spectra of distributions of SNP density (A, C, and E) and nucleotide divergence between humans and chimpanzees (B, D, and F) of three TSS categories; all TSS (A and B), CGI-TSSs (C and D) and nonCGI-TSSs (E and F). The side view and sectional view at the periodicity 146 nucleotides of the FFT diagrams are shown on the left and top of the diagram panels, respectively. The magenta and red lines are the means and the 99 % confidence intervals of the power values. The number of sequences analyzed are 10,171 (A), 6,329 (C), and 3,842 (E). The diagrams and their side views of SNP density (A, C and E) are dynamically colored according to the Z-scores, while those of divergence (B, D and F) are colored according to the power in arbitrary units, which are the square of coefficients for the polynomials of the trigonometric functions in the FFT. The color range for SNP density goes from blue to red, corresponding to 0 to 25 in Z-score. Those for divergence correspond to 0 to 3, respectively, in power value (a.u.).
Figure 3
Figure 3
Co-localization of CpG island and the 146 nucleotides periodicity. Occupancy of CpG islands (solid line, scale on the left) and the power of the 146 nucleotides periodicity of SNP density (dashed line, scale on the right) around the TSSs are shown. a.u., power in arbitrary units.

References

    1. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–311. doi: 10.1093/nar/29.1.308. - DOI - PMC - PubMed
    1. Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet. 2003;33:228–237. doi: 10.1038/ng1090. - DOI - PubMed
    1. Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet. 2004;74:106–120. doi: 10.1086/381000. - DOI - PMC - PubMed
    1. Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, Sherry S, Mullikin JC, Mortimore BJ, Willey DL, et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature. 2001;409:928–933. doi: 10.1038/35057149. - DOI - PubMed
    1. Stephens JC, Schneider JA, Tanguay DA, Choi J, Acharya T, Stanley SE, Jiang R, Messer CJ, Chew A, Han JH, et al. Haplotype variation and linkage disequilibrium in 313 human genes. Science . 2001;293:489–493. doi: 10.1126/science.1059431. - DOI - PubMed

Publication types

LinkOut - more resources