Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 1;34(Web Server issue):W692-5.
doi: 10.1093/nar/gkl234.

CorGen--measuring and generating long-range correlations for DNA sequence analysis

Affiliations

CorGen--measuring and generating long-range correlations for DNA sequence analysis

Philipp W Messer et al. Nucleic Acids Res. .

Abstract

CorGen is a web server that measures long-range correlations in the base composition of DNA and generates random sequences with the same correlation parameters. Long-range correlations are characterized by a power-law decay of the auto correlation function of the GC-content. The widespread presence of such correlations in eukaryotic genomes calls for their incorporation into accurate null models of eukaryotic DNA in computational biology. For example, the score statistics of sequence alignment and the performance of motif finding algorithms are significantly affected by the presence of genomic long-range correlations. We use an expansion-randomization dynamics to efficiently generate the correlated random sequences. The server is available at http://corgen.molgen.mpg.de.

PubMed Disclaimer

Figures

Figure 1
Figure 1
CorGen analysis of a 1 Mb region on human chromosome 22. The two plots in the top part show the measured GC-profile (left) and correlation function (right) of the chromosomal region. In the double-logarithmic correlation graph, power-law correlations C(r) ∝ r−α show up as a straight line with slope α. The fitting has been performed in the range 10 < r <10 000, and the obtained parameters are α = 0.359 and C (10) = 0.0234 (green line). A corresponding random sequence of length 1 Mb with the measured long-range correlation parameters and average GC-content of the query sequence has been generated and can be downloaded by the user. Its composition profile and correlation function are shown in the two plots at the bottom.

Similar articles

Cited by

References

    1. Peng C.-K., Buldyrev S.V., Goldberger A.L., Havlin S., Sciortino F., Simons M., Stanley H.E. Long-range correlations in nucleotide sequences. Nature. 1992;356:168. - PubMed
    1. Li W., Kaneko K. Long-range correlation and partial 1/fα spectrum in a noncoding DNA sequence. Europhys. Lett. 1992;17:655.
    1. Voss R.F. Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys. Rev. Lett. 1992;68:3805. - PubMed
    1. Arneodo A., Bacry E., Graves P.V., Muzy J.F. Characterizing long-range correlations in DNA sequences from wavelet analysis. Phys. Rev. Lett. 1995;74:3293. - PubMed
    1. Bernaola-Galvan P., Carpena P., Roman-Roldan R., Oliver J.L. Study of statistical correlations in DNA sequences. Gene. 2002;300:105. - PubMed

Publication types