Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 1;34(Web Server issue):W686-91.
doi: 10.1093/nar/gkl040.

GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences

Affiliations

GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences

Feng Gao et al. Nucleic Acids Res. .

Abstract

In order to understand the evolution, structure and function of genomes, it is important to know the general compositional features of DNA sequences. Based on the quadratic divergence, a new segmentation algorithm to partition a given genome or DNA sequence into compositionally distinct domains has been put forward. With the aid of the technique of cumulative GC profile, the distribution of segmentation points can be displayed intuitively. We have therefore developed them into GC-Profile, an interactive web-based software system, which can be used to segment prokaryotic and eukaryotic genomes. GC-Profile provides a quantitative and qualitative view of genome organization. Based on the obtained results, the relationships between the G+C content and other genomic features, such as distributions of genes and CpG islands, can be analyzed in a perceivable manner. It shows that GC-Profile would be an appropriate starting point for analyzing the isochore structure of higher eukaryotic genomes, and an intuitive tool for identifying genomic islands in prokaryotic genomes. GC-Profile is freely available at the website http://tubic.tju.edu.cn/GC-Profile/. In addition, precompiled binaries, together with examples and documentation, can also be freely downloaded for a local execution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An example of output pages of GC-Profile when the input is the sequence of chicken chromosome 28. (A) Coordinates, sizes and G + C contents of the segmented domains as an HTML table. (B) Number, coordinates, segmentation strength, segmentation times and segmented contig of the segmentation points as an HTML table. (C) The negative cumulative GC profile for chicken chromosome 28 marked with the segmentation points obtained. The lower plot shows the distributions of the G + C content and CpG islands along chicken chromosome 28. The G + C content is calculated for the domains segmented at t0 = 300. Here, the halting parameter t calculated for each segmentation point is also referred to as the segmentation strength, which is defined based on the quadratic divergence instead of the Jensen–Shannon divergence.
Figure 1
Figure 1
An example of output pages of GC-Profile when the input is the sequence of chicken chromosome 28. (A) Coordinates, sizes and G + C contents of the segmented domains as an HTML table. (B) Number, coordinates, segmentation strength, segmentation times and segmented contig of the segmentation points as an HTML table. (C) The negative cumulative GC profile for chicken chromosome 28 marked with the segmentation points obtained. The lower plot shows the distributions of the G + C content and CpG islands along chicken chromosome 28. The G + C content is calculated for the domains segmented at t0 = 300. Here, the halting parameter t calculated for each segmentation point is also referred to as the segmentation strength, which is defined based on the quadratic divergence instead of the Jensen–Shannon divergence.
Figure 2
Figure 2
The negative cumulative GC profile for the genome of V.vulnificus CMCP6 chromosome I marked with the segmentation points obtained. It shows that from 357 145 to 394 176 bp, 2 432 023 to 2 603 700 bp and 3 250 386 to 3 281 945 bp, there are three regions of low GC content, which are recognized as genomic islands. The segmentation points are obtained at t0 = 100. Here, we also mapped the horizontally transferred genes from HGT-DB to the negative cumulative GC profile. It can be seen that the three regions contain clusters of horizontally transferred genes, which strongly suggests that these regions are horizontally transferred genomic islands.

Similar articles

Cited by

References

    1. Lobry J.R. A simple vectorial representation of DNA sequences for the detection of replication origins in bacteria. Biochimie. 1996;78:323–326. - PubMed
    1. Zhang R., Zhang C.T. Identification of replication origins in archaeal genomes based on the Z-curve method. Archaea. 2004;1:335–346. - PMC - PubMed
    1. Zhang R., Zhang C.T. A systematic method to identify genomic islands and its applications in analyzing the genomes of Corynebacterium glutamicum and Vibrio vulnificus CMCP6 chromosome I. Bioinformatics. 2004;20:612–622. - PubMed
    1. Oliver J.L., Bernaola-Galvan P., Carpena P., Roman-Roldan R. Isochore chromosome maps of eukaryotic genomes. Gene. 2001;276:47–56. - PubMed
    1. Li W., Bernaola-Galvan P., Haghighi F., Grosse I. Applications of recursive segmentation to the analysis of DNA sequences. Comput. Chem. 2002;26:491–510. - PubMed

Publication types