Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Feb 20:10:65.
doi: 10.1186/1471-2105-10-65.

CpG islands or CpG clusters: how to identify functional GC-rich regions in a genome?

Affiliations

CpG islands or CpG clusters: how to identify functional GC-rich regions in a genome?

Leng Han et al. BMC Bioinformatics. .

Abstract

Background: CpG islands (CGIs), clusters of CpG dinucleotides in GC-rich regions, are often located in the 5' end of genes and considered gene markers. Hackenberg et al. (2006) recently developed a new algorithm, CpGcluster, which uses a completely different mathematical approach from previous traditional algorithms. Their evaluation suggests that CpGcluster provides a much more efficient approach to detecting functional clusters or islands of CpGs.

Results: We systematically compared CpGcluster with the traditional algorithm by Takai and Jones (2002). Our comparisons of (1) the number of islands versus the number of genes in a genome, (2) the distribution of islands in different genomic regions, (3) island length, (4) the distance between two neighboring islands, and (5) methylation status suggest that Takai and Jones' algorithm is overall more appropriate for identifying promoter-associated islands of CpGs in vertebrate genomes.

Conclusion: The generation of genome sequence and DNA methylation data is expected to accelerate greatly. The information in this study is important for its extensive utility in gene feature analysis and epigenomics including gene prediction and methylation chip design in different genomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Length distribution of CGIs or CGCs in the human genome. (A) CGIs versus CGCs. (B) For CGCs, promoter regions versus intergenic regions.
Figure 2
Figure 2
Multiple short CGCs embedded in one CGI in the promoter region. Dark box: CGCs identified by CpGcluster. Grey box: CGI identified by Takai and Jones' algorithm. The length of each CGC is labeled below the dark box and the distance between two neighboring CGCs is above the line. The transcription start site (TSS) is marked by an arrow. (A) CAP1. (B) ADAM33.
Figure 3
Figure 3
Distribution of distance between two neighbouring CGCs in the promoter region of a gene.

Similar articles

Cited by

References

    1. Bird AP. CpG islands as gene markers in the vertebrate nucleus. Trends Genet. 1987;3:342–347. doi: 10.1016/0168-9525(87)90294-0. - DOI
    1. Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987;196:261–282. doi: 10.1016/0022-2836(87)90689-9. - DOI - PubMed
    1. Takai D, Jones PA. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci USA. 2002;99:3740–3745. doi: 10.1073/pnas.052410099. - DOI - PMC - PubMed
    1. Hackenberg M, Previti C, Luque-Escamilla PL, Carpena P, Martinez-Aroza J, Oliver JL. CpGcluster: a distance-based algorithm for CpG-island detection. BMC Bioinformatics. 2006;7:446. doi: 10.1186/1471-2105-7-446. - DOI - PMC - PubMed
    1. Han L, Su B, Li WH, Zhao Z. CpG island density and its correlations with genomic features in mammalian genomes. Genome Biol. 2008;9:R79. doi: 10.1186/gb-2008-9-5-r79. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources