Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan;6(1):e22.
doi: 10.1371/journal.pbio.0060022.

A novel CpG island set identifies tissue-specific methylation at developmental gene loci

Affiliations

A novel CpG island set identifies tissue-specific methylation at developmental gene loci

Robert Illingworth et al. PLoS Biol. 2008 Jan.

Abstract

CpG islands (CGIs) are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. The Immobilised CXXC Domain Specifically Retains DNA Containing Clusters of Nonmethylated CpGs
(A) EMSA showing the CXXC complex with a DNA probe containing 27 nonmethylated CpG sites. Nonmethylated probe DNA (CG11) or methylated probe (MeCG11) was incubated with 0, 250, 500, 1,000, or 2,000 ng of recombinant CXXC protein. (B) A typical elution profile of bulk genomic DNA (blue line) from a CXXC affinity chromatography column. Genomic DNA (100 μg) was applied to the CXXC affinity matrix (see Methods) in low salt (0.1 M NaCl) and eluted with a gradient of increasing NaCl (red line; see text). Eighteen fractions were interrogated by PCR (blue lines). The bracket above indicates fractions that were found to contain nonmethylated CGIs. (C) Elution of specific CGI sequences of known methylation status. Methylated CGIs (NYESO and MAO in females) coelute with bulk genomic DNA (see bracket) whereas nonmethylated CGIs (P48 and MAO) elute at high NaCl concentration.
Figure 2
Figure 2. A Library of DNA Sequences that Bind Tightly to the CXXC Column Represents a Comprehensive Set of CGIs
(A and B) Plots of fragment length versus G+C content (A) and CpG[o/e] (B) for 28,013 unique Mse1 inserts. Fragments shorter than 512 bp with a G+C content = <50% and a CpG[o/e] = <0.6 (grey dots) were filtered out as contamination. The dashed line indicates the base composition (A) and CpG o/e (B) of bulk genomic DNA. (C) A filtered insert set representing 17,387 CGIs shows a discrete distribution that is distant from bulk genomic DNA (black dot). (D) Three random chromosomal regions showing CGI sequences mapped by ENSEMBL (green bars). Also shown are CGIs predicted by the NCBI-strict and NCBI-relaxed algorithms (blue bars). The directions of transcription of coding sequences (yellow bars) are arrowed. Numbered CGIs (1–4) represent sequences not detected by the NCBI-strict algorithm. (E) CpG maps of the four CGI clones not predicted by NCBI-strict. Transcription start sites in examples 1, 3, and 4 are indicated by arrows. Sequenced MseI fragments are denoted by dashed lines and CpG sites by vertical black strokes. (F) The distribution of cloned CGIs (red strokes) on human chromosomes. The number of CGIs on each chromosome is shown (right) and centromeres are denoted by blue dots.
Figure 3
Figure 3. Use of an Arrayed CGI Library to Detect CGI Methylation in Human Blood DNA
(A) Schematic showing isolation of densely methylated CGIs using MBD affinity purification based on reference [20]. Open and filled circles represent nonmethylated and methylated CpG sites, respectively. (B) Examples of retention of known methylated CGIs by MBD affinity chromatography. Methylated XIST and NYESO CGIs elute at high salt concentration, whereas nonmethylated P48 and female XIST co-elute with bulk genomic DNA (blue line) at low salt concentration (red line). (C) M values (log2[MBD/Input]) >1.5 (dashed vertical arrow) denote DNA fragments enriched by MAP. M values are plotted against the ratio of fragment abundance in the MAP probe versus input DNA as determined by quantitative PCR. Error bars represent ± standard deviation. (D–F) MAP CGI array hybridization identifies CGIs that are methylated on the inactive X chromosome. (D) Probes isolated by MAP from male and female whole blood DNA detected female-specific CGI methylation. (E) CGIs on the X chromosome (red dots) often showed female-specific methylation. (F) CGIs on Chromosome 16 (red dots) were indistinguishably methylated between male and female. (G and H) Confirmation of methylated CGIs by bisulfite genomic sequencing. CGI clones I1387 (G) and I9112 (H) are nonmethylated and methylated, respectively, as predicted by the microarray data. Open and filled circles represent nonmethylated and methylated CpG sites, respectively. The genomic locus including annotated transcripts and CpG maps (vertical strokes) are shown above each profile. Each column represents products of amplification by a single primer pair (brackets below CpG map). Each line corresponds to a sequenced DNA strand. Red bars indicate the location of the MseI fragment cloned in the CGI library. (I) The CGI array distinguishes genes inactivated on the X chromosome (inactive) from genes that escape inactivation (escaping). CGIs associated with inactivated genes (n = 103) show significantly higher M values than CGIs at escaping genes (n = 14; KS test: p = 1.2 ×10−5).
Figure 4
Figure 4. Tissue-Specific CGI Methylation in a Panel of Human Tissues
(A) Examples of pairwise comparisons using MAP CGI probes derived from blood, brain, muscle, and spleen. Broken red lines indicate threshold M values used to determine differential CGI methylation. (B) Frequencies of methylated CGIs in blood, brain, muscle, and spleen. The following catagories are represented: CGIs methylated in all tested tissues (black); CGIs methylated in more than one tissue tested but not all (green); CGIs methylated in one tissue only (blue); CGIs methylated in one tissue tested but unclassified in other tissues (white). (C) Somatically methylated CGIs display a very small but significant reduction in CpG[o/e] (0.75) relative to the whole CGI set (0.77; n = 1,657 and 12,661, Wilcoxon rank test: p-value: 1.022e−11). The histogram shows the CpG[o/e] profile for the total CGI set (white bars) overlaid with the CpG[o/e] profile for methylated CGIs (red line). (D–G) Confirmation of candidate CGIs showing evidence of tissue specific methylation by bisulfite genomic sequencing. Layout is as for Figure 3G.
Figure 5
Figure 5. Tissue, Cell-Type, and Individual-Specific CGI Methylation at Developmental Gene Loci
(A–B and E–F) Bisulfite genomic sequencing confirmed tissue-specific CGI methylation associated with the developmental genes OSR1 (A) and PAX6 (B). Multiple CGIs (red boxes) span the HOXC (C) and PAX6 (D) gene loci. Plots of the MAP-CGI array profiles for blood, brain, muscle, and spleen identify tissue-specific CGI methylation (vertical black bars extending above M = 1.5). Gray bars extending downwards below M = 1.5 (broken blue line) represent nonmethylated CGIs. The region of PAX6 analysed by bisulfite genomic sequencing (see Figure 5B) is indicated (asterisk in panel D). Tick marks on the y-axis are spaced at intervals of 1 M value unit. Coding sequences are diagrammed as yellow bars. (E) Individual-specific CGI methylation internal to the HOXC cluster in muscle DNA. (F) Cell type–specific methylation is seen at the SEC31B promoter CGI in monocytes and granulocytes derived from whole human blood. Bisulfite genomic sequencing results (A–B and E–F) are diagrammed as in Figure 3G.

References

    1. Li E, Bestor TH, Jaenisch R. Targeted mutation of the DNA methyltransferase gene results in embryonic lethality. Cell. 1992;69:915–926. - PubMed
    1. Okano M, Bell DW, Haber DA, Li E. DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell. 1999;99:247–257. - PubMed
    1. Bird A. DNA methylation patterns and epigenetic memory. Genes Dev. 2002;16:6–21. - PubMed
    1. Bird A, Taggart M, Frommer M, Miller OJ, Macleod D. A fraction of the mouse genome that is derived from islands of non-methylated, CpG-rich DNA. Cell. 1985;40:91–99. - PubMed
    1. Stein R, Razin A, Cedar H. In vitro methylation of the hamster adenine phosphorybosy transferase gene inhibits its expression in mouse L cells. Proc Natl Acad Sci U S A. 1982;79:4418–3422. - PMC - PubMed

Publication types