Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Dec;4(6):605-21.
doi: 10.2217/epi.12.59.

MBD-seq as a cost-effective approach for methylome-wide association studies: demonstration in 1500 case--control samples

Affiliations

MBD-seq as a cost-effective approach for methylome-wide association studies: demonstration in 1500 case--control samples

Karolina A Aberg et al. Epigenomics. 2012 Dec.

Abstract

Aim: We studied the use of methyl-CpG binding domain (MBD) protein-enriched genome sequencing (MBD-seq) as a cost-effective screening tool for methylome-wide association studies (MWAS).

Materials & methods: Because MBD-seq has not yet been applied on a large scale, we first developed and tested a pipeline for data processing using 1500 schizophrenia cases and controls plus 75 technical replicates with an average of 68 million reads per sample. This involved the use of technical replicates to optimize quality control for multi- and duplicate-reads, an in silico experiment to identify CpGs in loci with alignment problems, CpG coverage calculations based on multiparametric estimates of the fragment size distribution, a two-stage adaptive algorithm to combine data from correlated adjacent CpG sites, principal component analyses to control for confounders and new software tailored to handle the large data set.

Results: We replicated MWAS findings in independent samples using a different technology that provided single base resolution. In an MWAS of age-related methylation changes, one of our top findings was a previously reported robust association involving GRIA2. Our results also suggested that owing to the many confounding effects, a considerable challenge in MWAS is to identify those effects that are informative about disease processes.

Conclusion: This study showed the potential of MBD-seq as a cost-effective tool in large-scale disease studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Overview of the data processing pipeline and key results at each step
FDR: False-discovery rate; MBD-seq: Methyl-CpG binding domain sequencing; MWAS: Methylome-wide association studies; PC: Principal component; PCA: Principal component analysis; QC: Quality control; QV: Quality value.
Figure 2
Figure 2
Correlations calculated for 73 duplicates after different quality control procedures for duplicate- and multi-reads.
Figure 3
Figure 3. CpG density versus ‘raw’ coverage
For CpG density a simple ‘coupling’ factor was calculated by counting the number of CpGs within ±100 bp. The vertical lines indicate the relative frequencies of CpGs with that specific density.
Figure 4
Figure 4
Distribution of in silico coverage for CpGs in loci overlapping and not overlapping with RepeatMasker.
Figure 5
Figure 5. Size of stage 2 blocks across biological features
The mean, median, SD and 99th percentile for the stage 2 block size, in base pairs, is given for all blocks included and for blocks overlapping with CpG islands, CpG shores, regions marker by Rep. Mask, genes, exons, introns, 5′-UTRs, 3′-UTRs, regions within 8 kb upstream of transcriptional start sites corresponding to potential promoter regions (promoter) and conserved regions (cons). The percentage of blocks overlapping with the biological features are given in the legend on the x-axis. Note that a single block can overlap with multiple biological features. Rep. Mask: RepeatMasker; SD: Standard deviation.

References

    1. Laird PW. Principles and challenges of genomewide DNA methylation analysis. Nat Rev Genet. 2010;11:191–203. - PubMed
    1. Bonasio R, Tu S, Reinberg D. Molecular signals of epigenetic states. Science. 2010;330:612–616. - PMC - PubMed
    1. Coulondre C, Miller JH, Farabaugh PJ, Gilbert W. Molecular basis of base substitution hotspots in Escherichia coli. Nature. 1978;274:775–780. - PubMed
    1. Kwok JB. Role of epigenetics in Alzheimer’s and Parkinson’s disease. Epigenomics. 2010;2:671–682. - PubMed
    1. Hedrich CM, Tsokos GC. Epigenetic mechanisms in systemic lupus erythematosus and other autoimmune diseases. Trends Mol Med. 2011;17:714–724. - PMC - PubMed

Websites

    1. Center for Biomarker Research and Personalized Medicine. www.biomarker.vcu.edu.
    1. UCSC Genome Bioinformatics. http://genome.ucsc.edu.

Publication types

Substances