Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;40(13):e100.
doi: 10.1093/nar/gks275. Epub 2012 Mar 28.

A mostly traditional approach improves alignment of bisulfite-converted DNA

Affiliations

A mostly traditional approach improves alignment of bisulfite-converted DNA

Martin C Frith et al. Nucleic Acids Res. 2012 Jul.

Abstract

Cytosines in genomic DNA are sometimes methylated. This affects many biological processes and diseases. The standard way of measuring methylation is to use bisulfite, which converts unmethylated cytosines to thymines, then sequence the DNA and compare it to a reference genome sequence. We describe a method for the critical step of aligning the DNA reads to the correct genomic locations. Our method builds on classic alignment techniques, including likelihood-ratio scores and spaced seeds. In a realistic benchmark, our method has a better combination of sensitivity, specificity and speed than nine other high-throughput bisulfite aligners. This study enables more accurate and rational analysis of DNA methylation. It also illustrates how to adapt general-purpose alignment methods to a special case with distorted base patterns: this should be informative for other special cases such as ancient DNA and AT-rich genomes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Distribution of sequence quality (phred) scores for the two datasets. (Phred score = −10log10 error probability.) Each dataset contains 1 million DNA reads of length 87 (A) or 85 (B).
Figure 2.
Figure 2.
Accuracy of various methods for aligning bisulfite-converted DNA reads to the reference genome, for datasets (A) and (B). The sensitivity is the percentage of total reads that were correctly aligned. The error rate is the percentage of aligned reads that were wrongly aligned. For Last and Novoalign, each line shows the effect of varying the max mismap probability.
Figure 3.
Figure 3.
Run times of various methods for aligning bisulfite-converted DNA reads to the reference genome.
Figure 4.
Figure 4.
Accuracy of Last, with various parameter settings, for aligning bisulfite-converted DNA reads to the reference genome, for datasets (A) and (B). The black lines in this figure are identical to the solid black lines in Figure 2.
Figure 5.
Figure 5.
Accuracy of various methods for aligning bisulfite-converted DNA reads to recent duplications in the genome. The upper row shows results for dataset (A), and the lower row for dataset (B). The left-most column shows results for L1P elements, the middle column for segmental duplications and the right-most column for AluY elements. The sensitivity is the percentage of reads from within duplicated regions that were correctly aligned. The error rate is the percentage of reads aligned within duplicated regions that were wrongly aligned. Each symbol and line refers to a different alignment method: see the key in Figure 2.

References

    1. Watanabe Y, Maekawa M. Methylation of DNA in cancer. Adv. Clin. Chem. 2010;52:145–167. - PubMed
    1. Vanyushin BF, Ashapkin VV. DNA methylation in higher plants: past, present and future. Biochim. Biophys. Acta. 2011;1809:360–368. - PubMed
    1. Yu NK, Baek SH, Kaang BK. DNA methylation-mediated control of learning and memory. Mol. Brain. 2011;4:5. - PMC - PubMed
    1. Coppieters N, Dragunow M. Epigenetics in Alzheimer's disease: a focus on DNA modifications. Curr. Pharm. Des. 2011;17:3398–3412. - PubMed
    1. Mund C, Lyko F. Epigenetic cancer therapy: Proof of concept and remaining challenges. Bioessays. 2010;32:949–957. - PubMed

Publication types