Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun 15:6:7438.
doi: 10.1038/ncomms8438.

Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes

Affiliations

Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes

John Beaulaurier et al. Nat Commun. .

Abstract

Beyond its role in host defense, bacterial DNA methylation also plays important roles in the regulation of gene expression, virulence and antibiotic resistance. Bacterial cells in a clonal population can generate epigenetic heterogeneity to increase population-level phenotypic plasticity. Single molecule, real-time (SMRT) sequencing enables the detection of N6-methyladenine and N4-methylcytosine, two major types of DNA modifications comprising the bacterial methylome. However, existing SMRT sequencing-based methods for studying bacterial methylomes rely on a population-level consensus that lacks the single-cell resolution required to observe epigenetic heterogeneity. Here, we present SMALR (single-molecule modification analysis of long reads), a novel framework for single molecule-level detection and phasing of DNA methylation. Using seven bacterial strains, we show that SMALR yields significantly improved resolution and reveals distinct types of epigenetic heterogeneity. SMALR is a powerful new tool that enables de novo detection of epigenetic heterogeneity and empowers investigation of its functions in bacterial populations.

PubMed Disclaimer

Conflict of interest statement

E.E.S. is on the scientific advisory board of Pacific Biosciences.

Figures

Figure 1
Figure 1. SMALR methods for methylation detection in SMRT reads.
Schematic illustrating the general approaches of both the existing and two proposed SMALR methods for detecting DNA methylation in SMRT sequencing reads. (a) A single SMRT sequencing molecule (short DNA insert+adapters) and the subreads that are produced during sequencing. (b) The existing methylation detection method is based on a molecule-aggregated, single-nucleotide (AggSN) score. For a given strand and genomic position, the IPD values from all the subreads aligning to that strand and position are aggregated together across all molecules to infer the presence of a consensus methylated base. (c) The proposed single molecule, single nucleotide (SMSN) method for detecting DNA methylation relies instead on separate consideration of subreads from different molecules. The SMSN scores are calculated for each molecule, strand and genomic position. (d) A single SMRT sequencing molecule (long DNA insert+adapters) with a single long subread and the proposed single molecule, pooled (SMP) approach for assessing MTase activity. This approach pools together IPD values from multiple motif sites along the length of a single long subread.
Figure 2
Figure 2. Performance of SMSN level detection of DNA methylation.
Multiple metrics showing the performance of the proposed single molecule, single nucleotide (SMSN) detection method. (a) Performance of the approach for detecting 6mA modifications in the 5′-CTGCAG motif of E. coli C227 using three thresholds for minimum single-molecule coverage (covSM). (b) Distribution of the aggregate, single nucleotide (AggSN) and SMSN methylation scores for the partially non-methylated 5′-RGATCY motif in C. salexigens. The bimodal distribution of the SMSN scores enables the accurate and objective estimation of this fraction. (c) Accuracy of SMSN-enabled estimations of the methylated fraction (using covSM≥10) for the 5′-CTGCAG motif of E. coli C227 at various levels of genomic-sequencing coverage.
Figure 3
Figure 3. SMSN score distributions reveal epigenetic heterogeneity.
(a) Single molecule, single nucleotide (SMSN) score distributions for multiple bacterium-motif pairs (and the genome-wide motif count, N, of each motif) that exhibit near complete methylation, along with a non-methylated motif for comparison. (b) SMSN score distributions for multiple bacterium-motif pairs that display significant non-methylated fractions. The H. pylori J99 motifs show minor variation in the SMSN associated with each peak due to subtle differences in the chemistry version used for SMRT sequencing of the native and WGA samples. (c) SMSN interrogation of 5′-GANTC methylation at five genomic positions (columns) in a synchronized C. crescentus culture during a single round of DNA replication. Five time points (minutes post-synchronization; rows) provide snapshots of the bidirectional progression of the replication forks from the origin of replication (Cori) to the terminus (Ter). Grey wedges in the chromosome schematics show the 200-kb genomic regions where the SMSN scores are queried for each time point. Two regions are on either side of the Cori: (i) Cori - 0.1 Mbp and (ii) Cori+0.1 Mbp. Another two are halfway between Cori and Ter: (iii) Cori - 1 Mbp and (iv) Cori+1 Mbp. The final region covers the terminus: (v) Ter. Light (hemimethylated) to dark (fully methylated) colour shading in the schematic illustrates the approximate position of the replication fork at each time point. The bimodal distributions of approximate SMSN scores (Methods) reveal the progressive hemi-methylation of 5′-GANTC sites following the passage of the replication forks. Hemimethylated sites cannot transition back to full methylation until the MTase gene, ccrM, is transcribed, which does not occur until late in the replication process.
Figure 4
Figure 4. SMP score distributions reveal distinct types of epigenetic heterogeneity.
(a) Single molecule, pooled (SMP) distribution for H. pylori J99 motif 5′-GATC and its corresponding IPD-shuffled control. The identical unimodal distributions suggest a fully active MTase (as expected). (b) SMP distributions for H. pylori J99 motif 5′-GWCAY and its corresponding WGA control. The major peak around SMP≈0 and minor peak around SMP≈2 suggests that the mostly inactive MTase targeting 5′-GWCAY, M.Hpy99XXI, is methylating 5′-GWCAY in a small fraction of cells. Methylated molecules with SMP scores>2 have an FDR<0.2%. (c) SMP distributions of 5′-TCAN6TRG/5′-CYAN6TGA in H. pylori J99 and its corresponding IPD-shuffled control. The major peak around SMP≈2 and minor peak around SMP≈0 indicates that the normally active MTase, Hpy99XXII, is inactive in a small fraction of cells. Non-methylated molecules with SMP scores<0 have an FDR<1.3%. (d) High-accuracy sequencing with Illumina MiSeq and read-level analysis of insertion/deletion calls shows significant variation in the lengths of two specific homopolymers in the coding sequences of M.Hpy99XXI and S.Hpy99XXII. The high percentage of deletions in these two genes stands apart from the deletion rates found in five other C/G homopolymers from H. pylori J99 and E. coli K12, suggesting that this is not simply due to lower sequencing accuracy in homopolymer regions. (e) SMP distributions of 5′-TCNNGA in H. pylori J99 and its corresponding IPD-shuffled control. The SMP scores suggest a MTases behaviour similar to that of Hpy99XXII. Non-methylated molecules with SMP scores<0 have an FDR<1.6%. (f) SMP distribution for the C. salexigens motif 5′-RGATCY. The major peak near SMP≈0.9 indicates that the IPDs sampled for each molecule reflect a mixture of both non-methylated (IPD≈0) and methylated (IPD≈2) motif sites, suggesting stochastic methylation as the primary source of epigenetic heterogeneity for this motif.

References

    1. Cheng X. Structure and function of DNA methyltransferases. Annu. Rev. Biophys. Biomol. Struct. 24, 293–318 (1995). - PubMed
    1. Casadesús J. & Low D. Epigenetic gene regulation in the bacterial world. Microbiol. Mol. Biol. Rev. 70, 830–856 (2006). - PMC - PubMed
    1. Wion D. & Casadesús J. N6-methyl-adenine: an epigenetic signal for DNA-protein interactions. Nat. Rev. Microbiol. 4, 183–192 (2006). - PMC - PubMed
    1. Naito T., Kusano K. & Kobayashi I. Selfish behavior of restriction-modification systems. Science 267, 897–899 (1995). - PubMed
    1. Roberts R. J., Vincze T., Posfai J. & Macelis D. REBASE-a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res. 43, D298–D299 (2014). - PMC - PubMed

Publication types