Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 8;49(19):e113.
doi: 10.1093/nar/gkab705.

Rapid identification of methylase specificity (RIMS-seq) jointly identifies methylated motifs and generates shotgun sequencing of bacterial genomes

Affiliations

Rapid identification of methylase specificity (RIMS-seq) jointly identifies methylated motifs and generates shotgun sequencing of bacterial genomes

Chloé Baum et al. Nucleic Acids Res. .

Abstract

DNA methylation is widespread amongst eukaryotes and prokaryotes to modulate gene expression and confer viral resistance. 5-Methylcytosine (m5C) methylation has been described in genomes of a large fraction of bacterial species as part of restriction-modification systems, each composed of a methyltransferase and cognate restriction enzyme. Methylases are site-specific and target sequences vary across organisms. High-throughput methods, such as bisulfite-sequencing can identify m5C at base resolution but require specialized library preparations and single molecule, real-time (SMRT) sequencing usually misses m5C. Here, we present a new method called RIMS-seq (rapid identification of methylase specificity) to simultaneously sequence bacterial genomes and determine m5C methylase specificities using a simple experimental protocol that closely resembles the DNA-seq protocol for Illumina. Importantly, the resulting sequencing quality is identical to DNA-seq, enabling RIMS-seq to substitute standard sequencing of bacterial genomes. Applied to bacteria and synthetic mixed communities, RIMS-seq reveals new methylase specificities, supporting routine study of m5C methylation while sequencing new genomes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(A) Principle of RIMS-seq. Deamination of cytidine leads to a blocking damage while deamination of m5C leads to a mutagenic C to T damage only present on the first read (R1) of paired-end reads in standard Illumina sequencing. Thus, an increase of C to T errors in R1 in specific contexts is indicative of m5C. (B) The workflow of RIMS-seq is equivalent to a regular library preparation for Illumina DNA-seq with an extra step of limited alkaline deamination at 60°C. This step can be done immediately after adaptor ligation and does not require additional cleaning steps. (C) Fraction of C to T variants in XP12 (m5C) at all positions in the reads for R1 and R2 after 0min (DNA-seq), 10 min, 30 min, 60 min, 2 h, 3 h, 5 h and 14 h of heat-alkaline treatment. The C to T imbalance between R1 and R2 is indicative of deamination of m5C and increases with heat-alkaline treatment time. (D) Correlation between the C to T fold increases in R1 compared to R2 according to time (r2= 0.998).
Figure 2.
Figure 2.
(A) Bar plots representing the number of C to T read variants for K12 in R1 and R2 after different heat/alkaline treatment times. Colors represent duplicate experiments. (B) Circular bar plots representing the percentage of C to T read variants in all NCNNN contexts (with N = A, T, C or G) for Read 1 (R1, left) and Read 2 (R2, right) in DNA-seq performed on K12 (yellow bars), RIMS-seq (3H) performed on BL21 (green) and RIMS-seq (3H) performed on K12 (dark blue). Red asterisks denote CCWGG contexts with W being either A or T. (C) Proportion of C to T read variants in CCWGG (red) or CCWGG (green) contexts compared to other NCNNN or CNNNN contexts for R1 and R2 in K12 and BL21. The C to T read variants in CCWGG and CCWGG motifs represent less than 2% of all variants except in K12 (R1 only) after 10 min, 1- and 3-h treatments where the CCWGG motifs represent 4.1%, 22.5% and 32.6% of all C to T read variants respectively. The increase of C to T read variants in the CCWGG context is therefore specific to R1 in K12 strain. (D) Visualization of the statistically significant differences in position-specific nucleotide compositions around C to T variants in R1 compared to R2 using Two Sample Logo (21) for the K12 sample subjected to (from top to bottom) 3 h, 1 h, 10 min and 0 min heat alkaline treatment.
Figure 3.
Figure 3.
De novo discovery of methylase specificity using RIMS-seq. (A) Description of the RIMS-seq motif analysis pipeline. First, C to T read variants are identified in both Read 1 and Read 2 separately. Then, the MosDI program searches for overrepresented motifs. Once a motif is found, the pipeline is repeated until no more motifs are found, enabling identification of multiple methylase specificities in an organism. (B) Assembly statistics obtained using the sequence from the standard DNA-seq (+3H, left) and RIMS-seq (right). Visualization using assembly-stats program (https://github.com/rjchallis/assembly-stats). The corresponding table with the statistical values is available in the supplementary material (Supplementary Table S2). (C) Fractions of C to T read variants in CGCG (yellow) or GATC (green) contexts compared to other contexts for R1 and R2 in Acinetobacter calcoaceticus ATCC 49823 using the assembled or the reference genome. The increase of C to T read variants in these contexts are similar when using either the assembled or reference genomes
Figure 4.
Figure 4.
C to T error profile in GCGC (canonical recognition site), ACGC, TCGC, CCGC and GCGT. in R1 reads (orange) and R2 reads (red) for RIMS-seq (upper panel) and DNA-seq(+3H) (lower panel) A. Recombinant HhaI methylase expressed in E. coli B. Native HhaI methylase expressed in Haemophilus parahaemolyticus. Elevation of C to T in the R1 read variant can be observed in the context of GCGC for both the recombinant and native HhaI genomic DNA and in the context of ACGC only for DNA from the recombinant but not the native HhaI.
Figure 5.
Figure 5.
(A) Bacterial abundance in the ATCC gut microbiome calculated from bisulfite-seq data (left) and RIMS-seq (Right) normalized to DNAseq(+3H). The normalized abundance is plotted relative to the GC content of each bacterium. (B) Methylation levels in Acinetobacter johnsonii (ATCC skin microbiome).The methylation level was calculated for cytosine positions in the context of ACGT (yellow) and randomly selected positions in other contexts (blue). These bisulfite-seq data suggest some sites are methylated in the context of ACGT, but they are not fully methylated. (C) Methylation level in Streptococcus mitis (ATCC skin microbiome) calculated from bisulfite-seq data. The methylation level was calculated for cytosine positions in the context of ACGT and GCNGC (yellow) as well as for randomly selected positions in other contexts (blue). (D) Methylation level in Helicobacter pylori (ATCC gut microbiome) calculated from bisulfite-seq data. The methylation level was calculated for cytosine positions in the context of GCGC and CCTC (yellow) as well as for randomly selected positions in other contexts (blue).

References

    1. Loenen W.A.M., Dryden D.T.F., Raleigh E.A., Wilson G.G., Murray N.E.. Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res. 2014; 42:3–19. - PMC - PubMed
    1. Blow M.J., Clark T.A., Daum C.G., Deutschbauer A.M., Fomenkov A., Fries R., Froula J., Kang D.D., Malmstrom R.R., Morgan R.D.et al. .. The epigenomic landscape of prokaryotes. PLoS Genet. 2016; 12:e1005854. - PMC - PubMed
    1. Beaulaurier J., Schadt E.E., Fang G.. Deciphering bacterial epigenomes using modern sequencing technologies. Nat. Rev. Genet. 2019; 20:157–172. - PMC - PubMed
    1. Roberts R.J., Vincze T., Posfai J., Macelis D.. REBASE—a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res. 2015; 43:D298–D299. - PMC - PubMed
    1. Flusberg B.A., Webster D.R., Lee J.H., Travers K.J., Olivares E.C., Clark T.A., Korlach J., Turner S.W.. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat. Methods. 2010; 7:461–465. - PMC - PubMed

Publication types

MeSH terms

Supplementary concepts