Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 20;63(52):e202418500.
doi: 10.1002/anie.202418500. Epub 2024 Nov 25.

Robust Bisulfite-Free Single-Molecule Real-Time Sequencing of Methyldeoxycytidine Based on a Novel hpTet3 Enzyme

Affiliations

Robust Bisulfite-Free Single-Molecule Real-Time Sequencing of Methyldeoxycytidine Based on a Novel hpTet3 Enzyme

Hanife Sahin et al. Angew Chem Int Ed Engl. .

Abstract

In addition to the four canonical nucleosides dA, dG, dC and T, genomic DNA contains the additional base 5-methyldeoxycytidine (mdC). The presence of this methylated cytidine nucleoside in promoter regions or gene bodies significantly affects the transcriptional activity of the corresponding gene. Consequently, the methylation patterns of genes are crucial for either silencing or activating genes. Sequencing the positions of mdC in the genome is therefore of paramount importance for early cancer diagnostics as it helps determine incorrect gene expression. Currently, the bisulfite method is the gold standard for mdC-sequencing. However, this method has the drawback that the majority of the input DNA is degraded during the bisulfite treatment. Additionally, bisulfite sequencing is prone to errors. Here, we report a benign, bisulfite-free mdC sequencing method termed EMox-seq, which is based on third-generation single-molecule SMRT sequencing. The foundation of this technology is a new Tet3 enzyme that efficiently oxidizes mdCs to 5-carboxycytidine (cadC). In turn, cadC provides an excellent readout by SMRT sequencing using specially trained AI-based algorithms.

Keywords: Epigenetics; SMRTseq; Tet enzymes; methyldeoxycytidine; recurrent neural network.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
a) Depiction of bisulfite sequencing method and b) of the mdC to cadC oxidation chemistry as the basis for the sequencing method described in this publication. c) Depiction of the SMRT sequencing concept with a circular sequencing DNA template bound to the polymerase and fluorescent labeled triphosphates. d) Fluorescence signal during SMRT sequencing recorded by the Sequel IIe system. Y‐axis shows the fluorescent intensity, x‐axis the passing time. The kinetics or the nucleotide incorporation by the polymerase can be described by two time constants. The inter‐pulse‐duration (IPD), the time between two light pulses (nucleotide incorporation) and the pulse‐width (PW) the time the polymerase needs to form the phosphodiester bond. The presence of non‐canonical bases increase IPD and PW.
Figure 2
Figure 2
Purification of hpTet3. a) Depiction of the domain structure of hpTet3 in comparison to mouse Tet3. b) SDS‐PAGE of the purified protein after two steps (original image in Supporting Information Figure S2b).
Figure 3
Figure 3
Oxidation of genomic DNA with hpTet3. a) M.SssI methylated λDNA(dam,dcm) was digested, isotope standards were added (see Supporting Information for complete list) and the mixture analyzed by isotope dilution triple quadrupole mass spectrometry giving quantitative data for all nucleosides (left panel). The quantitative analysis of the nucleoside composition of genomic DNA was repeated after treatment of the genomic DNA with hpTet3 (right panel). b) This analysis was performed on various human and mouse genomes with natural methylation levels. Depicted are mean values ±s.d. of biological replicates (n=3). Bars represent the mean and error bars show the standard deviation.
Figure 4
Figure 4
Application of hpTet3 for SMRT sequencing. a) sequencing and model training workflow; dcm & dam negative DNA from lambda phage were sequenced using the Sequel IIe system, HiFi reads were aligned and IPD as well as PW values (features) extracted using ccsmeth; With ccsmeth, we trained a 5mdC detecting model, as well as a 5cadC detecting model. b) mean IPD & PW values (zscore normalized) extracted by ccsmeth across a 21‐k‐mer centered around a given CpG for unmodified λDNA (LMD‐dC, grey), methylated λDNA (LMD‐mdC, red) and carboxylated λDNA (LMD‐cadC, blue). c) methylation frequency of the λ genome using three analysis methods. The left graph shows Bisulfite‐Seq data for LMD‐mdC and LMD‐dC. The middle graph shows the methylation frequency (Pacbio's CCS) of LMD‐mdC and LMD‐dC using a custom‐trained mdC‐model. The right graph represents the methylation frequency (Pacbio's CCS) of M.Sssl methylated, hpTet3‐oxidized λDNA (LMD‐cadC) and LMD‐dC with a custom‐trained cadC‐model.

References

    1. Jones P. A., Nat. Rev. Genet. 2012, 13, 484–492. - PubMed
    1. Papanicolau-Sengos A., Aldape K., Annu. Rev. Pathol. Mech. Dis. 2022, 17, 295–321. - PubMed
    1. Li S., Tollefsbol T. O., Methods 2021, 187, 28–43. - PMC - PubMed
    1. Lo Y. M. D., Han D. S. C., Jiang P., Chiu R. W. K., Science 2021, 372, eaaw3616. - PubMed
    1. Vaisvila R., Ponnaluri V. K. C., Sun Z., Langhorst B. W., Saleh L., Guan S., Dai N., Campbell M. A., Sexton B. S., Marks K., Samaranayake M., Samuelson J. C., Church H. E., Tamanaha E., I. R. Corrêa Jr. , Pradhan S., Dimalanta E. T., T. C. Evans Jr. , Williams L., Davis T. B., Genome Res. 2021, 31, 1280–1289. - PMC - PubMed