Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 20;15(26):10073-10083.
doi: 10.1039/d4sc00930d. eCollection 2024 Jul 3.

Simultaneous detection of 5-methylcytosine and 5-hydroxymethylcytosine at specific genomic loci by engineered deaminase-assisted sequencing

Affiliations

Simultaneous detection of 5-methylcytosine and 5-hydroxymethylcytosine at specific genomic loci by engineered deaminase-assisted sequencing

Neng-Bin Xie et al. Chem Sci. .

Abstract

Cytosine modifications, particularly 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC), play crucial roles in numerous biological processes. Current analytical methods are often constrained to the separate detection of either 5mC or 5hmC, or the combination of both modifications. The ability to simultaneously detect C, 5mC, and 5hmC at the same genomic locations with precise stoichiometry is highly desirable. Herein, we introduce a method termed engineered deaminase-assisted sequencing (EDA-seq) for the simultaneous quantification of C, 5mC, and 5hmC at the same genomic sites. EDA-seq utilizes a specially engineered protein, derived from human APOBEC3A (A3A), known as eA3A-M5. eA3A-M5 exhibits distinct deamination capabilities for C, 5mC, and 5hmC. In EDA-seq, C undergoes complete deamination and is sequenced as T. 5mC is partially deaminated resulting in a mixed readout of T and C, and 5hmC remains undeaminated and is read as C. Consequently, the proportion of T readouts (P T) reflects the collective occurrences of C and 5mC, regulated by the deamination rate of 5mC (R 5mC). By determining R 5mC and P T values, we can deduce the precise levels of C, 5mC, and 5hmC at particular genomic locations. We successfully used EDA-seq to simultaneously measure C, 5mC, and 5hmC at specific loci within human lung cancer tissue and their normal counterpart. The results from EDA-seq demonstrated a strong concordance with those obtained from the combined application of BS-seq and ACE-seq methods. EDA-seq eliminates the need for bisulfite treatment, DNA oxidation or glycosylation and uniquely enables simultaneous quantification of C, 5mC and 5hmC at the same genomic locations.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Fig. 1
Fig. 1. Principle of EDA-seq. (A) C can be deaminated by eA3A-M5 to form U, which pairs with A; 5mC is partially deaminated by eA3A-M5, resulting in partial pairing with A and partial pairing with C; 5hmC is not deaminated by eA3A-M5 and still pairs with G. (B) In EDA-seq, upon treatment with eA3A-M5, C was completely deaminated and read as T during sequencing; 5mC was partially deaminated and read as both C and T; while 5hmC was not deaminated and read as C. The proportion of T readouts in sequencing at specific cytosine sites is equivalent to the sum of the proportion of C and the proportion of 5mC multiplied by the deamination rate of 5mC. The PT1 and PT2 values can be determined from the sequencing results of the analyzed DNA treated with eA3A-M5 at different times. The R5mC1 and R5mC2 values can be obtained from the sequencing results of the 5mC spike-in DNA. The proportions of C, 5mC and 5hmC can be obtained according to eqn (5)–(7).
Fig. 2
Fig. 2. Assessment of eA3A-M5 specificity for C, 5mC, and 5hmC across various sequence contexts. (A) The amino acid compositions of wtA3A and engineered A3A mutants (eA3A-v1 to eA3A-M5). (B) Sanger sequencing results for DNA substrates containing C, 5mC, and 5hmC (DNA-C1, DNA-5mC1, and DNA-5hmC1, respectively) as analyzed by EDA-seq. eA3A-M5 converted all the C sites to U, which were subsequently read as T during sequencing; 5mC sites underwent partial deamination, resulting in a mixed read of C and T; 5hmC sites remained unaltered by eA3A-M5 and continued to be read as C.
Fig. 3
Fig. 3. Quantitative assessment of the 5mC level at cytosine sites containing C and 5mC by EDA-seq. (A) DNA-C2 and DNA-5mC2 were mixed at varying ratios, with DNA-5mC2 ranging from 0% to 100%. A 0.1% DNA-5mC2 spike-in was introduced to determine the deamination rate of 5mC. (B) Sanger sequencing results of TC sites in the DNA-C2 and DNA-5mC2 mixture, along with the DNA-5mC2 spike-in. (C) Linear regression analysis of the measured proportion of 5mC at different sequence contexts against the corresponding theoretical proportion of 5mC at those sequence contexts.
Fig. 4
Fig. 4. Quantitative assessment of C, 5mC, and 5hmC levels at cytosine sites containing C, 5mC, and 5hmC. (A) DNA-C2, DNA-5mC2, and DNA-5hmC2 were mixed at proportions of 30%, 30%, and 40%, respectively. A 0.1% DNA-5mC2 spike-in was included in the mixture to determine the deamination rate of 5mC. (B) Sanger sequencing results of TC sites in the DNA-C2, DNA-5mC2, and DNA-5hmC2 mixture, as well as the DNA-5mC2 spike-in, after treatment with eA3A-M5 for varying durations. (C) The proportions of C, 5mC, and 5hmC at different sequence contexts measured by EDA-seq. Dashed lines indicate the theoretical proportions of C, 5mC, and 5hmC.
Fig. 5
Fig. 5. Quantitative assessment of C, 5mC, and 5hmC at specific genomic loci from lung cancer tissue and adjacent normal tissue using EDA-seq and combined BS-seq and ACE-seq. (A) Schematic illustration of the quantitative detection of C, 5mC, and 5hmC at specific genomic loci by EDA-seq. THRA 5mC spike-in and ALS2CL 5mC spike-in were added to determine the deamination rate of 5mC. (B) Sanger sequencing results of the chr17.38222379 site from normal lung tissue and corresponding lung cancer tissue, along with the THRA 5mC spike-in treated with eA3A-M5 for varying durations. (C) Sanger sequencing results of the chr17.38222379 site from normal lung tissue and corresponding lung cancer tissue using BS-seq and ACE-seq. (D) The proportions of C, 5mC, and 5hmC at the chr17.38222379 site detected by EDA-seq and the combined BS-seq and ACE-seq. (E) Sanger sequencing results of the chr3.46713993 site from normal lung tissue and corresponding lung cancer tissue, along with the ALS2CL 5mC spike-in treated with eA3A-M5 for varying durations. (F) Sanger sequencing results of the chr3.46713993 site from normal lung tissue and corresponding lung cancer tissue using BS-seq and ACE-seq. (G) The proportions of C, 5mC, and 5hmC at the chr3.46713993 site detected by EDA-seq and the combined BS-seq and ACE-seq.

References

    1. Bilyard M. K. Becker S. Balasubramanian S. Curr. Opin. Chem. Biol. 2020;57:1–7. doi: 10.1016/j.cbpa.2020.01.014. - DOI - PubMed
    1. Luo C. Hajkova P. Ecker J. R. Science. 2018;361:1336–1340. doi: 10.1126/science.aat6806. - DOI - PMC - PubMed
    1. Kriaucionis S. Heintz N. Science. 2009;324:929–930. - PMC - PubMed
    1. Tahiliani M. Koh K. P. Shen Y. Pastor W. A. Bandukwala H. Brudno Y. Agarwal S. Iyer L. M. Liu D. R. Aravind L. Rao A. Science. 2009;324:930–935. doi: 10.1126/science.1170116. - DOI - PMC - PubMed
    1. Munzel M. Globisch D. Carell T. Angew Chem. Int. Ed. Engl. 2011;50:6460–6468. doi: 10.1002/anie.201101547. - DOI - PubMed

LinkOut - more resources