LightCpG: a multi-view CpG sites detection on single-cell whole genome sequence data
- PMID: 31014252
- PMCID: PMC6480911
- DOI: 10.1186/s12864-019-5654-9
LightCpG: a multi-view CpG sites detection on single-cell whole genome sequence data
Erratum in
-
Correction to: LightCpG: a multi-view CpG sites detection on single-cell whole genome sequence data.BMC Genomics. 2019 May 13;20(1):365. doi: 10.1186/s12864-019-5742-x. BMC Genomics. 2019. PMID: 31084602 Free PMC article.
Abstract
Background: DNA methylation plays an important role in multiple biological processes that are closely related to human health. The study of DNA methylation can provide an insight into the mechanism behind human health and can also have a positive effect on the assessment of human health status. However, the available sequencing technology is limited by incomplete CpG coverage. Therefore, it is crucial to discover an efficient and convenient method capable of distinguishing between the states of CpG sites. Previous studies focused on identifying methylation states of the CpG sites in single cell, which only evaluated sequence information or structural information.
Results: In this paper, we propose a novel model, LightCpG, which combines the positional features with the sequence and structural features to provide information on the CpG sites at two stages. Next, we used the LightGBM model for training of the CpG site identification, and further utilized sample extraction and merged features to reduce the training time. Our results indicate that our method achieves outstanding performance in recognition of DNA methylation. The average AUC values of our method using the 25 human hepatocellular carcinoma cells (HCC) cell datasets and six human heptoplastoma-derived (HepG2) cell datasets were 0.9616 and 0.9213, respectively. Moreover, the average training times for our method on the HCC and HepG2 datasets were 8.3 and 5.06 s, respectively. Furthermore, the computational complexity of our model was much lower compared with other available methods that detect methylation states of the CpG sites.
Conclusions: In summary, LightCpG is an accurate model for identifying the DNA methylation status of CpG sites in single cells. Furthermore, three types of feature extraction methods and two strategies used in LightCpG are helpful for other prediction problems.
Keywords: DNA methylation; LightGBM; Positional features; Sequence features; Structural features.
Conflict of interest statement
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures













Similar articles
-
Exploring genome-wide DNA methylation profiles altered in hepatocellular carcinoma using Infinium HumanMethylation 450 BeadChips.Epigenetics. 2013 Jan;8(1):34-43. doi: 10.4161/epi.23062. Epub 2012 Dec 3. Epigenetics. 2013. PMID: 23208076 Free PMC article.
-
CpGFuse: a holistic approach for accurate identification of methylation states of DNA CpG sites.Brief Bioinform. 2024 Nov 22;26(1):bbaf063. doi: 10.1093/bib/bbaf063. Brief Bioinform. 2024. PMID: 39968737 Free PMC article.
-
[Screening for differential methylation status by CpG island microarray in the hepatocellular carcinoma cell lines].Zhonghua Zhong Liu Za Zhi. 2008 Dec;30(12):891-6. Zhonghua Zhong Liu Za Zhi. 2008. PMID: 19173987 Chinese.
-
Monitoring methylation changes in cancer.Adv Biochem Eng Biotechnol. 2007;104:1-11. doi: 10.1007/10_024. Adv Biochem Eng Biotechnol. 2007. PMID: 17290816 Review.
-
A review of computational algorithms for CpG islands detection.J Biosci. 2019 Dec;44(6):143. J Biosci. 2019. PMID: 31894124 Review.
Cited by
-
Multi-Omics Data Fusion via a Joint Kernel Learning Model for Cancer Subtype Discovery and Essential Gene Identification.Front Genet. 2021 Mar 4;12:647141. doi: 10.3389/fgene.2021.647141. eCollection 2021. Front Genet. 2021. PMID: 33747053 Free PMC article.
-
Deep learning imputes DNA methylation states in single cells and enhances the detection of epigenetic alterations in schizophrenia.Cell Genom. 2025 Mar 12;5(3):100774. doi: 10.1016/j.xgen.2025.100774. Epub 2025 Feb 21. Cell Genom. 2025. PMID: 39986279 Free PMC article.
-
UNet with Attention Networks: A Novel Deep Learning Approach for DNA Methylation Prediction in HeLa Cells.Genes (Basel). 2025 May 28;16(6):655. doi: 10.3390/genes16060655. Genes (Basel). 2025. PMID: 40565547 Free PMC article.
-
Computational Methods for Single-cell DNA Methylome Analysis.Genomics Proteomics Bioinformatics. 2023 Feb;21(1):48-66. doi: 10.1016/j.gpb.2022.05.007. Epub 2022 Jun 17. Genomics Proteomics Bioinformatics. 2023. PMID: 35718270 Free PMC article. Review.
-
Advancing epigenetic profiling in cervical cancer: machine learning techniques for classifying DNA methylation patterns.3 Biotech. 2024 Nov;14(11):264. doi: 10.1007/s13205-024-04107-2. Epub 2024 Oct 9. 3 Biotech. 2024. PMID: 39391214
References
-
- Suzuki MM, Adrian B. DNA methylation landscapes: provocative insights from epigenomics. Nat Rev Genet. 2008;9(6):465. - PubMed
-
- Bianchi C, Zangi R. Molecular dynamics study of the recognition of dimethylated CpG sites by MBD1 protein. J Chem Inf Model. 2015;55(3):636. - PubMed
-
- Gao D, Zhu B, Sun H. In: Mitochondrial DNA Methylation and Related Disease. Singapore: Springer Singapore: 2017. p. 117–32. - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources