Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 9;23(6):2959.
doi: 10.3390/ijms23062959.

Tissue-Specific Methylation Biosignatures for Monitoring Diseases: An In Silico Approach

Affiliations

Tissue-Specific Methylation Biosignatures for Monitoring Diseases: An In Silico Approach

Makrina Karaglani et al. Int J Mol Sci. .

Abstract

Tissue-specific gene methylation events are key to the pathogenesis of several diseases and can be utilized for diagnosis and monitoring. Here, we established an in silico pipeline to analyze high-throughput methylome datasets to identify specific methylation fingerprints in three pathological entities of major burden, i.e., breast cancer (BrCa), osteoarthritis (OA) and diabetes mellitus (DM). Differential methylation analysis was conducted to compare tissues/cells related to the pathology and different types of healthy tissues, revealing Differentially Methylated Genes (DMGs). Highly performing and low feature number biosignatures were built with automated machine learning, including: (1) a five-gene biosignature discriminating BrCa tissue from healthy tissues (AUC 0.987 and precision 0.987), (2) three equivalent OA cartilage-specific biosignatures containing four genes each (AUC 0.978 and precision 0.986) and (3) a four-gene pancreatic β-cell-specific biosignature (AUC 0.984 and precision 0.995). Next, the BrCa biosignature was validated using an independent ccfDNA dataset showing an AUC and precision of 1.000, verifying the biosignature's applicability in liquid biopsy. Functional and protein interaction prediction analysis revealed that most DMGs identified are involved in pathways known to be related to the studied diseases or pointed to new ones. Overall, our data-driven approach contributes to the maximum exploitation of high-throughput methylome readings, helping to establish specific disease profiles to be applied in clinical practice and to understand human pathology.

Keywords: breast cancer; diabetes; liquid biopsy; machine learning; methylation; microarrays; model; osteoarthritis.

PubMed Disclaimer

Conflict of interest statement

I.T. is the CEO and founder of JADBio.

Figures

Figure 1
Figure 1
Differential methylation analysis comparing BrCa and healthy tissues. Gene ontology analysis of the top 400 DMGs in the aspects of (A) biological process, (B) cellular component and (C) molecular function analysis. (D) Heatmap plot of top 100 DMGs comparing BrCa and healthy tissues. Abbreviations: BrCa = breast cancer, DMGs = differentially methylated genes.
Figure 2
Figure 2
BrCa-specific methylation biosignature built using AutoML. (A) ROC curves of training (blue line) and validation (green line) models. (B) Supervised PCA plot (i.e., only considering the selected relevant biomarkers) presents separation between BrCa (blue) and healthy tissues (green) within the training group. (C) Out-of-sample probability density plot (i.e., probability predictions when samples were not used for training) depicts discrete distributions among studied classes of the training group. (D) PCA plot presents separation between BrCa (blue) and healthy tissues (green) within the validation group. (E) ROC curves of training (blue line) and external validation (green line) models and (F) PCA plot presents separation between BrCa ccfDNA (blue) and healthy ccfDNA (green) within the external validation group. Abbreviations: BrCa = breast cancer, ROC = receiver operating characteristic, PCA = principal component analysis.
Figure 3
Figure 3
Differential methylation analysis comparing OA and healthy tissues. Gene ontology analysis of top 400 DMGs in the aspects of (A) biological process, (B) cellular component and (C) molecular function analysis. (D) Heatmap plot of top 100 DMGs comparing OA and healthy tissues. Abbreviations: OA = osteoarthritis, DMGs = differentially methylated genes.
Figure 4
Figure 4
OA-specific methylation biosignature built using AutoML. (A) ROC curves of training (blue line) and validation (green line) models. (B) Supervised PCA plot (i.e., only considering the selected relevant biomarkers) presents separation between OA (blue) and non-OA healthy tissues (green) within the training group. (C) Out-of-sample probability density plot (i.e., probability predictions when samples were not used for training) depicts discrete distributions among studied classes of the training group. (D) PCA plot presents separation between OA (blue) and non-OA healthy tissues (green) within the validation group. Abbreviations: OA = osteoarthritis, ROC = receiver operating characteristic, PCA = principal component analysis.
Figure 5
Figure 5
Differential methylation analysis comparing pancreatic β-cells and other tissues. Gene ontology analysis of 66 DMGs in the aspects of (A) biological process and (B) molecular function analysis. (C) Heatmap plot of 66 DMGs comparing pancreatic β-cells and other healthy tissues. Abbreviations: DMGs = differentially methylated genes.
Figure 6
Figure 6
Pancreatic β-cell-specific methylation biosignature built using AutoML. (A) ROC curve of model. (B) UMAP plot shows separation between pancreatic β-cells (blue) and other tissues (green). (C) Supervised PCA plot (i.e., only considering the selected relevant biomarkers) presents separation between pancreatic β-cells (blue) and other tissues (green). (D) Out-of-sample probability density plot (i.e., probability predictions when samples were not used for training) depicts discrete distributions among studied classes. Abbreviations: ROC = receiver operating characteristic, PCA = principal component analysis, UMAP = uniform manifold approximation and projection.
Figure 7
Figure 7
Study workflow. Abbreviations: DMGs = differentially methylated genes, GEO = Gene Expression Omnibus. Created with BioRender.com, accessed on 20 July 2021.

Similar articles

Cited by

References

    1. Robertson K.D. DNA methylation and human disease. Nat. Rev. Genet. 2005;6:597–610. doi: 10.1038/nrg1655. - DOI - PubMed
    1. Kulis M., Esteller M. 2–DNA Methylation and Cancer. In: Herceg Z., Ushijima T., editors. Advances in Genetics. Volume 70. Academic Press; Cambridge, MA, USA: 2010. pp. 27–56. - PubMed
    1. Richardson B. DNA methylation and autoimmune disease. Clin. Immunol. 2003;109:72–79. doi: 10.1016/S1521-6616(03)00206-7. - DOI - PubMed
    1. Bansal A., Pinney S.E. DNA methylation and its role in the pathogenesis of diabetes. Pediatr. Diabetes. 2017;18:167–177. doi: 10.1111/pedi.12521. - DOI - PMC - PubMed
    1. Ammal Kaidery N., Tarannum S., Thomas B. Epigenetic Landscape of Parkinson’s Disease: Emerging Role in Disease Mechanisms and Therapeutic Modalities. Neurotherapeutics. 2013;10:698–708. doi: 10.1007/s13311-013-0211-8. - DOI - PMC - PubMed