Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 8;103(45):e40484.
doi: 10.1097/MD.0000000000040484.

m6A-related genes and their role in Parkinson's disease: Insights from machine learning and consensus clustering

Affiliations

m6A-related genes and their role in Parkinson's disease: Insights from machine learning and consensus clustering

Jing Yan et al. Medicine (Baltimore). .

Abstract

Parkinson disease (PD) is a chronic neurological disorder primarily characterized by a deficiency of dopamine in the brain. In recent years, numerous studies have highlighted the substantial influence of RNA N6-methyladenosine (m6A) regulators on various biological processes. Nevertheless, the specific contribution of m6A-related genes to the development and progression of PD remains uncertain. In this study, we performed a differential analysis of the GSE8397 dataset in the Gene Expression Omnibus database and selected important m6A-related genes. Candidate m6A-related genes were then screened using a random forest model to predict the risk of PD. A nomogram model was built based on the candidate m6A-related genes. By employing a consensus clustering method, PD was divided into different m6A clusters based on the selected significant m6A-related genes. Finally, we performed immune cell infiltration analysis to explore the immune infiltration between different clusters. We performed a differential analysis of the GSE8397 dataset in the Gene Expression Omnibus database and selected 11 important m6A-related genes. Four candidate m6A-related genes (YTH Domain Containing 2, heterogeneous nuclear ribonucleoprotein C, leucine-rich pentatricopeptide repeat motif containing protein and insulin-like growth factor binding protein-3) were then screened using a random forest model to predict the risk of PD. A nomogram model was built based on the 4 candidate m6A-related genes. The decision curve analysis indicated that patients can benefit from the nomogram model. By employing a consensus clustering method, PD was divided into 2 m6A clusters (cluster A and cluster B) based on the selected significant m6A-related genes. The immune cell infiltration analysis revealed that cluster A and cluster B exhibit distinct immune phenotypes. In conclusion, m6A-related genes play a significant role in the development of PD and our study on m6A clustering may potentially guide personalized treatment strategies for PD in the future.

PubMed Disclaimer

Conflict of interest statement

The authors have no funding and conflicts of interest to disclose.

Figures

Figure 1.
Figure 1.
Distribution of RNA N6 methyl adenosine (m6A)-related genes in Parkinson disease (PD). (A) A volcanic map of the differential genes identified in GSE8397 (PD) dataset. (B) Boxplots of differential expression of m6A-related genes identified between control samples and PD samples. (C) Heatmap of the expression of 11 m6A-related genes in control and PD samples. (D) Location of the 11 m6A-related genes on the chromosome. (E) 11 m6A gene correlation analysis. *P < .05, **P < .01, and ***P < .001.
Figure 2.
Figure 2.
RF and SVM machine learning model construction. (A) Box plots of RF and SVM residuals to show the distribution of residuals for the RF and SVM models. (B) Inverse cumulative distribution of RF and SVM residuals to show the distribution of residuals for RF and SVM models. (C) ROC curves show the accuracy of RF and SVM models. (D) Random forest tree results. (E). Importance scores for disease characterizing genes. RF = random forest, ROC = receiver operating characteristic, SVM = support vector machine.
Figure 3.
Figure 3.
Construction of nomogram model. (A) Construction of nomogram model based on 4 candidate m6A-related genes. (B) Construction of calibration curve of the nomogram model. (C) Construction of DCA of the nomogram model. (D) Clinical impact curves to assess the clinical impact of nomogram models. DCA = decision curve analysis.
Figure 4.
Figure 4.
Consistent clustering of 11 m6A-related genes in Parkinson disease. (A) Consensus clustering matrix when k = 2. (B) Consensus clustering matrix when k = 3. (C) Consensus clustering matrix when k = 4. (D) Consensus clustering matrix when k = 5. (E) Representative CDF curves. (F) Box plots of differential expression of 11 m6A-related genes in cluster A and cluster B. (G) Heatmap of the expression of 11 m6A-related genes in cluster A and cluster B. (H) Principal component analysis of cluster A and cluster B. (I) KEGG enrichment analysis of 35 DEGs. *P < .05, **P < .01, and ***P < .001. CDF = cumulative distribution function, KEGG = Kyoto Encyclopedia of Genes and Genomes, m6A = N6-methyladenosine.
Figure 5.
Figure 5.
Analysis of immune cell infiltration by m6A subtypes. (A) Differential immune cell infiltration between cluster A and cluster B. (B) Correlation analysis between infiltrating immune cells and 11 m6A-related genes. (C) Difference in the abundance of infiltrating immune cells between high and HNRNPA2B1 expression groups. *P < .05, **P < .01, and ***P < .001. HNRNPA2B1 = heterogeneous nuclear ribonucleoprotein A2/B1, m6A = N6-methyladenosine.
Figure 6.
Figure 6.
Consensus clustering of 35 m6A-related DEGs in Parkinson disease. (A–D) Consensus matrix of 35 m6A-related DEGs with K = 2–5. (E) Expression heatmap of 35 m6A-related DEGs in gene cluster A and gene cluster B. (F) Boxplot of differential expression of 11 m6A-related genes in gene cluster A and gene cluster B. (G) Differential analysis of immune cell infiltration of gene cluster A and gene cluster B. *P < .05, **P < .01, and **P < .001. DEGs = differentially expressed genes, m6A = N6-methyladenosine.
Figure 7.
Figure 7.
Independent external dataset validation results. (A) ROC results of GSE22491. (E) ROC results of GSE28894. ROC = receiver operating characteristic.

Similar articles

References

    1. Bloem BR, Okun MS, Klein C. Parkinson’s disease. Lancet. 2021;397:2284–303. - PubMed
    1. Jankovic J, Tan EK. Parkinson’s disease: etiopathogenesis and treatment. J Neurol Neurosurg Psychiatry. 2020;91:795–808. - PubMed
    1. Chen Z, Li G, Liu J. Autonomic dysfunction in Parkinson’s disease: Implications for pathophysiology, diagnosis, and treatment. Neurobiol Dis. 2020;134:104700. - PubMed
    1. Flores-Dorantes MT, Díaz-López YE, Gutiérrez-Aguilar R. Environment and gene association with obesity and their impact on neurodegenerative and neurodevelopmental diseases. Front Neurosci. 2020;14:863. - PMC - PubMed
    1. Farrer MJ. Genetics of Parkinson disease: paradigm shifts and future prospects. Nat Rev Genet. 2006;7:306–18. - PubMed

Substances