Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 28;49(2):207-219.
doi: 10.11817/j.issn.1672-7347.2024.230307.

Screening of key immune - related gene in Parkinson ' s disease based on WGCNA and machine learning

[Article in English, Chinese]
Affiliations

Screening of key immune - related gene in Parkinson ' s disease based on WGCNA and machine learning

[Article in English, Chinese]
Yiming Huang et al. Zhong Nan Da Xue Xue Bao Yi Xue Ban. .

Abstract

Objectives: Abnormal immune system activation and inflammation are crucial in causing Parkinson's disease. However, we still don't fully understand how certain immune-related genes contribute to the disease's development and progression. This study aims to screen key immune-related gene in Parkinson's disease based on weighted gene co-expression network analysis (WGCNA) and machine learning.

Methods: This study downloaded the gene chip data from the Gene Expression Omnibus (GEO) database, and used WGCNA to screen out important gene modules related to Parkinson's disease. Genes from important modules were exported and a Venn diagram of important Parkinson's disease-related genes and immune-related genes was drawn to screen out immune related genes of Parkinson's disease. Gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) were used to analyze the the functions of immune-related genes and signaling pathways involved. Immune cell infiltration analysis was performed using the CIBERSORT package of R language. Using bioinformatics method and 3 machine learning methods [least absolute shrinkage and selection operator (LASSO) regression, random forest (RF), and support vector machine (SVM)], the immune-related genes of Parkinson's disease were further screened. A Venn diagram of differentially expressed genes screened using the 4 methods was drawn with the intersection gene being hub nodes (hub) gene. The downstream proteins of the Parkinson's disease hub gene was identified through the STRING database and a protein-protein interaction network diagram was drawn.

Results: A total of 218 immune genes related to Parkinson's disease were identified, including 45 upregulated genes and 50 downregulated genes. Enrichment analysis showed that the 218 genes were mainly enriched in immune system response to foreign substances and viral infection pathways. The results of immune infiltration analysis showed that the infiltration percentages of CD4+ T cells, NK cells, CD8+ T cells, and B cells were higher in the samples of Parkinson's disease patients, while resting NK cells and resting CD4+ T cells were significantly infiltrated in the samples of Parkinson's disease patients. ANK1 was screened out as the hub gene. The analysis of the protein-protein interaction network showed that the ANK1 translated and expressed 11 proteins which mainly participated in functions such as signal transduction, iron homeostasis regulation, and immune system activation.

Conclusions: This study identifies the Parkinson's disease immune-related key gene ANK1 via WGCNA and machine learning methods, suggesting its potential as a candidate therapeutic target for Parkinson's disease.

目的: 在帕金森病的发病过程中,免疫系统的异常激活和炎症反应起着重要作用。然而,目前对于免疫相关关键基因在帕金森病发生和发展中的具体作用和作用机制的了解仍然有限。本研究旨在通过加权基因共表达网络分析(weighted gene co-expression network analysis,WGCNA)和机器学习筛选帕金森病免疫相关关键基因。方法: 从基因表达综合(Gene Expression Omnibus,GEO)数据库下载基因芯片数据,采用WGCNA筛选出与帕金森病相关的重要基因模块;将重要模块中的基因导出,绘制帕金森病重要相关基因与免疫相关基因的韦恩图,从而筛选出帕金森病免疫相关基因。采用基因本体(gene ontology,GO)分析和京都基因和基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG)深入分析免疫相关基因的功能及参与的信号通路。通过R语言的CIBERSORT包进行免疫细胞浸润分析。采用生物信息学方法和3种机器学习方法[最小绝对收缩和选择算子(least absolute shrinkage and selection operator,LASSO)回归、随机森林(random forest,RF)和支持向量机(support vector machine,SVM)]对筛选出的帕金森病免疫相关基因进行进一步筛选研究,绘制4种方法筛选的差异表达基因的韦恩图,筛选交集基因即中心节点(hub node,hub)基因。通过STRING数据库搜索帕金森病hub基因的下游蛋白质,绘制蛋白质互作网络图。结果: 筛选出帕金森病重要模块基因中与免疫相关的基因218个,其中45个为上调基因,50个为下调基因。富集分析结果显示218个基因主要在免疫系统对外来物反应和病毒感染通路富集。免疫浸润分析结果表明,CD4+ T细胞、NK细胞、CD8+ T细胞、B细胞在帕金森病患者样本中的浸润百分率较高,静息NK细胞、静息CD4+ T细胞在帕金森病患者样本中显著浸润。4种方法筛选出的hub基因为ANK1基因。交集基因蛋白质互作网络分析结果显示,ANK1基因翻译表达的11个蛋白质主要参与信号转导、铁稳态调节及免疫系统激活等功能。结论: 通过WGCNA和机器学习方法,筛选出帕金森病免疫相关关键基因ANK1,该基因可能成为帕金森病诊断和治疗的候选靶点。.

Keywords: ANK1 gene; Parkinson’s disease; immunity; machine learning; weighted gene co-expression network analysis.

PubMed Disclaimer

Conflict of interest statement

作者声称无任何利益冲突。

Figures

图1
图1
最优软阈值分析图 Figure 1 Optimal soft threshold analysis diagram A: Ordinate is the intra-module connectivity or correlation of each gene within the module, also known as the internal connectivity of the gene (R 2=0.85). B: Ordinate is the mean connectivity (249.07), the degree of correlation between the gene and other genes in the network.
图2
图2
基因模块聚类分析 Figure 2 Gene module cluster analysis
图3
图3
基因模块与帕金森病患病相关性热图 Figure 3 Heat map of correlation between gene modules and Parkinson’s disease status
图4
图4
帕金森病相关与免疫相关基因筛选韦恩图 Figure 4 Venn diagram of Parkinson’s disease-related and immune-related gene screening
图5
图5
GO富集生物学功能 Figure 5 GO enrichment biological function GO: Gene ontology.
图6
图6
GO富集细胞成分 Figure 6 GO enriched cell component GO: Gene ontology.
图7
图7
GO富集分子功能 Figure 7 GO enriched molecular function GO: Gene ontology; RAGE: Receptor for advanced glycation end products.
图8
图8
KEGG通路分析 Figure 8 KEGG pathway analysis KEGG: Kyoto Encyclopedia of Genes and Genomes.
图9
图9
免疫细胞浸润分析 Figure 9 Immune cell infiltration analysis A: Infiltration levels of common immune cells in different samples. GSM184354 to GSM184362, GSM503950 to GSM503957, and GSM1192691 to GSM1192698 are normal samples; GSM184363 to GSM184378, GSM503958 to GSM503967, and GSM1192704 to GSM1192718 are disease samples. B: Heatmap of correlations between different types of immune cells. C: Infiltration results of different types of immune cells in normal samples and Parkinson’s disease patients.
图10
图10
绿松石色模块基因差异表达分析图 Figure 10 Differential expression analysis chart of turquoise module genes
图11
图11
LASSO回归 Figure 11 LASSO regression A: The abscissa is the logarithm of lambda. The larger the lambda, the more model coefficients tend to zero. A total of 11 genes are screened out according to the number of lines. B: The abscissa is the number of non-zero coefficients in the model. As lambda increases, the number of non-zero coefficients in the model decreases, and the optimal solution is 11 genes.
图12
图12
随机森林特征重要度 Figure 12 Random forest feature importance CD47: Cluster of differentiation 47; HSPA6: Heat shock protein family A (HSP70) member 6; ICAM2: Intercellular adhesion molecule 2; GPR27: G protein-coupled receptor 27; MICB: MHC class I polypeptide-related sequence B; SIRPA: Signal regulatory protein alpha; MICA: MHC class I polypeptide-related sequence A; ANK1: Ankyrin 1; EMP3: Epithelial membrane protein 3; CXCR2: C-X-C chemokine receptor 2; TIMM13: Translocase of inner mitochondrial membrane 13 homolog; L1CAM: L1 cell adhesion molecule; APOBEC3A: Apolipoprotein B mRNA editing enzyme catalytic subunit 3a; BZW2: Basic leucine zipper and W2 domains 2; PDGFRL: Platelet-derived growth factor receptor-like; CCNA1: Cell cycle and apoptosis regulator 1; S100A9: S100 calcium binding protein A9; METRNL: Meteorin like, glial cell differentiation regulator; SIGLEC10: Sialic acid binding Ig like lectin 10; HLA-E: Human leukocyte antigen E; LncMSE: Long non-coding RNA mean squared error.
图13
图13
韦恩图显示4种方法共同筛选得到的基因 Figure 13 Venn diagram showing intersected gene screening by 4 methods together LASSO: Least absolute shrinkage and selection operator; RF: Random forest; SVM: Support vector machine; Limma: Linear models for microarray data.
图14
图14
ANK1 下游蛋白质互作网络及介数中心度排序 Figure 14 ANK1 downstream protein interaction network and betweenness centrality ranking A: A total of 11 downstream proteins translated and expressed by the ANK1 gene. B: By constructing a protein interaction network diagram, the connections between various proteins are revealed. ANK1: Ankyrin 1; ANK2: Ankyrin 2; OBSCN: Obscurin; TTN: Titin; CD44: Cluster of differentiation 44; ITGAM: Integrin subunit alpha M; SPTB: Spectrin, beta, erythrocytic; NFASC: Neurofascin; SPTA1: Spectrin, alpha, erythrocytic 1; RHAG: Rh-associated glycoprotein; SLC4A1: Solute carrier family 4 member 1.

Similar articles

References

    1. WHO . Launch of WHO’s Parkinson disease technical brief [Z]. World Health Organization, 2022.
    1. Hayes MT. Parkinson’s disease and Parkinsonism[J]. Am J Med, 2019, 132(7): 802-807. 10.1016/j.amjmed.2019.03.001. - DOI - PubMed
    1. Tolosa E, Garrido A, Scholz SW, et al. . Challenges in the diagnosis of Parkinson’s disease[J]. Lancet Neurol, 2021, 20(5): 385-397. 10.1016/S1474-4422(21)00030-2. - DOI - PMC - PubMed
    1. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis[J]. BMC Bioinformatics, 2008, 9: 559. 10.1186/1471-2105-9-559. - DOI - PMC - PubMed
    1. 汤琳琳, 尤浩军, 雷静. 帕金森病的病理性痛机制与治疗进展[J]. 生理学报, 2023, 75(4): 595-603. 10.13294/japs.2023.0042. - DOI - PubMed
    2. TANG Linlin, YOU Haojun, LEI Jing. Pathological pain mechanism and treatment progress of Parkinson’s disease[J]. Acta Physiologica Sinica, 2023, 75(4): 595-603. 10.13294/japs.2023.0042. - DOI - PubMed