Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 20;13(2):175.
doi: 10.3390/brainsci13020175.

Parkinson's Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms

Affiliations

Parkinson's Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms

Yiwen Bao et al. Brain Sci. .

Abstract

Parkinson's disease (PD) is a common progressive neurodegenerative disorder. Various evidence has revealed the possible penetration of peripheral immune cells in the substantia nigra, which may be essential for PD. Our study uses machine learning (ML) to screen for potential PD genetic biomarkers. Gene expression profiles were screened from the Gene Expression Omnibus (GEO). Differential expression genes (DEGs) were selected for the enrichment analysis. A protein-protein interaction (PPI) network was built with the STRING database (Search Tool for the Retrieval of Interacting Genes), and two ML approaches, namely least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE), were employed to identify candidate genes. The external validation dataset further tested the expression degree and diagnostic value of candidate biomarkers. To assess the validity of the diagnosis, we determined the receiver operating characteristic (ROC) curve. A convolution tool was employed to evaluate the composition of immune cells by CIBERSORT, and we performed correlation analyses on the basis of the training dataset. Twenty-seven DEGs were screened in the PD and control samples. Our results from the enrichment analysis showed a close association with inflammatory and immune-associated diseases. Both the LASSO and SVM algorithms screened eight and six characteristic genes. AGTR1, GBE1, TPBG, and HSPA6 are overlapping hub genes strongly related to PD. Our results of the area under the ROC (AUC), including AGTR1 (AUC = 0.933), GBE1 (AUC = 0.967), TPBG (AUC = 0.767), and HSPA6 (AUC = 0.633), suggested that these genes have good diagnostic value, and these genes were significantly associated with the degree of immune cell infiltration. AGTR1, GBE1, TPBG, and HSPA6 were identified as potential biomarkers in the diagnosis of PD and provide a novel viewpoint for further study on PD immune mechanism and therapy.

Keywords: Parkinson’s disease; immune infiltrates; least absolute shrinkage and selection operator; support vector machine.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Flowchart of this study.
Figure 2
Figure 2
PCA and DEGs in substantia nigra between PD and normal controls. (A) pre-correction raw PCA; (B) post-correction combat PCA; and (C) heatmap indicating a significant DEGs. These two colors denote distinct trends; darker color for a more pronounced trend; (D) volcano map exhibiting DEGs. Red and green denote upregulated and downregulated genes, while grey denotes no significant difference. PCA: principal component analysis; DEGs: differentially expressed genes.
Figure 3
Figure 3
The results of the enrichment analysis of differential expression genes (DEGs). (A) Gene Ontology (GO) enrichment analysis, where the x-axis refers to the generation, and the y-axis refers to the significantly enriched GO analysis of the modules. (B) Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment using Circos plots. Each column in the outermost circle corresponds to a KEGG pathway. The second circle represents the number of genes contained in each pathway. The redder the color, the more significant the enrichment of DEGs. The third circle represents the number of DEGs enriched. The innermost circle represents the proportion of DEGs in the enriched genes of the pathway. (C) Disease Ontology (DO) enrichment analysis, where the x-axis refers to the gene count, and the y-axis refers to the enriched diseases.
Figure 4
Figure 4
PPI network construction and 2 subcluster modules extracted by MCODE. (A) The interaction network among the proteins was coded by DEGs (25 nodes and 96 edges). The node refers to a protein, while the edges refer to protein–protein correlation between two nodes. (B) Subcluster module 1 was extracted by MCODE and consisted of 10 nodes and 41 edges; MCODE score = 9.111. (C) Subcluster module 2 consisted of 5 nodes and 8 edges; MCODE score = 4.
Figure 5
Figure 5
LASSO and SVM-RFE jointly screened and verified the special gene biomarkers. (A) eight genes were extracted for PD gene biomarkers with the LASSO algorithm; (B) six genes were extracted for PD gene biomarkers with the SVM-RFE algorithm; (C) Venn diagram indicating the four crossover genes between LASSO and SVM-RFE. LASSO: least absolute shrinkage and selection operator; SVM-RFE: support vector machine recursive feature elimination.
Figure 6
Figure 6
Expression levels of the four genes in the verification dataset GSE20164 for the substantia nigra samples of the control and PD groups. (A) AGTR: p = 0.014; (B) GBE1: p = 0.002; (C) TPBG: p = 0.15; (D) HSPA6: p = 0.28. p < 0.05 denotes statistical significance. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Figure 7
Figure 7
The ROC curves of four genes in the validation dataset. (A) AGTR: AUC = 0.933; (B) GBE1: AUC = 0.967; (C) TPBG: AUC = 0.767; and (D) HSPA6: AUC = 0.633. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Figure 8
Figure 8
Analysis of the infiltrating immune cells. (A) The contrast of 22 types of immune cells’ proportion between the control group and treatment group. The x-axis refers to immune cells, and the y-axis refers to the relative percentage. (B) Discrepancy in the immune cell infiltration. Blue and red legends represent the control group vs. the PD group. The x-axis represents the type of immune cells, and the y-axis represents the fraction. p < 0.05 denotes statistical significance (B cells memory, M2 macrophages, and activated dendritic cells have significant differential infiltration). (C) Correlation in the immune cell infiltration. The x/y-axes represent the immune cell types, the red color refers to a positive correlation, and the blue refers to a negative correlation. Darker color represents a stronger association.
Figure 9
Figure 9
Immune cell infiltration correlations of the four selected genes. (A) lollipop plot of the correlation between AGTR and immune cells; (B,C) scatter plots of the significant correlation between AGTR and immune cells (M2 macrophages: R = −0.46, p = 0.0073; monocytes: R = −0.53, p = 0.0017); (D) lollipop plot of the correlation between GBE1 and immune cells; (E,F) scatter plots of the significant correlation between GBE1 and immune cells (T cells CD4 memory resting: R = −0.35, p = 0.046; monocytes: R = −0.38, p = 0.029); (G) lollipop plot of the correlation between TPBG and immune cells; (H) scatter plot of the significant correlation between TPBG and monocytes (R = −0.46, p = 0.007); (I) lollipop plot of the correlation between HSPA6 and immune cells; (J) scatter plot of the significant correlation between HSPA6 and plasma cells (R = −0.45, p = 0.0089). In the right column of lollipop plots, p-values < 0.05 with a red color indicate statistical significance. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.

Similar articles

Cited by

References

    1. Bloem B.R., Okun M.S., Klein C. Parkinson’s disease. Lancet. 2021;397:2284–2303. doi: 10.1016/S0140-6736(21)00218-X. - DOI - PubMed
    1. Sveinbjornsdottir S. The clinical symptoms of Parkinson’s disease. J. Neurochem. 2016;139:318–324. doi: 10.1111/jnc.13691. - DOI - PubMed
    1. Ahlskog J.E., Muenter M.D. Frequency of levodopa-related dyskinesias and motor fluctuations as estimated from the cumulative literature. Mov. Disord. 2001;16:448–458. doi: 10.1002/mds.1090. - DOI - PubMed
    1. Chapuis S., Ouchchane L., Metz O., Gerbaud L., Durif F. Impact of the motor complications of Parkinson’s disease on the quality of life. Mov. Disord. 2005;20:224–230. doi: 10.1002/mds.20279. - DOI - PubMed
    1. Harms A.S., Ferreira S.A., Romero-Ramos M. Periphery and brain, innate and adaptive immunity in Parkinson’s disease. Acta Neuropathol. 2021;141:527–545. - PMC - PubMed

LinkOut - more resources