Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 17;12(1):13963.
doi: 10.1038/s41598-022-18273-x.

Identification of key candidate genes for IgA nephropathy using machine learning and statistics based bioinformatics models

Affiliations

Identification of key candidate genes for IgA nephropathy using machine learning and statistics based bioinformatics models

Md Al Mehedi Hasan et al. Sci Rep. .

Abstract

Immunoglobulin-A-nephropathy (IgAN) is a kidney disease caused by the accumulation of IgAN deposits in the kidneys, which causes inflammation and damage to the kidney tissues. Various bioinformatics analysis-based approaches are widely used to predict novel candidate genes and pathways associated with IgAN. However, there is still some scope to clearly explore the molecular mechanisms and causes of IgAN development and progression. Therefore, the present study aimed to identify key candidate genes for IgAN using machine learning (ML) and statistics-based bioinformatics models. First, differentially expressed genes (DEGs) were identified using limma, and then enrichment analysis was performed on DEGs using DAVID. Protein-protein interaction (PPI) was constructed using STRING and Cytoscape was used to determine hub genes based on connectivity and hub modules based on MCODE scores and their associated genes from DEGs. Furthermore, ML-based algorithms, namely support vector machine (SVM), least absolute shrinkage and selection operator (LASSO), and partial least square discriminant analysis (PLS-DA) were applied to identify the discriminative genes of IgAN from DEGs. Finally, the key candidate genes (FOS, JUN, EGR1, FOSB, and DUSP1) were identified as overlapping genes among the selected hub genes, hub module genes, and discriminative genes from SVM, LASSO, and PLS-DA, respectively which can be used for the diagnosis and treatment of IgAN.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Flowchart of data preparation, processing, analysis, and validation.
Figure 2
Figure 2
Identification and hierarchical clustering of DEGs for IgAN patients. (A) Volcano plot of DEGs which were generated using “ggplot2 version 3.3.6” package in R (https://cran.r-project.org/package=ggplot2) . Dodger blue represents down-regulated, gray represents no significant genes, and fire brick represents up-regulated DEGs. (B) Heatmap of the DEGs for IgAN patients which were generated using “NMF” version 0.24.0 package in R (https://cran.r-project.org/package=NMF). The horizontal axis shows the number of patients and the vertical axis shows DEGs.
Figure 3
Figure 3
(A) PPI network of DEGs, (B) Module 1, and (C) Module 2. These three figures were generated by Cytoscape 3.9.1 (www.cytoscape.org).
Figure 4
Figure 4
Classification accuracy of SVM for each gene.
Figure 5
Figure 5
Discriminative gene selected using LASSO-based model by 10 CV: (A) A coefficient profile plot was generated against the log (λ) sequence. (B) 32 discriminative genes were selected for IgAN. (C) Contribution of 32 discriminative genes for IgAN patients.
Figure 6
Figure 6
PLS-DA for DEGs: (A) Component 1 vs. Component 2. The red points indicate IgAN patients and the green points indicate healthy control; (B) Importance of top 20 discriminative genes for IgAN.
Figure 7
Figure 7
Identification and PPI analysis of key hub genes for IgAN patients. (A) Key candidate genes identification from hub module genes, computed from Cytohubba, SVM, LASSO, and PLS-DA. (B) PPI analysis of key five candidate genes.
Figure 8
Figure 8
Boxplot of five key candidate genes as (A) FOS, (B) JUN, (C) EGR1, (D) FOSB, (E) DUSP1 for IgAN patients, and (F) Heatmap of the five key candidate genes in renal tissue samples which were generated using “NMF” version 0.24.0 package in R (https://cran.r-project.org/package=NMF).
Figure 9
Figure 9
Validation of the five key candidate genes using ROC curves which were generated by pROC package with version 1.18.0 in R (https://cran.r-project.org/package=pROC) and heatmap for GSE116626 dataset. (A) FOS (B) JUN (C), EGR1 (D) FOSB (E) DUSP1 (F) Heatmap of the five key candidate genes in renal tissue samples which were generated using “NMF” version 0.24.0 package in R (https://cran.r-project.org/package=NMF). CI confidence interval.
Figure 10
Figure 10
Validation of the five key candidate genes using ROC curves which were generated by pROC package with version 1.18.0 in R (https://cran.r-project.org/package=pROC) and heatmap for GSE35487 dataset. (A) FOS (B) JUN (C), EGR1 (D) FOSB (E) DUSP1 (F) Heatmap of the five key candidate genes in renal tissue samples which were generated using “NMF” version 0.24.0 package in R (https://cran.r-project.org/package=NMF).

References

    1. Boully A, et al. A brain problem with listeria monocytogenes. Lancet. Infect. Dis. 2022;22:296. doi: 10.1016/S1473-3099(21)00683-6. - DOI - PubMed
    1. Berger J, N H. Les depots intercapillaires d’iga-igg. J. Urol. Nephrol. 1968;74:694–695. - PubMed
    1. D’amico G. The commonest glomerulonephritis in the world: Iga nephropathy. Q. J. Med. 1987;64:709–727. doi: 10.1093/oxfordjournals.qjmed.a068143. - DOI - PubMed
    1. Lai KN, et al. Iga nephropathy. Nat. Rev. Dis. Primers. 2016;2:1–20. doi: 10.1038/nrdp.2016.1. - DOI - PubMed
    1. Jarrick S, et al. Immunoglobulin a nephropathy and ischemic heart disease: A nationwide population-based cohort study. BMC Nephrol. 2021;22:1–8. doi: 10.1186/s12882-021-02353-7. - DOI - PMC - PubMed

Publication types