Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 22:21:2316-2331.
doi: 10.1016/j.csbj.2023.03.034. eCollection 2023.

Identification of potential diagnostic and prognostic biomarkers for sepsis based on machine learning

Affiliations

Identification of potential diagnostic and prognostic biomarkers for sepsis based on machine learning

Li Ke et al. Comput Struct Biotechnol J. .

Abstract

Background: To identify potential diagnostic and prognostic biomarkers of the early stage of sepsis.

Methods: The differentially expressed genes (DEGs) between sepsis and control transcriptomes were screened from GSE65682 and GSE134347 datasets. The candidate biomarkers were identified by the least absolute shrinkage and selection operator (LASSO) regression and support vector machine recursive feature elimination (SVM-RFE) analyses. The diagnostic and prognostic abilities of the markers were evaluated by plotting receiver operating characteristic (ROC) curves and Kaplan-Meier survival curves. Gene Set Enrichment Analysis (GSEA) and single-sample GSEA (ssGSEA) were performed to further elucidate the molecular mechanisms and immune-related processes. Finally, the potential biomarkers were validated in a septic mouse model by qRT-PCR and western blotting.

Results: Eleven DEGs were identified between the sepsis and control samples, including YOD1, GADD45A, BCL11B, IL1R2, UGCG, TLR5, S100A12, ITK, HP, CCR7 and C19orf59 (all AUC>0.9). Furthermore, the survival analysis identified YOD1, GADD45A, BCL11B and IL1R2 as the prognostic biomarkers of sepsis. According to GSEA, four DEGs were significantly associated with immune-related processes. In addition, ssGSEA demonstrated a significant difference in the enriched immune cell populations between the sepsis and control groups (all P < 0.05). Moreover, YOD1, GADD45A and IL1R2 were upregulated, and BCL11B was downregulated in the heart, liver, lungs, and kidneys of the septic mice model.

Conclusions: We identified four potential immune-releated diagnostic and prognostic gene markers for sepsis that offer new insights into its underlying mechanisms.

Keywords: Biomarker; Diagnosis; Machine learning; Prognosis; Sepsis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflicts of interest with the contents of this article.

Figures

ga1
Graphical abstract
Fig. 1
Fig. 1
Identification of DEGs in sepsis. (A) Volcano map of all DEGs in sepsis and healthy control groups. Red plots represent up-regulated mRNAs with P < 0.05 and log2FC > 2. Green plots represent down-regulated genes with P < 0.05 and log2FC < −2. Black plots represent normally expressed mRNAs. (B) Heatmap of all DEGs. The horizontal axis represents the sample, and the vertical axis represents different genes; the red color indicates increased gene expression, and the blue indicates decreased gene expression. (C) λ 1 standard error (λ.1 SE) usually optimizes regularization so that the error and minimum error remain within a standard deviation error. Optimal λ value identified by using 10-fold cross-validation via minimum and 1-SE criteria in the LASSO regression analysis. Two marked dashed lines indicate two special lambda values λ. min and λ.1SE, and the λ between the two values is considered appropriate. λ. 1SE builds the simplest model by using fewer genes. λ. min was more accurate with using a larger number of genes. (D) SVM-RFE algorithm. The horizontal axis represents the number of DEG variables. The vertical axis represents cross-validation RMSEs. The marked plot is the number of DEGs required to obtain the optimal value. (E) Venn diagram of overlapping genes selected by Lasso and SVM-RFE algorithms.
Fig. 2
Fig. 2
The model's interpretation. (A) Feature importance ranked using mean decrease impurity (MDI) method. (B) Feature importance ranked using permutation importance. (C) The importance ranking of the DEGs according to the mean (|SHapley Additive exPlanations (SHAP) value|). (D) The importance ranking of the DEGs based on SHAP values. The higher SHAP value of a feature is given, the higher risk of death the patient would have. The red part in feature value represents higher value. and blue indicates that the value of a feature is low.
Fig. 3
Fig. 3
GO, KEGG and DO enrichment analysis. (A) GO enrichment analysis of the 11 DEGs. (B) KEGG enrichment analysis of the 11 DEGs. (C) DO enrichment analysis of the 11 DEGs.
Fig. 4
Fig. 4
Expression analysis of the 11 candidate DEGs in the GSE65682 and GSE134347 datasets between sepsis and healthy control groups. The relative expression levels of (A) YOD1, (B) GADD45A, (C) BCL11B, (D) IL1R2, (E) UGCG, (F) TLR5, (G) S100A12, (H) ITK, (I) HP, (J) CCR7 and (K) C19orf59 mRNAs are shown.
Fig. 5
Fig. 5
Diagnostic value of DEGs for sepsis in the GSE65682 and GSE134347 datasets. The ROC curves of (A) YOD1, (B) GADD45A, (C) BCL11B, (D) IL1R2, (E) UGCG, (F) TLR5, (G) S100A12, (H) ITK, (I) HP, (J) CCR7 and (K) C19orf59 are shown.
Fig. 6
Fig. 6
Prognostic value of DEGs for patients in the GSE65682 dataset. The Kaplan-Meier survival curves of the high- and low-expression groups of (A) YOD1, (B) GADD45A, (C) BCL11B, (D) IL1R2, (E) UGCG, (F) TLR5, (G) S100A12, (H) ITK, (I) HP, (J) CCR7 and (K) C19orf59 are shown.
Fig. 7
Fig. 7
Nomogram prediction model. (A) Nomogram to predict the sepsis rate based on the GSE65682 and GSE134347 datasets. (B) Calibration curve for the predictive ability of the nomogram.
Fig. 8
Fig. 8
GSEA results of YOD1. (A) Severe infection. (B) Macroautophagy. (C) Immunological synapse. (D) Impaired antigen specific response. (E) Hematopoiesis mature cell. (F) T cell receptor and costimulatory signaling.
Fig. 9
Fig. 9
GSEA results of GGADD45A. (A) Antigen processing and presentation. (B) T cell receptor signaling pathway. (C) Primary immunodeficiency. (D) Intestinal immune network for IgA production. (E) Endoplasmic reticulum tubular network organization. (F) B cell proliferation.
Fig. 10
Fig. 10
GSEA results of BCL11B. (A) O glycan biosynthesis. (B) Starch and sucrose metabolism. (C) Glycerophospholipid metabolism. (D) Allograft rejection. (E) T cell receptor signaling pathway. (F) Antigen processing and presentation.
Fig. 11
Fig. 11
GSEA results of IL1R2. (A) T cell receptor signaling pathway. (B) Primary immunodeficiency. (C) Antigen processing and presentation. (D) Abnormal eosinophil. (E) Immunological. (F) T cell Differentiation.
Fig. 12
Fig. 12
Comparison of immune cell proportions between healthy control and sepsis groups.
Fig. 13
Fig. 13
Correlation between immune cells and the biomarker genes. Red represents a positive correlation and blue represents a negative correlation.
Fig. 14
Fig. 14
Validation of biomarkers in a mouse model of sepsis. (A-D) YOD1, GADD45A, BCL11B and IL1R2 mRNA levels in the heart, liver, lungs and kidneys (n = 3, compared with the Mann-Whitney test); (E-H) YOD1, GADD45A, BCL11B and IL1R2 protein levels in the heart, liver, lung and kidney tissues (n = 6, *p < 0.05, **p < 0.01, ***p < 0.001).

Similar articles

Cited by

References

    1. Singer M., Deutschman C.S., Seymour C.W., Shankar-Hari M., Annane D., Bauer M., et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3) Jama-J Am Med Assoc. 2016;315:801–810. doi: 10.1001/jama.2016.0287. - DOI - PMC - PubMed
    1. Rudd K.E., Johnson S.C., Agesa K.M., Shackelford K.A., Tsoi D., Kievlan D.R., et al. Global, regional, and national sepsis incidence and mortality, 1990-2017: analysis for the Global Burden of Disease Study. Lancet. 2020;395:200–211. doi: 10.1016/S0140-6736(19)32989-7. - DOI - PMC - PubMed
    1. Mayr F.B., Yende S., Angus D.C. Epidemiology of severe sepsis. Virulence. 2014;5:4–11. doi: 10.4161/viru.27372. - DOI - PMC - PubMed
    1. Gotts J.E., Matthay M.A. Sepsis: pathophysiology and clinical management. Bmj-Brit Med J. 2016;353:i1585. doi: 10.1136/bmj.i1585. - DOI - PubMed
    1. Angus D.C., van der Poll T. Severe sepsis and septic shock. N Engl J Med. 2013;369:840–851. doi: 10.1056/NEJMra1208623. - DOI - PubMed