Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;56(1):2409352.
doi: 10.1080/07853890.2024.2409352. Epub 2024 Sep 28.

Integrating cellular experiments, single-cell sequencing, and machine learning to identify endoplasmic reticulum stress biomarkers in idiopathic pulmonary fibrosis

Affiliations

Integrating cellular experiments, single-cell sequencing, and machine learning to identify endoplasmic reticulum stress biomarkers in idiopathic pulmonary fibrosis

Yi Liao et al. Ann Med. 2024 Dec.

Abstract

Background: Idiopathic Pulmonary Fibrosis (IPF) presents a severe respiratory challenge with a poor prognosis due to the lack of reliable biomarkers. Recent evidence suggests that Endoplasmic Reticulum Stress (ERS) may be associated with IPF pathogenesis. This study focuses on uncovering ERS-associated biomarkers for IPF.

Methods: Sequencing data from diverse datasets were analyzed, utilizing differential gene expression analysis and Weighted Gene Co-expression Network Analysis (WGCNA). Endoplasmic Reticulum Stress (ERS)-related genes were extracted from the GeneCards database. Hub genes were identified through Protein-Protein Interaction (PPI) analysis. Diagnostic and prognostic models were developed using machine learning algorithms and validated across both training and validation sets. Additionally, techniques such as Cell-type Identification by Estimating Relative Subsets of RNA Transcripts and single-cell RNA sequencing were employed to identify potential IPF-related cells. These findings were further investigated to elucidate their underlying mechanisms through in vitro experiments.

Results: Differentially expressed genes, WGCNA-identified blue module genes, and ERS-related genes extracted from the GeneCards database were intersected, and the resulting genes were used to construct diagnostic and prognostic models. Validation using multiple datasets indicated that both the diagnostic and prognostic models possess strong predictive capabilities. PPI analysis highlighted SPP1 as a potential hub gene in IPF. Moreover, M2 macrophages were found in higher quantities in the lung tissue of IPF patients, with a significant increase in SPP1-expressing M2 macrophages compared to the control group. In vitro experiments demonstrated that exogenous SPP1 inhibited the proliferation and migration of M2 macrophages and promoted apoptosis within a certain concentration range.

Conclusion: This study identifies ERS-related biomarkers in IPF, highlighting SPP1 and M2 macrophages. The resulting diagnostic and prognostic models offer strong predictive capabilities, unveiling new therapeutic avenues.

Keywords: Endoplasmic Reticulum Stress; Idiopathic pulmonary fibrosis; M2 macrophage; SPP1; machine-learning.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the author(s).

Figures

Figure 1.
Figure 1.
Data preprocessing and identification of differentially expressed genes. The box plot and principal component analysis elucidate the overall gene expression profiles (A) before and (B) after the normalization process. These results substantiate the effective removal of batch effects. Volcano plot (C) and heatmap (D) display the differentially expressed genes, highlighting significant variations in gene expression.
Figure 2.
Figure 2.
WGCNA results. (A) Dendrogram of sample clustering from the combined dataset alongside corresponding clinical information (indicating IPF or control group). (B) Determination of the soft-thresholding power in WGCNA, with the left side presenting scale-free index analysis for various soft-thresholding powers (β), and the right side depicting the analysis of mean connectivity for various soft-thresholding powers. (C) Dendrogram of genes based on clustering using the topological overlap matrix measure, with the color band displaying results obtained from automatic single-block analysis. (D) Heatmap illustrating the correlation between module eigengenes and clinical traits (IPF or healthy control group), with the turquoise module selected for further analysis. (E) Scatter plot of gene significance versus module membership in the blue module. (F) Venn diagram showing the intersection of genes among the three analyses. (G) Chromosomal locations of candidate hub genes. (H) Manhattan plot of candidate genes.
Figure 3.
Figure 3.
Functional enrichment and immune infiltration analysis. (A) Disease ontology analysis of hub genes. (B) Gene ontology analysis of hub genes, covering biological processes, cellular components, and molecular functions. (C) Kyoto Encyclopedia of Genes and Genomes pathway analysis for core genes, identifying significant pathways involved. (D) The Protein-Protein Interaction network displaying the top five genes with the highest degree of connectivity and their interactions with other genes. (E) Differential analysis of immune cell infiltration between IPF and control group lung tissues, illustrating variations in immune cell presence.
Figure 4.
Figure 4.
Developed and validated through a machine learning-integrated approach for constructing diagnostic models. (A) a comprehensive suite of 113 diagnostic models was meticulously evaluated for accuracy across all training and test datasets. Receiver operating characteristic curves and confusion matrices for the models were generated for (B) GSE150910, (C) GSE32537, (D) GSE53845, (E) GSE92592, and (F) GSE110147 datasets, illustrating the performance and predictive validity of each diagnostic model in distinguishing IPF cases from controls.
Figure 5.
Figure 5.
Developed and validated through a machine learning-based integrative approach for constructing prognostic models. (A) a comprehensive analysis involving 97 prognostic models assessed the C-index of each model across all training and test datasets. Kaplan-Meier curves and time-dependent ROC curves were generated for datasets (B) GSE70866, (C) GSE28221, and (D) GSE93606. (E) Comparison of prognostic models and 15 published signatures.
Figure 6.
Figure 6.
Single-cell sequencing results. (A) t-distributed Stochastic Neighbor Embedding (t-SNE) visualization of the 12 samples. (B) t-SNE plots comparing normal and IPF samples. (C) t-SNE representation of the 28 cell clusters. (D) Cell types were delineated based on marker gene profiles. (E) Heat map displaying the top 5 marker genes across 8 cell types. (F) Comparative analysis of 5 hub genes between IPF and control groups. (G) Feature plots illustrating the expression of the 5 hub genes across 8 cell types. (H) Differential comparison of the 5 hub genes among the 8 cell types, highlighting the variance in gene expression patterns.
Figure 7.
Figure 7.
Cellular experiment results. (A) Growth curves were monitored using the cell Counting Kit-8 (CCK-8) assay under various SPP1 concentration conditions. (B) Transwell experiments indicated the migratory capacity of cells at different SPP1 concentrations. (C) Results of 5-Ethynyl-2’-deoxyuridine (EDU) incorporation under various SPP1 concentrations. (D) Apoptotic cells were identified using the Annexin V-APC/PI Double Staining Kit across different SPP1 concentration conditions.

References

    1. Raghu G, Remy-Jardin M, Richeldi L, et al. . Idiopathic pulmonary fibrosis (an update) and progressive pulmonary fibrosis in adults: an official ATS/ERS/JRS/ALAT clinical practice guideline. Am J Respir Crit Care Med. 2022;205(9):e18–e47. doi: 10.1164/rccm.202202-0399ST. - DOI - PMC - PubMed
    1. Maher TM, Bendstrup E, Dron L, et al. . Global incidence and prevalence of idiopathic pulmonary fibrosis. Respir Res. 2021;22(1):197. doi: 10.1186/s12931-021-01791-z. - DOI - PMC - PubMed
    1. Shah Gupta R, Koteci A, Morgan A, et al. . Incidence and prevalence of interstitial lung diseases worldwide: a systematic literature review. BMJ Open Respir Res. 2023;10(1):e001291. doi: 10.1136/bmjresp-2022-001291. - DOI - PMC - PubMed
    1. Lynch DA, Sverzellati N, Travis WD, et al. . Diagnostic criteria for idiopathic pulmonary fibrosis: a Fleischner Society White Paper. Lancet Respir Med. 2018;6(2):138–153. doi: 10.1016/S2213-2600(17)30433-2. - DOI - PubMed
    1. Raghu G, Collard HR, Egan JJ, et al. . An official ATS/ERS/JRS/ALAT statement: idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. Am J Respir Crit Care Med. 2011;183(6):788–824. doi: 10.1164/rccm.2009-040GL. - DOI - PMC - PubMed

MeSH terms