Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2024 Oct 23;25(1):383.
doi: 10.1186/s12931-024-03015-6.

Clustering-aided prediction of outcomes in patients with idiopathic pulmonary fibrosis

Affiliations
Multicenter Study

Clustering-aided prediction of outcomes in patients with idiopathic pulmonary fibrosis

Lijun Wang et al. Respir Res. .

Abstract

Background: Blood biomarkers predictive of the progression of idiopathic pulmonary fibrosis (IPF) would be of value for research and clinical practice. We used data from the IPF-PRO Registry to investigate whether the addition of "omics" data to risk prediction models based on demographic and clinical characteristics improved prediction of the progression of IPF.

Methods: The IPF-PRO Registry enrolled patients with IPF at 46 sites across the US. Patients were followed prospectively. Median follow-up was 27.2 months. Prediction models for disease progression included omics data (proteins and microRNAs [miRNAs]), demographic factors and clinical factors, all assessed at enrollment. Data on proteins and miRNAs were included in the models either as raw values or based on clusters in various combinations. Least absolute shrinkage and selection operator (Lasso) Cox regression was applied for time-to-event composite outcomes and logistic regression with L1 penalty was applied for binary outcomes assessed at 1 year. Model performance was assessed using Harrell's C-index (for time-to-event outcomes) or area under the curve (for binary outcomes).

Results: Data were analyzed from 231 patients. The models based on demographic and clinical factors, with or without omics data, were the top-performing models for prediction of all the time-to-event outcomes. Relative changes in average C-index after incorporating omics data into models based on demographic and clinical factors ranged from 1.7 to 3.2%. Of the blood biomarkers, surfactant protein-D, serine protease inhibitor A7 and matrix metalloproteinase-9 (MMP-9) were among the top predictors of the outcomes. For the binary outcomes, models based on demographics alone and models based on demographics plus omics data had similar performances. Of the blood biomarkers, CC motif chemokine 11, vascular cell adhesion protein-1, adiponectin, carcinoembryonic antigen and MMP-9 were the most important predictors of the binary outcomes.

Conclusions: We identified circulating protein and miRNA biomarkers associated with the progression of IPF. However, the integration of omics data into prediction models that included demographic and clinical factors did not materially improve the performance of the models.

Trial registration: ClinicalTrials.gov; No: NCT01915511; registered August 5, 2013; URL: www.

Clinicaltrials: gov .

Keywords: Biomarkers; Disease progression; Interstitial lung disease.

PubMed Disclaimer

Conflict of interest statement

Lijun Wang, Peitao Wu, Yi Liu, and Divya C Patel are employees of Boehringer Ingelheim Pharmaceuticals, Inc. Thomas B Leonard was an employee of Boehringer Ingelheim Pharmaceuticals, Inc at the time that these analyses were planned and conducted. Hongyu Zhao has no competing interests other than the funding of this project by Boehringer Ingelheim Pharmaceuticals, Inc.

Figures

Fig. 1
Fig. 1
Workflow for evaluating the performance of different models
Fig. 2
Fig. 2
C-indices of models for (A) composite of time to death or lung transplant and (B) composite of time to death, lung transplant, or decline in forced vital capacity > 10%. The left panel shows models based on demographic and clinical characteristics with or without omics data. The middle panel shows models based on demographics with or without omics data. The right panel shows the model based on omics data alone. Clin, clinical; demo, demographics; lbl_miRNA, cluster label of miRNA; lbl_prot, cluster label of protein; raw_miRNA, raw values of miRNAs; raw_prot, raw values of proteins.
Fig. 3
Fig. 3
Variable importance of demographic and omics variables for (A) composite outcome of death or lung transplant and (B) composite outcome of death, lung transplant, or decline in forced vital capacity (FVC) > 10%. Green bars denote proteins, blue bars denote demographic variables and the red bar denotes the cluster label of miRNA. Demo, demographics; lbl_miRNA, clustering label of miRNA; raw_prot, raw values of proteins
Fig. 4
Fig. 4
Area under the curve (AUC) of models for absolute decline in forced vital capacity > 10% at 1 year. The left panel shows models based on demographic and clinical characteristics with or without omics data. The middle panel shows models based on demographics with or without omics data. The right panel shows the models based on omics data alone. Clin, clinical; demo, demographics; lbl_miRNA, cluster label of miRNA; lbl_prot, cluster label of protein; raw_miRNA, raw values of miRNAs; raw_prot, raw values of proteins
Fig. 5
Fig. 5
Variable importance of demographic and omics variables for absolute decline in forced vital capacity > 10% at 1 year. Green bars denote proteins, blue bars denote demographic variables and the red bar denotes the cluster label of miRNA. Demo, demographics; lbl_miRNA, clustering label of miRNA; raw_prot, raw values of proteins
Fig. 6
Fig. 6
Area under the curve (AUC) of models for disease progression (decline in forced vital capacity [FVC] % predicted > 10%, decline in diffusing capacity of the lungs for carbon monoxide [DLco] % predicted > 15%, death, or lung transplant) at 1 year. The left panel shows models based on demographics and clinical characteristics with or without omics data. The middle panel shows models based on demographics with or without omics data. The right panel shows the models based on omics data alone. Clin, clinical; demo, demographics; lbl_miRNA, cluster label of miRNA; lbl_prot, cluster label of protein; raw_miRNA, raw values of miRNAs; raw_prot, raw values of proteins
Fig. 7
Fig. 7
Variable importance by selected frequency for disease progression (decline in forced vital capacity [FVC] % predicted > 10%, decline in diffusing capacity of the lungs for carbon monoxide [DLco] % predicted > 15%, death or lung transplant) at 1 year. Green bars denote proteins, blue bars denote demographic variables and the red bar denotes the cluster label of miRNA. Demo, demographics; lbl_miRNA, clustering label of miRNA; raw_prot, raw values of proteins

References

    1. Raghu G, Remy-Jardin M, Richeldi L, Thomson CC, Inoue Y, Johkoh T, et al. Idiopathic pulmonary fibrosis (an update) and progressive pulmonary fibrosis in adults: an official ATS/ERS/JRS/ALAT clinical practice guideline. Am J Respir Crit Care Med. 2022;205:e18–47. - PMC - PubMed
    1. Fainberg HP, Oldham JM, Molyneau PL, Allen RJ, Kraven LM, Fahy WA, et al. Forced vital capacity trajectories in patients with idiopathic pulmonary fibrosis: a secondary analysis of a multicentre, prospective, observational cohort. Lancet Digit Health. 2022;4:e862–72. - PubMed
    1. Neely ML, Hellkamp AS, Bender S, Todd JL, Liesching T, Luckhardt TR, et al. Lung function trajectories in patients with idiopathic pulmonary fibrosis. Respir Res. 2023;24:209. - PMC - PubMed
    1. Snyder L, Neely ML, Hellkamp AS, O’Brien E, de Andrade J, Conoscenti CS, et al. Predictors of death or lung transplant after a diagnosis of idiopathic pulmonary fibrosis: insights from the IPF-PRO Registry. Respir Res. 2019;20:105. - PMC - PubMed
    1. Gao J, Kalafatis D, Carlson L, Pesonen IHA, Li CX, Wheelock Å, et al. Baseline characteristics and survival of patients of idiopathic pulmonary fibrosis: a longitudinal analysis of the Swedish IPF Registry. Respir Res. 2021;22:40. - PMC - PubMed

Publication types

Associated data