Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2025 Jun 5;65(6):2401070.
doi: 10.1183/13993003.01070-2024. Print 2025 Jun.

Transcriptomics of interstitial lung disease: a systematic review and meta-analysis

Affiliations
Meta-Analysis

Transcriptomics of interstitial lung disease: a systematic review and meta-analysis

Daniel He et al. Eur Respir J. .

Abstract

Objective: Gene expression (transcriptomics) studies have revealed potential mechanisms of interstitial lung disease, yet sample sizes of studies are often limited and between-subtype comparisons are scarce. The aim of this study was to identify and validate consensus transcriptomic signatures of interstitial lung disease subtypes.

Methods: We performed a systematic review and meta-analysis of fibrotic interstitial lung disease transcriptomics studies using an individual participant data approach. We included studies examining bulk transcriptomics of human adult interstitial lung disease samples and excluded those focusing on individual cell populations. Patient-level data and expression matrices were extracted from 43 studies and integrated using a multivariable integrative algorithm to develop interstitial lung disease classification models.

Results: Using 1459 samples from 24 studies, we identified transcriptomic signatures for idiopathic pulmonary fibrosis, hypersensitivity pneumonitis, idiopathic nonspecific interstitial pneumonia and systemic sclerosis-associated interstitial lung disease against control samples, which were validated on 308 samples from eight studies (idiopathic pulmonary fibrosis area under receiver operating curve (AUC) 0.99, 95% CI 0.99-1.00; hypersensitivity pneumonitis AUC 0.91, 95% CI 0.84-0.99; nonspecific interstitial pneumonia AUC 0.94, 95% CI 0.88-0.99; systemic sclerosis-associated interstitial lung disease AUC 0.98, 95% CI 0.93-1.00). Significantly, meta-analysis allowed us to identify, for the first time, robust lung transcriptomics signatures to discriminate idiopathic pulmonary fibrosis (AUC 0.71, 95% CI 0.63-0.79) and hypersensitivity pneumonitis (AUC 0.76, 95% CI 0.63-0.89) from other fibrotic interstitial lung disease, and unsupervised learning algorithms identified putative molecular endotypes of interstitial lung disease associated with decreased forced vital capacity and diffusing capacity of the lungs for carbon monoxide % predicted. Transcriptomics signatures were reflective of both cell-specific and disease-specific changes in gene expression.

Conclusion: We present the first systematic review and largest meta-analysis of fibrotic interstitial lung disease transcriptomics to date, identifying reproducible transcriptomic signatures with clinical relevance.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest: S.A. Guler reports grants from Boehringer Ingelheim, Roche, MSD and Janssen; payment or honoraria for lectures, presentations, manuscript writing or educational events and support for attending meetings from Boehringer Ingelheim; and participation on a data safety monitoring board or advisory board with Boehringer Ingelheim, MSD and Janssen. C.J. Ryerson reports grants from Boehringer Ingelheim; consulting fees from Boehringer Ingelheim, Pliant Therapeutics, AstraZeneca, Trevi Therapeutics and Veracyte; payment or honoraria for lectures, presentations, manuscript writing or educational events from Hoffmann-La Roche, Boehringer Ingelheim and Cipla; payment for expert testimony from Boehringer Ingelheim; and support for attending meetings from Cipla and Boehringer Ingelheim. The remaining authors have no potential conflicts of interest to disclose.

Figures

None
Overview of the systematic review and meta-analysis results. a) 8950 records were identified from medical literature and transcriptomics databases, of which 43 were included after full-text screening. b) Datasets were extracted and integrated to develop classification models differentiating between lung transcriptomics samples obtained from patients with interstitial lung disease (ILD) subtypes and healthy controls. Shared differentially expressed genes were identified across multiple ILD subtypes, some of which were due to differences in cell populations as determined by deconvolution analysis. c) Comparison of specific ILD subtypes against all other ILD subtypes identified subtype-specific gene signatures that were validated on external datasets and associated with fibrosis in idiopathic pulmonary fibrosis (IPF) and major histocompatibility complex (MHC) binding in hypersensitivity pneumonitis (HP). AT1: alveolar type 1 cell; AT2: alveolar type 2 cell; SSc-ILD: systemic sclerosis-associated interstitial lung disease; NSIP: nonspecific interstitial pneumonia.
FIGURE 1
FIGURE 1
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart of study identification and screening. MEDLINE, Embase and the Gene Expression Omnibus (GEO) were examined for studies using transcriptomics to investigate samples from patients with interstitial lung disease (ILD).
FIGURE 2
FIGURE 2
Integration of lung transcriptomics datasets. Author-normalised datasets were subsetted with genes shared in >80% of datasets (a) and integrated using the Multivariate INTegrative (MINT) method. MINT-integrated samples projected into a common latent space via multi-group principal component (PC) analysis are shown in (b). RNA-seq: RNA sequencing.
FIGURE 3
FIGURE 3
Classification models for interstitial lung disease (ILD) subtypes against controls. Training set datasets were integrated and tuned to develop classification models, which were then validated on test set datasets. Projection of Multivariate INTegrative (MINT)-integrated training set samples into the space of the first two partial least squares components for a) idiopathic pulmonary fibrosis (IPF) (n=576 control, n=721 IPF), b) hypersensitivity pneumonitis (HP) (n=199 control, n=104 HP), c) nonspecific interstitial pneumonia (NSIP) (n=157 control, n=29 NSIP) and d) systemic sclerosis-associated ILD (SSc-ILD) (n=14 control, n=24 SSc-ILD) classification models. Inset: area under the receiver operating characteristic curve (AUC) performance of each ILD classification model on held-out test datasets (a: n=107 control, n=193 IPF; b: n=35 control, n=22 HP; c: n=45 control, n=20 NSIP; d: n=9 control, n=6 SSc-ILD). Detailed performance metrics for each dataset can be found in supplementary table S2.
FIGURE 4
FIGURE 4
Idiopathic pulmonary fibrosis (IPF) and hypersensitivity pneumonitis (HP) have unique transcriptomic signatures compared to other interstitial lung disease (ILD) subtypes. IPF-specific and HP-specific classification models were developed through comparison against all other ILD subtypes using Multivariate INTegrative (MINT)-integrated lung transcriptomics datasets. The training set consisted of 721 IPF, 104 HP, 29 nonspecific interstitial pneumonia (NSIP), 24 systemic sclerosis-associated ILD (SSc-ILD), 14 respiratory bronchiolitis-ILD (RB-ILD), 14 unknown fibrosis, six cryptogenic organising pneumonia (COP), four desquamative interstitial pneumonia (DIP), three rheumatoid arthritis-ILD (RA-ILD) and one connective tissue disease-ILD (CTD-ILD) samples, while the test set consisted of 193 IPF, 22 HP, 20 NSIP, six SSc-ILD, five mixed IPF-NSIP, three unknown fibrosis, two RB-ILD, two DIP, two CTD-ILD, one COP and one combined pulmonary fibrosis and emphysema samples. The figure shows the MINT-partial least squares projection of samples for classification models of IPF versus non-IPF ILD (a) and HP versus non-HP ILD (c), with corresponding performance on held-out test datasets (b, d). e) Upregulated gene ontology pathways using gene signatures identified in classification models for IPF against non-IPF ILD and HP against non-HP ILD. AUC: area under receiver operating characteristic curve.
FIGURE 5
FIGURE 5
Molecular endotypes identified through unsupervised learning are associated with cell types and lung function in interstitial lung disease (ILD). ComBat-corrected lung transcriptomics datasets containing healthy control, ILD and COPD (disease control) samples (from GSE47460) were analysed using biclustering algorithms. a) Heatmaps of genes found in select biclusters showing differences in gene expression between bicluster and non-bicluster samples. b) Proportion of samples within each subtype identified in each bicluster (i.e. 32.6% of idiopathic pulmonary fibrosis (IPF) samples are in the “M9-Fibrosis” bicluster). c) Comparison of reported forced vital capacity (FVC % predicted) in IPF samples within and outside of the “M13-Proflieration” bicluster. Significance was determined through linear regression analysis of bicluster membership against FVC %. HP: hypersensitivity pneumonitis; NSIP: nonspecific interstitial pneumonia; CTD-ILD: connective tissue disease interstitial lung disease; AT1: alveolar type 1; AT2: alveolar type 2; EMT: epithelial-to-mesenchymal transition. *: false discovery rate <0.10.
FIGURE 6
FIGURE 6
Integrated gene signatures are driven by changes in cellular abundance and transcriptional activity. a) Estimate of differences between eigengene values (as determined by cellular deconvolution analysis) of interstitial lung disease (ILD) subtypes using integrated data. Between-subtype comparisons can be read as subtype 1 versus subtype 2, i.e. “IPF/Control” compares idiopathic pulmonary fibrosis (IPF) against Control samples, with dark green and beige dots representing cell types increased or decreased, respectively, in IPF. Listed bicluster subtypes represent comparisons between bicluster and non-cluster samples. Blank spaces indicate comparisons that were not significant. b) Differences in single-cell RNA sequencing (scRNA-seq) cellular abundance in GSE122960 between indicated subtypes as determined by a binomial generalised linear model. c) Comparison of Multivariate INTegrative (MINT) gene signatures with differentially expressed gene (DEG) analysis of scRNA-seq data. Model weights of each gene identified in classification models were plotted on the y-axis against their log2 fold change values in scRNA-seq DEG analysis on the x-axis, and coloured if the gene was significantly different in scRNA-seq with the same direction of change as the MINT signature. Selected genes of interest are labelled. A full list of genes can be found in supplementary table S17. HP: hypersensitivity pneumonitis; NSIP: nonspecific interstitial pneumonia; SSc-ILD: systemic sclerosis-associated interstitial lung disease; AT1: alveolar type 1; AT2: alveolar type 2; EMT: epithelial-to-mesenchymal transition; NKT cell: natural killer T-cell.

Comment in

References

    1. Kaul B, Cottin V, Collard HR, et al. . Variability in global prevalence of interstitial lung disease. Front Med 2021; 8: 751181. doi:10.3389/fmed.2021.751181 - DOI - PMC - PubMed
    1. Vukmirovic M, Kaminski N. Impact of transcriptomics on our understanding of pulmonary fibrosis. Front Med 2018; 5: 87. doi:10.3389/fmed.2018.00087 - DOI - PMC - PubMed
    1. Khan FA, Stewart I, Saini G, et al. . A systematic review of blood biomarkers with individual participant data meta-analysis of matrix metalloproteinase-7 in idiopathic pulmonary fibrosis. Eur Respir J 2022; 59: 2101612. doi:10.1183/13993003.01612-2021 - DOI - PMC - PubMed
    1. Spagnolo P, Ryerson CJ, Putman R, et al. . Early diagnosis of fibrotic interstitial lung disease: challenges and opportunities. Lancet Respir Med; 2021; 9: 1065–1076. doi:10.1016/S2213-2600(21)00017-5 - DOI - PubMed
    1. Lasky JA, Case A, Unterman A, et al. . The impact of the Envisia genomic classifier in the diagnosis and management of patients with idiopathic pulmonary fibrosis. Ann Am Thorac Soc 2022; 19: 916–924. doi:10.1513/AnnalsATS.202107-897OC - DOI - PubMed

MeSH terms

LinkOut - more resources