Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar:113:105609.
doi: 10.1016/j.ebiom.2025.105609. Epub 2025 Feb 25.

Identifying chronic obstructive pulmonary disease subtypes using multi-trait genetics

Affiliations

Identifying chronic obstructive pulmonary disease subtypes using multi-trait genetics

Andrey Ziyatdinov et al. EBioMedicine. 2025 Mar.

Abstract

Background: Chronic Obstructive Pulmonary Disease (COPD) has a broad spectrum of clinical characteristics. The aetiology of these differences is not well understood. The objective of this study is to assess whether respiratory genetic variants cluster by phenotype and associate with COPD heterogeneity.

Methods: We clustered genome-wide association studies of COPD, lung function, and asthma and phenotypes from the UK Biobank using non-negative matrix factorization. We constructed cluster-specific genetic risk scores and tested these scores for association with phenotypes in non-Hispanic white subjects in the COPDGene study.

Findings: We identified three clusters from 482 variants and 44 traits from genetic associations in 379,337 UK Biobank participants. Variants from asthma, COPD, and lung function were found in all three clusters. Clusters displayed varying effects on white blood cell counts, height, and body mass index (BMI)-related phenotypes in the UK Biobank. In the COPDGene cohort, cluster-specific genetic risk scores were associated with differences in steroid use, BMI, lymphocyte counts, and chronic bronchitis, as well as variations in gene and protein expression.

Interpretation: Our results suggest that multi-phenotype analysis of obstructive lung disease-related risk variants may identify genetically driven phenotypic patterns in COPD.

Funding: MHC was supported by R01HL149861, R01HL135142, R01HL137927, R01HL147148, and R01HL089856. HA and HJ were supported by ANR-20-CE36-0009-02 and ANR-16-CONV-0005. The COPDGene study (NCT00608764) is supported by grants from the NHLBI (U01HL089897 and U01HL089856), by NIH contract 75N92023D00011, and by the COPD Foundation through contributions made to an Industry Advisory Committee that has included AstraZeneca, Bayer Pharmaceuticals, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer and Sunovion.

Keywords: COPD; Genetic epidemiology; Multitrait analysis; Pathways.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests MHC has received grant support from Bayer and consulting fee from Apogee. EKS has received grant support from Northpond Laboratories and Bayer. MM has received honorarium from the NY State Thoracic Society and the ATS. PJC has received support from Bayer and Sanofi, and consulting fees from Verona Pharma. MDT has received support from Orion Pharma. The remaining authors have nothing to declare.

Figures

Fig. 1
Fig. 1
Study overview. We identified a set of 482 genetic variants associated with obstructive lung disorder and extracted association z-scores between those variants and 44 phenotypes from genome-wide association studies (GWAS) conducted in the UK biobank. We factorized this association matrix into two matrices of traits and variants weights using the NMF approach, resulting in the clustering of the variants in three groups. Those clusters were then used to derive three component genetic risk scores (grs), which were tested for association with phenotypes in the COPDGene cohort. The distribution of the grs across COPD cases was ultimately used to examine potential COPD disease axes.
Fig. 2
Fig. 2
Distribution of trait weights across the variant clusters in UK Biobank. Distribution of traits weight extracted from the Non-negative Matrix Factorization (NMF) analysis for the top 15 traits with the largest contribution to the three clusters. For comparison purposes, weights were normalized to have a sum of one within each cluster (X axis, in percentage). Red bars correspond to weights derived from positive z-scores, blue bars to weights derived from negative z-scores, reflect increasing and decreasing disease risk, respectively. Cluster 1 displayed high positive weights for inflammation-related phenotypes and blood cell counts. Cluster 2 had negative weights for traits linked to body composition and obesity. Cluster 3 displayed a large positive weight for height, and negative weights for blood cell counts.
Fig. 3
Fig. 3
Effects of GRSs on selected traits in the validation COPDGene dataset. Point estimates (effect size) and 95% confidence intervals for association between COPDgene phenotypes and cluster-specific GRSs (GRS13; from dark blue to pink) and unweighted GRS (GRS0; black). For comparison purposes, all GRS were re-scaled to a unit variance. We selected traits representing different COPD phenotypic groups: demographics (height, weight), lung function (forced expiratory flow at 25–75% of forced vital capacity (FEF 25–75) and diffusing capacity for carbon monoxide (DLCO), imaging (visual emphysema score (emphysema) and upper third/lower third emphysema ratio (emphysema ratio)), asthma-related traits (eosinophil count, steroid treatment), and comorbidities (coronary artery disease (CAD)).
Fig. 4
Fig. 4
Contribution of cluster-specific GRSs. Out of 240 COPDgene phenotypes tested for association with genetic risk scores, a total of 47 phenotypes showed a statistically significant (Phet) improvement of model fit at an FDR of 0.1 when comparing the marginal GRS0 model against a full model including GRS0 and all GRS1–3. The barplots represent the relative contribution of GRS1, GRS2, and GRS3, measured as Zscore derived from the full model, for these 47 phenotypes, highlighting which of the three GRS convey the improved fit. Phenotypes are order by Phet. Red dash lines indicate the stringent Bonferoni significance threshold accounting for a total of 723 tests.

Update of

References

    1. Sakornsakolpat P., Prokopenko D., Lamontagne M., et al. Genetic landscape of chronic obstructive pulmonary disease identifies heterogeneous cell-type and phenotype associations. Nat Genet. 2019;51(3):494–505. - PMC - PubMed
    1. Castaldi P.J., Boueiz A., Yun J., et al. Machine learning characterization of COPD subtypes: insights from the COPDGene study. Chest. 2020;157(5):1147–1157. - PMC - PubMed
    1. Rennard S.I., Vestbo J. The many “small COPDs”: COPD should be an orphan disease. Chest. 2008;134(3):623–627. - PubMed
    1. Aguirre M., Tanigawa Y., Venkataraman G.R., Tibshirani R., Hastie T., Rivas M.A. Polygenic risk modeling with latent trait-related genetic components. Eur J Hum Genet. 2021;29(7):1071–1081. - PMC - PubMed
    1. Udler M.S., Kim J., von Grotthuss M., et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: a soft clustering analysis. PLoS Med. 2018;15(9) - PMC - PubMed