Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 21;17(7):e1009144.
doi: 10.1371/journal.pcbi.1009144. eCollection 2021 Jul.

PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation

Affiliations

PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation

George Crowley et al. PLoS Comput Biol. .

Abstract

Biomarkers predict World Trade Center-Lung Injury (WTC-LI); however, there remains unaddressed multicollinearity in our serum cytokines, chemokines, and high-throughput platform datasets used to phenotype WTC-disease. To address this concern, we used automated, machine-learning, high-dimensional data pruning, and validated identified biomarkers. The parent cohort consisted of male, never-smoking firefighters with WTC-LI (FEV1, %Pred< lower limit of normal (LLN); n = 100) and controls (n = 127) and had their biomarkers assessed. Cases and controls (n = 15/group) underwent untargeted metabolomics, then feature selection performed on metabolites, cytokines, chemokines, and clinical data. Cytokines, chemokines, and clinical biomarkers were validated in the non-overlapping parent-cohort via binary logistic regression with 5-fold cross validation. Random forests of metabolites (n = 580), clinical biomarkers (n = 5), and previously assayed cytokines, chemokines (n = 106) identified that the top 5% of biomarkers important to class separation included pigment epithelium-derived factor (PEDF), macrophage derived chemokine (MDC), systolic blood pressure, macrophage inflammatory protein-4 (MIP-4), growth-regulated oncogene protein (GRO), monocyte chemoattractant protein-1 (MCP-1), apolipoprotein-AII (Apo-AII), cell membrane metabolites (sphingolipids, phospholipids), and branched-chain amino acids. Validated models via confounder-adjusted (age on 9/11, BMI, exposure, and pre-9/11 FEV1, %Pred) binary logistic regression had AUCROC [0.90(0.84-0.96)]. Decreased PEDF and MIP-4, and increased Apo-AII were associated with increased odds of WTC-LI. Increased GRO, MCP-1, and simultaneously decreased MDC were associated with decreased odds of WTC-LI. In conclusion, automated data pruning identified novel WTC-LI biomarkers; performance was validated in an independent cohort. One biomarker-PEDF, an antiangiogenic agent-is a novel, predictive biomarker of particulate-matter-related lung disease. Other biomarkers-GRO, MCP-1, MDC, MIP-4-reveal immune cell involvement in WTC-LI pathogenesis. Findings of our automated biomarker identification warrant further investigation into these potential pharmacotherapy targets.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Study Design.
From a parent cohort of WTC-LI cases (n = 96) and controls (n = 127), a metabolomics subcohort (n = 15/group) and a validation cohort (n = 114) were drawn as described.
Fig 2
Fig 2. Random Forests Variable Importance.
A. Mean decrease accuracy was used to determine and rank the top 5% of important metabolites, cytokines, chemokines, and clinical biomarkers. B. Agglomerative, Hierarchical Clustering identified 5 clusters of variables in the refined profile with similar patterns of expression in the metabolomics subcohort. C. PCA Loading Weights Plot visualizes clusters of variables based on intervariable correlations and provides an alternative-but-similar view of variable relationships in the metabolomics subcohort. Points are colored according to cluster membership.
Fig 3
Fig 3. Predictive performance of Final Model.
ROCAUC of the final, confounder-adjusted, multivariate binary logistic regression shows its predictive performance as well as that of its constituents when entered separately in confounder-adjusted binary logistic regressions.

Similar articles

Cited by

References

    1. Peters A, Dockery DW, Muller JE, Mittleman MA. Increased particulate air pollution and the triggering of myocardial infarction. Circulation. 2001. Jun 12;103(23):2810–5. doi: 10.1161/01.cir.103.23.2810 - DOI - PubMed
    1. Peters A, von Klot S, Heier M, Trentinaglia I, Hormann A, Wichmann HE, et al.. Exposure to traffic and the onset of myocardial infarction. N Engl J Med. 2004. Oct 21;351(17):1721–30. doi: 10.1056/NEJMoa040203 - DOI - PubMed
    1. Wellenius GA, Schwartz J, Mittleman MA. Air pollution and hospital admissions for ischemic and hemorrhagic stroke among medicare beneficiaries. Stroke; a journal of cerebral circulation. 2005. Dec;36(12):2549–53. doi: 10.1161/01.STR.0000189687.78760.47 - DOI - PubMed
    1. Wellenius GA, Yeh GY, Coull BA, Suh HH, Phillips RS, Mittleman MA. Effects of ambient air pollution on functional status in patients with chronic congestive heart failure: a repeated-measures study. Environ Health. 2007;6:26. doi: 10.1186/1476-069X-6-26 - DOI - PMC - PubMed
    1. Wellenius GA, Coull BA, Batalha JR, Diaz EA, Lawrence J, Godleski JJ. Effects of ambient particles and carbon monoxide on supraventricular arrhythmias in a rat model of myocardial infarction. Inhalation toxicology. 2006. Dec;18(14):1077–82. doi: 10.1080/08958370600945473 - DOI - PubMed

Publication types

MeSH terms