Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 7;22(1):11.
doi: 10.1186/s12014-025-09532-w.

An integrated proteomic classifier to distinguish benign from malignant pulmonary nodules

Affiliations

An integrated proteomic classifier to distinguish benign from malignant pulmonary nodules

Bin Jia et al. Clin Proteomics. .

Abstract

Background: Pulmonary nodule with diameters ranging 8-30 mm has a high occurrence rate, and distinguishing benign from malignant nodules can greatly improve the patient outcome of lung cancer. However, sensitive and specific liquid-biopsy methods have yet to achieve satisfactory clinical goals.

Methods: We enrolled three cohorts and a total of 185 patients diagnosed with benign (BE) and malignant (MA) pulmonary nodules. Utilizing data-independent acquisition (DIA) mass spectrometry, we quantified plasma proteome from these patients. We then performed logistic regression analysis to classify benign from malignant nodules, using cohort 1 as discovery data set and cohort 2 and 3 as independent validation data sets. We also developed a targeted multi-reaction monitoring (MRM) method to measure the concentration of the selected six peptide markers in plasma samples.

Results: We quantified a total of 451 plasma proteins, with 15 up-regulated and 5 down-regulated proteins from patients diagnosed as having malignant nodules. Logistic regression identified a six-protein panel comprised of APOA4, CD14, PFN1, APOB, PLA2G7, and IGFBP2 that classifies benign and malignant nodules with improved accuracy. In cohort 1, the area under curve (AUC) of the training and testing reached 0.87 and 0.91, respectively. We achieved a sensitivity of 100%, specificity of 40%, positive predictive value (PPV) of 62.5%, and negative predictive value (NPV) of 100%. In two independent cohorts, the 6-biomarker panel showed a sensitivity, specificity, PPV, and NPV of 96.2%, 35%, 65.8%, and 87.5% respectively in cohort 2, and 91.4%, 54.2%, 74.4%, and 81.3% respectively in cohort 3. We performed a targeted LC-MS/MS method to quantify plasma concentration of the six peptides and applied logistic regression to classify benign and malignant nodules with AUC of the training and testing reached 0.758 and 0.751, respectively.

Conclusions: Our study identified a panel of plasma protein biomarkers for distinguishing benign from malignant pulmonary nodules that worth further development into a clinically valuable assay.

Keywords: Biomarker; Classification; Lung cancer; Plasma; Pulmonary nodule.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: All the studies mentioned in this article were approved by the Ethics Committee, and written informed consent was obtained by all the participant. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Study design. Flow chart showing the design this study
Fig. 2
Fig. 2
Quantitative proteomic analysis of plasma samples from patients diagnosed with pulmonary nodule. (A) Volcano plot showing differentially expressed proteins in blue (down) or red (up) circles. X-axis shows log2-fold change of plasma proteins between malignant (N = 40) and benign (N = 40) nodule patient groups, and y-axis shows log10 of statistical significance values. (B) Heat map of 26 differentially expressed proteins between malignant (MA) patients and benign (BE) subjects. Intensities of proteins were log2-transformed. Different color in protein names indicates different biological processes derived from these proteins. (C) Principal component analysis of plasma samples from cohort 1 using the plasma proteome expression data. (D) Gene Ontology (GO) analysis of differentially expressed proteins between patients and healthy controls
Fig. 3
Fig. 3
Peptide intensity plot at the peptide level of feature proteins selected for logistic regression analysis. Boxplot showing differential expression of represented peptides from six proteins between patient groups confirmed as either malignant (MA) or benign (BE) nodules. Cohort 1: discovery stage, cohort 2: validation 1, cohort 3: validation 2. Note that patients in cohort 2 and 3 are from two different hospitals
Fig. 4
Fig. 4
Logistic regression classification of benign and malignant nodules using cohort 1 as discovery data set. (A) Principal component analysis of plasma samples using the 6 candidate proteins (APOA4, APOB, CD14, PFN1, PLA2G7, and IGFBP2). (B) ROC curves of a six-protein logistic regression classifier (APOA4, APOB, CD14, PFN1, PLA2G7, and IGFBP2) for distinguishing benign and malignant nodules. (C) Sensitivity, specificity, PPN, and NPV value distributions over the range of threshold values from 0 to 1. (D) Confusion matrix showing the classification results in cohort 1
Fig. 5
Fig. 5
Logistic regression classification of benign and malignant nodules using cohort 2 and cohort 3 as independent validation data sets. (A) Principal component analysis of plasma samples from cohort 2 using the 6 candidate proteins (APOA4, APOB, CD14, PFN1, PLA2G7, and IGFBP2). (B) Principal component analysis of plasma samples from cohort 3 using the 6 candidate proteins. (C) ROC curves of the six-protein logistic regression classifier for distinguishing benign and malignant nodules in cohort 2 and 3. (D) Confusion matrix showing the classification error in cohort 2 and 3
Fig. 6
Fig. 6
MRM quantification and logistic regression classification of benign and malignant nodule subjects. (A) Peptide intensity plot at the peptide level of feature proteins selected for logistic regression. Boxplot showing differential expression of represented peptides from six proteins between patient groups confirmed as either malignant (MA) or benign (BE) nodules. (B) ROC curves of the six-peptide logistic regression classifier to distinguishing benign and malignant nodules. (C) Confusion matrix showing the classification results of the MRM assay

Similar articles

References

    1. Mazzone PJ, Lam L. Evaluating the patient with a pulmonary nodule: A review. JAMA. 2022;327(3):264–73. 10.1001/jama.2021.24287. - PubMed
    1. Oudkerk M, Liu S, Heuvelmans MA, Walter JE, Field JK. Lung cancer LDCT screening and mortality reduction - evidence, pitfalls and future perspectives. Nat Rev Clin Oncol. 2021;18(3):135–51. 10.1038/s41571-020-00432-6. - PubMed
    1. Pinsky PF, Gierada DS, Black W, Munden R, Nath H, Aberle D, Kazerooni E. Performance of lung-RADS in the National lung screening trial: a retrospective assessment. Ann Intern Med. 2015;162(7):485–91. 10.7326/m14-2086. - PMC - PubMed
    1. Zeng D, Wang C, Mu C, Su M, Mao J, Huang J, Jiang J. Cell-free DNA from Bronchoalveolar lavage fluid (BALF): a new liquid biopsy medium for identifying lung cancer. Ann Transl Med. 2021;9(13):1080. 10.21037/atm-21-2579. - PMC - PubMed
    1. Zhang X, Yu Z, Xu Y, Chao Y, Hu Q, Li C, Zhang X. Utility of cell-free DNA from bronchial washing fluid in diagnosis and genomic determination for radiology-suspected pulmonary nodules. Br J Cancer. 2022;127(12):2154–65. 10.1038/s41416-022-01969-2. - PMC - PubMed

LinkOut - more resources