Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 4:16:1610284.
doi: 10.3389/fgene.2025.1610284. eCollection 2025.

MOLUNGN: a multi-omics graph neural network for biomarker discovery and accurate lung cancer classification

Affiliations

MOLUNGN: a multi-omics graph neural network for biomarker discovery and accurate lung cancer classification

Daifeng Zhang et al. Front Genet. .

Abstract

Introduction: Lung cancer continues to pose significant global health burdens due to its high morbidity and mortality. This study aimed to systematically integrate biomedical datasets, particularly incorporating traditional Chinese medicine (TCM)-associated multi-omics data, employing advanced deep-learning methods enhanced by graph attention mechanisms. We sought to investigate molecular mechanisms underlying stage-wise lung cancer progression and identify pivotal stage-specific biomarkers to support precise cancer staging classification.

Methods: We developed a novel multi-omics integrative model, named the Multi-Omics Lung Cancer Graph Network (MOLUNGN), based on Graph Attention Networks (GAT). Clinical datasets of non-small cell lung cancer (NSCLC), including lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), were analyzed to create omics-specific feature matrices comprising mRNA expression, miRNA mutation profiles, and DNA methylation data. MOLUNGN incorporated omics-specific GAT modules (OSGAT) combined with a Multi-Omics View Correlation Discovery Network (MOVCDN), effectively capturing intra- and inter-omics correlations. This framework enabled comprehensive classification of clinical cases into precise cancer stages, alongside the extraction of stage-specific biomarkers.

Results: Evaluations utilizing publicly available datasets confirmed MOLUNGN's superior performance over existing methodologies. On the LUAD dataset, MOLUNGN achieved accuracy (ACC) of 0.84, Recall_weighted of 0.84, F1_weighted of 0.83, and F1_macro of 0.82. On the LUSC dataset, the model further improved, achieving ACC of 0.86, Recall_weighted of 0.86, F1_weighted of 0.85, and F1_macro of 0.84. Notably, critical stage-specific biomarkers with significant biological relevance to lung cancer progression were identified, facilitating robust gene-disease associations.

Discussion: Our findings underscore the efficacy of MOLUNGN as an integrative framework in accurately classifying lung cancer stages and uncovering essential biomarkers. These biomarkers provide deep insights into lung cancer progression mechanisms and represent promising targets for future clinical validation. Integrating these biomarkers into the TCM-target-disease network enriches the understanding of TCM therapeutic potentials, laying a robust foundation for future precision medicine applications.

Keywords: GAT; MOLUNGN; lung cancer; multi-omics data integration; stage prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Illustration of MOLUNGN. MOLUNGN combines OSGAT for omics-specific features learning and MOVCDN for multi-omics integration. For concise illustration, an example of one patient is chosen to demonstrate the MOVCDN component for multi-omics integration after omics-specific learning. Pre-processing is first performed on each omics data type to remove noise and redundant features. OSGAT learns class prediction using omics features and the corresponding patient similarity network generated from the omics data. Cross-omics discovery tensor is calculated from initial predictions from OSGAT and forwarded to MOVCDN for final prediction. MOLUNGN is an end-to-end model and all networks are trained jointly.
FIGURE 2
FIGURE 2
The performance comparison of algorithms under LUAD data set. (A) Accuracy comparison. (B) Recall weighted comparison. (C) F1 weighted comparison. (D) F1 macro comparison.
FIGURE 3
FIGURE 3
The performance comparison of algorithms under LUSC data set. (A) Accuracy comparison. (B) Recall weighted comparison. (C) F1 weighted comparison. (D) F1 macro comparison.
FIGURE 4
FIGURE 4
(A) Ablation experiment diagram under LUAD dataset; (B) Ablation experiment diagram under LUSC dataset Ablation experiment diagram under LUAD dataset.
FIGURE 5
FIGURE 5
The figure includes a partial display of the important biomarkers discovered during the progression and stage transition of (A) LUAD and (B) LUSC.

Similar articles

References

    1. Brierley J. D., Gospodarowicz M. K., Wittekind C. (2017). TNM classification of malignant tumours. John Wiley and Sons.
    1. Cao Z.-J., Gao G. (2022). Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat. Biotechnol. 40 (10), 1458–1466. 10.1038/s41587-022-01284-4 - DOI - PMC - PubMed
    1. Chen H. I., Jin Y., Huang Y., Chen Y. (2016). Detection of high variability in gene expression from single-cell RNA-seq profiling. BMC Genomics 17 (Suppl. 7), 508. 10.1186/s12864-016-2897-6 - DOI - PMC - PubMed
    1. Chen Z., Pang M., Zhao Z., Li S., Miao R., Zhang Y., et al. (2020). Feature selection may improve deep neural networks for the bioinformatics problems. Bioinformatics 36 (5), 1542–1552. 10.1093/bioinformatics/btz763 - DOI - PubMed
    1. Deng T., Chen S., Zhang Y., Xu Y., Feng D., Wu H., et al. (2023). A cofunctional grouping-based approach for non-redundant feature gene selection in unannotated single-cell RNA-seq analysis. Brief. Bioinform 24 (2), bbad042. 10.1093/bib/bbad042 - DOI - PMC - PubMed

LinkOut - more resources