Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 30;26(2):105915.
doi: 10.1016/j.isci.2022.105915. eCollection 2023 Feb 17.

A subnetwork-based framework for prioritizing and evaluating prognostic gene modules from cancer transcriptome data

Affiliations

A subnetwork-based framework for prioritizing and evaluating prognostic gene modules from cancer transcriptome data

Biwei Cao et al. iScience. .

Abstract

Cancer prognosis prediction is critical to the clinical decision-making process. Currently, the high availability of transcriptome datasets allows us to extract the gene modules with promising prognostic values. However, the biomarker identification is greatly challenged by tumor and patient heterogeneity. In this study, a framework of three subnetwork-based strategies is presented, incorporating hypothesis-driven, data-driven, and literature-based methods with informative visualization to prioritize candidate genes. By applying the proposed approaches to a head and neck squamous cell cancer (HNSCC) transcriptome dataset, we successfully identified multiple HNSCC-specific gene modules with improved prognostic values and mechanism information compared with the standard gene panel selection methods. The proposed framework is general and can be applied to any type of omics data. Overall, the study demonstrates and supports the use of the subnetwork-based approach for distilling reliable and biologically meaningful prognostic factors.

Keywords: Bioinformatics; Cancer; Gene network.

PubMed Disclaimer

Conflict of interest statement

C.H.C. has received honoraria from Sanofi, Merck, and Brooklyn ImmunoTherapeutics, and Exelixis for serving in ad hoc scientific advisory boards. All other authors declare no conflict interest.

Figures

None
Graphical abstract
Figure 1
Figure 1
Three subnetwork strategies for prioritizing prognostic genes (A) The hypothesis-driven approach focuses on the individual prognostic cancer mechanistic pathways. Based on the genes involved in 1 mechanism, the gene expression analysis by extracting subnetworks (e.g., exploratory graph analysis [EGA]) is applied to refine the prognostic signatures. (B) The data-driven strategy examines all genes by their prognostic significance simultaneously. A PPI network is then used as a complementary affirmation to prioritize the top candidate genes. (C) The literature-based strategy is a combination of data-driven and hypothesis-driven approaches. The functional/cellular submodule of the candidate gene is first explored in the context of the anchor genes discovered based on the scRNA-seq. Both the gene expression (GE) analysis and the PPI network information will be utilized in this approach.
Figure 2
Figure 2
Unsupervised gene communities detected with EGA in HNSCC (A) Eight gene communities grouped by EGA based on the hypoxia gene signature. (B) p Values of cox models after -log10 transformation. The yellow bar represents the p values from the EGA-selected prognostic submodule, and the green bar represents the p values from all genes among 7 hypoxia-enriched cancer types: uterine corpus endometrial carcinoma (UCEC), ovarian cancer (OV), lung squamous cell carcinoma (LUSC), HNSC, colon adenocarcinoma (COAD), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), and bladder urothelial carcinoma (BLCA).
Figure 3
Figure 3
Gene prioritization results from the data-driven strategy workflow (A) Top genes selected by Cox models for all patients and the stratified HPV-negative patients subgroup. (using both OS and progression-free survival [PFI] as the outcome) (B) PPI network demonstrated the key hub genes and subnetwork around them. The hypoxia and immune gene hubs are highlighted with blue and red circles. (C) Kaplan-Meier curve plot for patients stratified by risk score quartiles with OS as the outcome. Patients with higher predicted risk scores (fourth quartile) resulted in lower OS probabilities for the training dataset. (D) Kaplan-Meier curve plot for patients stratified by risk score quartiles with disease-free survival (DFS) as the outcome. Consistent with the trend shown in panel C, patients with higher predicted risk scores resulted in lower DFS rates for the test dataset.
Figure 4
Figure 4
Gene subnetwork delineated by the literature-based strategy (A) The “Spinglass” algorithm clusters the candidate gene into 1 of the three compartments: immune, stroma, or tumor. As an example, the input gene “GZMA” highlighted with red asterisk was assigned to the immune community. (B) The PPI network of the representative anchor gene and “GZMA” based on the STRING database. (C) Overlapped (significant) prognostic genes in TCGA and Gene Expression Omnibus (GEO) datasets. The yellow bars are the literature-based genes selected in both datasets, gray bars represent the number of genes selected in one dataset, and the blue bars reveal the number of genes selected in neither dataset. (D) Gene enrichment analysis of genes assigned in the malignant group, including Msigdb Hamllmark, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, and gene ontology pathways (Gene Ontolgy Biological Process (GOBP), Gene Ontolgy Molecular Function (GOMF), and Gene Ontolgy Cellular Component (GOCC)).

Similar articles

Cited by

References

    1. Kuksin M., Morel D., Aglave M., Danlos F.X., Marabelle A., Zinovyev A., Gautheret D., Verlingue L. Applications of single-cell and bulk RNA sequencing in onco-immunology. Eur. J. Cancer. 2021;149:193–210. - PubMed
    1. Kukurba K.R., Montgomery S.B. RNA sequencing and analysis. Cold Spring Harb. Protoc. 2015;2015:top084970. - PMC - PubMed
    1. Hwang B., Lee J.H., Bang D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 2021;53:1005–1014. - PMC - PubMed
    1. Li K., Wang X., Kuan P.F. Mixture network regularized generalized linear model with feature selection. bioRxiv. 2019:678029. doi: 10.1101/678029. Preprint at. - DOI
    1. Kim Y., Kim T.-K., Kim Y., Yoo J., You S., Lee I., Carlson G., Hood L., Choi S., Hwang D. Principal network analysis: identification of subnetworks representing major dynamics using gene expression data. Bioinformatics. 2011;27:391–398. - PMC - PubMed

LinkOut - more resources