Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 19;45(16):e151.
doi: 10.1093/nar/gkx642.

De novo pathway-based biomarker identification

Affiliations

De novo pathway-based biomarker identification

Nicolas Alcaraz et al. Nucleic Acids Res. .

Abstract

Gene expression profiles have been extensively discussed as an aid to guide the therapy by predicting disease outcome for the patients suffering from complex diseases, such as cancer. However, prediction models built upon single-gene (SG) features show poor stability and performance on independent datasets. Attempts to mitigate these drawbacks have led to the development of network-based approaches that integrate pathway information to produce meta-gene (MG) features. Also, MG approaches have only dealt with the two-class problem of good versus poor outcome prediction. Stratifying patients based on their molecular subtypes can provide a detailed view of the disease and lead to more personalized therapies. We propose and discuss a novel MG approach based on de novo pathways, which for the first time have been used as features in a multi-class setting to predict cancer subtypes. Comprehensive evaluation in a large cohort of breast cancer samples from The Cancer Genome Atlas (TCGA) revealed that MGs are considerably more stable than SG models, while also providing valuable insight into the cancer hallmarks that drive them. In addition, when tested on an independent benchmark non-TCGA dataset, MG features consistently outperformed SG models. We provide an easy-to-use web service at http://pathclass.compbio.sdu.dk where users can upload their own gene expression datasets from breast cancer studies and obtain the subtype predictions from all the classifiers.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow for de novo pathway-based classification of breast cancer subtypes. Datasets (A): dataset is split into 5-folds: four for training and one for testing. Feature extraction (B): subtype-specific pathways are extracted by projecting the training sets on the input network and running KeyPathwayMiner (KPM). Feature scoring (C): the extracted pathways are scored using single sample gene set enrichment analysis (ssGSEA) to produce a matrix of samples versus pathways. Feature reduction (D): pathways are clustered based on their Spearman’s correlation coefficient across all training samples and the most representative feature for each cluster is selected for the next step. Feature selection (E): the best features are selected using random forests (RFs) with recursive feature elimination and a final RF model is built with the selected features. Evaluation (F): the final model is used to predict the subtype labels of the test set. The process is repeated for all fold splits and 10 repeats, recording the performance for each run.
Figure 2.
Figure 2.
Prediction performance (F-score) for the different models within the TCGA cross-validation runs (A and C) and the final validation on the unseen ACES datasets (B and D) for gene expression. Top figures correspond to the overall performance and bottom figures to performance by class.
Figure 3.
Figure 3.
Stability of MG features from a priori pathways inside the cross-validation evaluation scheme for the TCGA gene expression cohort. The Jaccard Index was computed for the selected features of each pair of folds.
Figure 4.
Figure 4.
Stability of gene markers from de novo pathways inside the cross-validation evaluation scheme for the TCGA gene expression cohort. The Jaccard Index (A) was computed for the genes within the selected features for each pair of folds. In (B) the number of genes that were selected for all runs.
Figure 5.
Figure 5.
Cancer hallmark enrichment (Fisher’s exact test) for the top 100 most frequently selected genes within the TCGA gene expression cross-validation loop in (A) gene expression and (B) DNA methylation. The cancer hallmarks are: activating invasion and metastasis (AIM), avoiding immune destruction (AID), deregulating cellular energetics (DCE), enabling replicative immortality (ERI), evading growth suppressors (EGS), genome instability and mutation (GIM), inducing angiogenesis (IA), resisting cell death (RCD), sustaining proliferative signaling (SPS) and tumor-promoting inflammation (TPI). Bold horizontal line corresponds to P-value = 0.05.

Similar articles

Cited by

References

    1. Grammatikos A.P., Kyttaris V.C., Kis-Toth K., Fitzgerald L.M., Devlin A., Finnell M.D., Tsokos G.C.. A T cell gene expression panel for the diagnosis and monitoring of disease activity in patients with systemic lupus erythematosus. Clin. Immunol. 2014; 150:192–200. - PMC - PubMed
    1. Arijs I., Quintens R., Van Lommel L., Van Steen K., De Hertogh G., Lemaire K., Schraenen A., Perrier C., Van Assche G., Vermeire S. et al. . Predictive value of epithelial gene expression profiles for response to infliximab in Crohn’s disease. Inflamm. Bowel Dis. 2010; 16:2090–2098. - PubMed
    1. Molochnikov L., Rabey J.M., Dobronevsky E., Bonucelli U., Ceravolo R., Frosini D., Grünblatt E., Riederer P., Jacob C., Aharon-Peretz J. et al. . A molecular signature in blood identifies early Parkinson’s disease. Mol. Neurodegener. 2012; 7:26. - PMC - PubMed
    1. van ’t Veer L.J., Dai H., van de Vijver M.J., He Y.D., Hart A.A., Mao M., Peterse H.L., van der Kooy K., Marton M.J., Witteveen A.T. et al. . Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002; 415:530–536. - PubMed
    1. Karnes R.J., Bergstralh E.J., Davicioni E., Ghadessi M., Buerki C., Mitra A.P., Crisan A., Erho N., Vergara I.A., Lam L.L. et al. . Validation of a genomic classifier that predicts metastasis following radical prostatectomy in an at risk patient population. J. Urol. 2013; 190:2047–2053. - PMC - PubMed

Publication types

Substances