Network-based biomarkers enhance classical approaches to prognostic gene expression signatures

Rebecca L Barter, Sarah-Jane Schramm, Graham J Mann, Yee Hwa Yang

PMID: 25521200
PMCID: PMC4290694
DOI: 10.1186/1752-0509-8-S4-S5

Network-based biomarkers enhance classical approaches to prognostic gene expression signatures

Rebecca L Barter et al. BMC Syst Biol. 2014.

. 2014;8 Suppl 4(Suppl 4):S5.

doi: 10.1186/1752-0509-8-S4-S5. Epub 2014 Dec 8.

Authors

Rebecca L Barter, Sarah-Jane Schramm, Graham J Mann, Yee Hwa Yang

PMID: 25521200
PMCID: PMC4290694
DOI: 10.1186/1752-0509-8-S4-S5

Abstract

Background: Classical approaches to predicting patient clinical outcome via gene expression information are primarily based on differential expression of unrelated genes (single-gene approaches) or genes related by, for example, biologic pathway or function (gene-sets). Recently, network-based approaches utilising interaction information between genes have emerged. An open problem is whether such approaches add value to the more traditional methods of signature modelling. We explored this question via comparison of the most widely employed single-gene, gene-set, and network-based methods, using gene expression microarray data from two different cancers: melanoma and ovarian. We considered two kinds of network approaches. The first of these identifies informative genes using gene expression and network connectivity information combined, the latter drawn from prior knowledge of protein-protein interactions. The second approach focuses on identification of informative sub-networks (small networks of interacting proteins, again from prior knowledge networks). For all methods we performed 100 rounds of 5-fold cross-validation under 3 different classifiers. For network-based approaches, we considered two different protein-protein interaction networks. We quantified resulting patterns of misclassification and discussed the relative value of each relative to ongoing development of prognostic biomarkers.

Results: We found that single-gene, gene-set and network methods yielded similar error rates in melanoma and ovarian cancer data. Crucially, however, our novel and detailed patient-level analyses revealed that the different methods were correctly classifying alternate subsets of patients in each cohort. We also found that the network-based NetRank feature selection method was the most stable.

Conclusions: Next-generation methods of gene expression signature modelling harness data from external networks and are foreshadowed as a standard mode of analysis. But what do they add to traditional approaches? Our findings indicate there is value in the way in which different subspaces of the patient sample are captured differently among the various methods, highlighting the possibility of 'combination' classifiers capable of identifying which patients will be more accurately classified by one particular method over another. We have seen this clearly for the first time because of our in-depth analysis at the level of individual patients.

PubMed Disclaimer

Figures

**Figure 1**
**Examples of informative features which differ between the PP class (red) and GP class (blue)**. These examples were obtained using the melanoma data set and the iRefWeb network. A) presents the differential expression of the TRAF3 gene (the x-axis corresponds to the samples and the y-axis corresponds to the expression values), B) presents the differential (median) expression of the CD19 gene-set which consists of 6 genes (the x-axis corresponds to the samples and the y-axis corresponds to the expression values), and C) presents the differential correlation of the DTX1 hub sub-network (for visual simplicity, we present the hub gene expression (x-axis) versus interactor gene expression (y-axis) for three of the five edges in the DTX1 hub sub-network and for 10 samples from each class). The hub-interactor correlations for each hub-interactor pair are presented. Image adapted from [6].

**Figure 2**
**Classification error rates**. The error rates (y-axis) obtained from 100 rounds of 5-fold cross validation are presented for the RF classifier, the SVM classifier and the DLDA classifier for iRefWeb network and A) the melanoma data set and B) the ovarian cancer data set. The numbers within the parentheses following the method names are the number of selected features in each cross-validation round.

**Figure 3**
**Class-specific classification error rates**. The GP (dotted line) and PP (solid line) error rates averaged over the 100 rounds of 5-fold cross validation for each method are presented for the iRefWeb network and the RF classifier, the SVM classifier and the DLDA classifier using A) the melanoma data set and B) the ovarian cancer data set.

**Figure 4**
**Stability of the feature selection methods**. The number of selected features pair-wise in common over the 100 rounds of 5-fold cross-validation (thus over a total of 500 selected feature lists) for each of the single-gene, gene-set and network methods based on the iRefWeb PPI network for A) the melanoma data set and B) the ovarian cancer data set, with respect to the number of features selected.

**Figure 5**
**Classification accuracy at the patient level**. A black cell corresponds to the patient being classified correctly in all 100 CV rounds, whereas a white cell corresponds to the patient being classified correctly in none of the 100 CV rounds for the RF classifier, the SVM classifier and the DLDA classifier using A) the melanoma data set and B) the ovarian cancer data set, together with the iRefWeb PPI network. The rows are split into single-gene (the first row), gene-set (the second row) and network-based method (the last three rows). The tumor IDs are given on the x-axis, and the average error rate (taken over the 100 rounds of CV) are provided on the right-hand-side y-axis.

See this image and copyright information in PMC

References

1. Weigelt B, Baehner FL, Reis JS. The contribution of gene expression profiling to breast cancer classification, prognostication and prediction: a retrospective of the last decade. Journal of Pathology. 2010;220:263–280. - PubMed
1. Harbeck N, Sotlar K, Wuerstlein R, Doisneau-Sixou S. Molecular and protein markers for clinical decision making in breast cancer: Today and tomorrow. Cancer treatment reviews. 2014;40:434–444. doi: 10.1016/j.ctrv.2013.09.014. - DOI - PubMed
1. Timar J, Gyorffy B, Raso E. Gene signature of the metastatic potential of cutaneous melanoma: too much for too little? Clinical and Experimental Metastasis. 2010;27:371–387. doi: 10.1007/s10585-010-9307-2. - DOI - PubMed
1. Sanz-Pamplona R, Berenguer A, Cordero D, Riccadonna S, Solé X, Crous-Bou M, Guinó E, Sanjuan X, Biondo S, Soriano A. et al.Clinical Value of Prognosis Gene Expression Signatures in Colorectal Cancer: A Systematic Review. PLoS ONE. 2012;7:e48877. doi: 10.1371/journal.pone.0048877. - DOI - PMC - PubMed
1. Subramanian J, Simon R. Gene Expression-Based Prognostic Signatures in Lung Cancer: Ready for Clinical Use? J Natl Cancer Inst. 2010;102:464–474. doi: 10.1093/jnci/djq025. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Associated data

Actions
- Search in PubMed
- Search in GEO
Actions
- Search in PubMed
- Search in GEO

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Network-based biomarkers enhance classical approaches to prognostic gene expression signatures

Network-based biomarkers enhance classical approaches to prognostic gene expression signatures

Authors

Abstract

Figures

References

Publication types

MeSH terms

Substances

Associated data

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases