Identifying Individual-Cancer-Related Genes by Rebalancing the Training Samples

Bolin Chen, Xuequn Shang, Min Li, Jianxin Wang, Fang-Xiang Wu

PMID: 27093705
DOI: 10.1109/TNB.2016.2553119

Identifying Individual-Cancer-Related Genes by Rebalancing the Training Samples

Bolin Chen et al. IEEE Trans Nanobioscience. 2016 Jun.

. 2016 Jun;15(4):309-315.

doi: 10.1109/TNB.2016.2553119. Epub 2016 Apr 12.

Authors

Bolin Chen, Xuequn Shang, Min Li, Jianxin Wang, Fang-Xiang Wu

PMID: 27093705
DOI: 10.1109/TNB.2016.2553119

Abstract

The identification of individual-cancer-related genes typically is an imbalanced classification issue. The number of known cancer-related genes is far less than the number of all unknown genes, which makes it very hard to detect novel predictions from such imbalanced training samples. A regular machine learning method can either only detect genes related to all cancers or add clinical knowledge to circumvent this issue. In this study, we introduce a training sample rebalancing strategy to overcome this issue by using a two-step logistic regression and a random resampling method. The two-step logistic regression is to select a set of genes that related to all cancers. While the random resampling method is performed to further classify those genes associated with individual cancers. The issue of imbalanced classification is circumvented by randomly adding positive instances related to other cancers at first, and then excluding those unrelated predictions according to the overall performance at the following step. Numerical experiments show that the proposed resampling method is able to identify cancer-related genes even when the number of known genes related to it is small. The final predictions for all individual cancers achieve AUC values around 0.93 by using the leave-one-out cross validation method, which is very promising, compared with existing methods.

PubMed Disclaimer

Cited by

Identifying Cancer genes by combining two-rounds RWR based on multiple biological data.
Zhang W, Lei Ieee Member X, Bian C. Zhang W, et al. BMC Bioinformatics. 2019 Nov 25;20(Suppl 18):518. doi: 10.1186/s12859-019-3123-8. BMC Bioinformatics. 2019. PMID: 31760937 Free PMC article.
Ensemble disease gene prediction by clinical sample-based networks.
Luo P, Tian LP, Chen B, Xiao Q, Wu FX. Luo P, et al. BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):79. doi: 10.1186/s12859-020-3346-8. BMC Bioinformatics. 2020. PMID: 32164526 Free PMC article.
Discovering mutated driver genes through a robust and sparse co-regularized matrix factorization framework with prior information from mRNA expression patterns and interaction network.
Xi J, Wang M, Li A. Xi J, et al. BMC Bioinformatics. 2018 Jun 5;19(1):214. doi: 10.1186/s12859-018-2218-y. BMC Bioinformatics. 2018. PMID: 29871594 Free PMC article.
Identifying Disease-Gene Associations With Graph-Regularized Manifold Learning.
Luo P, Xiao Q, Wei PJ, Liao B, Wu FX. Luo P, et al. Front Genet. 2019 Apr 2;10:270. doi: 10.3389/fgene.2019.00270. eCollection 2019. Front Genet. 2019. PMID: 31001321 Free PMC article.
Predicting disease-related genes using integrated biomedical networks.
Peng J, Bai K, Shang X, Wang G, Xue H, Jin S, Cheng L, Wang Y, Chen J. Peng J, et al. BMC Genomics. 2017 Jan 25;18(Suppl 1):1043. doi: 10.1186/s12864-016-3263-4. BMC Genomics. 2017. PMID: 28198675 Free PMC article.

See all "Cited by" articles

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- IEEE Engineering in Medicine and Biology Society
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Identifying Individual-Cancer-Related Genes by Rebalancing the Training Samples

Identifying Individual-Cancer-Related Genes by Rebalancing the Training Samples

Authors

Abstract

Similar articles

Cited by

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources