Computational advances of tumor marker selection and sample classification in cancer proteomics
- PMID: 32802273
- PMCID: PMC7403885
- DOI: 10.1016/j.csbj.2020.07.009
Computational advances of tumor marker selection and sample classification in cancer proteomics
Abstract
Cancer proteomics has become a powerful technique for characterizing the protein markers driving transformation of malignancy, tracing proteome variation triggered by therapeutics, and discovering the novel targets and drugs for the treatment of oncologic diseases. To facilitate cancer diagnosis/prognosis and accelerate drug target discovery, a variety of methods for tumor marker identification and sample classification have been developed and successfully applied to cancer proteomic studies. This review article describes the most recent advances in those various approaches together with their current applications in cancer-related studies. Firstly, a number of popular feature selection methods are overviewed with objective evaluation on their advantages and disadvantages. Secondly, these methods are grouped into three major classes based on their underlying algorithms. Finally, a variety of sample separation algorithms are discussed. This review provides a comprehensive overview of the advances on tumor maker identification and patients/samples/tissues separations, which could be guidance to the researches in cancer proteomics.
Keywords: ANN, Artificial Neural Network; ANOVA, Analysis of Variance; CFS, Correlation-based Feature Selection; Cancer proteomics; Computational methods; DAPC, Discriminant Analysis of Principal Component; DT, Decision Trees; EDA, Estimation of Distribution Algorithm; FC, Fold Change; GA, Genetic Algorithms; GR, Gain Ratio; HC, Hill Climbing; HCA, Hierarchical Cluster Analysis; IG, Information Gain; LDA, Linear Discriminant Analysis; LIMMA, Linear Models for Microarray Data; MBF, Markov Blanket Filter; MWW, Mann–Whitney–Wilcoxon test; OPLS-DA, Orthogonal Partial Least Squares Discriminant Analysis; PCA, Principal Component Analysis; PLS-DA, Partial Least Square Discriminant Analysis; RF, Random Forest; RF-RFE, Random Forest with Recursive Feature Elimination; SA, Simulated Annealing; SAM, Significance Analysis of Microarrays; SBE, Sequential Backward Elimination; SFS, and Sequential Forward Selection; SOM, Self-organizing Map; SU, Symmetrical Uncertainty; SVM, Support Vector Machine; SVM-RFE, Support Vector Machine with Recursive Feature Elimination; Sample classification; Tumor marker selection; sPLSDA, Sparse Partial Least Squares Discriminant Analysis; t-SNE, Student t Distribution; χ2, Chi-square.
© 2020 The Authors.
Figures
Similar articles
-
A critical assessment of feature selection methods for biomarker discovery in clinical proteomics.Mol Cell Proteomics. 2013 Jan;12(1):263-76. doi: 10.1074/mcp.M112.022566. Epub 2012 Oct 31. Mol Cell Proteomics. 2013. PMID: 23115301 Free PMC article.
-
An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data.Biomed Res Int. 2018 Aug 30;2018:7538204. doi: 10.1155/2018/7538204. eCollection 2018. Biomed Res Int. 2018. PMID: 30228989 Free PMC article.
-
Improving PLS-RFE based gene selection for microarray data classification.Comput Biol Med. 2015 Jul;62:14-24. doi: 10.1016/j.compbiomed.2015.04.011. Epub 2015 Apr 17. Comput Biol Med. 2015. PMID: 25912984
-
Advances in Current Diabetes Proteomics: From the Perspectives of Label- free Quantification and Biomarker Selection.Curr Drug Targets. 2020;21(1):34-54. doi: 10.2174/1389450120666190821160207. Curr Drug Targets. 2020. PMID: 31433754
-
Application of High Resolution Mass Spectrometric methods coupled with chemometric techniques in olive oil authenticity studies - A review.Anal Chim Acta. 2020 Oct 16;1134:150-173. doi: 10.1016/j.aca.2020.07.029. Epub 2020 Jul 30. Anal Chim Acta. 2020. PMID: 33059861 Review.
Cited by
-
Construction of a prediction and visualization system for cognitive impairment in elderly COPD patients based on self-assigning feature weights and residual evolution model.Front Artif Intell. 2025 Feb 7;8:1473223. doi: 10.3389/frai.2025.1473223. eCollection 2025. Front Artif Intell. 2025. PMID: 39991464 Free PMC article.
-
Identification of the key target profiles underlying the drugs of narrow therapeutic index for treating cancer and cardiovascular disease.Comput Struct Biotechnol J. 2021 Apr 21;19:2318-2328. doi: 10.1016/j.csbj.2021.04.035. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 33995923 Free PMC article.
-
Key genes and immune pathways in T-cell mediated rejection post-liver transplantation identified via integrated RNA-seq and machine learning.Sci Rep. 2024 Oct 16;14(1):24315. doi: 10.1038/s41598-024-74874-8. Sci Rep. 2024. PMID: 39414868 Free PMC article.
-
Immunoassay-aptasensor for the determination of tumor-derived exosomes based on the combination of magnetic nanoparticles and hybridization chain reaction.RSC Adv. 2021 Jan 27;11(9):4983-4990. doi: 10.1039/d0ra10159a. eCollection 2021 Jan 25. RSC Adv. 2021. PMID: 35424452 Free PMC article.
-
Proteomics Profiling of Stool Samples from Preterm Neonates with SWATH/DIA Mass Spectrometry for Predicting Necrotizing Enterocolitis.Int J Mol Sci. 2022 Oct 1;23(19):11601. doi: 10.3390/ijms231911601. Int J Mol Sci. 2022. PMID: 36232903 Free PMC article.
References
-
- Malvezzi M., Carioli G., Bertuccio P., Negri E., La Vecchia C. Relation between mortality trends of cardiovascular diseases and selected cancers in the European Union, in 1970–2017. Focus on cohort and period effects. Eur J Cancer. 2018;103:341–355. - PubMed
-
- Arora D., Chaudhary R., Singh A. System biology approach to identify potential receptor for targeting cancer and biomolecular interaction studies of indole[2,1-a]isoquinoline derivative as anticancerous drug candidate against it. Interdiscip Sci Comput Life Sci. 2017;11:125–134. - PubMed
Publication types
LinkOut - more resources
Full Text Sources