. 2019 Dec 11;9(4):219.

doi: 10.3390/diagnostics9040219.

A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer

Osama Hamzeh¹, Abedalrhman Alkhateeb¹, Julia Zhuoran Zheng¹, Srinath Kandalam², Crystal Leung³, Govindaraja Atikukke⁴, Dora Cavallo-Medved², Nallasivam Palanisamy⁵, Luis Rueda¹

Affiliations

¹ School of Computer Science, University of Windsor, 401 Sunset Ave, Windsor, ON N9B 3P4, Canada.
² Department of Biomedical Sciences, University of Windsor, 401 Sunset Ave, Windsor, ON N9B 3P4, Canada.
³ Schulich School of Medicine and Dentistry, Western University, 1151 Richmond St, London, ON N6A 5C1, Canada.
⁴ ITOS Oncology Inc., 1453 Prince Rd, Ste: 4125, Windsor, ON N9C 3Z4, Canada.
⁵ Department of Urology, Henry Ford Health System, One Ford Place, Detroit, MI 48202, USA.

PMID: 31835700
PMCID: PMC6963340
DOI: 10.3390/diagnostics9040219

A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer

Osama Hamzeh et al. Diagnostics (Basel). 2019.

. 2019 Dec 11;9(4):219.

doi: 10.3390/diagnostics9040219.

Authors

Osama Hamzeh¹, Abedalrhman Alkhateeb¹, Julia Zhuoran Zheng¹, Srinath Kandalam², Crystal Leung³, Govindaraja Atikukke⁴, Dora Cavallo-Medved², Nallasivam Palanisamy⁵, Luis Rueda¹

Affiliations

¹ School of Computer Science, University of Windsor, 401 Sunset Ave, Windsor, ON N9B 3P4, Canada.
² Department of Biomedical Sciences, University of Windsor, 401 Sunset Ave, Windsor, ON N9B 3P4, Canada.
³ Schulich School of Medicine and Dentistry, Western University, 1151 Richmond St, London, ON N6A 5C1, Canada.
⁴ ITOS Oncology Inc., 1453 Prince Rd, Ste: 4125, Windsor, ON N9C 3Z4, Canada.
⁵ Department of Urology, Henry Ford Health System, One Ford Place, Detroit, MI 48202, USA.

PMID: 31835700
PMCID: PMC6963340
DOI: 10.3390/diagnostics9040219

Abstract

(1) Background:One of the most common cancers that affect North American men and men worldwide is prostate cancer. The Gleason score is a pathological grading system to examine the potential aggressiveness of the disease in the prostate tissue. Advancements in computing and next-generation sequencing technology now allow us to study the genomic profiles of patients in association with their different Gleason scores more accurately and effectively. (2) Methods: In this study, we used a novel machine learning method to analyse gene expression of prostate tumours with different Gleason scores, and identify potential genetic biomarkers for each Gleason group. We obtained a publicly-available RNA-Seq dataset of a cohort of 104 prostate cancer patients from the National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) repository, and categorised patients based on their Gleason scores to create a hierarchy of disease progression. A hierarchical model with standard classifiers in different Gleason groups, also known as nodes, was developed to identify and predict nodes based on their mRNA or gene expression. In each node, patient samples were analysed via class imbalance and hybrid feature selection techniques to build the prediction model. The outcome from analysis of each node was a set of genes that could differentiate each Gleason group from the remaining groups. To validate the proposed method, the set of identified genes were used to classify a second dataset of 499 prostate cancer patients collected from cBioportal. (3) Results: The overall accuracy of applying this novel method to the first dataset was 93.3%; the method was further validated to have 87% accuracy using the second dataset. This method also identified genes that were not previously reported as potential biomarkers for specific Gleason groups. In particular, PIAS3 was identified as a potential biomarker for Gleason score 4 + 3 = 7, and UBE2V2 for Gleason score 6. (4) Insight: Previous reports show that the genes predicted by this newly proposed method strongly correlate with prostate cancer development and progression. Furthermore, pathway analysis shows that both PIAS3 and UBE2V2 share similar protein interaction pathways, the JAK/STAT signaling process.

Keywords: Gleason score detection; classification; next generation sequencing; prostate cancer; supervised learning; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

**Figure 1**
Gleason groups and their distributions.

**Figure 2**
Hierarchical tree of classifications of Gleason groups against the rest, along with the corresponding classification accuracies.

**Figure 3**
Accuracy obtained by each classifier for classifying one versus the rest for all five Gleason groups.

**Figure 4**
Classification accuracies obtained after applying the model on the second dataset.

**Figure 5**
An interactive figure taken from proteomics database STRING. It shows neighbouring protein binding and pathway interactions for a given gene using STRING and KEGG pathway analysis. Here, the gene of interest is *PIAS3*, an identified possible biomarker in the 4 + 3 = 7 score. The figure shows the interaction between other proteins and pathways associated with it.

**Figure 6**
Pre-processing steps of the proposed method.

**Figure 7**
Hypothetical example that shows how the synthetic minority oversampling technique (SMOTE) works.

**Figure 8**
Machine learning pipeline used in the proposed method.

See this image and copyright information in PMC

References

1. Ferlay J., Colombet M., Soerjomataram I., Mathers C., Parkin D., Piñeros M., Znaor A., Bray F. Estimating the global cancer incidence and mortality in 2018: Globocan sources and methods. Int. J. Cancer. 2019;144:1941–1953. doi: 10.1002/ijc.31937. - DOI - PubMed
1. Gospodarowicz M., Benedet L., Hutter R.V., Fleming I., Henson D.E., Sobin L.H. History and international developments in cancer staging. Cancer Prev. Control CPC Prev. Controle en Cancerol. PCC. 1998;2:262–268. - PubMed
1. Edge S., Compton C. The American Joint committee on cancer: The 7th edition of the AJCC cancer staging manual and the future of TNM. Ann. Surg. Oncol. 2010;17:1471–1474. doi: 10.1245/s10434-010-0985-4. - DOI - PubMed
1. Gordetsky J., Epstein J. Grading of Prostatic Adenocarcinoma: Current State and Prognostic Implications. Diagn. Pathol. 2016;11:25. doi: 10.1186/s13000-016-0478-2. - DOI - PMC - PubMed
1. Epstein J.I., Zelefsky M.J., Sjoberg D.D., Nelson J.B., Egevad L., Magi-Galluzzi C., Vickers A.J., Parwani A.V., Reuter V.E., Fine S.W., et al. A contemporary prostate cancer grading system: A validated alternative to the Gleason score. Eur. Urol. 2016;69:428–435. doi: 10.1016/j.eururo.2015.06.046. - DOI - PMC - PubMed

Grants and funding

RGPIN-2019-04696/Natural Sciences and Engineering Research Council of Canada

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer

Affiliations

A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources