Reducing variability of breast cancer subtype predictors by grounding deep learning models in prior knowledge

Paul Anderson¹, Richa Gadgil¹, William A Johnson², Ella Schwab², Jean M Davidson³

Affiliations

¹ Department of Computer Science and Software Engineering, California Polytechnic State University, San Luis Obispo, CA, USA.
² Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA.
³ Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA. Electronic address: jdavid06@calpoly.edu.

PMID: 34536702
DOI: 10.1016/j.compbiomed.2021.104850

Free article

Reducing variability of breast cancer subtype predictors by grounding deep learning models in prior knowledge

Paul Anderson et al. Comput Biol Med. 2021 Nov.

Free article

. 2021 Nov:138:104850.

doi: 10.1016/j.compbiomed.2021.104850. Epub 2021 Sep 10.

Authors

Paul Anderson¹, Richa Gadgil¹, William A Johnson², Ella Schwab², Jean M Davidson³

Affiliations

¹ Department of Computer Science and Software Engineering, California Polytechnic State University, San Luis Obispo, CA, USA.
² Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA.
³ Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA. Electronic address: jdavid06@calpoly.edu.

PMID: 34536702
DOI: 10.1016/j.compbiomed.2021.104850

Abstract

Deep learning neural networks have improved performance in many cancer informatics problems, including breast cancer subtype classification. However, many networks experience underspecificationwheremultiplecombinationsofparametersachievesimilarperformance, bothin training and validation. Additionally, certain parameter combinations may perform poorly when the test distribution differs from the training distribution. Embedding prior knowledge from the literature may address this issue by boosting predictive models that provide crucial, in-depth information about a given disease. Breast cancer research provides a wealth of such knowledge, particularly in the form of subtype biomarkers and genetic signatures. In this study, we draw on past research on breast cancer subtype biomarkers, label propagation, and neural graph machines to present a novel methodology for embedding knowledge into machine learning systems. We embed prior knowledge into the loss function in the form of inter-subject distances derived from a well-known published breast cancer signature. Our results show that this methodology reduces predictor variability on state-of-the-art deep learning architectures and increases predictor consistency leading to improved interpretation. We find that pathway enrichment analysis is more consistent after embedding knowledge. This novel method applies to a broad range of existing studies and predictive models. Our method moves the traditional synthesis of predictive models from an arbitrary assignment of weights to genes toward a more biologically meaningful approach of incorporating knowledge.

Keywords: Applied computing; Bioinformatics; Genomics; Transcriptomics.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- ClinicalKey
- Elsevier Science
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reducing variability of breast cancer subtype predictors by grounding deep learning models in prior knowledge

Affiliations

Reducing variability of breast cancer subtype predictors by grounding deep learning models in prior knowledge

Authors

Affiliations

Abstract

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous