Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Sep 27;13(1):152.
doi: 10.1186/s13073-021-00968-x.

Deep learning in cancer diagnosis, prognosis and treatment selection

Affiliations
Review

Deep learning in cancer diagnosis, prognosis and treatment selection

Khoa A Tran et al. Genome Med. .

Abstract

Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.

Keywords: Artificial intelligence; Cancer genomics; Cancer of unknown primary; Deep learning; Explainability; Molecular subtypes; Multi-modal learning; Pharmacogenomics; Precision oncology; Prognosis; Tumour microenvironment.

PubMed Disclaimer

Conflict of interest statement

John V Pearson and Nicola Waddell are co-founders and Board members of genomiQa. The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Deep learning may impact clinical oncology during diagnosis, prognosis and treatment. Specific areas of clinical oncology where deep learning is showing promise include cancer of unknown primary, molecular subtyping of cancers, prognosis and survivability and precision oncology. Examples of deep learning applications within each of these areas are listed. The data modalities utilised by deep learning models are numerous and include genomic, transcriptomic and histopathology data categories covered in this review
Fig. 2
Fig. 2
An overview of Deep Learning techniques and concepts in oncology. a Graph convolutional neural networks (GCNN) are designed to operate on graph-structured data. In this particular example inspired by [–19], gene expression values (upper left panel) are represented as graph signals structured by a protein–protein interactions graph (lower left panel) that serve as inputs to GCNN. For a single sample (highlighted with red outline), each node represents one gene with its expression value assigned to the corresponding protein node, and inter-node connections represent known protein–protein interactions. GCNN methods covered in this review require a graph to be undirected. Graph convolution filters are applied on each gene to extract meaningful gene expression patterns from the gene’s neighbourhood (nodes connected by orange edges). Pooling, i.e. combining clusters of nodes, can be applied following graph convolution to obtain a coarser representation of the graph. Output of the final graph convolution/pooling layer would then be passed through fully connected layers producing GCNN’s decision. b Semantic segmentation is applied to image data where it assigns a class label to each pixel within an image. A semantic segmentation model usually consists of an encoder, a decoder and a softmax function. The encoder consists of feature extraction layers to ‘learn’ meaningful and granular features from the input, while the decoder learns features to generate a coloured map of major object classes in the input (through the use of the softmax function). The example shows a H&E tumour section with infiltrating lymphocyte map generated by Saltz et al. [20] DL model c multimodal learning allows multiple datasets representing the same underlying phenotype to be combined to increase predictive power. Multimodal learning usually starts with encoding each input modality into a representation vector of lower dimension, followed by a feature combination step to aggregate these vectors together. d Explainability methods take a trained neural network and mathematically quantify how each input feature influences the model’s prediction. The outputs are usually feature contribution scores, capable of explaining the most salient features that dictate the model’s predictions. In this example, each input gene is assigned a contribution score by the explainability model (colour scale indicates the influence on the model prediction). An example of gene interaction network is shown coloured by contribution scores (links between red dots represent biological connections between genes)

References

    1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. - DOI - PubMed
    1. Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet. 2015;16:321–332. doi: 10.1038/nrg3920. - DOI - PMC - PubMed
    1. Jones W, Alasoo K, Fishman D, Parts L. Computational biology: deep learning. Skolnick J, editor. Emerg Top Life Sci. 2017;1:257–274. doi: 10.1042/ETLS20160025. - DOI - PMC - PubMed
    1. Wainberg M, Merico D, Delong A, Frey BJ. Deep learning in biomedicine. Nat Biotechnol. 2018;36:829–838. doi: 10.1038/nbt.4233. - DOI - PubMed
    1. Zou J, Huss M, Abid A, Mohammadi P, Torkamani A, Telenti A. A primer on deep learning in genomics. Nat Genet. 2019;51:12–18. doi: 10.1038/s41588-018-0295-5. - DOI - PMC - PubMed

Publication types