Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Mar 23;21(2):368-394.
doi: 10.1093/bib/bby120.

Graph- and rule-based learning algorithms: a comprehensive review of their applications for cancer type classification and prognosis using genomic data

Affiliations
Review

Graph- and rule-based learning algorithms: a comprehensive review of their applications for cancer type classification and prognosis using genomic data

Saurav Mallik et al. Brief Bioinform. .

Abstract

Cancer is well recognized as a complex disease with dysregulated molecular networks or modules. Graph- and rule-based analytics have been applied extensively for cancer classification as well as prognosis using large genomic and other data over the past decade. This article provides a comprehensive review of various graph- and rule-based machine learning algorithms that have been applied to numerous genomics data to determine the cancer-specific gene modules, identify gene signature-based classifiers and carry out other related objectives of potential therapeutic value. This review focuses mainly on the methodological design and features of these algorithms to facilitate the application of these graph- and rule-based analytical approaches for cancer classification and prognosis. Based on the type of data integration, we divided all the algorithms into three categories: model-based integration, pre-processing integration and post-processing integration. Each category is further divided into four sub-categories (supervised, unsupervised, semi-supervised and survival-driven learning analyses) based on learning style. Therefore, a total of 11 categories of methods are summarized with their inputs, objectives and description, advantages and potential limitations. Next, we briefly demonstrate well-known and most recently developed algorithms for each sub-category along with salient information, such as data profiles, statistical or feature selection methods and outputs. Finally, we summarize the appropriate use and efficiency of all categories of graph- and rule mining-based learning methods when input data and specific objective are given. This review aims to help readers to select and use the appropriate algorithms for cancer classification and prognosis study.

Keywords: association rule mining; cancer classification; cancer prognosis; data set integration; gene signature; graph mining; learning technique.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The flowchart of three categories of integration where sub-figure (a) illustrates the flowchart of model-based integration, sub-figure (b) depicts the flowchart of pre-processing-based integration and sub-figure (c) denotes the flowchart of post-processing integration.
Figure 2
Figure 2
The work-flows of four categories of learning where sub-figure (a) represents the work-flow of supervised learning, sub-figure (b) illustrates the work-flow of unsupervised learning, sub-figure (c) signifies semi-supervised learning and sub-figure (d) represents the flowchart of survival-based learning.

Similar articles

Cited by

References

    1. Fabres PJ, Collins C, Cavagnaro TR, et al. . A concise review on multi-omics data integration for terroir analysis in Vitis vinifera. Front Plant Sci 2017;8:1065. - PMC - PubMed
    1. Huang S, Chaudhary K. Garmire LX. More is better: recent progress in multi-omics data integration methods. Front Genet 2017;8:84. - PMC - PubMed
    1. Ebrahim A, Brunk E, Tan J, et al. . Multi-omic data integration enables discovery of hidden biological regularities. Nat Commun 2016;7:13091. - PMC - PubMed
    1. Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol 2017;18:83. - PMC - PubMed
    1. Kim M, Tagkopoulos I. Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 2018;14:8–25. - PubMed

Publication types