Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 26:5:35.
doi: 10.1186/1752-0509-5-35.

A Boolean-based systems biology approach to predict novel genes associated with cancer: Application to colorectal cancer

Affiliations

A Boolean-based systems biology approach to predict novel genes associated with cancer: Application to colorectal cancer

Shivashankar H Nagaraj et al. BMC Syst Biol. .

Abstract

Background: Cancer has remarkable complexity at the molecular level, with multiple genes, proteins, pathways and regulatory interconnections being affected. We introduce a systems biology approach to study cancer that formally integrates the available genetic, transcriptomic, epigenetic and molecular knowledge on cancer biology and, as a proof of concept, we apply it to colorectal cancer.

Results: We first classified all the genes in the human genome into cancer-associated and non-cancer-associated genes based on extensive literature mining. We then selected a set of functional attributes proven to be highly relevant to cancer biology that includes protein kinases, secreted proteins, transcription factors, post-translational modifications of proteins, DNA methylation and tissue specificity. These cancer-associated genes were used to extract 'common cancer fingerprints' through these molecular attributes, and a Boolean logic was implemented in such a way that both the expression data and functional attributes could be rationally integrated, allowing for the generation of a guilt-by-association algorithm to identify novel cancer-associated genes. Finally, these candidate genes are interlaced with the known cancer-related genes in a network analysis aimed at identifying highly conserved gene interactions that impact cancer outcome. We demonstrate the effectiveness of this approach using colorectal cancer as a test case and identify several novel candidate genes that are classified according to their functional attributes. These genes include the following: 1) secreted proteins as potential biomarkers for the early detection of colorectal cancer (FXYD1, GUCA2B, REG3A); 2) kinases as potential drug candidates to prevent tumor growth (CDC42BPB, EPHB3, TRPM6); and 3) potential oncogenic transcription factors (CDK8, MEF2C, ZIC2).

Conclusion: We argue that this is a holistic approach that faithfully mimics cancer characteristics, efficiently predicts novel cancer-associated genes and has universal applicability to the study and advancement of cancer research.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The schema for the identification of novel genes associated with complex diseases. The expression profiles from the cancer data are analyzed to predict differentially expressed and condition-specific genes. The functional attributes over-represented in cancer are selected and representative datasets from public resources mined. The common cancer fingerprints from cancer-associated genes are processed through Boolean logic to develop a guilt-by-association classifier which, applied to non-cancer-associated genes, predicts novel candidate cancer-associated genes. Finally, novel candidate genes are further analyzed using network theory approaches.
Figure 2
Figure 2
The classification of differentially expressed genes resulting from the expression data analysis. The top 15 DE genes in all of the three categories are tabulated with their expression values in normal, adenoma, carcinoma and inflammation.
Figure 3
Figure 3
Trends showing the distribution of genes across 13 binarized Boolean variables. Four classes of genes were used for the comparison; i. all the genes in the human genome (21 892), ii. cancer-associated genes (749), iii. GBA ranked candidate genes candidate genes (1017) and iv. top candidate genes (134, 13.2%of the GBA ranked candidate genes). PTM and SEC classes are enriched in cancer-associated genes as well as in candidate genes category.
Figure 4
Figure 4
Two-step computational validation approach to ascertain the inferential validity of the proposed GBA. 4A shows the ratio of the average Boolean score given to cancer genes over the average score given to the other genes. Candidate genes comprising the top 13.2% of genes guarantee a 2.71-fold over-representation of cancer genes. 4B. Standard cross-validation in which the proportion of cancer-associated genes are compared to genes with extreme Boolean scores. By selecting the 50% most extreme genes captures 90% of all cancer genes.
Figure 5
Figure 5
The Always Conserved network visualized using the Cytoscape software at our levels of resolution: (A) Connections involving at least one top candidate gene; (B) derived from A where only genes with more than two connections are displayed; (C) derived from B where only connections that were deemed to be significant across the four original networks (Adenoma, Carcinoma, Inflammation and Normal) are displayed; and (D) only those connections involving at least one top candidate gene in the four networks. The specific nature of edges, nodes and other features such as shape and color along with the Cytoscape file is provided in our website http://www.livestockgenomics.csiro.au/courses/crc.html

References

    1. Hornberg JJ, Bruggeman FJ, Westerhoff HV, Lankelma J. Cancer: a Systems Biology disease. Biosystems. 2006;83(2-3):81–90. doi: 10.1016/j.biosystems.2005.05.014. - DOI - PubMed
    1. Kitano H. Cancer as a robust system: implications for anticancer therapy. Nat Rev Cancer. 2004;4(3):227–235. doi: 10.1038/nrc1300. - DOI - PubMed
    1. Ergun A, Lawrence CA, Kohanski MA, Brennan TA, Collins JJ. A network biology approach to prostate cancer. Mol Syst Biol. 2007;3:82. doi: 10.1038/msb4100125. - DOI - PMC - PubMed
    1. Chuang HY, Lee E, Liu YT, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Mol Syst Biol. 2007;3:140. doi: 10.1038/msb4100180. - DOI - PMC - PubMed
    1. Mani KM, Lefebvre C, Wang K, Lim WK, Basso K, Dalla-Favera R, Califano A. A systems biology approach to prediction of oncogenes and molecular perturbation targets in B-cell lymphomas. Mol Syst Biol. 2008;4:169. doi: 10.1038/msb.2008.2. - DOI - PMC - PubMed

Publication types

MeSH terms