Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012:2012:373506.
doi: 10.1155/2012/373506. Epub 2012 Jun 28.

Detecting cancer outlier genes with potential rearrangement using gene expression data and biological networks

Affiliations

Detecting cancer outlier genes with potential rearrangement using gene expression data and biological networks

Mohammed Alshalalfa et al. Adv Bioinformatics. 2012.

Abstract

Gene alterations are a major component of the landscape of tumor genomes. To assess the significance of these alterations in the development of prostate cancer, it is necessary to identify these alterations and analyze them from systems biology perspective. Here, we present a new method (EigFusion) for predicting outlier genes with potential gene rearrangement. EigFusion demonstrated excellent performance in identifying outlier genes with potential rearrangement by testing it to synthetic and real data to evaluate performance. EigFusion was able to identify previously unrecognized genes such as FABP5 and KCNH8 and confirmed their association with primary and metastatic prostate samples while confirmed the metastatic specificity for other genes such as PAH, TOP2A, and SPINK1. We performed protein network based approaches to analyze the network context of potential rearranged genes. Functional gene rearrangement Modules are constructed by integrating functional protein networks. Rearranged genes showed to be highly connected to well-known altered genes in cancer such as AR, RB1, MYC, and BRCA1. Finally, using clinical outcome data of prostate cancer patients, potential rearranged genes demonstrated significant association with prostate cancer specific death.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Gene rearrangements, common gene rearrangements in cancer cells. (b) Shows the deletion of some genes at the DNA level which leads to depletion of corresponding mRNA. (c) Represents the fusion of two genes that leads to fused mRNA and fused proteins. (d) Represents special rearrangement type that fuses a strong promoter of gene A to the 5' of gene B. This leads to underexpression of gene A and overexpression of gene B.
Figure 2
Figure 2
Evaluation of EigFusion performance on synthetic and real cancer data, AUC values are used to assess the performance of fusion gene detection methods. ROC curves were plotted as 1-specificity versus sensitivity of the methods. We plotted ROC curves for each method in several cancer samples size (x-axis) and found the area under the curve (AUC) as a measure of performance. (a) Using synthetic data, COPA and KS showed poor performance over all cases; on the other hand, ORT, GTI, and OS showed that poor performance is affected by the ratio of the size cancer samples to normal samples. (b) Applying all the methods on real prostate data (Singh data) showed that EigFusion outperforms the other methods.
Figure 3
Figure 3
Evaluation of EigFusion performance on synthetic and real cancer data, AUC values are used to assess the performance of fusion gene detection methods. We used the (a) positive FDR (PFDR) and (b) negative FDR (NFDR) to assess FDR of each method under different cancer samples proportions. EigFusion showed to have zero FDR and f-measure value of 1. (c) We further assess the performance of the methods on real cancer data. We used Singh prostate cancer data with embedded test genes with different cancer proportions. We assessed the performance of each method based on their ability to identify test10 and test20 test genes.
Figure 4
Figure 4
Genes altered in prostate samples, 54 genes are selected as overexpressed in subset of samples (primary or metastatic). Some genes showed to be overexpressed in only metastatic samples (genes with all red bars). Other genes showed to be overexpressed in both primary and metastatic but not normal samples (genes with red and blue bars). The frequency on the axis is the fractions of samples with rearrangement (overexpression in subset of samples) over all samples size. The red bars for example represents the frequency of gene rearrangement in primary samples.
Figure 5
Figure 5
Integrating the discovered potential rearranged genes with functional protein interactions revealed functional modularity of the rearranged genes with enriched pathways.
Figure 6
Figure 6
Functional modules of altered genes in prostate. Analyzing the rearranged genes by integrating functional protein networks and copy number alteration data revealed modularity of rearranged genes and high association with master regulators of well-known dysregulated pathways in cancer, such as AR, P53, and KLK3. Nodes with solid black border are identified by EigFusion.
Figure 7
Figure 7
Validating prostate potential rearranged genes using prostate CNA revealed that half of genes are amplified or deleted in set of samples.
Figure 8
Figure 8
Validating prostate rearranged genes using ovarian CNA revealed that most of prostate rearranged genes are altered in larger portion of samples compared with prostate CNA.
Figure 9
Figure 9
Clinical association of genes with death and aggressiveness of cancer. To understand the effect of alteration in gene expression, Kaplan-Meier survival curves are plotted to two sets of genes. (a) Is KM curves for all genes in figure top 25 genes altered. Samples with alterations demonstrated high risk disease. (b) Is KM curves using only ERG, SPINK1, KCNH8, and FABP5. Alterations in these four genes showed higher risk compared with the whole set of genes. (c) Hamming distance is used as a measure to find genes that have high association with death and aggressive cancer. Both death and aggressiveness were represented as vectors of samples. Distance shows how much gene's rearrangements vector differ from clinical vectors (death, aggressive). For example, ERG has distance of 0.16 to death vector; means that 84% of the samples of ERG fusion have death outcome.
Figure 10
Figure 10
Hierarchical clustering of prostate rearranged genes. Validating the prostate rearranged genes on Swedish cohort revealed three prostate tumor subgroups with distinct rearrangement profile and different cancer specific death profiles.

References

    1. de Klein A, van Kessel AG, Grosveld G, et al. A celllular oncogene is translocated to the Philadelphia chromosome in chronic myelocytic leukaemia. Nature . 1982;300(5894):765–767. - PubMed
    1. Soda M, Choi YL, Enomoto M, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature . 2007;448(7153):561–566. - PubMed
    1. Kumar-Sinha C, Tomlins SA, Chinnaiyan AM. Recurrent gene fusions in prostate cancer. Nature Reviews Cancer . 2008;8(7):497–511. - PMC - PubMed
    1. Squire JA. TMPRSS2-ERG and PTEN loss in prostate cancer. Nature Genetics . 2009;41(5):509–510. - PubMed
    1. Berger MF, Lawrence MS, Demichelis F. The genome complexity of primary human prostate cancer. Nature . 2011;470:214–220. - PMC - PubMed