Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 17:4:67.
doi: 10.1186/1752-0509-4-67.

An integrative multi-dimensional genetic and epigenetic strategy to identify aberrant genes and pathways in cancer

Affiliations

An integrative multi-dimensional genetic and epigenetic strategy to identify aberrant genes and pathways in cancer

Raj Chari et al. BMC Syst Biol. .

Abstract

Background: Genomics has substantially changed our approach to cancer research. Gene expression profiling, for example, has been utilized to delineate subtypes of cancer, and facilitated derivation of predictive and prognostic signatures. The emergence of technologies for the high resolution and genome-wide description of genetic and epigenetic features has enabled the identification of a multitude of causal DNA events in tumors. This has afforded the potential for large scale integration of genome and transcriptome data generated from a variety of technology platforms to acquire a better understanding of cancer.

Results: Here we show how multi-dimensional genomics data analysis would enable the deciphering of mechanisms that disrupt regulatory/signaling cascades and downstream effects. Since not all gene expression changes observed in a tumor are causal to cancer development, we demonstrate an approach based on multiple concerted disruption (MCD) analysis of genes that facilitates the rational deduction of aberrant genes and pathways, which otherwise would be overlooked in single genomic dimension investigations.

Conclusions: Notably, this is the first comprehensive study of breast cancer cells by parallel integrative genome wide analyses of DNA copy number, LOH, and DNA methylation status to interpret changes in gene expression pattern. Our findings demonstrate the power of a multi-dimensional approach to elucidate events which would escape conventional single dimensional analysis and as such, reduce the cohort sample size for cancer gene discovery.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Genomic profiles of breast cancer cell lines. (A) Whole genome frequency analysis copy number gain (red), copy number loss (green), loss of heterozygosity/allelic imbalance (AI) (top blue) and copy number neutral LOH/AI (bottom blue). Vertical lines through all four graphs represent the genomic location of key breast cancer genes, using the hg18 build of the human genome map. (B) Illustration of copy number and LOH/AI status for ESR1, BRCA1, BRCA2, ERBB2 and TP53 in each of the samples. Each of these DNA events is evident in all of these genes.
Figure 2
Figure 2
Quantitative and qualitative benefits of integrative analyses. (A) Heatmap and bar plot illustration of the additive benefit of multi-dimensional DNA analysis for the explanation of consequential differential gene expression. Within a sample, when sequentially adding a DNA dimension of analysis, an increasing percentage of observed differential gene expression can be explained. For each dimension or combination of dimensions, in the bar plot, the median value is used (grey bars). Heatmaps display the percentage of differential expression explained by DNA mechanisms, with values near to 100 either dark red (overexpression) or green (underexpression) and values closer to 0 in white. (B) Two specific genes GNAS and CASP1 are given as examples to show multiple and complementary mechanisms of gene disruption, illustrating the importance of multi-dimensional analysis (MDA).
Figure 3
Figure 3
Determination and application of a disruption frequency threshold. (A) Results of the analyses of ten simulated datasets. Aggregating the results of the simulated analyses, the proportion of genes in random simulations at the observed frequency thresholds are shown. From these analysis, approximately 2% of the simulations were ≥ 6/9. (B) Using a frequency cut-off of 6/9, the number of genes disrupted at that frequency using a single or combination of DNA dimensions. With a single dimension alone, we can maximally identify 437 genes which are differentially expressed and exhibit a concerted change at the DNA level in a minimum of 6/9 samples. However, using all three dimensions, we find that 1162 genes are in fact differentially expressed and contain at least one concerted change in one of the DNA dimensions. This represents over a two-fold increase in the number of genes identified.
Figure 4
Figure 4
Impact of multi-dimensional analysis on low frequency events. (A) Box plot analysis of the frequency distribution of single and multi-dimensional analyses (MDA) of the 1162 genes differentially expressed with a concerted change in one of the DNA dimensions. The area in red represents the number of genes (of the 1162) that would be missed if only a single DNA dimension was examined, while the area in blue represents the genes that would be detected. Examining the median values for the three right-most boxes, we see that by even using the box with the highest median (copy number), we would not be able to detect about 50% of the 1162 genes. (B) Two specific examples highlighting the importance of multi-dimensional genomic analysis. Using single dimensional analyses (green shade) alone, CD70 (blue line graph) and ENG (red line graph) disruption occur at very low frequencies (44% and 33% respectively). However, when examining two (red shade) or three genomic dimensions (blue shade), the disruption of these genes occurs at very high frequencies, 88% and 77% respectively. Frequency threshold of 6/9 is denoted with a black dotted line.
Figure 5
Figure 5
Pathway analysis of the 1162 genes identified by multi-dimensional analysis. Ingenuity Pathway Analysis of the 1162 genes identified by MDA as well as genes meeting the same frequency criteria (6/9) from the analysis of the ten simulated datasets. In total, using the list of 1162 MDA genes, 53 canonical signaling pathways were identified as significant after multiple testing correction using a Benjamini-Hochberg correction (Additional File 5). In contrast, using the same statistical criteria, nine of the 10 simulated datasets yielded no significant pathways with one of the datasets yielding one pathway. In this figure, ten of the most well known, cancer-related pathways are shown. The yellow threshold line represents a Benjamini-Hochberg corrected p-value of 0.05 with bars above that line deemed significant. The first blue bar represents the analysis of the actual dataset and the subsequent ten bars represent the analyses of the ten simulated datasets.
Figure 6
Figure 6
Complex deregulation of the Neuregulin/ERBB2 signaling pathway. Each gene is color-coded red and green to represent over and underexpression respectively. Genes colored both represent genes which are over and underexpressed in different samples. Beside each gene is the status for gene expression, copy number, LOH/AI and DNA methylation, with the alterations in each dimension colored as per the legend. DNA alterations are only shown when a change in gene expression is observed. It should be noted that LOH can be derived from multiple mechanisms. In this study, we do not distinguish between the which mechanisms. Likewise, methylation changes may affect one or both alleles. In this study, we do not distinguish the status of the alleles individually. Genes denoted with * have one sample exhibiting multiple concerted disruption (MCD). Samples are coded as follows: S1 = HCC38, S2 = HCC1008, S3 = HCC1143, S4 = HCC1395, S5 = HCC1599, S6 = HCC1937, S7 = HCC2218, S8 = BT474, and S9 = MCF7.
Figure 7
Figure 7
Deregulation of PTEN occurs differently between samples. In HCC1008 (top), PTEN is overexpressed with an associated gain in copy number and hypomethylation. Conversely, in HCC1395 (bottom), PTEN is underexpressed, with an associated loss in copy number, LOH, and DNA hypermethylation. This illustrates how each tumor may behave differently from another.
Figure 8
Figure 8
Multiple concerted disruption (MCD) analysis and its application to triple negative breast cancer. (A) Analysis of ten simulated datasets to determine the proportion of random simulations at each observed frequency of MCD. Notably, 99.7% of random simulations had a MCD frequency of 0/9, with the remaining 0.3% at 1/9. Moreover, no simulations showed a frequency ≥ 2/9. Thus, the observation of an MCD event suggests the event is likely non-random. (B) Using the knowledge database of Ingenuity Pathway Analysis, upstream and downstream components of FGFR2 were selected to assess their role in the subset of triple negative breast cancer (TNBC) cell lines. Only components which were shown to have a direct or indirect expression level relationship were selected. Of the seven components identified (four upstream and three downstream of FGFR2), one upstream and one downstream component were present in both the MDA list (Additional File 4) and MCD list (Additional File 7). Examining FGFR2 and COL1A1, while FGFR2 overexpression is not frequently associated with DNA level alteration, COL1A1 is frequently affected at DNA level. Moreover, in the five TNBC cell lines examined, four have DNA level alteration of COL1A1 and the remaining line has DNA level alteration of FGFR2.

References

    1. Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG, Gutierrez MC, Elledge R, Mohsin S, Osborne CK, Chamness GC, Allred DC. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet. 2003;362(9381):362–369. doi: 10.1016/S0140-6736(03)14023-8. - DOI - PubMed
    1. Coe BP, Chari R, Lockwood WW, Lam WL. Evolving strategies for global gene expression analysis of cancer. J Cell Physiol. 2008;217(3):590–597. doi: 10.1002/jcp.21554. - DOI - PubMed
    1. Perou CM, Sorlie T, Eisen MB, Rijn M van de, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA. Molecular portraits of human breast tumours. Nature. 2000;406(6797):747–752. doi: 10.1038/35021093. - DOI - PubMed
    1. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, Rijn M van de, Jeffrey SS. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA. 2001;98(19):10869–10874. doi: 10.1073/pnas.191367098. - DOI - PMC - PubMed
    1. van 't Veer LJ, Dai H, Vijver MJ van de, He YD, Hart AA, Mao M, Peterse HL, Kooy K van der, Marton MJ, Witteveen AT. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415(6871):530–536. doi: 10.1038/415530a. - DOI - PubMed

Publication types