Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Sep 7;2(9):65.
doi: 10.1186/gm186.

Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme

Affiliations

Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme

Kristian Ovaska et al. Genome Med. .

Abstract

Background: Coordinated efforts to collect large-scale data sets provide a basis for systems level understanding of complex diseases. In order to translate these fragmented and heterogeneous data sets into knowledge and medical benefits, advanced computational methods for data analysis, integration and visualization are needed.

Methods: We introduce a novel data integration framework, Anduril, for translating fragmented large-scale data into testable predictions. The Anduril framework allows rapid integration of heterogeneous data with state-of-the-art computational methods and existing knowledge in bio-databases. Anduril automatically generates thorough summary reports and a website that shows the most relevant features of each gene at a glance, allows sorting of data based on different parameters, and provides direct links to more detailed data on genes, transcripts or genomic regions. Anduril is open-source; all methods and documentation are freely available.

Results: We have integrated multidimensional molecular and clinical data from 338 subjects having glioblastoma multiforme, one of the deadliest and most poorly understood cancers, using Anduril. The central objective of our approach is to identify genetic loci and genes that have significant survival effect. Our results suggest several novel genetic alterations linked to glioblastoma multiforme progression and, more specifically, reveal Moesin as a novel glioblastoma multiforme-associated gene that has a strong survival effect and whose depletion in vitro significantly inhibited cell proliferation. All analysis results are available as a comprehensive website.

Conclusions: Our results demonstrate that integrated analysis and visualization of multidimensional and heterogeneous data by Anduril enables drawing conclusions on functional consequences of large-scale molecular data. Many of the identified genetic loci and genes having significant survival effect have not been reported earlier in the context of glioblastoma multiforme. Thus, in addition to generally applicable novel methodology, our results provide several glioblastoma multiforme candidate genes for further studies.Anduril is available at http://csbi.ltdk.helsinki.fi/anduril/The glioblastoma multiforme analysis results are available at http://csbi.ltdk.helsinki.fi/anduril/tcga-gbm/

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of the Anduril platform. Anduril is an extensible framework for analyzing large-scale data sets using workflows. Elementary analysis and reporting methods, as well as connections to external databases, are implemented as reusable Anduril components. Components can utilize libraries such as Bioconductor and Weka and are not limited to a particular programming language. Components are then wired into custom workflows, which implement complete analyses that take complex high-throughput data as input and automatically produce comprehensive final reports as result. Reports include generated web sites that show the most relevant features of genes at a glance, and detailed figures and tables produced by analysis methods such as Kaplan-Meier analysis, Gene Ontology enrichment, and so on. Analysis workflows and their parameters are also documented in reports.
Figure 2
Figure 2
Example of Anduril-generated result website and links to external sources. Anduril generates a browsable website based on analysis results. (a) A screenshot of the gene level view of the data. The genes are sorted according to the survival P-value on the exon platform. The data are divided into 13 fields corresponding to analysis results and data sources. For example, the field 'GeneExpression' illustrates fold changes between GBM and control samples using data from gene expression microarrays. Exon array values are computed at the gene ('MedianExonExpression') and transcript levels ('TranscriptExpression'). For the transcript data the minimum and maximum transcript expression values show GBM-specific alternative splice variant candidates. The fields 'TranscriptExpression:Survival' and 'MedianExonExpression:Survival' show survival analysis P-values for the best transcript and gene in the exon arrays, whereas 'SNPSurvival' contains P-values for the survival associated SNPs. The green color for 'GeneExpression', 'FoldChange', 'Min', 'Max', 'Gain', 'Loss' and 'Methylation' denote downregulation and red denotes upregulation. The red color for P-values for the fields 'Survival', 'SNPSurvival' and 'ExonIntegration' denotes low P-values. (b) A web page that opens after clicking the gene MSN. This page contains detailed results and external links. (c, d) Clicking 'GeneName' opens a website in Genecards [28] (c), and 'GeneID' connects to Ensembl [29] (d). (e) Clicking 'Protein Interactions' opens a page listing known protein-protein interactions in PINA [27]. (f) Clicking an entry in 'KEGG pathway' allows accessing pathways at the KEGG [26] website. (g) Each splice variant is listed separately and if the survival P-value is < 0.01, the users can view the Kaplan-Meier curves. The groups '1', '-1' and '0' denote overexpression, underexpression (not shown for MSN) and stable expression, respectively ('-1' is not present in the figure). The dotted lines are 95% confidence intervals.
Figure 3
Figure 3
Functional effects of knocking out MSN in three glioblastoma and one control cell line. Four MSN targeting siRNAs at a final concentration of 13 nM were transfected with Silenfect (BioRad) transfection reagent to A172, LN405 and U87MG glioma cell lines and the SVGp12 control cell line. (a) Cell proliferation was assayed 72 h after transfection using CellTiter-Glo Cell Viability assay. (b) Induction of caspase-3 and -7 activities was detected 48 h after transfection with homogeneous Apo-ONE assay (Promega). Loess normalized signals from the proliferation and caspase-3/7 assays are presented as relative scores to the mean of lipid-containing wells. Significant P-values < 0.05*, < 0.01** and < 0.001*** calculated by t-test are shown. Error bars indicate standard error of the mean (SEM).

References

    1. Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–1068. doi: 10.1038/nature07385. - DOI - PMC - PubMed
    1. Furnari FB, Fenton T, Bachoo RM, Mukasa A, Stommel JM, Stegh A, Hahn WC, Ligon KL, Louis DN, Brennan C, Chin L, DePinho RA, Cavenee WK. Malignant astrocytic glioma: genetics, biology, and paths to treatment. Genes Dev. 2007;21:2683–2710. doi: 10.1101/gad.1596707. - DOI - PubMed
    1. Bredel M, Scholtens DM, Harsh GR, Bredel C, Chandler JP, Renfrow JJ, Yadav AK, Vogel H, Scheck AC, Tibshirani R, Sikic BI. A network model of a cooperative genetic landscape in brain tumors. JAMA. 2009;302:261–275. doi: 10.1001/jama.2009.997. - DOI - PMC - PubMed
    1. Brennan C, Momota H, Hambardzumyan D, Ozawa T, Tandon A, Pedraza A, Holland E. Glioblastoma subclasses can be defined by activity among signal transduction pathways and associated genomic alterations. PLoS One. 2009;4:e7752. doi: 10.1371/journal.pone.0007752. - DOI - PMC - PubMed
    1. Cerami E, Demir E, Schultz N, Taylor BS, Sander C. Automated network analysis identifies core pathways in glioblastoma. PLoS One. 2010;5:e8918. doi: 10.1371/journal.pone.0008918. - DOI - PMC - PubMed