Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 1;18(6):1044-1056.
doi: 10.1093/bib/bbw080.

Exploring and visualizing multidimensional data in translational research platforms

Exploring and visualizing multidimensional data in translational research platforms

William Dunn Jr et al. Brief Bioinform. .

Abstract

The unprecedented advances in technology and scientific research over the past few years have provided the scientific community with new and more complex forms of data. Large data sets collected from single groups or cross-institution consortiums containing hundreds of omic and clinical variables corresponding to thousands of patients are becoming increasingly commonplace in the research setting. Before any core analyses are performed, visualization often plays a key role in the initial phases of research, especially for projects where no initial hypotheses are dominant. Proper visualization of data at a high level facilitates researcher's abilities to find trends, identify outliers and perform quality checks. In addition, research has uncovered the important role of visualization in data analysis and its implied benefits facilitating our understanding of disease and ultimately improving patient care. In this work, we present a review of the current landscape of existing tools designed to facilitate the visualization of multidimensional data in translational research platforms. Specifically, we reviewed the biomedical literature for translational platforms allowing the visualization and exploration of clinical and omics data, and identified 11 platforms: cBioPortal, interactive genomics patient stratification explorer, Igloo-Plot, The Georgetown Database of Cancer Plus, tranSMART, an unnamed data-cube-based model supporting heterogeneous data, Papilio, Caleydo Domino, Qlucore Omics, Oracle Health Sciences Translational Research Center and OmicsOffice® powered by TIBCO Spotfire. In a health sector continuously witnessing an increase in data from multifarious sources, visualization tools used to better grasp these data will grow in their importance, and we believe our work will be useful in guiding investigators in similar situations.

Keywords: data analytics; high-dimensional data; omics; translational research; visualization.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A sampling of commonly used visualization techniques for multidimensional data using a subset of data in our data set compiling data from three groups of patients Var1, Var2 and Var3 are neurocognitive dimensions, Var4 and Var5 are psychopathological dimensions and Var6 is a global genetic index. Specific visualizations used are (A) dynamic pivot table (using R ‘rpivotTable’ package), (B) correlation matrix (using R ‘PerformanceAnalytics’ package), (C) Heatmap clustered by rows and columns (using R ‘gplots’ package), (D) 3D scatterplot using color and size (using R ‘scatterplot3d’ package) and (E) parallel coordinates showing all data (using d3 Javascript library ‘d3.parcoords.js’ [21]). A colour version of this figure is available at BIB online: https://academic.oup.com/bib.
Figure 2
Figure 2
Overview of tranSMART. In a typical workflow, users define subsets of patients based on a drag and drop method of variables from the right column to the appropriate boxes (A). In this example, the summary statistics view (B) shows age difference between patients with genotypes (subsets 1 and 2, respectively) in a candidate gene. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.
Figure 3
Figure 3
A demonstration of Caleydo Domino using exploration of a set of multiple tabular data sets for a music data set containing song and musician information. This figure displays the main user interface of the program where users can drag and position data subsets and chose which calculations or visualizations to use to explore data and relationships between data [63]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.
Figure 4
Figure 4
A demonstration of StratomeX using exploration of a set of multiple tabular data sets for the TCGA clear cell renal carcinoma data set. This figure displays the main user interface of the program where users can drag and position data subsets and chose which calculations or visualizations to use to explore data and relationships between data. Above, users can visualize the relation between patients with subtypes based on two different genomic clustering experiments [65]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

References

    1. Heer J, Bostock M, Ogievetsky V.. A tour through the visualization zoo. Commun ACM 2010;53:59–67.
    1. Pareek CS, Smoczynski R, Tretyn A.. Sequencing technologies and genome sequencing. J Appl Genet 2011;52:413–35. - PMC - PubMed
    1. Leonelli S. Data interpretation in the digital age. Perspect Sci 2014;22:397–417. - PMC - PubMed
    1. Hey AJG, Tansley S, Tolle KM, et al.The Fourth Paradigm: Data-Intensive Scientific Discovery. Redmond, WA: Microsoft Research, 2009. http://202.120.81.220:81/inter/uploads/readings/four-paradigm.pdf
    1. Steenwijk MD, Milles J, Buchem MA.. Integrated Visual Analysis for Heterogeneous Datasets in Cohort Studies. Eurographics Workshop Vis Comput Biomed, 2010. https://www.researchgate.net/profile/Johan_Reiber/publication/265438762_...