Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar 27;116(7):1115-9.
doi: 10.1161/CIRCRESAHA.115.306013.

Harnessing the heart of big data

Affiliations

Harnessing the heart of big data

Sarah B Scruggs et al. Circ Res. .

Abstract

The exponential increase in Big Data generation combined with limited capitalization on the wealth of information embedded within Big Data have prompted us to revisit our scientific discovery paradigms. A successful transition into this digital era of medicine holds great promise for advancing fundamental knowledge in biology, innovating human health and driving personalized medicine, however, this will require a drastic shift of research culture in how we conceptualize science and use data. An e-transformation will require global adoption and synergism among computational science, biomedical research and clinical domains.

Keywords: crowdsourcing; database; heart diseases; information storage and retrieval; metabolomics; proteomics; user-computer interface.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Central Theme of Data Science—Data, Tools & Users. These are three essential components of data science architectures. Data refer to datasets that are reusable, accumulate value over time, and provide a multi-dimensional, systems-level understanding. Tools enable organization of and knowledge inference from data, in areas such as on-cloud data processing, multi-scale data integration, machine learning, crowdsourcing and text mining, data visualization and mechanistic modeling. Users are anyone who has access to a digital device and an Internet connection. Individuals such as healthcare professionals, biomedical investigators and layperson/patient populations are users.
Figure 2
Figure 2
Example of a Modular Data Science Architecture for Supporting Cardiovascular Investigations. The workflow above provides an example to illustrate data science platforms correlating multi-scale molecular expression and phenotypic data from different experiments and/or the literature. The workflow begins with users uploading their own genomics or proteomics data, or data shared on and retrieved from a Cloud-Based Infrastructure. Subsequently, with their submitted protein/gene data, they access the Knowledge Aggregation Tools that enable location and access of both knowledgebase and analytical tools for processing and analysis. Data types are automatically annotated using community intelligence Knowledgebase 1 (e.g., Gene Wiki). Multi-scale pathway information is integrated into a cohesive model via a Pathways Analysis Tool, which retrieves molecular interaction and biochemical pathway information from Analytical Tools 1 and 2 (e.g., PSICQUIC and Reactome, respectively). Results are output to Visualization Tools (e.g., BioJS) for tailored, multi-faceted visualization. Processed data can be stored and re-accessed via Knowledgebases 2 and 3 (e.g., COPaKB-Data or Sage Synapse (http://www.sagebase.org/)).

References

    1. Collins FS. Reengineering translational science: The time is right. Sci Transl Med. 2011;3:90cm17. - PMC - PubMed
    1. Kell DB. Finding novel pharmaceuticals in the systems biology era using multiple effective drug targets, phenotypic screening and knowledge of transporters: Where drug discovery went wrong and how to fix it. Febs J. 2013;280:5957–5980. - PubMed
    1. Hayes DF, Markus HS, Leslie RD, Topol EJ. Personalized medicine: Risk prediction, targeted therapies and mobile health technology. BMC Med. 2014;12:37. - PMC - PubMed
    1. Krumholz HM. Big data and new knowledge in medicine: The thinking, training, and tools needed for a learning health system. Health Aff (Millwood) 2014;33:1163–1170. - PMC - PubMed
    1. Denny JC, Bastarache L, Ritchie MD, et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol. 2013;31:1102–1110. - PMC - PubMed

Publication types

MeSH terms