Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
- PMID: 30964441
- PMCID: PMC6477571
- DOI: 10.2196/13043
Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
Abstract
Background: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth.
Objective: The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform.
Methods: We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics.
Results: Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases.
Conclusions: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research.
Keywords: big data; computational health care; data science; medical informatics computing; monitoring, physiologic.
©Jacob McPadden, Thomas JS Durant, Dustin R Bunch, Andreas Coppi, Nathaniel Price, Kris Rodgerson, Charles J Torre Jr, William Byron, Allen L Hsiao, Harlan M Krumholz, Wade L Schulz. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 09.04.2019.
Conflict of interest statement
Conflicts of Interest: HMK was a recipient of a research grant, through Yale, from Medtronic and the US Food and Drug Administration to develop methods for postmarket surveillance of medical devices; is a recipient of research agreements with Medtronic and Johnson & Johnson (Janssen), through Yale, to develop methods of clinical trial data sharing; works under contract with the US Centers for Medicare & Medicaid Services to develop and maintain performance measures that are publicly reported; chairs a Cardiac Scientific Advisory Board for UnitedHealth Group Inc; is a participant and participant representative of the IBM Watson Health Life Sciences Board; is a member of the Advisory Board for Element Science, Inc, and the Physician Advisory Board for Aetna Inc; and is the founder of Hugo, a personal health information platform. WLS is a consultant for Hugo, a personal health information platform.
Figures
References
-
- EMC . The digital universe driving data growth in healthcare. Hopkinton, MA: Dell Inc; 2014. [2018-10-03]. https://www.emc.com/analyst-report/digital-universe-healthcare-vertical-... .
-
- Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Hung Byers A. Big data: the next frontier for innovation, competition, and productivity. New York, NY: McKinsey Global Institute; 2011. Jun, [2019-02-04]. https://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%... .
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
