Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar;39(3):23.
doi: 10.1007/s10916-015-0220-8. Epub 2015 Feb 10.

Design and development of a medical big data processing system based on Hadoop

Affiliations

Design and development of a medical big data processing system based on Hadoop

Qin Yao et al. J Med Syst. 2015 Mar.

Abstract

Secondary use of medical big data is increasingly popular in healthcare services and clinical research. Understanding the logic behind medical big data demonstrates tendencies in hospital information technology and shows great significance for hospital information systems that are designing and expanding services. Big data has four characteristics--Volume, Variety, Velocity and Value (the 4 Vs)--that make traditional systems incapable of processing these data using standalones. Apache Hadoop MapReduce is a promising software framework for developing applications that process vast amounts of data in parallel with large clusters of commodity hardware in a reliable, fault-tolerant manner. With the Hadoop framework and MapReduce application program interface (API), we can more easily develop our own MapReduce applications to run on a Hadoop framework that can scale up from a single node to thousands of machines. This paper investigates a practical case of a Hadoop-based medical big data processing system. We developed this system to intelligently process medical big data and uncover some features of hospital information system user behaviors. This paper studies user behaviors regarding various data produced by different hospital information systems for daily work. In this paper, we also built a five-node Hadoop cluster to execute distributed MapReduce algorithms. Our distributed algorithms show promise in facilitating efficient data processing with medical big data in healthcare services and clinical research compared with single nodes. Additionally, with medical big data analytics, we can design our hospital information systems to be much more intelligent and easier to use by making personalized recommendations.

PubMed Disclaimer

References

    1. Am J Prev Med. 1999 Jan;16(1):1-9 - PubMed
    1. JAMA. 1998 May 20;279(19):1548-53 - PubMed
    1. J Biomed Inform. 2014 Jun;49:119-33 - PubMed
    1. BMC Bioinformatics. 2010 Dec 21;11 Suppl 12:S1 - PubMed
    1. Epidemiology. 2011 May;22(3):298-301 - PubMed

Publication types

MeSH terms

LinkOut - more resources