Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Jan 17;2(1):e1.
doi: 10.2196/medinform.2913.

Big data and clinicians: a review on the state of the science

Affiliations
Review

Big data and clinicians: a review on the state of the science

Weiqi Wang et al. JMIR Med Inform. .

Abstract

Background: In the past few decades, medically related data collection saw a huge increase, referred to as big data. These huge datasets bring challenges in storage, processing, and analysis. In clinical medicine, big data is expected to play an important role in identifying causality of patient symptoms, in predicting hazards of disease incidence or reoccurrence, and in improving primary-care quality.

Objective: The objective of this review was to provide an overview of the features of clinical big data, describe a few commonly employed computational algorithms, statistical methods, and software toolkits for data manipulation and analysis, and discuss the challenges and limitations in this realm.

Methods: We conducted a literature review to identify studies on big data in medicine, especially clinical medicine. We used different combinations of keywords to search PubMed, Science Direct, Web of Knowledge, and Google Scholar for literature of interest from the past 10 years.

Results: This paper reviewed studies that analyzed clinical big data and discussed issues related to storage and analysis of this type of data.

Conclusions: Big data is becoming a common feature of biological and clinical studies. Researchers who use clinical big data face multiple challenges, and the data itself has limitations. It is imperative that methodologies for data analysis keep pace with our ability to collect and store data.

Keywords: big data; clinical research; database; medical informatics; medicine.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
A schematic of the issues surrounding storage and use of big data. Clinical big data, as well as big data in other disciplines, have been surrounded by a number of issues and challenges, including (but not limited to): generation, storage, curation, extraction, integration, analysis, visualization, etc. ANN: artificial neuron network; EMR: electronic medical record; MPP: massively parallel-processing; PCA: principle component analysis; ROI: return of investment; SVM: support vector machine.

References

    1. Wenkebach U, Pollwein B, Finsterer U. Visualization of large datasets in intensive care. Proc Annu Symp Comput Appl Med Care. 1992:18–22. http://europepmc.org/abstract/MED/1482864 - PMC - PubMed
    1. Wang J, Chen Y, Hua R, Wang P, Fu J. A distributed big data storage and data mining framework for solar-generated electricity quantity forecasting. Proc. SPIE 8333, Photonics and Optoelectronics Meetings (POEM) 2011 doi: 10.1117/12.919640. http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=1200314 - DOI
    1. Wang JZ, Chen YJ, Hua R, Wang P, Fu J. A distributed big data storage and data mining framework for solar-generated electricity quantity forecasting. Proc. SPIE 8333, Photonics and Optoelectronics Meetings (POEM) 2011. 2011 doi: 10.1117/12.919640. http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=1200314 - DOI
    1. Fu J, Chen ZH, Wang JC, He MQ, Wang JZ. Distributed storage system big data mining based on HPC application-A solar photovoltaic forecasting system practice. Information-Tokyo. 2012;15(9):3749–3755.
    1. Brinkmann BH, Bower MR, Stengel KA, Worrell GA, Stead M. Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data. J Neurosci Methods. 2009;180(1):185–192. doi: 10.1016/j.jneumeth.2009.03.022. http://europepmc.org/abstract/MED/19427545 - DOI - PMC - PubMed

LinkOut - more resources