Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Nov 16:4:194.
doi: 10.3389/fvets.2017.00194. eCollection 2017.

The Scope of Big Data in One Medicine: Unprecedented Opportunities and Challenges

Affiliations
Review

The Scope of Big Data in One Medicine: Unprecedented Opportunities and Challenges

Molly E McCue et al. Front Vet Sci. .

Abstract

Advances in high-throughput molecular biology and electronic health records (EHR), coupled with increasing computer capabilities have resulted in an increased interest in the use of big data in health care. Big data require collection and analysis of data at an unprecedented scale and represents a paradigm shift in health care, offering (1) the capacity to generate new knowledge more quickly than traditional scientific approaches; (2) unbiased collection and analysis of data; and (3) a holistic understanding of biology and pathophysiology. Big data promises more personalized and precision medicine for patients with improved accuracy and earlier diagnosis, and therapy tailored to an individual's unique combination of genes, environmental risk, and precise disease phenotype. This promise comes from data collected from numerous sources, ranging from molecules to cells, to tissues, to individuals and populations-and the integration of these data into networks that improve understanding of heath and disease. Big data-driven science should play a role in propelling comparative medicine and "one medicine" (i.e., the shared physiology, pathophysiology, and disease risk factors across species) forward. Merging of data from EHR across institutions will give access to patient data on a scale previously unimaginable, allowing for precise phenotype definition and objective evaluation of risk factors and response to therapy. High-throughput molecular data will give insight into previously unexplored molecular pathophysiology and disease etiology. Investigation and integration of big data from a variety of sources will result in stronger parallels drawn at the molecular level between human and animal disease, allow for predictive modeling of infectious disease and identification of key areas of intervention, and facilitate step-changes in our understanding of disease that can make a substantial impact on animal and human health. However, the use of big data comes with significant challenges. Here we explore the scope of "big data," including its opportunities, its limitations, and what is needed capitalize on big data in one medicine.

Keywords: bioinformatics; clinical informatics; deep phenotyping; environmental epidemiology; genetic epidemiology; multilayer disease module; network medicine; structural informatics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The multiple levels of biomedical informatics data. (A) Population health informatics focuses on the study of infectious and genetic disease in populations and the impacts of environmental exposures (i.e., the exposome: internal, general external, and specific external environments). Although metagenomics is the study of the small molecules of the genome of microorganiosms, the microbiome is considered an environmental factor by many investigators. (B) Clinical informatics includes all quantitative and qualitative clinical measures made on patients including history, physical examinations, clinical laboratory testing, and other clinical diagnostic procedures. (C) Imaging informatics encompasses measures made at the tissue or organ level and includes structural and functional imaging studies as well as histopathology and other microscopic studies. (D) Bioinformatics encompasses the largest level and includes all measurements of small molecules (i.e., the ‘omics studies). The bioinformatics level also incorporates studies of the interactions between molecules of the same of different molecular levels within a cell (the “interactome”) and describes the molecular phenotype of health and disease.
Figure 2
Figure 2
Relational databases capture related datasets. A relational database organizes a collection of multiple related datasets. Each dataset is organized within a table by rows and columns. Each table relates to one or more tables in the relational database, and tables communicate with each other to share information. Each table is a “relation,” which contains one or more data columns. Each row in a table is considered a “record” and contains unique data in the corresponding columns. One or more record(s) has data within column(s) that relate to one or many records contained in other tables.
Figure 3
Figure 3
Disease subclassification. (A) Risk factor-based disease subclassification includes factors that increase disease risk (even prior to disease). (B) Phenotype-based disease subclassification includes both clinical factors and molecular phenotypes. (C) A patient’s verotype or true phenotype is the result of risk factors and molecular and clinical subtypes.
Figure 4
Figure 4
Multilayer disease modules. Organizing heterogeneous big data into biologic networks can lead to a deeper understanding of normal function and dysfunction in disease. Networks consist of nodes representing an object of interest that are connected by edges that capture the relationship between the nodes. Networks can be built with data gathered across layers and tissues as well as across individuals. Vertical integration of networks across the levels of health-care big data (dashed red arrows) is an important goal of translational bioinformatics. Integration of data across tissues (solid red arrows) also provides an opportunity to understand tissue cross-talk in health and disease. In particular, changes in cross-talking proteins may signal rewiring in disease.
Figure 5
Figure 5
Types and uses of disease biomarkers. (A) Different biomarkers are used for different reasons at different points in disease progression. (B) Disease progression can be thought of as a continuous process that is the result of rewiring of important molecular, cellular, or tissue networks, which results in progression from a healthy state to severe clinical disease. (C) A combination of biomarker types allows for identification of at risk or disease patients at different stages of disease.
Figure 6
Figure 6
Biomarker development. (A) Biomarker development progresses through several stages from initial discovery to clinical validation with increasing number of individuals and a decreasing number of biomarkers at each stage in the process. (B) As biomarkers progress through each of these stages, there is an increasing amount of evidence and increasing clinical validity in support of the biomarkers use.
Figure 7
Figure 7
Big data scientific method. Hypothesis-driven and data-driven scientific methods progress through parallel stages. (A) Framing the problem and general hypotheses. (B). Data collection and exploratory experimentation/analysis. (C) Formulation of specific hypotheses. (D) Testing the hypotheses. (E) Accepting or rejecting the hypotheses.

Similar articles

Cited by

References

    1. Zinsstag J, Schelling E, Waltner-Toews D, Tanner M. From “one medicine” to “one health” and systemic approaches to health and well-being. Prev Vet Med (2011) 101(3–4):148–56.10.1016/j.prevetmed.2010.07.003 - DOI - PMC - PubMed
    1. Berger ML, Doban V. Big data, advanced analytics and the future of comparative effectiveness research. J Comp Eff Res (2014) 3(2):167–76.10.2217/cer.14.2 - DOI - PubMed
    1. Gligorijević V, Malod-Dognin N, Pržulj N. Integrative methods for analyzing big data in precision medicine. Proteomics (2016) 16(5):741–58.10.1002/pmic.201500396 - DOI - PubMed
    1. Sagiroglu S, Sinanc D. Big data: a review. 2013 International Conference on Collaboration Technologies and Systems (CTS) San Diego, CA: IEEE (2013). p. 42–7.
    1. Andreu-Perez J, Poon CCY, Merrifield RD, Wong STC, Yang GZ. Big data for health. IEEE J Biomed Health Inform (2015) 19(4):1193–208.10.1109/JBHI.2015.2450362 - DOI - PubMed

LinkOut - more resources