Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 17;17(1):133.
doi: 10.1186/s12916-019-1366-x.

Why we need a small data paradigm

Affiliations

Why we need a small data paradigm

Eric B Hekler et al. BMC Med. .

Abstract

Background: There is great interest in and excitement about the concept of personalized or precision medicine and, in particular, advancing this vision via various 'big data' efforts. While these methods are necessary, they are insufficient to achieve the full personalized medicine promise. A rigorous, complementary 'small data' paradigm that can function both autonomously from and in collaboration with big data is also needed. By 'small data' we build on Estrin's formulation and refer to the rigorous use of data by and for a specific N-of-1 unit (i.e., a single person, clinic, hospital, healthcare system, community, city, etc.) to facilitate improved individual-level description, prediction and, ultimately, control for that specific unit.

Main body: The purpose of this piece is to articulate why a small data paradigm is needed and is valuable in itself, and to provide initial directions for future work that can advance study designs and data analytic techniques for a small data approach to precision health. Scientifically, the central value of a small data approach is that it can uniquely manage complex, dynamic, multi-causal, idiosyncratically manifesting phenomena, such as chronic diseases, in comparison to big data. Beyond this, a small data approach better aligns the goals of science and practice, which can result in more rapid agile learning with less data. There is also, feasibly, a unique pathway towards transportable knowledge from a small data approach, which is complementary to a big data approach. Future work should (1) further refine appropriate methods for a small data approach; (2) advance strategies for better integrating a small data approach into real-world practices; and (3) advance ways of actively integrating the strengths and limitations from both small and big data approaches into a unified scientific knowledge base that is linked via a robust science of causality.

Conclusion: Small data is valuable in its own right. That said, small and big data paradigms can and should be combined via a foundational science of causality. With these approaches combined, the vision of precision health can be achieved.

Keywords: Artificial intelligence; Data science; Personalized medicine; Precision health; Precision medicine; Small data.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Small versus big data paradigm pathways to help individuals and transportable knowledge
Fig. 2
Fig. 2
Small data hypothesis-driven pyramid
Fig. 3
Fig. 3
Different success criteria for big versus small data. While multiple methods can be used in each quadrant, to help illustrate, there is a rough mapping to different methods as used in different disciplines. Quadrant A includes techniques such as supervised and unsupervised machine learning, deep learning, reinforcement learning, and recommender systems, commonly used in computer science and the technology industry. Quadrant B includes techniques such as single case experimental designs, N-of-1 cross over designs, and system identification as respectively used in the social and behavioral sciences, medicine, and control systems engineering. Quadrant C includes techniques such as supervised and unsupervised machine learning and deep learning, commonly used in computer science, the technology industry, and various ‘-omics’ efforts. Quadrant D includes techniques articulated as part of the evidence-based pyramid and inferential statistics, commonly used in fields like medicine, epidemiology, public health, and psychology

References

    1. Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372(9):793–795. - PMC - PubMed
    1. Sackett DL, Rosenberg WM, Gray JM, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312(7023):71–72. - PMC - PubMed
    1. Improving Outcomes through Personalised Medicine. Working at the cutting edge of science to improve patients’ lives. 2016. https://www.england.nhs.uk/wp-content/uploads/2016/09/improving-outcomes.... Accessed 10 Jun 2019.
    1. National Research Council (US) Committee on A Framework for Developing a New taxonomy of Disease . Toward precision medicine: building a knowledge network for biomedical research and a new taxonomy of disease. Washington DC: National Academies Press; 2011. - PubMed
    1. The Precision Medicine Advisory Committee . Precision Medicine: An Action Plan for California. 2018.

Publication types

MeSH terms