Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 30:26:e50890.
doi: 10.2196/50890.

Machine Learning and Health Science Research: Tutorial

Affiliations

Machine Learning and Health Science Research: Tutorial

Hunyong Cho et al. J Med Internet Res. .

Abstract

Machine learning (ML) has seen impressive growth in health science research due to its capacity for handling complex data to perform a range of tasks, including unsupervised learning, supervised learning, and reinforcement learning. To aid health science researchers in understanding the strengths and limitations of ML and to facilitate its integration into their studies, we present here a guideline for integrating ML into an analysis through a structured framework, covering steps from framing a research question to study design and analysis techniques for specialized data types.

Keywords: health science researcher; machine learning; machine learning pipeline; medical machine learning; precision medicine; reproducibility; unsupervised learning.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Machine learning workflow for a health science research question, from research question refinement to results reporting, with additional considerations. The cyclic nature of the process is reflected in the arrows, as several different iterations may be considered before narrowing down to a decisive pipeline, leading to result reporting.
Figure 2
Figure 2
Commonly used algorithms in the supervised setting by algorithm type distinguished between classification and regression problems, as well as methods used in unsupervised learning.

References

    1. Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349(6245):255–260. doi: 10.1126/science.aaa8415.349/6245/255 - DOI - PubMed
    1. Obermeyer Z, Emanuel EJ. Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216–1219. doi: 10.1056/NEJMp1606181. https://europepmc.org/abstract/MED/27682033 - DOI - PMC - PubMed
    1. Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. 2018;319(13):1317–1318. doi: 10.1001/jama.2017.18391.2675024 - DOI - PubMed
    1. Wiens J, Saria S, Sendak M, Ghassemi M, Liu VX, Doshi-Velez F, Jung K, Heller K, Kale D, Saeed M, Ossorio PN, Thadaney-Israni S, Goldenberg A. Do no harm: a roadmap for responsible machine learning for health care. Nat Med. 2019;25(9):1337–1340. doi: 10.1038/s41591-019-0548-6.10.1038/s41591-019-0548-6 - DOI - PubMed
    1. Greely HT. The uneasy ethical and legal underpinnings of large-scale genomic biobanks. Annu Rev Genomics Hum Genet. 2007;8:343–364. doi: 10.1146/annurev.genom.7.080505.115721. - DOI - PubMed

Publication types

LinkOut - more resources