Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Guideline
. 2016 Dec 16;18(12):e323.
doi: 10.2196/jmir.5870.

Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View

Affiliations
Guideline

Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View

Wei Luo et al. J Med Internet Res. .

Abstract

Background: As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs.

Objective: To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence.

Methods: A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method.

Results: The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models.

Conclusions: A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community.

Keywords: clinical prediction rule; guideline; machine learning.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Steps to identify the prediction problem.
Figure 2
Figure 2
Information flow in the predictive modelling process.

References

    1. Ayaru L, Ypsilantis P, Nanapragasam A, Choi RC, Thillanathan A, Min-Ho L, Montana G. Prediction of outcome in acute lower gastrointestinal bleeding using gradient boosting. PLoS One. 2015;10(7):e0132485. doi: 10.1371/journal.pone.0132485. http://dx.plos.org/10.1371/journal.pone.0132485 - DOI - DOI - PMC - PubMed
    1. Ogutu J, Schulz-Streeck T, Piepho HP. Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions. BMC Proc. 2012;6(Suppl 2):S10. doi: 10.1186/1753-6561-6-S2-S10. - DOI - PMC - PubMed
    1. Tran T, Luo W, Phung D, Harvey R, Berk M, Kennedy RL, Venkatesh S. Risk stratification using data from electronic medical records better predicts suicide risks than clinician assessments. BMC Psychiatry. 2014 Mar 14;14:76. doi: 10.1186/1471-244X-14-76. https://bmcpsychiatry.biomedcentral.com/articles/10.1186/1471-244X-14-76 - DOI - DOI - PMC - PubMed
    1. Breiman L, Friedman J, Stone C, Olshen R. Classification and regression trees. New York: Chapman & Hall; 1984.
    1. Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015 Jul 17;349(6245):255–60. doi: 10.1126/science.aaa8415. - DOI - PubMed

Publication types