Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Feb;29(2):387-401.
doi: 10.1038/s41380-023-02334-2. Epub 2024 Jan 4.

A primer on the use of machine learning to distil knowledge from data in biological psychiatry

Affiliations
Review

A primer on the use of machine learning to distil knowledge from data in biological psychiatry

Thomas P Quinn et al. Mol Psychiatry. 2024 Feb.

Abstract

Applications of machine learning in the biomedical sciences are growing rapidly. This growth has been spurred by diverse cross-institutional and interdisciplinary collaborations, public availability of large datasets, an increase in the accessibility of analytic routines, and the availability of powerful computing resources. With this increased access and exposure to machine learning comes a responsibility for education and a deeper understanding of its bases and bounds, borne equally by data scientists seeking to ply their analytic wares in medical research and by biomedical scientists seeking to harness such methods to glean knowledge from data. This article provides an accessible and critical review of machine learning for a biomedically informed audience, as well as its applications in psychiatry. The review covers definitions and expositions of commonly used machine learning methods, and historical trends of their use in psychiatry. We also provide a set of standards, namely Guidelines for REporting Machine Learning Investigations in Neuropsychiatry (GREMLIN), for designing and reporting studies that use machine learning as a primary data-analysis approach. Lastly, we propose the establishment of the Machine Learning in Psychiatry (MLPsych) Consortium, enumerate its objectives, and identify areas of opportunity for future applications of machine learning in biological psychiatry. This review serves as a cautiously optimistic primer on machine learning for those on the precipice as they prepare to dive into the field, either as methodological practitioners or well-informed consumers.

PubMed Disclaimer

Conflict of interest statement

SVF in the past year, received income, potential income, travel expenses continuing education support and/or research support from Aardvark, Akili, Genomind, Ironshore, KemPharm/Corium, Noven, Ondosis, Otsuka, Rhodes, Supernus, Takeda, Tris and Vallon. With his institution, he has US patent US20130217707 A1 for the use of sodium-hydrogen exchange inhibitors in the treatment of ADHD. In previous years, he received support from: Alcobra, Arbor, Aveksham, CogCubed, Eli Lilly, Enzymotec, Impact, Janssen, Lundbeck/Takeda, McNeil, NeuroLifeSciences, Neurovance, Novartis, Pfizer, Shire, and Sunovion. He also receives royalties from books published by Guilford Press: Straight Talk about Your Child’s Mental Health; Oxford University Press: Schizophrenia: The Facts; and Elsevier: ADHD: Non-Pharmacologic Interventions. In addition, he is the program director of www.adhdinadults.com.

Figures

Figure 1.
Figure 1.
Schematic illustration of trends in ML development relevant to psychiatric research.
Figure 2.
Figure 2.
Distribution of types of ML methods used across 1,461 surveyed publications. Note that PCA, FA, and CA are techniques that are deeply rooted in classical statistics and should not be misinterpreted in this context as exclusive ML methods. Abstracts of these publications were classified into three categories based on the ML approach used: supervised, unsupervised, or other. Among those classified as ‘other’, some abstracts mentioned ‘machine learning’, but did not specify a particular method; these were categorized as “Unspecified”. Other abstracts mentioned both supervised and unsupervised methods; these were categorized as “Multiple”. Abbreviations. CA, cluster analysis; DT, decision trees; FA, factor analysis; PCA, principal component analysis (including other related independent or latent component analyses); NN, neural networks; penREG, penalized regression; REG, generalized linear regression; RF, random forest.
Figure 3.
Figure 3.
Surveyed publications (n = 1,461) summarized by their period of publication, and by: A) 12 psychiatric disorders/phenotypes, and B) 4 data modalities (clinical, imaging, electrophysiology, and genomics) and their combinations (multiple). Abbreviations. BD, bipolar disorder; ED, eating disorders; EEG, electroencephalography; EPG, electrophysiology; MDD, major depressive disorder; PSD, psychotic spectrum disorders; SCZ, schizophrenia; SUD, substance use disorders; SVM, support vector machine.
Figure 4.
Figure 4.
Surveyed publications (n = 1,461) summarized by year of modality and by (A) modelling method, as well as (B) psychiatric phenotype. Our review of the literature identified 1,461 studies which we grouped into 12 broad categories of psychiatric disorders and 9 ML modelling methods. Abbreviations. BD, bipolar disorder; CA, cluster analysis; DT, decision trees; ED, eating disorders; EEG, electroencephalography; EPG, electrophysiology; FA, factor analysis; PCA, independent/latent/principal component analysis; MDD, major depressive disorder; NN, neural networks; penREG, penalized regression; PSD, psychotic spectrum disorders; REG, generalized linear regression; RF, random forest; SCZ, schizophrenia; SUD, substance use disorders; SVM, support vector machine.

References

    1. Alpaydin E. Machine Learning, Revised And Updated Edition. The MIT Press; 2021.
    1. Deo RC. Machine Learning in Medicine. Circulation. 2015;132:1920–1930. - PMC - PubMed
    1. Tarca AL, Carey VJ, Chen X-W, Romero R, Drăghici S. Machine learning and its applications to biology. PLoS Comput Biol. 2007;3:e116. - PMC - PubMed
    1. de Ridder D, de Ridder J, Reinders MJT. Pattern recognition in bioinformatics. Brief Bioinform. 2013;14:633–647. - PubMed
    1. Perlman ZE, Slack MD, Feng Y, Mitchison TJ, Wu LF, Altschuler SJ. Multidimensional drug profiling by automated microscopy. Science. 2004;306:1194–1198. - PubMed

Publication types

LinkOut - more resources