Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 19;11(1):14636.
doi: 10.1038/s41598-021-94007-9.

An integrated machine learning framework for a discriminative analysis of schizophrenia using multi-biological data

Affiliations

An integrated machine learning framework for a discriminative analysis of schizophrenia using multi-biological data

Peng-Fei Ke et al. Sci Rep. .

Abstract

Finding effective and objective biomarkers to inform the diagnosis of schizophrenia is of great importance yet remains challenging. Relatively little work has been conducted on multi-biological data for the diagnosis of schizophrenia. In this cross-sectional study, we extracted multiple features from three types of biological data, including gut microbiota data, blood data, and electroencephalogram data. Then, an integrated framework of machine learning consisting of five classifiers, three feature selection algorithms, and four cross validation methods was used to discriminate patients with schizophrenia from healthy controls. Our results show that the support vector machine classifier without feature selection using the input features of multi-biological data achieved the best performance, with an accuracy of 91.7% and an AUC of 96.5% (p < 0.05). These results indicate that multi-biological data showed better discriminative capacity for patients with schizophrenia than single biological data. The top 5% discriminative features selected from the optimal model include the gut microbiota features (Lactobacillus, Haemophilus, and Prevotella), the blood features (superoxide dismutase level, monocyte-lymphocyte ratio, and neutrophil count), and the electroencephalogram features (nodal local efficiency, nodal efficiency, and nodal shortest path length in the temporal and frontal-parietal brain areas). The proposed integrated framework may be helpful for understanding the pathophysiology of schizophrenia and developing biomarkers for schizophrenia using multi-biological data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Flow chart of the brain network construction of the EEG signal. EEG electroencephalogram, PLV phase locking value. Figure (a) was generated by an EEG processing tool of “EEGLAB” (Version 2019.0, https://sccn.ucsd.edu/eeglab/index.php), based on MATLAB (Version R2018a). Figure (bd) were generated by a brain network visualization tool of "BrainNet Viewer" (Version1.62, https://www.nitrc.org/projects/bnv/), based on MATLAB (Version R2018a).
Figure 2
Figure 2
Overview of the proposed integrated machine learning framework for classifying schizophrenia. The proposed integrated machine learning framework for classifying schizophrenia consists of 5 M-methods. (a) Multi-biological data were collected from all subjects, including electroencephalogram (EEG) data, fecal data and blood data. (b) Multi-biological features were extracted from multi-biological data. (c) Multi-feature selection algorithms were used to eliminate redundant features, including recursive feature elimination (RFE), principal component analysis (PCA), and analysis of variance (ANOVA) (d) Multi-classifier were used to match heterogeneous biological features including support vector machine (SVM), random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), and k-nearest neighbor (KNN) methods. (e) Multi-cross validation methods including tenfold, fivefold, threefold, and leave-one-out methods, were used to evaluate the performance of the trained model.
Figure 3
Figure 3
Flowchart of the machine learning classification method.
Figure 4
Figure 4
Areas under the receiver operating characteristic curves (AUC) for the best model comparing the gut microbiota features, blood features, electroencephalogram features and the combination of GMV, BF and EF as the input for machine learning. Each curve in the figure represents the ROC curve of the best model using different input features. GMF gut microbiota features, BF blood features, EF electroencephalogram features, CF combined features. This figure was generated by “Visual Studio Code” (Version 1.56, https://code.visualstudio.com/).

Similar articles

Cited by

References

    1. Fernandes BS, et al. The new field of 'precision psychiatry'. BMC Med. 2017;15(1):80. doi: 10.1186/s12916-017-0849-x. - DOI - PMC - PubMed
    1. McCutcheon RA, Reis Marques T, Howes OD. Schizophrenia—An overview. JAMA Psychiat. 2020;77(2):201–210. doi: 10.1001/jamapsychiatry.2019.3360. - DOI - PubMed
    1. Li S, et al. Altered gut microbiota associated with symptom severity in schizophrenia. PeerJ. 2020;8:e9574. doi: 10.7717/peerj.9574. - DOI - PMC - PubMed
    1. Shen Y, et al. Analysis of gut microbiota diversity and auxiliary diagnosis as a biomarker in patients with schizophrenia: A cross-sectional study. Schizophr. Res. 2018;197:470–477. doi: 10.1016/j.schres.2018.01.002. - DOI - PubMed
    1. Li, S., et al. The gut microbiome is associated with brain structure and function in schizophrenia. Sci. Rep.11, 9743 (2021). - PMC - PubMed

Publication types

MeSH terms