. 2023 Jun 26:17:1199312.

doi: 10.3389/fnins.2023.1199312. eCollection 2023.

Self-supervised pretraining improves the performance of classification of task functional magnetic resonance imaging

Chenwei Shi¹, Yanming Wang¹, Yueyang Wu¹, Shishuo Chen¹, Rongjie Hu¹, Min Zhang¹, Bensheng Qiu^{1

2}, Xiaoxiao Wang¹

Affiliations

¹ Center for Biomedical Imaging, University of Science and Technology of China, Hefei, Anhui, China.
² Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China.

PMID: 37434766
PMCID: PMC10330812
DOI: 10.3389/fnins.2023.1199312

Self-supervised pretraining improves the performance of classification of task functional magnetic resonance imaging

Chenwei Shi et al. Front Neurosci. 2023.

. 2023 Jun 26:17:1199312.

doi: 10.3389/fnins.2023.1199312. eCollection 2023.

Authors

Chenwei Shi¹, Yanming Wang¹, Yueyang Wu¹, Shishuo Chen¹, Rongjie Hu¹, Min Zhang¹, Bensheng Qiu^{1

2}, Xiaoxiao Wang¹

Affiliations

¹ Center for Biomedical Imaging, University of Science and Technology of China, Hefei, Anhui, China.
² Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China.

PMID: 37434766
PMCID: PMC10330812
DOI: 10.3389/fnins.2023.1199312

Abstract

Introduction: Decoding brain activities is one of the most popular topics in neuroscience in recent years. And deep learning has shown high performance in fMRI data classification and regression, but its requirement for large amounts of data conflicts with the high cost of acquiring fMRI data.

Methods: In this study, we propose an end-to-end temporal contrastive self-supervised learning algorithm, which learns internal spatiotemporal patterns within fMRI and allows the model to transfer learning to datasets of small size. For a given fMRI signal, we segmented it into three sections: the beginning, middle, and end. We then utilized contrastive learning by taking the end-middle (i.e., neighboring) pair as the positive pair, and the beginning-end (i.e., distant) pair as the negative pair.

Results: We pretrained the model on 5 out of 7 tasks from the Human Connectome Project (HCP) and applied it in a downstream classification of the remaining two tasks. The pretrained model converged on data from 12 subjects, while a randomly initialized model required 100 subjects. We then transferred the pretrained model to a dataset containing unpreprocessed whole-brain fMRI from 30 participants, achieving an accuracy of 80.2 ± 4.7%, while the randomly initialized model failed to converge. We further validated the model's performance on the Multiple Domain Task Dataset (MDTB), which contains fMRI data of 26 tasks from 24 participants. Thirteen tasks of fMRI were selected as inputs, and the results showed that the pre-trained model succeeded in classifying 11 of the 13 tasks. When using the 7 brain networks as input, variations of the performance were observed, with the visual network performed as well as whole brain inputs, while the limbic network almost failed in all 13 tasks.

Discussion: Our results demonstrated the potential of self-supervised learning for fMRI analysis with small datasets and unpreprocessed data, and for analysis of the correlation between regional fMRI activity and cognitive tasks.

Keywords: brain networks; deep learning; fMRI; interpretability; self-supervised.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Seven brain networks in MNI152 space.

**Figure 2**
The proposed neural network. **(A)**The proposed framework of self-supervised learning. Dividing an fMRI signal segment into three parts: the beginning, middle, and end. After feature extraction and nonlinear mapping, the beginning and end parts are more widely separated, whereas the middle and end parts are closer in the contrast space. **(B)** the model consists of a temporal convolutional layer, four 3D residual layers, and a project head to map the feature to the contrast space.

**Figure 3**
Performance evaluation on Motor and Relational tasks **(A–C)** show the average accuracy on the Motor and Relational classification task using different numbers of subjects to finetune the model and **(A–C)** use different frames as input (frame N = 15, 12, 9). **(D)** The average accuracy of different training epochs of the model which uses 200 subjects’ data to finetune. The performance goes better as the training process progresses.

**Figure 4**
Performance evaluation on OpenNeuro ds002938 Dataset. **(A)** The average confusion matrix of OpenNeuro ds002938 task classification. **(B)** The performance of the different methods on OpenNeuro ds002938 dataset. SSRes means self-supervised resnet trained on the HCP dataset. SRes means supervised resnet trained on HCP. Res_r means resnet with randomly initialized. 4DAtt means 4D attention model trained on HCP. 4DAtt_r means 4D attention model with randomly initialized.

**Figure 5**
Performance on MDTB dataset. **(A)** The ‘*’ shows that after the t-test, the f1 scores were greater than the random classification level for p < 0.05 (with FDR corrected). ToM: Theory of Mind, Observe: Action Observation, Arith: ArithMetic, Object: Object Viewing, BioMotion: Biological Motion, Interval: Interval Timing, Object N-Back: NBack, Response Alternatives: ResAlt, Spatial Map: SpaMap, Spatial Imagery: SpaImag, Verb Generation: Verb, Visual Search: Visual. **(B)** The average confusion matrix of the task classification using the whole brain fMRI signal as input.

See this image and copyright information in PMC

Cited by

Strategies to Improve the Robustness and Generalizability of Deep Learning Segmentation and Classification in Neuroimaging.
Tran AT, Zeevi T, Payabvash S. Tran AT, et al. BioMedInformatics. 2025 Jun;5(2):20. doi: 10.3390/biomedinformatics5020020. Epub 2025 Apr 14. BioMedInformatics. 2025. PMID: 40271381 Free PMC article.

References

1. Aben B., Calderon C. B., Bussche E. V., Verguts T. (2022). Task-dependent effort-induced connectivity. OpenNeuro.
1. Aben B., Calderon C. B., Van den Bussche E., Verguts T. (2020). Cognitive effort modulates connectivity between dorsal anterior cingulate cortex and task-relevant cortical areas. J. Neurosci. 40, 3838–3848. doi: 10.1523/JNEUROSCI.2948-19.2020, PMID: - DOI - PMC - PubMed
1. Akbari H., Yuan L., Qian R., Chuang W.-H., Chang S.-F., Cui Y., et al. . (2021). Vatt: transformers for multimodal self-supervised learning from raw video, audio and text. Adv. Neural Inf. Proces. Syst. 34, 24206–24221.
1. Alwassel H., Mahajan D., Korbar B., Torresani L., Ghanem B., Tran D. (2020). Self-supervised learning by cross-modal audio-video clustering. Adv. Neural Inf. Proces. Syst. 33, 9758–9770.
1. Bandettini P. A. (2020). Fmri Cambridge, MA: MIT Press

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Self-supervised pretraining improves the performance of classification of task functional magnetic resonance imaging

Affiliations

Self-supervised pretraining improves the performance of classification of task functional magnetic resonance imaging

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources

Miscellaneous