A self-learning multimodal approach for fake news detection

Hao Chen¹, Yue Yu², Hui Guo³, Baochen Hu⁴, Shu Hu⁵, Jinrong Hu¹, Siwei Lyu⁶, Xi Wu¹, Ching-Sheng Lin⁷, Xin Wang⁸

Affiliations

¹ School of Computer Science, Chengdu University of Information Technology, Chengdu, China.
² CAACSRI, Chengdu, China.
³ Department of Mathematics & Statistics, University at Albany, New York, NY, United States.
⁴ Dropbox Inc., California, CA, United States.
⁵ School of Applied and Creative Computing, Purdue University, West Lafayette, IN, United States.
⁶ Department of Computer Science and Engineering, University of Buffalo, New York, NY, United States.
⁷ Master Program of Digital Innovation, Tunghai University, Taichung, Taiwan.
⁸ Department of Epidemiology and Biostatistics, College of Integrated Health Sciences, and AI Plus Institute, University at Albany, New York, NY, United States.

PMID: 41280884
PMCID: PMC12631648
DOI: 10.3389/frai.2025.1665798

A self-learning multimodal approach for fake news detection

Hao Chen et al. Front Artif Intell. 2025.

. 2025 Nov 6:8:1665798.

doi: 10.3389/frai.2025.1665798. eCollection 2025.

Authors

Hao Chen¹, Yue Yu², Hui Guo³, Baochen Hu⁴, Shu Hu⁵, Jinrong Hu¹, Siwei Lyu⁶, Xi Wu¹, Ching-Sheng Lin⁷, Xin Wang⁸

Affiliations

¹ School of Computer Science, Chengdu University of Information Technology, Chengdu, China.
² CAACSRI, Chengdu, China.
³ Department of Mathematics & Statistics, University at Albany, New York, NY, United States.
⁴ Dropbox Inc., California, CA, United States.
⁵ School of Applied and Creative Computing, Purdue University, West Lafayette, IN, United States.
⁶ Department of Computer Science and Engineering, University of Buffalo, New York, NY, United States.
⁷ Master Program of Digital Innovation, Tunghai University, Taichung, Taiwan.
⁸ Department of Epidemiology and Biostatistics, College of Integrated Health Sciences, and AI Plus Institute, University at Albany, New York, NY, United States.

PMID: 41280884
PMCID: PMC12631648
DOI: 10.3389/frai.2025.1665798

Abstract

The rapid growth of social media has resulted in an explosion of online news content, leading to a significant increase in the spread of misleading or false information. While machine learning techniques have been widely applied to detect fake news, the scarcity of labeled datasets remains a critical challenge. Misinformation frequently appears as paired text and images, where a news article or headline is accompanied by a related visuals. In this paper, we introduce a self-learning multimodal model for fake news classification. The model leverages contrastive learning, a robust method for feature extraction that operates without requiring labeled data, and integrates the strengths of Large Language Models (LLMs) to jointly analyze both text and image features. LLMs are excel at this task due to their ability to process diverse linguistic data drawn from extensive training corpora. Our experimental results on a public dataset demonstrate that the proposed model outperforms several state-of-the-art classification approaches, achieving over 85% accuracy, precision, recall, and F1-score. These findings highlight the model's effectiveness in tackling the challenges of multimodal fake news detection.

Keywords: contrastive learning; fake news; large language model; machine learning; multimodal.

PubMed Disclaimer

Conflict of interest statement

BH was employed by Dropbox Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
An example of fake news (mismatching image-text) from dataset (reproduced from Nakamura et al., 2020, European Language Resources Association (ELRA), licensed under CC-BY-NC).

**Figure 2**
The overall structure of multimodal fake news detection (images reproduced from Nakamura et al., 2020, the Fakeddit dataset, https://github.com/entitize/Fakeddit). The model is composed of three components, contrastive learning module is for learning the image feature using a small sample of training data, infusing module aims to align text and image feature and then apply the large language model for the multimodal combination, the classification module is for the prediction of fake news.

**Figure 3**
Momentum configuration for contrastive learning (image reproduced from Nakamura et al., 2020, the Fakeddit dataset, https://github.com/entitize/Fakeddit).

**Figure 4**
Q-Former structure adopted from Li et al. (2023).

See this image and copyright information in PMC

References

1. Aneja S., Midoglu C., Dang-Nguyen D.-T., Khan S. A., Riegler M., Halvorsen P., et al. (2022). Acm multimedia grand challenge on detecting cheapfakes. ArXiv, abs/2207.14534.
1. Bagozzi B. E., Goel R., Lugo-De-Fabritz B., Knickmeier-Cummings K., Balasubramanian K. (2024). A framework for enhancing social media misinformation detection with topical-tactics. Dig. Threat. 5, 1–29. 10.1145/3670694 - DOI
1. Bondielli A., Marcelloni F. (2019). A survey on fake news and rumour detection techniques. Inf. Sci. 497, 38–55. 10.1016/j.ins.2019.05.035 - DOI
1. Brown T. B., Mann B., Ryder N., Subbiah M., Kaplan J., Dhariwal P., et al. (2020). Language models are few-shot learners. ArXiv, abs/2005.14165.
1. Chadha A., Kumar V., Kashyap S., Gupta M. (2021). “Deepfake: an overview,” in Proceedings of second international conference on computing, communications, and cyber-security: IC4S 2020 (Springer: ), 557–566. 10.1007/978-981-16-0733-2_39 - DOI

Publication types

Actions

LinkOut - more resources

Full Text Sources
- Frontiers Media SA

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A self-learning multimodal approach for fake news detection

Affiliations

A self-learning multimodal approach for fake news detection

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources