Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 1;15(1):20834.
doi: 10.1038/s41598-025-03559-7.

Prompt-based fine-tuning with multilingual transformers for language-independent sentiment analysis

Affiliations

Prompt-based fine-tuning with multilingual transformers for language-independent sentiment analysis

Faizad Ullah et al. Sci Rep. .

Abstract

In the era of global digital communication, understanding user sentiment across multiple languages is a critical challenge with wide-ranging applications in opinion mining, customer feedback analysis, and social media monitoring. This study advances the field of language-independent sentiment analysis by leveraging prompt-based fine-tuning with state-of-the-art transformer models. The performance of classical machine learning approaches, hybrid deep learning architectures, and multilingual transformer models is evaluated across eight typologically diverse languages: Arabic, English, French, German, Hindi, Italian, Portuguese, and Spanish. Baseline models are established using traditional machine learning approaches such as Support Vector Machines (SVM) and Logistic Regression, with feature extraction methods like TF-IDF. A hybrid deep learning model is introduced, combining Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNNs) to capture local and sequential text patterns. Building on these, pre-trained multilingual transformer models, specifically BERT-base-multilingual and XLM-RoBERTa, are fine-tuned for language-independent sentiment classification tasks. The key contribution lies in the implementation of prompt-based fine-tuning strategies for language independent sentiment analysis. Using (1) prefix prompts and (2) cloze-style prompts, a unified framework is established that employs templates designed in one language and evaluates their performance on data from the remaining ( n - 1 ) languages. Experimental results demonstrate that transformer models, particularly XLM-RoBERTa with prompt-based fine-tuning outperform both classical and deep learning methods. With only 32 training examples per class, prefix prompts produce results comparable to standard fine-tuning, which typically uses 70-80% of the data for training. This highlights the potential of prompt-based learning for scalable, multilingual sentiment analysis in diverse language settings.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Overview of the proposed approach: Tweets in eight languages (left) are paired with a prompt template in any (of the eight) languages. The model predicts the masked word based on context, mapping it to a sentiment class for consistent multilingual sentiment analysis.
Fig. 2
Fig. 2
Graph visualizing the performance metrics (Precision, Recall, F1-Score, and Accuracy) for the Prefix Prompt approach using the two multilingual models (BERT-base-multilingual-cased and XLM-RoBERTa) across eight languages.
Fig. 3
Fig. 3
Graph illustrating the performance metrics (Precision, Recall, F1-Score, and Accuracy) for two multilingual models (BERT and XLM-RoBERTa) across eight languages.

Similar articles

References

    1. Li, F., Li, J. & Abza, F. Sentiment analysis of tweets employing convolutional neural network optimized by enhanced gorilla troops optimization algorithm. Sci. Rep.15, 795 (2025). - PMC - PubMed
    1. Miah, M. S. U. et al. A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and llm. Sci. Rep.14, 9603 (2024). - PMC - PubMed
    1. Chang, T. A. & Bergen, B. K. Language model behavior: A comprehensive survey. Comput. Linguistics50, 293–350 (2024).
    1. Varda, A. G. D. & Marelli, M. Data-driven cross-lingual syntax: An agreement study with massively multilingual models. Comput. Linguistics49, 261–299. 10.1162/coli_a_00472 (2023).
    1. Mohammad, S. M. Ethics sheet for automatic emotion recognition and sentiment analysis. Comput. Linguistics48, 239–278 (2022).

LinkOut - more resources