Multi-Agent Reinforcement Learning in Games: Research and Applications
- PMID: 40558344
- PMCID: PMC12190516
- DOI: 10.3390/biomimetics10060375
Multi-Agent Reinforcement Learning in Games: Research and Applications
Abstract
Biological systems, ranging from ant colonies to neural ecosystems, exhibit remarkable self-organizing intelligence. Inspired by these phenomena, this study investigates how bio-inspired computing principles can bridge game-theoretic rationality and multi-agent adaptability. This study systematically reviews the convergence of multi-agent reinforcement learning (MARL) and game theory, elucidating the innovative potential of this integrated paradigm for collective intelligent decision-making in dynamic open environments. Building upon stochastic game and extensive-form game-theoretic frameworks, we establish a methodological taxonomy across three dimensions: value function optimization, policy gradient learning, and online search planning, thereby clarifying the evolutionary logic and innovation trajectories of algorithmic advancements. Focusing on complex smart city scenarios-including intelligent transportation coordination and UAV swarm scheduling-we identify technical breakthroughs in MARL applications for policy space modeling and distributed decision optimization. By incorporating bio-inspired optimization approaches, the investigation particularly highlights evolutionary computation mechanisms for dynamic strategy generation in search planning, alongside population-based learning paradigms for enhancing exploration efficiency in policy refinement. The findings reveal core principles governing how groups make optimal choices in complex environments while mapping the technological development pathways created by blending cross-disciplinary methods to enhance multi-agent systems.
Keywords: evolutionary computation; game theory; multi-agent reinforcement learning; stochastic games.
Conflict of interest statement
The authors declare no conflicts of interest.
Figures









Similar articles
-
Representation-driven sampling and adaptive policy resetting for improving multi-Agent reinforcement learning.Neural Netw. 2025 Jul 15;192:107875. doi: 10.1016/j.neunet.2025.107875. Online ahead of print. Neural Netw. 2025. PMID: 40684699
-
Accreditation through the eyes of nurse managers: an infinite staircase or a phenomenon that evaporates like water.J Health Organ Manag. 2025 Jun 30. doi: 10.1108/JHOM-01-2025-0029. Online ahead of print. J Health Organ Manag. 2025. PMID: 40574247
-
How lived experiences of illness trajectories, burdens of treatment, and social inequalities shape service user and caregiver participation in health and social care: a theory-informed qualitative evidence synthesis.Health Soc Care Deliv Res. 2025 Jun;13(24):1-120. doi: 10.3310/HGTQ8159. Health Soc Care Deliv Res. 2025. PMID: 40548558
-
Deep Genomics: Deep Learning-Based Analysis of Genome-Sequenced Data for Identification of Gene Alterations.Methods Mol Biol. 2025;2952:335-367. doi: 10.1007/978-1-0716-4690-8_20. Methods Mol Biol. 2025. PMID: 40553343
-
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843. JBI Database System Rev Implement Rep. 2016. PMID: 27532314
References
-
- Che A., Wang Z., Zhou C. Multi-Agent Deep Reinforcement Learning for Recharging-Considered Vehicle Scheduling Problem in Container Terminals. IEEE Trans. Intell. Transp. Syst. 2024;25:16855–16868. doi: 10.1109/TITS.2024.3412932. - DOI
-
- Wang K., Shen Z., Lei Z., Liu X., Zhang T. IEEE Transactions on Mobile Computing. IEEE; Piscataway, NJ, USA: 2024. Towards Multi-agent Reinforcement Learning based Traffic Signal Control through Spatio-temporal Hypergraphs; pp. 1–14. - DOI
-
- Zhang L., Yang C., Yan Y., Hu Y. Distributed real-time scheduling in cloud manufacturing by deep reinforcement learning. IEEE Trans. Ind. Inform. 2022;18:8999–9007. doi: 10.1109/TII.2022.3178410. - DOI
-
- Xiong K., Wei Q., Liu Y. Community Microgrid Energy Co-Scheduling Based on Deep Reinforcement Learning and Contribution Mechanisms. IEEE Trans. Smart Grid. 2025;16:1051–1061. doi: 10.1109/TSG.2024.3461320. - DOI
-
- Xiong W., Guo L., Jiao T. A multi-agent path planning algorithm based on game theory and reinforcement learning. Shenzhen Daxue Xuebao (Ligong Ban)/J. Shenzhen Univ. Sci. Eng. 2024;41:274–282. doi: 10.3724/SP.J.1249.2024.03274. - DOI
Publication types
LinkOut - more resources
Full Text Sources