Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains

Yongbin Yang¹, Mengdie Wang², Jiyuan Wang³, Pan Li⁴, Mengjie Zhou⁵

Affiliations

¹ Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90007, USA.
² School of Taxation and Public Administration, Shanghai Lixin University of Accounting and Finance, Shanghai 201620, China.
³ The Fuqua School of Business, Duke University, Durham, NC 27708, USA.
⁴ The Business School, University of Hull, Hull HU6 7R, UK.
⁵ Department of Computer Science, The University of Bristol, Bristol BS8 1QU, UK.

PMID: 40285118
PMCID: PMC12031219
DOI: 10.3390/s25082428

Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains

Yongbin Yang et al. Sensors (Basel). 2025.

. 2025 Apr 11;25(8):2428.

doi: 10.3390/s25082428.

Authors

Yongbin Yang¹, Mengdie Wang², Jiyuan Wang³, Pan Li⁴, Mengjie Zhou⁵

Affiliations

¹ Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90007, USA.
² School of Taxation and Public Administration, Shanghai Lixin University of Accounting and Finance, Shanghai 201620, China.
³ The Fuqua School of Business, Duke University, Durham, NC 27708, USA.
⁴ The Business School, University of Hull, Hull HU6 7R, UK.
⁵ Department of Computer Science, The University of Bristol, Bristol BS8 1QU, UK.

PMID: 40285118
PMCID: PMC12031219
DOI: 10.3390/s25082428

Abstract

The retail industry faces increasing challenges in matching supply with demand due to evolving consumer behaviors, market volatility, and supply chain disruptions. While existing approaches employ statistical and machine learning methods for demand forecasting, they often fail to capture complex temporal dependencies and lack the ability to simultaneously optimize inventory decisions. This paper proposes a novel multi-agent deep reinforcement learning framework that jointly optimizes demand forecasting and inventory management in retail supply chains, leveraging data from IoT sensors, RFID tracking systems, and smart shelf monitoring devices. Our approach combines transformer-based sequence modeling for demand patterns with hierarchical reinforcement learning agents that coordinate inventory decisions across distribution networks. The framework integrates both historical sales data and real-time sensor measurements, employing attention mechanisms to capture seasonal patterns, promotional effects, and environmental conditions detected through temperature and humidity sensors. Through extensive experiments on large-scale retail datasets incorporating sensor network data, we demonstrate that our method achieves 18.2% lower forecast error and 23.5% reduced stockout rates compared with state-of-the-art baselines. The results show particular improvements in handling promotional events and seasonal transitions, where traditional methods often struggle. Our work provides new insights into leveraging deep reinforcement learning for integrated retail operations optimization and offers a scalable solution for modern sensor-enabled supply chain challenges.

Keywords: demand forecasting; inventory optimization; multi-agent reinforcement learning; supply chain management.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
Overview of the proposed MARIOD framework.

**Figure 2**
Forecast accuracy during promotional periods. MARIOD demonstrates superior adaptation to sudden demand changes compared with baseline methods.

**Figure 3**
Inventory-level trajectories comparing MARIOD with baseline methods. Note the reduced variability and more efficient inventory utilization.

See this image and copyright information in PMC

References

1. Chui M., Issler M., Roberts R., Yee L. Technology Trends Outlook 2023. McKinsey & Company; New York, NY, USA: 2023.
1. Amed I., Balchandani A., Berg A., Hedrich S., Jensen J.E., Le Merle L., Rölkens F. State of Fashion 2022: An Uneven Recovery and New Frontiers. Mckinsey & Company; New York, NY, USA: 2021.
1. Fildes R., Ma S., Kolassa S. Retail forecasting: Research and practice. Int. J. Forecast. 2022;38:1283–1318. doi: 10.1016/j.ijforecast.2019.06.004. - DOI - PMC - PubMed
1. Trapero J.R., Kourentzes N., Fildes R. On the identification of sales forecasting models in the presence of promotions. J. Oper. Res. Soc. 2015;66:299–307. doi: 10.1057/jors.2013.174. - DOI
1. Ma S., Fildes R., Huang T. Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information. Eur. J. Oper. Res. 2016;249:245–257. doi: 10.1016/j.ejor.2015.08.029. - DOI

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains

Affiliations

Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources