Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 11;25(8):2428.
doi: 10.3390/s25082428.

Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains

Affiliations

Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains

Yongbin Yang et al. Sensors (Basel). .

Abstract

The retail industry faces increasing challenges in matching supply with demand due to evolving consumer behaviors, market volatility, and supply chain disruptions. While existing approaches employ statistical and machine learning methods for demand forecasting, they often fail to capture complex temporal dependencies and lack the ability to simultaneously optimize inventory decisions. This paper proposes a novel multi-agent deep reinforcement learning framework that jointly optimizes demand forecasting and inventory management in retail supply chains, leveraging data from IoT sensors, RFID tracking systems, and smart shelf monitoring devices. Our approach combines transformer-based sequence modeling for demand patterns with hierarchical reinforcement learning agents that coordinate inventory decisions across distribution networks. The framework integrates both historical sales data and real-time sensor measurements, employing attention mechanisms to capture seasonal patterns, promotional effects, and environmental conditions detected through temperature and humidity sensors. Through extensive experiments on large-scale retail datasets incorporating sensor network data, we demonstrate that our method achieves 18.2% lower forecast error and 23.5% reduced stockout rates compared with state-of-the-art baselines. The results show particular improvements in handling promotional events and seasonal transitions, where traditional methods often struggle. Our work provides new insights into leveraging deep reinforcement learning for integrated retail operations optimization and offers a scalable solution for modern sensor-enabled supply chain challenges.

Keywords: demand forecasting; inventory optimization; multi-agent reinforcement learning; supply chain management.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Overview of the proposed MARIOD framework.
Figure 2
Figure 2
Forecast accuracy during promotional periods. MARIOD demonstrates superior adaptation to sudden demand changes compared with baseline methods.
Figure 3
Figure 3
Inventory-level trajectories comparing MARIOD with baseline methods. Note the reduced variability and more efficient inventory utilization.

References

    1. Chui M., Issler M., Roberts R., Yee L. Technology Trends Outlook 2023. McKinsey & Company; New York, NY, USA: 2023.
    1. Amed I., Balchandani A., Berg A., Hedrich S., Jensen J.E., Le Merle L., Rölkens F. State of Fashion 2022: An Uneven Recovery and New Frontiers. Mckinsey & Company; New York, NY, USA: 2021.
    1. Fildes R., Ma S., Kolassa S. Retail forecasting: Research and practice. Int. J. Forecast. 2022;38:1283–1318. doi: 10.1016/j.ijforecast.2019.06.004. - DOI - PMC - PubMed
    1. Trapero J.R., Kourentzes N., Fildes R. On the identification of sales forecasting models in the presence of promotions. J. Oper. Res. Soc. 2015;66:299–307. doi: 10.1057/jors.2013.174. - DOI
    1. Ma S., Fildes R., Huang T. Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information. Eur. J. Oper. Res. 2016;249:245–257. doi: 10.1016/j.ejor.2015.08.029. - DOI

LinkOut - more resources