Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023;53(4):4063-4098.
doi: 10.1007/s10489-022-03605-1. Epub 2022 Jun 6.

Explaining deep reinforcement learning decisions in complex multiagent settings: towards enabling automation in air traffic flow management

Affiliations

Explaining deep reinforcement learning decisions in complex multiagent settings: towards enabling automation in air traffic flow management

Theocharis Kravaris et al. Appl Intell (Dordr). 2023.

Abstract

With the objective to enhance human performance and maximize engagement during the performance of tasks, we aim to advance automation for decision making in complex and large-scale multi-agent settings. Towards these goals, this paper presents a deep multi agent reinforcement learning method for resolving demand - capacity imbalances in real-world Air Traffic Management settings with thousands of agents. Agents comprising the system are able to jointly decide on the measures to be applied to resolve imbalances, while they provide explanations on their decisions: This information is rendered and explored via appropriate visual analytics tools. The paper presents how major challenges of scalability and complexity are addressed, and provides results from evaluation tests that show the abilities of models to provide high-quality solutions and high-fidelity explanations.

Keywords: Air traffic management; Explainability; Interpretability; Multi-agent deep reinforcement learning; Stochastic decision trees; Visualization.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Evolution of daily demand in sector LECPDEO in a singe day
Fig. 2
Fig. 2
Payoff matrices for 2X2 games
Fig. 3
Fig. 3
DQN architecture
Fig. 4
Fig. 4
Simulation rounds for delay regulations
Fig. 5
Fig. 5
Selecting alternative flight plans due to level capping measures
Fig. 6
Fig. 6
Combining level capping and ground delays measures
Fig. 7
Fig. 7
The mimicking approach for interpretability
Fig. 8
Fig. 8
An overview of a solution: a table view with columns corresponding to sectors and rows to time intervals
Fig. 9
Fig. 9
An overview of a solution involving flight delays. Segmented bars of different shades of grey indicate proportions of delayed flights with different delay durations. Light grey corresponds to non-delayed flights; the darkness increases proportionally to the delay duration
Fig. 10
Fig. 10
Comparison of three solutions with a focus on a selected sector and the sectors connected to it
Fig. 11
Fig. 11
Components and contents of Sector Explorer in the mode of showing a single solution
Fig. 12
Fig. 12
Hourly demands for sectors are represented by time-based histograms with overlapping bars
Fig. 13
Fig. 13
List of flights with accumulated delays. The dynamics of delays accumulations is shown if columns “Cumulative delays” and “Added delays”, representing total and momentary delays, accordingly
Fig. 14
Fig. 14
Decisions applied to a single flight over multiple model steps are explained by a series of sets of arguments. Arguments, intervals of feature values, and actual values for each step are shown in table rows
Fig. 15
Fig. 15
Box plots for delay regulations
Fig. 16
Fig. 16
Box plots for delay regulations and level capping measures
Fig. 17
Fig. 17
Box plots for M20190622-0708 results in scenario 20190714
Fig. 18
Fig. 18
Box plots for M20190622-0708 results in scenario 20190705
Fig. 19
Fig. 19
Histogram of M20190622-0708 distribution of delays in 20190714
Fig. 20
Fig. 20
Histogram for M20190622-0708 distribution of unresolved hotspots in 20190714

References

    1. Agogino AK, Tumer K. A multiagent approach to managing air traffic flow. Auton Agents Multiagent Syst. 2012;24:1–25. doi: 10.1007/s10458-010-9142-5. - DOI
    1. Bazzan ALC. Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Auton Agent Multi-Agent Syst. 2009;18:342–375. doi: 10.1007/s10458-008-9062-9. - DOI
    1. Kuyer L, Whiteson S, Bakker B, Vlassis N (2008) Multiagent reinforcement learning for urban traffic control using coordination graphs. Mach Learn Knowl Discov Database:656–671
    1. Tumer K, Agogino A (2007) Distributed agent-based air traffic flow management. International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’07)
    1. Walraven E, Spaan MTJ, B.Bakker Traffic flow optimization: A reinforcement learning approach. Eng Appl Artif Intell. 2016;52:203–212. doi: 10.1016/j.engappai.2016.01.001. - DOI

LinkOut - more resources