Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 20:4:550030.
doi: 10.3389/frai.2021.550030. eCollection 2021.

Explainable AI and Reinforcement Learning-A Systematic Review of Current Approaches and Trends

Affiliations

Explainable AI and Reinforcement Learning-A Systematic Review of Current Approaches and Trends

Lindsay Wells et al. Front Artif Intell. .

Abstract

Research into Explainable Artificial Intelligence (XAI) has been increasing in recent years as a response to the need for increased transparency and trust in AI. This is particularly important as AI is used in sensitive domains with societal, ethical, and safety implications. Work in XAI has primarily focused on Machine Learning (ML) for classification, decision, or action, with detailed systematic reviews already undertaken. This review looks to explore current approaches and limitations for XAI in the area of Reinforcement Learning (RL). From 520 search results, 25 studies (including 5 snowball sampled) are reviewed, highlighting visualization, query-based explanations, policy summarization, human-in-the-loop collaboration, and verification as trends in this area. Limitations in the studies are presented, particularly a lack of user studies, and the prevalence of toy-examples and difficulties providing understandable explanations. Areas for future study are identified, including immersive visualization, and symbolic representation.

Keywords: artificial intelligence; explainable AI; machine learning; reinforcement learning; visualization.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Number of papers included in review after various stages of filtering.
Figure 2
Figure 2
Categorization of papers by domain. Note that some papers were in multiple domains.
Figure 3
Figure 3
Screenshots of the game-based applications in the studied papers. Where more than one paper used that game, the number of papers using the game are shown in brackets.
Figure 4
Figure 4
Distribution of surveyed papers by year, indicating an increase of academic interest in this area.
Figure 5
Figure 5
Categorization of papers by scope. Note that some papers were multi-faceted and covered multiple categories.
Figure 6
Figure 6
Example visualizations from DQNVis, showing (a,b) episode duration, and (c) actions taken over time and how experts identified these as “hesitating” and “repeating” behaviors which were non-optimal (from Wang et al., , p. 294, reproduced with permission).
Figure 7
Figure 7
Screenshots (left) and their matching object saliency maps (right) in the game Ms Pacman (from Iyer et al., , p. 148, reproduced with permission).
Figure 8
Figure 8
Summarized policies for Montezuma's Revenge (left), and the “taxi” problem (right), (from Lyu et al. (2019), p. 2975, reproduced with permission).
Figure 9
Figure 9
Plot of steering actions generated by standard DRL agent vs. the summarized NDPS policy, which resulted in much smoother steering movements (from Verma et al., , p. 7, reproduced with permission).

References

    1. Adebayo J., Gilmer J., Muelly M., Goodfellow I., Hardt M., Kim B. (2018). Sanity checks for saliency maps. arXiv [Preprint] arXiv:1810.03292.
    1. Amir O., Doshi-Velez F., Sarne D. (2019). Summarizing agent strategies. Autonomous Agents Multi Agent Syst. 33, 628–644. 10.1007/s10458-019-09418-w - DOI - PMC - PubMed
    1. Anjomshoae S., Najjar A., Calvaresi D., Främling K. (2019). Explainable agents and robots: results from a systematic literature review, in 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), Montreal, Canada, May 13–17, 2019 (Montreal, QC: International Foundation for Autonomous Agents and Multiagent Systems; ), 1078–1088.
    1. Araiza-Illan D., Eder K. (2019). Safe and trustworthy human-robot interaction, in Humanoid Robotics: A Reference, eds Goswami A., Vadakkepat P. (Dordrecht: Springer Netherlands; ), 2397–2419.
    1. Baker B., Kanitscheider I., Markov T., Wu Y., Powell G., McGrew B., et al. . (2019). Emergent tool use from multi-agent autocurricula. arXiv [Preprint] arXiv:1909.07528.

Publication types

LinkOut - more resources