. 2021 May 20:4:550030.

doi: 10.3389/frai.2021.550030. eCollection 2021.

Explainable AI and Reinforcement Learning-A Systematic Review of Current Approaches and Trends

Lindsay Wells¹, Tomasz Bednarz^{1

2}

Affiliations

¹ Expanded Perception and Interaction Center, Faculty of Art and Design, University of New South Wales, Sydney, NSW, Australia.
² Data61, Commonwealth Scientific and Industrial Research Organisation, Sydney, NSW, Australia.

PMID: 34095817
PMCID: PMC8172805
DOI: 10.3389/frai.2021.550030

Explainable AI and Reinforcement Learning-A Systematic Review of Current Approaches and Trends

Lindsay Wells et al. Front Artif Intell. 2021.

. 2021 May 20:4:550030.

doi: 10.3389/frai.2021.550030. eCollection 2021.

Authors

Lindsay Wells¹, Tomasz Bednarz^{1

2}

Affiliations

¹ Expanded Perception and Interaction Center, Faculty of Art and Design, University of New South Wales, Sydney, NSW, Australia.
² Data61, Commonwealth Scientific and Industrial Research Organisation, Sydney, NSW, Australia.

PMID: 34095817
PMCID: PMC8172805
DOI: 10.3389/frai.2021.550030

Abstract

Research into Explainable Artificial Intelligence (XAI) has been increasing in recent years as a response to the need for increased transparency and trust in AI. This is particularly important as AI is used in sensitive domains with societal, ethical, and safety implications. Work in XAI has primarily focused on Machine Learning (ML) for classification, decision, or action, with detailed systematic reviews already undertaken. This review looks to explore current approaches and limitations for XAI in the area of Reinforcement Learning (RL). From 520 search results, 25 studies (including 5 snowball sampled) are reviewed, highlighting visualization, query-based explanations, policy summarization, human-in-the-loop collaboration, and verification as trends in this area. Limitations in the studies are presented, particularly a lack of user studies, and the prevalence of toy-examples and difficulties providing understandable explanations. Areas for future study are identified, including immersive visualization, and symbolic representation.

Keywords: artificial intelligence; explainable AI; machine learning; reinforcement learning; visualization.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Number of papers included in review after various stages of filtering.

**Figure 2**
Categorization of papers by domain. Note that some papers were in multiple domains.

**Figure 3**
Screenshots of the game-based applications in the studied papers. Where more than one paper used that game, the number of papers using the game are shown in brackets.

**Figure 4**
Distribution of surveyed papers by year, indicating an increase of academic interest in this area.

**Figure 5**
Categorization of papers by scope. Note that some papers were multi-faceted and covered multiple categories.

**Figure 6**
Example visualizations from DQNVis, showing **(a,b)** episode duration, and **(c)** actions taken over time and how experts identified these as “hesitating” and “repeating” behaviors which were non-optimal (from Wang et al., , p. 294, reproduced with permission).

**Figure 7**
Screenshots (left) and their matching object saliency maps (right) in the game Ms Pacman (from Iyer et al., , p. 148, reproduced with permission).

**Figure 8**
Summarized policies for Montezuma's Revenge (left), and the “taxi” problem (right), (from Lyu et al. (2019), p. 2975, reproduced with permission).

**Figure 9**
Plot of steering actions generated by standard DRL agent vs. the summarized NDPS policy, which resulted in much smoother steering movements (from Verma et al., , p. 7, reproduced with permission).

See this image and copyright information in PMC

References

1. Adebayo J., Gilmer J., Muelly M., Goodfellow I., Hardt M., Kim B. (2018). Sanity checks for saliency maps. arXiv [Preprint] arXiv:1810.03292.
1. Amir O., Doshi-Velez F., Sarne D. (2019). Summarizing agent strategies. Autonomous Agents Multi Agent Syst. 33, 628–644. 10.1007/s10458-019-09418-w - DOI - PMC - PubMed
1. Anjomshoae S., Najjar A., Calvaresi D., Främling K. (2019). Explainable agents and robots: results from a systematic literature review, in 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), Montreal, Canada, May 13–17, 2019 (Montreal, QC: International Foundation for Autonomous Agents and Multiagent Systems; ), 1078–1088.
1. Araiza-Illan D., Eder K. (2019). Safe and trustworthy human-robot interaction, in Humanoid Robotics: A Reference, eds Goswami A., Vadakkepat P. (Dordrecht: Springer Netherlands; ), 2397–2419.
1. Baker B., Kanitscheider I., Markov T., Wu Y., Powell G., McGrew B., et al. . (2019). Emergent tool use from multi-agent autocurricula. arXiv [Preprint] arXiv:1909.07528.

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Explainable AI and Reinforcement Learning-A Systematic Review of Current Approaches and Trends

Affiliations

Explainable AI and Reinforcement Learning-A Systematic Review of Current Approaches and Trends

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources