Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Apr 11:9:799893.
doi: 10.3389/frobt.2022.799893. eCollection 2022.

Robot Learning From Randomized Simulations: A Review

Affiliations
Review

Robot Learning From Randomized Simulations: A Review

Fabio Muratore et al. Front Robot AI. .

Abstract

The rise of deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data. Unfortunately, it is prohibitively expensive to generate such data sets on a physical platform. Therefore, state-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive and subsequently transfer the knowledge to the real robot (sim-to-real). Despite becoming increasingly realistic, all simulators are by construction based on models, hence inevitably imperfect. This raises the question of how simulators can be modified to facilitate learning robot control policies and overcome the mismatch between simulation and reality, often called the "reality gap." We provide a comprehensive review of sim-to-real research for robotics, focusing on a technique named "domain randomization" which is a method for learning from randomized simulations.

Keywords: domain randomization; reality gap; reinforcement learning; robotics; sim-to-real; simulation; simulation optimization bias.

PubMed Disclaimer

Conflict of interest statement

Author FM was employed by the Technical University of Darmstadt in collaboration with the Honda Research Institute Europe. Author FR was employed by NVIDIA. Author WY was employed by Google. Author MG was employed by the Honda Research Institute Europe. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from the Honda Research Institute Europe. The funder had the following involvement in the study: the structuring and improvement of this article jointly with the authors, and the decision to submit it for publication.

Figures

FIGURE 1
FIGURE 1
Examples of sim-to-real robot learning research using domain randomization: (left) Multiple simulation instances of robotic in-hand manipulation (OpenAI et al., 2020), (middle top) transformation to a canonical simulation (James et al., 2019), (middle bottom) synthetic 3D hallways generated for indoor drone flight (Sadeghi and Levine, 2017), (right top) ball-in-a-cup task solved with adaptive dynamics randomization (Muratore et al., 2021a), (right bottom) quadruped locomotion (Tan et al., 2018).
FIGURE 2
FIGURE 2
Topological overview of the sim-to-real research and a selection of related fields.
FIGURE 3
FIGURE 3
Topological overview of domain randomization methods.
FIGURE 4
FIGURE 4
Conceptual illustration of static domain randomization.
FIGURE 5
FIGURE 5
Conceptual illustration of adaptive domain randomization.
FIGURE 6
FIGURE 6
Conceptual illustration of adversarial domain randomization.

References

    1. Abdulsamad H., Dorau T., Belousov B., Zhu J., Peters J. (2021). Distributionally Robust Trajectory Optimization under Uncertain Dynamics via Relative-Entropy Trust Regions. arXiv 2103.15388
    1. Alghonaim R., Johns E. (2020). Benchmarking Domain Randomisation for Visual Sim-To-Real Transfer. arXiv 2011.07112
    1. Allevato A., Short E. S., Pryor M., Thomaz A. (2019). Tunenet: One-Shot Residual Tuning for System Identification and Sim-To-Real Robot Task Transfer. In Conference on Robot Learning (CoRL), Osaka, Japan, October 30 - November 1 (PMLR; ), vol. 100 of Proc. Machine Learn. Res., 445–455.
    1. Amari S.-i. (1977). Dynamics of Pattern Formation in Lateral-Inhibition Type Neural fields. Biol. Cybern. 27, 77–87. 10.1007/bf00337259 - DOI - PubMed
    1. Andrychowicz M., Crow D., Ray A., Schneider J., Fong R., Welinder P., et al. (2017). “Hindsight Experience Replay,” in Conference on Neural Information Processing Systems (NIPS), December 4-9 (Long Beach, CA, USA, 5048–5058.