Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 9:11:1329270.
doi: 10.3389/frobt.2024.1329270. eCollection 2024.

A comparison of visual and auditory EEG interfaces for robot multi-stage task control

Affiliations

A comparison of visual and auditory EEG interfaces for robot multi-stage task control

Kai Arulkumaran et al. Front Robot AI. .

Abstract

Shared autonomy holds promise for assistive robotics, whereby physically-impaired people can direct robots to perform various tasks for them. However, a robot that is capable of many tasks also introduces many choices for the user, such as which object or location should be the target of interaction. In the context of non-invasive brain-computer interfaces for shared autonomy-most commonly electroencephalography-based-the two most common choices are to provide either auditory or visual stimuli to the user-each with their respective pros and cons. Using the oddball paradigm, we designed comparable auditory and visual interfaces to speak/display the choices to the user, and had users complete a multi-stage robotic manipulation task involving location and object selection. Users displayed differing competencies-and preferences-for the different interfaces, highlighting the importance of considering modalities outside of vision when constructing human-robot interfaces.

Keywords: brain-computer interface; human-robot interaction; imitation learning; multitask learning; shared autonomy.

PubMed Disclaimer

Conflict of interest statement

Authors KA, MD, RJ, SA, DL, MS, KT, and SS were employed by Araya Inc.

Figures

FIGURE 1
FIGURE 1
(A) Experimental layout. The user with the EEG cap sits in front of a small table facing the robot, with a display (for the visual P300 interface only). The robot is behind a table with chest of drawers and three kitchen objects: a cup, a spoon, and a bottle. (B) Multi-stage task control flow. After the robot environment is initialised, the robot provides a set of actions that can be performed. These actions are used to create the user interface. After an action has been decoded by the user interface, the robot performs the chosen action, and then provides the next set of actions. This generic process can be repeated; however, in our particular task there are two decision points for the user (pick a drawer to open, and item to pick and place), after which the robot performs a third and final action (close the drawer) autonomously.
FIGURE 2
FIGURE 2
EEG decoding results: hatched bars correspond to decoder training, and plain bars correspond to online decoding. The average balanced accuracy, precision and accuracy across users was 0.50 ± 0.05, 0.17 ± 0.03 and 0.47 ± 0.15 for the visual interface, and 0.55 ± 0.09, 0.20 ± 0.06 and 0.48 ± 0.19 for the auditory interface, respectively. Average success was 0.23 ± 0.42 for the visual interface, and 0.25 ± 0.43 for the auditory interface. Only one user experienced timeouts (3, on the auditory interface).
FIGURE 3
FIGURE 3
Usability questionnaire results. Users understood the interfaces without much difficulty, but were frustrated with poor decoding accuracy. The most significant difference between the interfaces was the ability to ignore the non-targets, with users finding this particularly difficult with the visual interface.
FIGURE 4
FIGURE 4
PerAct success rates. On average, PerAct had a success rate of 72% on opening the drawer, 78% on closing the drawer, and 45% on picking and placing objects.

Similar articles

References

    1. Ahn M., Brohan A., Brown N., Chebotar Y., Cortes O., David B., et al. (2022). Do as i can, not as i say: grounding language in robotic affordances. Available at: https://arxiv.org/abs/2204.01691.
    1. Akinola I., Chen B., Koss J., Patankar A., Varley J., Allen P. (2017). “Task level hierarchical system for bci-enabled shared autonomy,” in 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), Birmingham, UK, November, 2017, 219–225.
    1. Akiyama S., Lillrank D. O., Arulkumaran K. (2023). “Fine-grained object detection and manipulation with segmentation-conditioned perceiver-actor,” in ICRA Workshop on Pretraining for Robotics, London, United Kingdom, May, 2023.
    1. Aljalal M., Ibrahim S., Djemal R., Ko W. (2020). Comprehensive review on brain-controlled mobile robots and robotic arms based on electroencephalography signals. Intell. Serv. Robot. 13, 539–563. 10.1007/s11370-020-00328-5 - DOI
    1. Ao J., Wang R., Zhou L., Wang C., Ren S., Wu Y., et al. (2022). “Speecht5: unified-modal encoder-decoder pre-training for spoken language processing,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, May, 2022, 5723–5738.