Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing

Enea Ceolini¹, Charlotte Frenkel^{1

2}, Sumit Bam Shrestha³, Gemma Taverni¹, Lyes Khacef⁴, Melika Payvand¹, Elisa Donati¹

Affiliations

¹ Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland.
² ICTEAM Institute, Université Catholique de Louvain, Louvain-la-Neuve, Belgium.
³ Temasek Laboratories, National University of Singapore, Singapore, Singapore.
⁴ Université Côte d'Azur, CNRS, LEAT, Nice, France.

PMID: 32903824
PMCID: PMC7438887
DOI: 10.3389/fnins.2020.00637

Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing

Enea Ceolini et al. Front Neurosci. 2020.

. 2020 Aug 5:14:637.

doi: 10.3389/fnins.2020.00637. eCollection 2020.

Authors

Enea Ceolini¹, Charlotte Frenkel^{1

2}, Sumit Bam Shrestha³, Gemma Taverni¹, Lyes Khacef⁴, Melika Payvand¹, Elisa Donati¹

Affiliations

¹ Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland.
² ICTEAM Institute, Université Catholique de Louvain, Louvain-la-Neuve, Belgium.
³ Temasek Laboratories, National University of Singapore, Singapore, Singapore.
⁴ Université Côte d'Azur, CNRS, LEAT, Nice, France.

PMID: 32903824
PMCID: PMC7438887
DOI: 10.3389/fnins.2020.00637

Abstract

Hand gestures are a form of non-verbal communication used by individuals in conjunction with speech to communicate. Nowadays, with the increasing use of technology, hand-gesture recognition is considered to be an important aspect of Human-Machine Interaction (HMI), allowing the machine to capture and interpret the user's intent and to respond accordingly. The ability to discriminate between human gestures can help in several applications, such as assisted living, healthcare, neuro-rehabilitation, and sports. Recently, multi-sensor data fusion mechanisms have been investigated to improve discrimination accuracy. In this paper, we present a sensor fusion framework that integrates complementary systems: the electromyography (EMG) signal from muscles and visual information. This multi-sensor approach, while improving accuracy and robustness, introduces the disadvantage of high computational cost, which grows exponentially with the number of sensors and the number of measurements. Furthermore, this huge amount of data to process can affect the classification latency which can be crucial in real-case scenarios, such as prosthetic control. Neuromorphic technologies can be deployed to overcome these limitations since they allow real-time processing in parallel at low power consumption. In this paper, we present a fully neuromorphic sensor fusion approach for hand-gesture recognition comprised of an event-based vision sensor and three different neuromorphic processors. In particular, we used the event-based camera, called DVS, and two neuromorphic platforms, Loihi and ODIN + MorphIC. The EMG signals were recorded using traditional electrodes and then converted into spikes to be fed into the chips. We collected a dataset of five gestures from sign language where visual and electromyography signals are synchronized. We compared a fully neuromorphic approach to a baseline implemented using traditional machine learning approaches on a portable GPU system. According to the chip's constraints, we designed specific spiking neural networks (SNNs) for sensor fusion that showed classification accuracy comparable to the software baseline. These neuromorphic alternatives have increased inference time, between 20 and 40%, with respect to the GPU system but have a significantly smaller energy-delay product (EDP) which makes them between 30× and 600× more efficient. The proposed work represents a new benchmark that moves neuromorphic computing toward a real-world scenario.

Keywords: electromyography (EMG) signal processing; event-based camera; hand-gesture classification; neuromorphic engineering; sensor fusion; spiking neural networks (SNNs).

PubMed Disclaimer

Figures

**Figure 1**
Example, for a gesture “elle,” of spike streams for DVS (left) and EMG (right). In the EMG figure the spikes are represented by dots while the continuous line is the raw EMG. Different channels have different colors.

**Figure 2**
System overview. From left to right: **(A)** data collection setup featuring the DVS, the traditional camera and the subject wearing the EMG armband sensor, **(B)** data streams of (b1) DVS and (b2) EMG transformed into spikes via the Delta modulation approach, **(C)** the two neuromorphic systems namely (c1) Loihi and (c2) ODIN + MorphIC, **(D)** the hand gestures that the system is able to recognize in real time.

**Figure 3**
Architectures of the neural networks implemented on the neuromorphic systems and used in the baselines. **(A)** CNN architecture implemented on Loihi; the corresponding baseline CNN receives APS frames instead of DVS events. **(B)** subMLP architectures implemented on MorphIC, the corresponding baseline subMLPs receive APS frames instead of DVS events. **(C)** MLP architecture for the EMG data implemented on Loihi (c1) and on ODIN (c2), the corresponding baseline MLPs receive EMG features instead of EMG events. The shading indicates those layers that are concatenated during the fusion of the networks.

**Figure 4**
Accuracy vs. stimulus duration for the Loihi system and its software baseline counterpart. In green the results for the CNN (GPU), in purple the results for the spiking CNN (Loihi). No classification is present for APS frames before 25 ms since the frame rate is 20 fps.

**Figure 5**
Accuracy vs. stimulus duration for the ODIN + MorphIC system and its software baseline counterpart. In blue the results for the MLP (GPU), in red the results for the spiking MLP (ODIN + MorphIC). No classification is present for APS frames before 25 ms since the frame rate is 20 fps.

**Figure 6**
Comparison between the two neuromorphic system with respect to **(A)** energy delay product (EDP) (see section 1), **(B)** number of synaptic operations (SOPs) (see section 2.3.1), **(C)** EDP normalized by the number of SOPs.

See this image and copyright information in PMC

Cited by

A neuromorphic multi-scale approach for real-time heart rate and state detection.
De Luca C, Tincani M, Indiveri G, Donati E. De Luca C, et al. Npj Unconv Comput. 2025;2(1):6. doi: 10.1038/s44335-025-00024-6. Epub 2025 Apr 2. Npj Unconv Comput. 2025. PMID: 40191602 Free PMC article.
Neuromorphic hardware for somatosensory neuroprostheses.
Donati E, Valle G. Donati E, et al. Nat Commun. 2024 Jan 16;15(1):556. doi: 10.1038/s41467-024-44723-3. Nat Commun. 2024. PMID: 38228580 Free PMC article. Review.
Application of Wireless Network Multisensor Fusion Technology in Sports Training.
Tang W. Tang W. Comput Intell Neurosci. 2022 Jun 3;2022:9836697. doi: 10.1155/2022/9836697. eCollection 2022. Comput Intell Neurosci. 2022. Retraction in: Comput Intell Neurosci. 2023 Dec 13;2023:9760595. doi: 10.1155/2023/9760595. PMID: 35694566 Free PMC article. Retracted.
Machine-learned wearable sensors for real-time hand-motion recognition: toward practical applications.
Pyun KR, Kwon K, Yoo MJ, Kim KK, Gong D, Yeo WH, Han S, Ko SH. Pyun KR, et al. Natl Sci Rev. 2023 Nov 27;11(2):nwad298. doi: 10.1093/nsr/nwad298. eCollection 2024 Feb. Natl Sci Rev. 2023. PMID: 38213520 Free PMC article. Review.
Spiking neural networks for nonlinear regression.
Henkes A, Eshraghian JK, Wessels H. Henkes A, et al. R Soc Open Sci. 2024 May 1;11(5):231606. doi: 10.1098/rsos.231606. eCollection 2024 May. R Soc Open Sci. 2024. PMID: 38699557 Free PMC article.

See all "Cited by" articles

References

1. Amir A., Taba B., Berg D., Melano T., McKinstry J., Nolfo C. D., et al. (2017). “A low power, fully event-based gesture recognition system,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Honolulu, HI: ), 7388–7397. 10.1109/CVPR.2017.781 - DOI
1. Anumula J., Neil D., Delbruck T., Liu S.-C. (2018). Feature representations for neuromorphic audio spike streams. Front. Neurosci. 12:23. 10.3389/fnins.2018.00023 - DOI - PMC - PubMed
1. Atzori M., Gijsberts A., Castellini C., Caputo B., Hager A.-G. M., Elsig S., et al. . (2014). Electromyography data for non-invasive naturally-controlled robotic hand prostheses. Sci. Data 1:140053. 10.1038/sdata.2014.53 - DOI - PMC - PubMed
1. Barker J., Marxer R., Vincent E., Watanabe S. (2015). “The third ‘chime’ speech separation and recognition challenge: dataset, task and baselines,” in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (Scottsdale, AZ: ), 504–511. 10.1109/ASRU.2015.7404837 - DOI
1. Benatti S., Casamassima F., Milosevic B., Farella E., Schönle P., Fateh S., et al. . (2015). A versatile embedded platform for emg acquisition and gesture recognition. IEEE Trans. Biomed. Circuits Syst. 9, 620–630. 10.1109/TBCAS.2015.2476555 - DOI - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing

Affiliations

Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources