. 2024 Apr 25;24(9):2740.

doi: 10.3390/s24092740.

End-to-End Ultrasonic Hand Gesture Recognition

Elfi Fertl^{1

2}, Do Dinh Tan Nguyen¹, Martin Krueger¹, Georg Stettinger¹, Rubén Padial-Allué², Encarnación Castillo², Manuel P Cuéllar³

Affiliations

¹ Infineon Technologies AG, 85579 Neubiberg, Germany.
² Department of Electronics and Computer Technology, University of Granada, 18071 Granada, Spain.
³ Department of Computer Science and Artificial Intelligence, University of Granada, 18071 Granada, Spain.

PMID: 38732843
PMCID: PMC11086334
DOI: 10.3390/s24092740

End-to-End Ultrasonic Hand Gesture Recognition

Elfi Fertl et al. Sensors (Basel). 2024.

. 2024 Apr 25;24(9):2740.

doi: 10.3390/s24092740.

Authors

Elfi Fertl^{1

2}, Do Dinh Tan Nguyen¹, Martin Krueger¹, Georg Stettinger¹, Rubén Padial-Allué², Encarnación Castillo², Manuel P Cuéllar³

Affiliations

¹ Infineon Technologies AG, 85579 Neubiberg, Germany.
² Department of Electronics and Computer Technology, University of Granada, 18071 Granada, Spain.
³ Department of Computer Science and Artificial Intelligence, University of Granada, 18071 Granada, Spain.

PMID: 38732843
PMCID: PMC11086334
DOI: 10.3390/s24092740

Abstract

As the number of electronic gadgets in our daily lives is increasing and most of them require some kind of human interaction, this demands innovative, convenient input methods. There are limitations to state-of-the-art (SotA) ultrasound-based hand gesture recognition (HGR) systems in terms of robustness and accuracy. This research presents a novel machine learning (ML)-based end-to-end solution for hand gesture recognition with low-cost micro-electromechanical (MEMS) system ultrasonic transducers. In contrast to prior methods, our ML model processes the raw echo samples directly instead of using pre-processed data. Consequently, the processing flow presented in this work leaves it to the ML model to extract the important information from the echo data. The success of this approach is demonstrated as follows. Four MEMS ultrasonic transducers are placed in three different geometrical arrangements. For each arrangement, different types of ML models are optimized and benchmarked on datasets acquired with the presented custom hardware (HW): convolutional neural networks (CNNs), gated recurrent units (GRUs), long short-term memory (LSTM), vision transformer (ViT), and cross-attention multi-scale vision transformer (CrossViT). The three last-mentioned ML models reached more than 88% accuracy. The most important innovation described in this research paper is that we were able to demonstrate that little pre-processing is necessary to obtain high accuracy in ultrasonic HGR for several arrangements of cost-effective and low-power MEMS ultrasonic transducer arrays. Even the computationally intensive Fourier transform can be omitted. The presented approach is further compared to HGR systems using other sensor types such as vision, WiFi, radar, and state-of-the-art ultrasound-based HGR systems. Direct processing of the sensor signals by a compact model makes ultrasonic hand gesture recognition a true low-cost and power-efficient input method.

Keywords: Fourier transform; HMI; MEMS ultrasonic transducer; machine learning; pre-processing.

PubMed Disclaimer

Conflict of interest statement

Authors Elfi Fertl, Do Dinh Tan Nguyen, Martin Krueger, and Georg Stettinger were employed by the company Infineon Technologies AG. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Sensor-agnostic HGR processing flow.

**Figure 2**
Suggested change in HGR processing flow.

**Figure 4**
Processing shield with linear array; transducer locations are marked with yellow and white circles; yellow circle—sending transducer and white circle—receiving transducer.

**Figure 5**
Corner (**left**) and center (**right**) transducer array without processing shield; transducer locations marked with yellow and white circles; yellow circle—sending transducer and white circle—receiving transducer.

**Figure 6**
Gestures used in the dataset.

**Figure 7**
Part of a gesture frame of one channel of a pp gesture. Find an illustration of the pp gesture in Figure 6.

**Figure 8**
Plot of a pulse train with echo (in blue) compared to a pulse train without echo (in yellow) of transducer 1.

**Figure 9**
Plot of a pulse train with echo (in blue) compared to a pulse train without echo (in yellow) of transducer 2.

**Figure 10**
Plot of a pulse train with echo (in blue) compared to a pulse train without echo (in yellow) of transducer 3.

**Figure 11**
Schema of the CNN model with the best results. When the input is two channels (linear dataset), the output is 1 of 4 possible gestures. When the input is three channels (center and corner dataset), 1 out of 6 gestures is the output.

**Figure 12**
Accuracies per model per dataset.

**Figure 14**
Best accuracy of the best models on all classes compared to the accuracy of the best models on four classes and the average accuracy of all models on each of the three datasets.

**Figure 15**
Accuracies per model averaged over datasets.

See this image and copyright information in PMC

References

1. Kirimtat A., Krejcar O., Kertesz A., Tasgetiren M.F. Future Trends and Current State of Smart City Concepts: A Survey. IEEE Access. 2020;8:86448–86467. doi: 10.1109/ACCESS.2020.2992441. - DOI
1. Hamad A., Jia B. How Virtual Reality Technology Has Changed Our Lives: An Overview of the Current and Potential Applications and Limitations. Int. J. Environ. Res. Public Health. 2022;19:11278. doi: 10.3390/ijerph191811278. - DOI - PMC - PubMed
1. Fu J., Rota A., Li S., Zhao J., Liu Q., Iovene E., Ferrigno G., De Momi E. Recent Advancements in Augmented Reality for Robotic Applications: A Survey. Actuators. 2023;12:323. doi: 10.3390/act12080323. - DOI
1. Guo L., Lu Z., Yao L. Human-Machine Interaction Sensing Technology Based on Hand Gesture Recognition: A Review. IEEE Trans. Human-Mach. Syst. 2021;51:300–309. doi: 10.1109/THMS.2021.3086003. - DOI
1. Oudah M., Al-Naji A., Chahl J. Hand Gesture Recognition Based on Computer Vision: A Review of Techniques. J. Imaging. 2020;6:73. doi: 10.3390/jimaging6080073. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

FKZ 19A20012D/"Neue Fahrzeug- und Systemtechnologien" from the Bundesministerium 655 für Wirtschaft und Energie (BMWi) through the funding project "SEMULIN - Selbstunterstützende 656 Multimodale Interaktion"

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

End-to-End Ultrasonic Hand Gesture Recognition

Affiliations

End-to-End Ultrasonic Hand Gesture Recognition

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials