. 2019 Oct 14;19(20):4441.

doi: 10.3390/s19204441.

Hands-Free User Interface for AR/VR Devices Exploiting Wearer's Facial Gestures Using Unsupervised Deep Learning

Jaekwang Cha¹, Jinhyuk Kim², Shiho Kim³

Affiliations

¹ Seamless Transportation Lab (STL), School of Integrated Technology, and Yonsei Institute of Convergence Technology, Yonsei University, Incheon 21983, Korea. chajae42@yonsei.ac.kr.
² Seamless Transportation Lab (STL), School of Integrated Technology, and Yonsei Institute of Convergence Technology, Yonsei University, Incheon 21983, Korea. jinhyuk.kim@yonsei.ac.kr.
³ Seamless Transportation Lab (STL), School of Integrated Technology, and Yonsei Institute of Convergence Technology, Yonsei University, Incheon 21983, Korea. shiho@yonsei.ac.kr.

PMID: 31614988
PMCID: PMC6832972
DOI: 10.3390/s19204441

Hands-Free User Interface for AR/VR Devices Exploiting Wearer's Facial Gestures Using Unsupervised Deep Learning

Jaekwang Cha et al. Sensors (Basel). 2019.

. 2019 Oct 14;19(20):4441.

doi: 10.3390/s19204441.

Authors

Jaekwang Cha¹, Jinhyuk Kim², Shiho Kim³

Affiliations

¹ Seamless Transportation Lab (STL), School of Integrated Technology, and Yonsei Institute of Convergence Technology, Yonsei University, Incheon 21983, Korea. chajae42@yonsei.ac.kr.
² Seamless Transportation Lab (STL), School of Integrated Technology, and Yonsei Institute of Convergence Technology, Yonsei University, Incheon 21983, Korea. jinhyuk.kim@yonsei.ac.kr.
³ Seamless Transportation Lab (STL), School of Integrated Technology, and Yonsei Institute of Convergence Technology, Yonsei University, Incheon 21983, Korea. shiho@yonsei.ac.kr.

PMID: 31614988
PMCID: PMC6832972
DOI: 10.3390/s19204441

Abstract

Developing a user interface (UI) suitable for headset environments is one of the challenges in the field of augmented reality (AR) technologies. This study proposes a hands-free UI for an AR headset that exploits facial gestures of the wearer to recognize user intentions. The facial gestures of the headset wearer are detected by a custom-designed sensor that detects skin deformation based on infrared diffusion characteristics of human skin. We designed a deep neural network classifier to determine the user's intended gestures from skin-deformation data, which are exploited as user input commands for the proposed UI system. The proposed classifier is composed of a spatiotemporal autoencoder and deep embedded clustering algorithm, trained in an unsupervised manner. The UI device was embedded in a commercial AR headset, and several experiments were performed on the online sensor data to verify operation of the device. We achieved implementation of a hands-free UI for an AR headset with average accuracy of 95.4% user-command recognition, as determined through tests by participants.

Keywords: augmented reality; deep embedded clustering; hands-free interface; spatiotemporal autoencoder.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
Example of common interaction method with a user wearing an augmented reality (AR) headset: (a) hand-held controller and (b) button or touchpad.

**Figure 2**
Overall system diagram: The sensor module includes an IR LD and an IR camera. The IR camera takes images of IR diffusion patterns, and the LD emits IR light onto the skin.

**Figure 3**
Implementation of the sensor module. (a) The sensor module includes a USB camera and NIR laser diode. We installed the module on the left side of the Epson BT-350 AR glasses. (b) A photograph of the headset on a user. The laser diode is aimed on the skin near the left cheek, which is deformed when a user makes winking gestures.

**Figure 4**
Images of IR laterally propagated through the skin: (a) IR SRDR image captured with no facial gesture and (b) captured IR SRDR image during a wink gesture by the user. The brightness of the white region in these images represents the intensity of the IR SRDR.

**Figure 5**
Preprocessing procedure of the clustering network: The preprocessing unit calculates the difference between two images. For the calculation, the unit thresholds the input images, conducts pixel-wise subtraction between the images, and resizes the images to 28 × 28 pixels to suit the clustering network input size.

**Figure 6**
Structure of the network extracting sensor data features: spatiotemporal autoencoder (STAE) consists of an encoder and a decoder. After the training of STAE, only the encoder part is utilized as the feature extractor.

**Figure 7**
Detailed configurations of the STAE used for feature extraction.

**Figure 8**
Proposed classifier network consisting of a STAE-based feature extractor and a deep embedded clustering (DEC)-based feature classifier.

**Figure 9**
Clustering results by applying proposed DEC method: (a) clustering results for 81,758 training dataset images and (b) clustering results for real-time sensing from users (online validation).

**Figure 10**
Screenshots from the demonstration using a custom-made application. A user could pop balloons (a,b) or select buttons to change the background (c,d). A user could select an object by aiming (targeting) a red center dot on a specific object and then execute using a winking gesture.

See this image and copyright information in PMC

References

1. Farrell T.J., Patterson M.S., Wilson B. A diffusion theory model of spatially resolved, steady-state diffuse reflectance for the noninvasive determination of tissue optical properties in vivo. Med. Phys. 1992;19:879–888. doi: 10.1118/1.596777. - DOI - PubMed
1. Kienle A., Wetzel C., Bassi A.L., Comelli D., Taroni P., Pifferi A. Determination of the optical properties of anisotropic biological media using an isotropic diffusion model. J. Biomed. Opt. 2007;12:014026. doi: 10.1117/1.2709864. - DOI - PubMed
1. Kienle A., D’Andrea C., Foschum F., Taroni P., Pifferi A. Light propagation in dry and wet softwood. Opt. Express. 2008;16:9895–9906. doi: 10.1364/OE.16.009895. - DOI - PubMed
1. Nickell S., Hermann M., Essenpreis M., Farrell T.J., Krämer U., Patterson M.S. Anisotropy of light propagation in human skin. Phys. Med. Biol. 2000;45:2873–2886. doi: 10.1088/0031-9155/45/10/310. - DOI - PubMed
1. Cha J., Kim J., Kim S. Noninvasive determination of fiber orientation and tracking 2-dimensional deformation of human skin utilizing spatially resolved reflectance of infrared light measurement in vivo. Measurement. 2019;142:170–180. doi: 10.1016/j.measurement.2019.04.065. - DOI

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

2017-0-00244/Institute for Information and Communications Technology Promotion

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Hands-Free User Interface for AR/VR Devices Exploiting Wearer's Facial Gestures Using Unsupervised Deep Learning

Affiliations

Hands-Free User Interface for AR/VR Devices Exploiting Wearer's Facial Gestures Using Unsupervised Deep Learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials