Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Feb 17;21(4):1399.
doi: 10.3390/s21041399.

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review

Affiliations
Review

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review

Wookey Lee et al. Sensors (Basel). .

Abstract

Voice is one of the essential mechanisms for communicating and expressing one's intentions as a human being. There are several causes of voice inability, including disease, accident, vocal abuse, medical surgery, ageing, and environmental pollution, and the risk of voice loss continues to increase. Novel approaches should have been developed for speech recognition and production because that would seriously undermine the quality of life and sometimes leads to isolation from society. In this review, we survey mouth interface technologies which are mouth-mounted devices for speech recognition, production, and volitional control, and the corresponding research to develop artificial mouth technologies based on various sensors, including electromyography (EMG), electroencephalography (EEG), electropalatography (EPG), electromagnetic articulography (EMA), permanent magnet articulography (PMA), gyros, images and 3-axial magnetic sensors, especially with deep learning techniques. We especially research various deep learning technologies related to voice recognition, including visual speech recognition, silent speech interface, and analyze its flow, and systematize them into a taxonomy. Finally, we discuss methods to solve the communication problems of people with disabilities in speaking and future research with respect to deep learning components.

Keywords: EMG; artificial larynx; biosignal; deep learning; mouth interface; voice production.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
How to help people with voice disorders communicate with others: to recognize voices, which indicates that biosignals are processed via various methods and transmitted to devices suitable for the user, and can be used in several fields, such as robotics, medical engineering, and image processing.
Figure 2
Figure 2
Measurement of muscle electrical signals using EMG technology.
Figure 3
Figure 3
Examples of the positions for electrodes on a face [25].
Figure 4
Figure 4
Example of speech recognition using the EMG signal.
Figure 5
Figure 5
Collect the subject’s tongue movements, lips movement, and voice data with the development device for the wireless tongue tracking technique combining the camera and several acceleration sensors [35].
Figure 6
Figure 6
The interface is attached to the palate (iTDS-1) [33].
Figure 7
Figure 7
Voice remote control system structure based on language recognition processing.
Figure 8
Figure 8
LipNet architecture [64].
Figure 9
Figure 9
The view of wearing AlterEgo [99].
Figure 10
Figure 10
SottoVoce based on ultrasonic image [100].
Figure 11
Figure 11
Speech recognition technologies and services using sensor-based deep learning models.

References

    1. Voice Disorders: Overview. [(accessed on 29 October 2019)]; Available online: https://www.asha.org/practice-portal/clinical-topics/voice-disorders/
    1. Cheah L.A., Gilbert J.M., Gonzalez J.A., Bai J., Ell S.R., Green P.D., Moore R.K. Towards an Intraoral-Based Silent Speech Restoration System for Post-laryngectomy Voice Replacement; Proceedings of the International Joint Conference on Biomedical Engineering Systems and Technologies; Rome, Italy. 21–23 February 2016.
    1. Shin Y.H., Seo J. Towards contactless silent speech recognition based on detection of active and visible articulators using IR-UWB radar. Sensors. 2016;16:1812. doi: 10.3390/s16111812. - DOI - PMC - PubMed
    1. Sharpe G., Camoes Costa V., Doubé W., Sita J., McCarthy C., Carding P. Communication changes with laryngectomy and impact on quality of life: A review. Qual. Life Res. 2019;28:863–877. doi: 10.1007/s11136-018-2033-y. - DOI - PubMed
    1. Li W. Silent speech interface design methodology and case study. Chin. J. Electron. 2016;25:88–92. doi: 10.1049/cje.2016.01.014. - DOI

LinkOut - more resources