Review

. 2019 Jun 25:2019:4368036.

doi: 10.1155/2019/4368036. eCollection 2019.

Speech Technology Progress Based on New Machine Learning Paradigm

Vlado Delić¹, Zoran Perić², Milan Sečujski¹, Nikša Jakovljević¹, Jelena Nikolić², Dragiša Mišković¹, Nikola Simić², Siniša Suzić¹, Tijana Delić¹

Affiliations

¹ University of Novi Sad, Faculty of Technical Sciences, 21000 Novi Sad, Serbia.
² University of Niš, Faculty of Electronic Engineering, 18000 Niš, Serbia.

PMID: 31341467
PMCID: PMC6614991
DOI: 10.1155/2019/4368036

Review

Speech Technology Progress Based on New Machine Learning Paradigm

Vlado Delić et al. Comput Intell Neurosci. 2019.

. 2019 Jun 25:2019:4368036.

doi: 10.1155/2019/4368036. eCollection 2019.

Authors

Vlado Delić¹, Zoran Perić², Milan Sečujski¹, Nikša Jakovljević¹, Jelena Nikolić², Dragiša Mišković¹, Nikola Simić², Siniša Suzić¹, Tijana Delić¹

Affiliations

¹ University of Novi Sad, Faculty of Technical Sciences, 21000 Novi Sad, Serbia.
² University of Niš, Faculty of Electronic Engineering, 18000 Niš, Serbia.

PMID: 31341467
PMCID: PMC6614991
DOI: 10.1155/2019/4368036

Abstract

Speech technologies have been developed for decades as a typical signal processing area, while the last decade has brought a huge progress based on new machine learning paradigms. Owing not only to their intrinsic complexity but also to their relation with cognitive sciences, speech technologies are now viewed as a prime example of interdisciplinary knowledge area. This review article on speech signal analysis and processing, corresponding machine learning algorithms, and applied computational intelligence aims to give an insight into several fields, covering speech production and auditory perception, cognitive aspects of speech communication and language understanding, both speech recognition and text-to-speech synthesis in more details, and consequently the main directions in development of spoken dialogue systems. Additionally, the article discusses the concepts and recent advances in speech signal compression, coding, and transmission, including cognitive speech coding. To conclude, the main intention of this article is to highlight recent achievements and challenges based on new machine learning paradigms that, over the last decade, had an immense impact in the field of speech signal processing.

PubMed Disclaimer

Figures

**Figure 1**
Interdisciplinary nature of speech technologies, i.e., spoken language processing (adopted from [2]).

**Figure 2**
Unified framework that encompasses speech signal processing fields in the scope of the article.

**Figure 3**
Block diagram of speech production and speech perception and corresponding processes performed by machines carrying out text-to-speech synthesis (TTS) and automatic speech recognition (ASR).

**Figure 4**
Components of a human-machine speech dialogue system.

**Figure 5**
Speech signal quality according to MOS versus bit rate for various speech signal coding techniques.

**Figure 6**
Forward adaptive PCM: (a) encoder; (b) decoder.

**Figure 7**
One of the realizations of backward adaptive PCM with one codeword memory: (a) encoder; (b) decoder.

**Figure 8**
Dual mode quantization scheme: (a) encoder; (b) decoder.

**Figure 9**
DPCM: (a) encoder; (b) decoder.

See this image and copyright information in PMC

References

1. Kuhn T. S. The Structure of Scientific Revolutions-50th Anniversary Edition. 4th. Vol. 3. Chicago, IL, USA: The University of Chicago Press; 2012.
1. Moore R. K. Cognitive informatics: the future of spoken language processing?. Proceedings of the 10th International Conference on Speech and Computer (SPECOM); October 2005; Patras, Greece.
1. Paul J. D. Re-creating the sigsaly quantizer: this 1943 analog-to-digital converter gave the allies an unbreakable scrambler-(resources) IEEE Spectrum. 2019;56(2):16–17. doi: 10.1109/mspec.2019.8635806. - DOI
1. Jayant N. S., Noll P. Digital coding of waveforms. Principles and applications to speech and video. Signal Processing. 1985;9(2):139–140. doi: 10.1016/0165-1684(85)90053-2. - DOI
1. Chu W. C. Speech Coding Algorithms: Foundation and Evolution of Standardized Coders. Hoboken, NJ, USA: John Wiley & Sons; 2003.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Speech Technology Progress Based on New Machine Learning Paradigm

Affiliations

Speech Technology Progress Based on New Machine Learning Paradigm

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources