Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance

doi:10.1063/1.3463722

. 2010 Sep;20(3):033106.

doi: 10.1063/1.3463722.

Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance

Ayyoob Jafari¹, Farshad Almasganj, Maryam Nabi Bidhendi

Affiliations

PMID: 20887046
DOI: 10.1063/1.3463722

Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance

Ayyoob Jafari et al. Chaos. 2010 Sep.

. 2010 Sep;20(3):033106.

doi: 10.1063/1.3463722.

Authors

Ayyoob Jafari¹, Farshad Almasganj, Maryam Nabi Bidhendi

Affiliation

¹ Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran. ajafari20@aut.ac.ir

PMID: 20887046
DOI: 10.1063/1.3463722

Abstract

This paper introduces a combinational feature extraction approach to improve speech recognition systems. The main idea is to simultaneously benefit from some features obtained from Poincaré section applied to speech reconstructed phase space (RPS) and typical Mel frequency cepstral coefficients (MFCCs) which have a proved role in speech recognition field. With an appropriate dimension, the reconstructed phase space of speech signal is assured to be topologically equivalent to the dynamics of the speech production system, and could therefore include information that may be absent in linear analysis approaches. Moreover, complicated systems such as speech production system can present cyclic and oscillatory patterns and Poincaré sections could be used as an effective tool in analysis of such trajectories. In this research, a statistical modeling approach based on Gaussian mixture models (GMMs) is applied to Poincaré sections of speech RPS. A final pruned feature set is obtained by applying an efficient feature selection approach to the combination of the parameters of the GMM model and MFCC-based features. A hidden Markov model-based speech recognition system and TIMIT speech database are used to evaluate the performance of the proposed feature set by conducting isolated and continuous speech recognition experiments. By the proposed feature set, 5.7% absolute isolated phoneme recognition improvement is obtained against only MFCC-based features.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Silverchair Information Systems

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance

Affiliation

Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance

Authors

Affiliation

Abstract

Similar articles

MeSH terms

LinkOut - more resources

Full Text Sources