Training and search methods for speech recognition

F Jelinek¹

Affiliations

PMID: 7479810
PMCID: PMC40719
DOI: 10.1073/pnas.92.22.9964

Training and search methods for speech recognition

F Jelinek. Proc Natl Acad Sci U S A. 1995.

. 1995 Oct 24;92(22):9964-9.

doi: 10.1073/pnas.92.22.9964.

Author

F Jelinek¹

Affiliation

¹ IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA.

PMID: 7479810
PMCID: PMC40719
DOI: 10.1073/pnas.92.22.9964

Abstract

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.

PubMed Disclaimer

References

1. Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63 - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Training and search methods for speech recognition

Affiliation

Training and search methods for speech recognition

Author

Affiliation

Abstract

References

MeSH terms

LinkOut - more resources

Full Text Sources