Segmental intelligibility of synthetic speech produced by rule

J S Logan¹, B G Greene, D B Pisoni

Affiliations

PMID: 2527884
PMCID: PMC3507386
DOI: 10.1121/1.398236

Comparative Study

Segmental intelligibility of synthetic speech produced by rule

J S Logan et al. J Acoust Soc Am. 1989 Aug.

. 1989 Aug;86(2):566-81.

doi: 10.1121/1.398236.

Authors

J S Logan¹, B G Greene, D B Pisoni

Affiliation

¹ Department of Psychology, Indiana University, Bloomington 47405.

PMID: 2527884
PMCID: PMC3507386
DOI: 10.1121/1.398236

Abstract

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.

PubMed Disclaimer

Figures

**FIG. 1**
Overall error rates (in percent) for each of the ten text-to-speech systems tested using the MRT. Natural speech is included here as a control condition and benchmark.

**FIG. 2**
Error rates (in percent) for consonants in initial and final position for each of the systems tested in the MRT. Open bars designate initial consonant error rates and striped bars designate final consonant error rates.

**FIG. 3**
Error rates (in percent) for all systems tested in both the closed- and open-response format MRT. Open bars designate error rates for the closed-response format and striped bars designate error rates for the open-response format.

See this image and copyright information in PMC

References

1. Allen J. Synthesis of speech from unrestricted text. Proc. IEEE. 1976;64:433–442.
1. Allen J. Linguistic-based algorithms offer practical text-to-speech systems. Speech Tech. 1981;1:12–16.
1. Allen J, Hunnicutt S, Klatt DH. From Text to Speech: The MITalk System. Cambridge, United Kingdom: Cambridge U. P.; 1987.
1. Bernstein J, Pisoni DB. Unlimited text-to-speech system: Description and evaluation of a microprocessor based device. Proc. Int. Conf. Acoust. Speech Signal Process. 1980;ICASSP-80:576–579.
1. Bruckert E, Minow M, Tetschner W. Three-tiered software and VLSI aid developmental system to read text aloud. Electronics. 1983;56:133–138.

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Segmental intelligibility of synthetic speech produced by rule

Affiliation

Segmental intelligibility of synthetic speech produced by rule

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical