Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar-Apr;45(2):351-369.
doi: 10.1097/AUD.0000000000001430. Epub 2023 Oct 26.

Identifying Links Between Latent Memory and Speech Recognition Factors

Affiliations

Identifying Links Between Latent Memory and Speech Recognition Factors

Adam K Bosen et al. Ear Hear. 2024 Mar-Apr.

Abstract

Objectives: The link between memory ability and speech recognition accuracy is often examined by correlating summary measures of performance across various tasks, but interpretation of such correlations critically depends on assumptions about how these measures map onto underlying factors of interest. The present work presents an alternative approach, wherein latent factor models are fit to trial-level data from multiple tasks to directly test hypotheses about the underlying structure of memory and the extent to which latent memory factors are associated with individual differences in speech recognition accuracy. Latent factor models with different numbers of factors were fit to the data and compared to one another to select the structures which best explained vocoded sentence recognition in a two-talker masker across a range of target-to-masker ratios, performance on three memory tasks, and the link between sentence recognition and memory.

Design: Young adults with normal hearing (N = 52 for the memory tasks, of which 21 participants also completed the sentence recognition task) completed three memory tasks and one sentence recognition task: reading span, auditory digit span, visual free recall of words, and recognition of 16-channel vocoded Perceptually Robust English Sentence Test Open-set sentences in the presence of a two-talker masker at target-to-masker ratios between +10 and 0 dB. Correlations between summary measures of memory task performance and sentence recognition accuracy were calculated for comparison to prior work, and latent factor models were fit to trial-level data and compared against one another to identify the number of latent factors which best explains the data. Models with one or two latent factors were fit to the sentence recognition data and models with one, two, or three latent factors were fit to the memory task data. Based on findings with these models, full models that linked one speech factor to one, two, or three memory factors were fit to the full data set. Models were compared via Expected Log pointwise Predictive Density and post hoc inspection of model parameters.

Results: Summary measures were positively correlated across memory tasks and sentence recognition. Latent factor models revealed that sentence recognition accuracy was best explained by a single factor that varied across participants. Memory task performance was best explained by two latent factors, of which one was generally associated with performance on all three tasks and the other was specific to digit span recall accuracy at lists of six digits or more. When these models were combined, the general memory factor was closely related to the sentence recognition factor, whereas the factor specific to digit span had no apparent association with sentence recognition.

Conclusions: Comparison of latent factor models enables testing hypotheses about the underlying structure linking cognition and speech recognition. This approach showed that multiple memory tasks assess a common latent factor that is related to individual differences in sentence recognition, although performance on some tasks was associated with multiple factors. Thus, while these tasks provide some convergent assessment of common latent factors, caution is needed when interpreting what they tell us about speech recognition.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflicts of interest to disclose.

Figures

Figure 1.
Figure 1.
Performance on working memory and speech tasks. For reading span (top left) and digit span (bottom left), edit distance score as a percentage of maximum possible score is shown as a function of list length. Thin lines show recall accuracy for each participant, and the thick line shows the group average. In-lab participants are depicted with black lines and remote participants are depicted with gray lines. For free recall (top right), the number of items in each list was fixed, so the average number of items recalled across all lists is shown as a bar for each participant. In-lab participants are depicted with black bars and remote participants are depicted with gray bars. For speech recognition (bottom right), keyword recognition accuracy is shown a as a function of target-to-masker ratio. Thin lines show keyword recognition accuracy for each participant, and the thick line shows the group average.
Figure 2.
Figure 2.
Mean PRESTO sentence recognition accuracy across target-to-masker ratios as a function of performance in each memory task. Each dot shows data from one individual, and lines show standard major axis regression fits (Legendre, 2013) across individuals. All correlations were significant (p < 0.05).
Figure 3.
Figure 3.
The best-fitting latent factor model for speech recognition across target-to-masker ratios. (A) Group-level keyword recognition accuracy for each target-to-masker ratio is given as μ (reported in log odds), and individual performance varied relative to the group based on the product of their individual speech recognition ability η and the amount that individual ability contributed to accuracy for each target-to-masker ratio, λ. (B) Each shaded region depicts the posterior probability density of speech recognition ability η for one participant.
Figure 4.
Figure 4.
The best-fitting latent factor model for working memory tasks and across list lengths within digit and reading span tasks. (A) Parameters are depicted as in Figure 3, although arrows depicting the contribution of each latent factor, η, to accuracy on each task, λ, are omitted for clarity. Model parameters λ that had a Bayes Factor greater than 3:1 in favor of lying outside of the region of practical equivalence (see text for details) are in bold, parameters with Bayes Factors between 3:1 and 1:1 are shown in black, and parameters with Bayes Factors less than 1:1 are shown in gray. (B) The two latent working memory factors differ across individuals and are not correlated across the group. Most likely values of each latent factor for each participant are depicted as circles, and shaded gradients around each most likely value depict the corresponding probability density for that participant.
Figure 5.
Figure 5.
The best-fitting latent factor model linking individual differences in latent working memory abilities, η1 and η2, to individual differences in speech recognition ability, ηspeech. (A) The parameters linking individual differences in latent factors to performance in each task and task condition are depicted as in Figures 3 and 4. In this model, the regression coefficient β1 was fixed to 1 because preliminary models failed to converge when attempting to approach this boundary value. (B) The two latent individual difference factors obtained by this model fit are depicted as in Figure 4B for participants who completed both the memory and speech tasks.

References

    1. Akeroyd MA (2008). Are individual differences in speech reception related to individual differences in cognitive ability? A survey of twenty experimental studies with normal and hearing-impaired adults. International Journal of Audiology, 47(SUPPL. 2). 10.1080/14992020802301142 - DOI - PubMed
    1. Arlinger S, Lunner T, Lyxell B, & Kathleen Pichora-Fuller M (2009). The emergence of cognitive hearing science. Scandinavian Journal of Psychology, 50(5), 371–384. 10.1111/j.1467-9450.2009.00753.x - DOI - PubMed
    1. Baddeley A (2012). Working Memory: Theories, Models, and Controversies. Annual Review of Psychology, 63(1), 1–29. 10.1146/annurev-psych-120710-100422 - DOI - PubMed
    1. Beechey T (2022). Is speech intelligibility what speech intelligibility tests test? The Journal of the Acoustical Society of America, 152(3), 1573–1585. 10.1121/10.0013896 - DOI - PubMed
    1. Bernstein JGW, Goupell MJ, Schuchman GI, Rivera AL, & Brungart DS (2016). Having two ears facilitates the perceptual separation of concurrent talkers for bilateral and single-sided deaf cochlear implantees. Ear and Hearing, 37(3), 289–302. 10.1097/AUD.0000000000000284 - DOI - PMC - PubMed

Publication types

LinkOut - more resources