Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct;94(4):647-657.
doi: 10.1002/ana.26744. Epub 2023 Aug 18.

Measuring Sentence Information via Surprisal: Theoretical and Clinical Implications in Nonfluent Aphasia

Affiliations

Measuring Sentence Information via Surprisal: Theoretical and Clinical Implications in Nonfluent Aphasia

Neguine Rezaii et al. Ann Neurol. 2023 Oct.

Abstract

Objective: Nonfluent aphasia is characterized by simplified sentence structures and word-level abnormalities, including reduced use of verbs and function words. The predominant belief about the disease mechanism is that a core deficit in syntax processing causes both structural and word-level abnormalities. Here, we propose an alternative view based on information theory to explain the symptoms of nonfluent aphasia. We hypothesize that the word-level features of nonfluency constitute a distinct compensatory process to augment the information content of sentences to the level of healthy speakers. We refer to this process as lexical condensation.

Methods: We use a computational approach based on language models to measure sentence information through surprisal, a metric calculated by the average probability of occurrence of words in a sentence, given their preceding context. We apply this method to the language of patients with nonfluent primary progressive aphasia (nfvPPA; n = 36) and healthy controls (n = 133) as they describe a picture.

Results: We found that nfvPPA patients produced sentences with the same sentence surprisal as healthy controls by using richer words in their structurally impoverished sentences. Furthermore, higher surprisal in nfvPPA sentences correlated with the canonical features of agrammatism: a lower function-to-all-word ratio, a lower verb-to-noun ratio, a higher heavy-to-all-verb ratio, and a higher ratio of verbs in -ing forms.

Interpretation: Using surprisal enables testing an alternative account of nonfluent aphasia that regards its word-level features as adaptive, rather than defective, symptoms, a finding that would call for revisions in the therapeutic approach to nonfluent language production. ANN NEUROL 2023;94:647-657.

PubMed Disclaimer

Conflict of interest statement

Potential Conflict of Interest

The Authors declare no Competing Financial or Non-Financial Interests.

Figures

Figure 1.
Figure 1.
The working hypothesis of the study. Each sentence, represented by the black sliding bar, can be made up of a different share of lexical and syntactic information. In a healthy individual, the bar can slide over a wide range of possible combinations of the two sources of information while keeping the sentence information constant. In nfvPPA, the pathological process limits the use of complex syntax, pushing the sliding bar to the left, where more informative words must be selected to convey the intended message. Sentence information is measured by average surprisal, lexical information by average content word frequency, and syntactic information by average syntax frequency.
Figure 2.
Figure 2.
Sentence surprisal is predicted by word frequency as well as syntax frequency within each sentence. Partial effect plots for each smooth term–word frequency and syntax frequency–in the GAM illustrate each component of the model, predicting sentence surprisal. Shaded zones show 95% confidence intervals around the mean of the effect. “Adjusted sentence surprisal” represents the sentence surprisal, adjusted for the other frequency variable (i.e., in the plot showing that word frequency is inversely correlated with adjusted sentence surprisal, sentence surprisal is adjusted for syntax frequency).
Figure 3.
Figure 3.
The density graphs of word frequency, syntax frequency, and surprisal at the sentence level to compare spoken and written modalities in nfvPPA patients and healthy controls. * denotes p < 0.05, ** p < 0.01, and NS non-significance.
Figure 4.
Figure 4.
The radar chart shows the absolute value of the coefficient of the repeated measures correlation (rrm) between surprisal and various word-level features at the sentence level in the written and spoken samples of nfvPPA patients.

References

    1. Saffran EM, Berndt RS, Schwartz MF. The quantitative analysis of agrammatic production: Procedure and data. Brain Lang. 1989;37(3):440–479. doi:10.1016/0093-934X(89)90030-8 - DOI - PubMed
    1. Goodglass H. Agrammatism in aphasiology. Clin Neurosci N Y N. 1997;4(2):51–56. - PubMed
    1. Bradley DC, Garrett MF, Zurif EB. Syntactic deficits in Broca’s aphasia. In: Biological Studies Of Mental Processes; David Caplan. MIT Press; 1980:269–286.
    1. Miceli G, Silveri MC, Villa G, Caramazza A. On the basis for the agrammatic’s difficulty in producing main verbs. Cortex J Devoted Study Nerv Syst Behav. 1984;20(2):207–220. doi:10.1016/s0010-9452(84)80038-6 - DOI - PubMed
    1. Bencini G, Ronald D. Verb access difficulties in agrammatic aphasic narratives. Pap Present 70th Annu Meet Linguist Soc Am San Diego CA. Published online 1996.

Publication types

LinkOut - more resources