Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

Aparna Balagopalan^{1

2

3}, Benjamin Eyre¹, Jessica Robin¹, Frank Rudzicz^{2

3

4}, Jekaterina Novikova¹

Affiliations

¹ Winterlight Labs Inc., Toronto, ON, Canada.
² Department of Computer Science, University of Toronto, Toronto, ON, Canada.
³ Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
⁴ Unity Health Toronto, Toronto, ON, Canada.

PMID: 33986655
PMCID: PMC8110916
DOI: 10.3389/fnagi.2021.635945

Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

Aparna Balagopalan et al. Front Aging Neurosci. 2021.

. 2021 Apr 27:13:635945.

doi: 10.3389/fnagi.2021.635945. eCollection 2021.

Authors

Aparna Balagopalan^{1

2

3}, Benjamin Eyre¹, Jessica Robin¹, Frank Rudzicz^{2

3

4}, Jekaterina Novikova¹

Affiliations

¹ Winterlight Labs Inc., Toronto, ON, Canada.
² Department of Computer Science, University of Toronto, Toronto, ON, Canada.
³ Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
⁴ Unity Health Toronto, Toronto, ON, Canada.

PMID: 33986655
PMCID: PMC8110916
DOI: 10.3389/fnagi.2021.635945

Abstract

Introduction: Research related to the automatic detection of Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional diagnostic methods. Since AD significantly affects the content and acoustics of spontaneous speech, natural language processing, and machine learning provide promising techniques for reliably detecting AD. There has been a recent proliferation of classification models for AD, but these vary in the datasets used, model types and training and testing paradigms. In this study, we compare and contrast the performance of two common approaches for automatic AD detection from speech on the same, well-matched dataset, to determine the advantages of using domain knowledge vs. pre-trained transfer models. Methods: Audio recordings and corresponding manually-transcribed speech transcripts of a picture description task administered to 156 demographically matched older adults, 78 with Alzheimer's Disease (AD) and 78 cognitively intact (healthy) were classified using machine learning and natural language processing as "AD" or "non-AD." The audio was acoustically-enhanced, and post-processed to improve quality of the speech recording as well control for variation caused by recording conditions. Two approaches were used for classification of these speech samples: (1) using domain knowledge: extracting an extensive set of clinically relevant linguistic and acoustic features derived from speech and transcripts based on prior literature, and (2) using transfer-learning and leveraging large pre-trained machine learning models: using transcript-representations that are automatically derived from state-of-the-art pre-trained language models, by fine-tuning Bidirectional Encoder Representations from Transformer (BERT)-based sequence classification models. Results: We compared the utility of speech transcript representations obtained from recent natural language processing models (i.e., BERT) to more clinically-interpretable language feature-based methods. Both the feature-based approaches and fine-tuned BERT models significantly outperformed the baseline linguistic model using a small set of linguistic features, demonstrating the importance of extensive linguistic information for detecting cognitive impairments relating to AD. We observed that fine-tuned BERT models numerically outperformed feature-based approaches on the AD detection task, but the difference was not statistically significant. Our main contribution is the observation that when tested on the same, demographically balanced dataset and tested on independent, unseen data, both domain knowledge and pretrained linguistic models have good predictive performance for detecting AD based on speech. It is notable that linguistic information alone is capable of achieving comparable, and even numerically better, performance than models including both acoustic and linguistic features here. We also try to shed light on the inner workings of the more black-box natural language processing model by performing an interpretability analysis, and find that attention weights reveal interesting patterns such as higher attribution to more important information content units in the picture description task, as well as pauses and filler words. Conclusion: This approach supports the value of well-performing machine learning and linguistically-focussed processing techniques to detect AD from speech and highlights the need to compare model performance on carefully balanced datasets, using consistent same training parameters and independent test datasets in order to determine the best performing predictive model.

Keywords: Alzheimer's disease; BERT; MMSE regression; dementia detection; feature engineering; transfer learning.

PubMed Disclaimer

Conflict of interest statement

Authors AB, BE, JR and JN were employed by company Winterlight Labs Inc. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
A t-SNE plot showing class separation. Note we only use the 13 features significantly different between classes (see Table 10) in feature representation for this plot.

**Figure 2**
An attention visualization plot showing attention contributions of embeddings corresponding to each word to the “pooled” representation. This example is a sub-sample (first two utterances) of a speech transcript from a healthy person.

See this image and copyright information in PMC

References

1. Ahmed S., Haigh A.-M. F., de Jager C. A., Garrard P. (2013). Connected speech as a marker of disease progression in autopsy-proven Alzheimer's disease. Brain 136, 3727–3737. 10.1093/brain/awt269 - DOI - PMC - PubMed
1. Ai H., Lu X. (2010). A web-based system for automatic measurement of lexical complexity, in 27th Annual Symposium of the Computer-Assisted Language Consortium (CALICO-10) (Amherst, MA: ), 8–12.
1. Balagopalan A., Eyre B., Rudzicz F., Novikova J. (2020). To BERT or not to BERT: comparing speech and language-based approaches for Alzheimer's disease detection. Proc. Interspeech 2020, 2167–2171. 10.21437/Interspeech.2020-2557 - DOI
1. Balagopalan A., Novikova J., Rudzicz F., Ghassemi M. (2018). The effect of heterogeneous data for Alzheimer's disease detection from speech. arXiv preprint arXiv:1811.12254.
1. Becker J. T., Boiler F., Lopez O. L., Saxton J., McGonigle K. L. (1994). The natural history of Alzheimer's disease: description of study cohort and accuracy of diagnosis. Arch. Neurol. 51, 585–594. 10.1001/archneur.1994.00540180063015 - DOI - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

Affiliations

Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources