Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
- PMID: 38553456
- PMCID: PMC10980748
- DOI: 10.1038/s41467-024-46631-y
Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
Erratum in
-
Author Correction: Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.Nat Commun. 2024 Oct 1;15(1):8500. doi: 10.1038/s41467-024-52626-6. Nat Commun. 2024. PMID: 39353920 Free PMC article. No abstract available.
Abstract
Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. Using stringent zero-shot mapping we demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns. The common geometric patterns allow us to predict the brain embedding in IFG of a given left-out word based solely on its geometrical relationship to other non-overlapping words in the podcast. Furthermore, we show that contextual embeddings capture the geometry of IFG embeddings better than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain.
© 2024. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures



Similar articles
-
Shared computational principles for language processing in humans and deep language models.Nat Neurosci. 2022 Mar;25(3):369-380. doi: 10.1038/s41593-022-01026-4. Epub 2022 Mar 7. Nat Neurosci. 2022. PMID: 35260860 Free PMC article.
-
Enhancing clinical concept extraction with contextual embeddings.J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304. doi: 10.1093/jamia/ocz096. J Am Med Inform Assoc. 2019. PMID: 31265066 Free PMC article.
-
A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations.Nat Hum Behav. 2025 May;9(5):1041-1055. doi: 10.1038/s41562-025-02105-9. Epub 2025 Mar 7. Nat Hum Behav. 2025. PMID: 40055549 Free PMC article.
-
Visualization of medical concepts represented using word embeddings: a scoping review.BMC Med Inform Decis Mak. 2022 Mar 29;22(1):83. doi: 10.1186/s12911-022-01822-9. BMC Med Inform Decis Mak. 2022. PMID: 35351120 Free PMC article.
-
SECNLP: A survey of embeddings in clinical natural language processing.J Biomed Inform. 2020 Jan;101:103323. doi: 10.1016/j.jbi.2019.103323. Epub 2019 Nov 8. J Biomed Inform. 2020. PMID: 31711972 Review.
Cited by
-
Evaluating large language models in theory of mind tasks.Proc Natl Acad Sci U S A. 2024 Nov 5;121(45):e2405460121. doi: 10.1073/pnas.2405460121. Epub 2024 Oct 29. Proc Natl Acad Sci U S A. 2024. PMID: 39471222 Free PMC article.
-
The "Podcast" ECoG dataset for modeling neural activity during natural language comprehension.Sci Data. 2025 Jul 3;12(1):1135. doi: 10.1038/s41597-025-05462-2. Sci Data. 2025. PMID: 40610484 Free PMC article.
-
Individual Differences in Statistical Learning and Semantic Adaptation: An N400 Study.Psychophysiology. 2025 Aug;62(8):e70125. doi: 10.1111/psyp.70125. Psychophysiology. 2025. PMID: 40767276 Free PMC article.
-
Approximating the semantic space: word embedding techniques in psychiatric speech analysis.Schizophrenia (Heidelb). 2024 Dec 2;10(1):114. doi: 10.1038/s41537-024-00524-7. Schizophrenia (Heidelb). 2024. PMID: 39622800 Free PMC article.
-
Brain-model neural similarity reveals abstractive summarization performance.Sci Rep. 2025 Jan 2;15(1):370. doi: 10.1038/s41598-024-84530-w. Sci Rep. 2025. PMID: 39747634 Free PMC article.
References
-
- Lees, R. B. & Chomsky, N. Syntactic structures. Language33, 375 (1957).
-
- Fodor, J. A. The Language of Thought (Harvard Univ. Press, 1975).
-
- Landauer, T. K. & Dumais, S. T. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev.104, 211–240 (1997).
-
- Pennington, J., Socher, R. & Manning, C. Glove: global vectors for word representation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1532–1543 (Association for Computational Linguistics, 2014).
-
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. & Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (eds. Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z. & Weinberger, K. Q.) (Curran Associates Inc., 2013).
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources