Attention in Natural Language Processing
- PMID: 32915750
- DOI: 10.1109/TNNLS.2020.3019893
Attention in Natural Language Processing
Abstract
Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.
Similar articles
-
Languages for different health information readers: multitrait-multimethod content analysis of Cochrane systematic reviews textual summary formats.BMC Med Res Methodol. 2019 Apr 5;19(1):75. doi: 10.1186/s12874-019-0716-x. BMC Med Res Methodol. 2019. PMID: 30953453 Free PMC article.
-
Relation is an option for processing context information.Front Artif Intell. 2022 Oct 11;5:924688. doi: 10.3389/frai.2022.924688. eCollection 2022. Front Artif Intell. 2022. PMID: 36304959 Free PMC article.
-
SECNLP: A survey of embeddings in clinical natural language processing.J Biomed Inform. 2020 Jan;101:103323. doi: 10.1016/j.jbi.2019.103323. Epub 2019 Nov 8. J Biomed Inform. 2020. PMID: 31711972 Review.
-
Stacked DeBERT: All attention in incomplete data for text classification.Neural Netw. 2021 Apr;136:87-96. doi: 10.1016/j.neunet.2020.12.018. Epub 2020 Dec 25. Neural Netw. 2021. PMID: 33453522
-
Processing of communication sounds: contributions of learning, memory, and experience.Hear Res. 2013 Nov;305:31-44. doi: 10.1016/j.heares.2013.06.005. Epub 2013 Jun 18. Hear Res. 2013. PMID: 23792078 Free PMC article. Review.
Cited by
-
Multi-level feature fusion network for neuronal morphology classification.Front Neurosci. 2024 Oct 21;18:1465642. doi: 10.3389/fnins.2024.1465642. eCollection 2024. Front Neurosci. 2024. PMID: 39498391 Free PMC article.
-
RadioLOGIC, a healthcare model for processing electronic health records and decision-making in breast disease.Cell Rep Med. 2023 Aug 15;4(8):101131. doi: 10.1016/j.xcrm.2023.101131. Epub 2023 Jul 24. Cell Rep Med. 2023. PMID: 37490915 Free PMC article.
-
Application of the Nadaraya-Watson estimator based attention mechanism to the field of predictive maintenance.MethodsX. 2024 May 17;12:102754. doi: 10.1016/j.mex.2024.102754. eCollection 2024 Jun. MethodsX. 2024. PMID: 38846433 Free PMC article.
-
Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis.Sci Rep. 2024 Sep 20;14(1):21984. doi: 10.1038/s41598-024-70750-7. Sci Rep. 2024. PMID: 39304708 Free PMC article.
-
Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning.Entropy (Basel). 2023 Aug 16;25(8):1216. doi: 10.3390/e25081216. Entropy (Basel). 2023. PMID: 37628246 Free PMC article.
Publication types
LinkOut - more resources
Full Text Sources
Other Literature Sources