Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023;4(5):428.
doi: 10.1007/s42979-023-01866-2. Epub 2023 Jun 7.

A Semi-automated Approach for Bengali Neologism

Affiliations

A Semi-automated Approach for Bengali Neologism

Apurbalal Senapati. SN Comput Sci. 2023.

Abstract

Neologisms refer to newly coined words or phrases adopted by a language, and it is a slow but ongoing process that occurs in all languages. Sometimes, rarely used or obsolete words are also considered neologisms. Certain events, such as wars, the emergence of new diseases, or advancements like computers and the internet, can trigger the creation of new words or neologisms. The COVID-19 pandemic is one such event that has rapidly led to an explosion of neologisms in the context of the disease and several other social contexts. Even the term COVID-19 itself is a newly coined term. Studying such adaptation or change and quantifying it is essential from a linguistic perspective. However, identifying newly coined terms or extracting neologisms computationally is a challenging task. The standard tools and techniques for finding newly coined terms in English-like languages may not be suitable for Bengali and other Indic languages. This study aims to use a semi-automated approach to investigate the emergence or modification of new words in the Bengali language amidst the COVID-19 pandemic. To conduct this study, a Bengali web corpus was compiled consisting of COVID-19 related articles sourced from various web sources in Bengali. The current experiment focuses solely on COVID-19-related neologisms, but the method can be adapted for general purposes and extended to other languages as well.

Keywords: Bengali; COVID-19; Corpus; Language.; Linguistic analysis; Neologisms; Word formation.

PubMed Disclaimer

Conflict of interest statement

Conflict of interestThe author Apurbalal Senapati declares that he has no conflict of interest.

Figures

Fig. 1
Fig. 1
System of the corpus creation from the web
Fig. 2
Fig. 2
System flowchart: execution flow
Fig. 3
Fig. 3
Frequency wise rank list of COVID-19 related newly added terms in Bengali
Fig. 4
Fig. 4
Lowest frequency wise rank list of COVID-19 related newly added terms in Bengali
Fig. 5
Fig. 5
Derived neologisms
Fig. 6
Fig. 6
Word-formation rules
Fig. 7
Fig. 7
Error classification
Fig. 8
Fig. 8
System identified neologism and errors

References

    1. Still S. COVID-19 health system response, quarterly of the European observatory on health systems and policies. Eurohealth. 2020;26(2):108.
    1. McPherson F, Stewart P, Wild K. Presentation on “The language of Covid-19: special OED update”, OED. 2020. https://public.oed.com/wp-content/uploads/The-Language-of-Covid-19-webin.... Accessed 12 June 2021.
    1. Asif M, Zhiyong D, Iram A, Nisar M. Linguistic analysis of neologism related to coronavirus (COVID-19). 2020. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3608585. Accessed 17 June 2022. - PMC - PubMed
    1. Luke S. Language evolution, acquisition, adaptation and change, sociolinguistics—interdisciplinary perspectives. In: Jiang X. editor. IntechOpen; 2017. 10.5772/67767, https://www.intechopen.com/chapters/54552. Accessed 17 June 2022.
    1. Kilgarriff A, Baisa V, Bušta J, et al. The Sketch Engine: ten years on. Lexicogr ASIALEX. 2014;1:7–36. doi: 10.1007/s40607-014-0009-9. - DOI

LinkOut - more resources