Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 12:25:e44897.
doi: 10.2196/44897.

Construction of an Emotional Lexicon of Patients With Breast Cancer: Development and Sentiment Analysis

Affiliations

Construction of an Emotional Lexicon of Patients With Breast Cancer: Development and Sentiment Analysis

Chaixiu Li et al. J Med Internet Res. .

Abstract

Background: The innovative method of sentiment analysis based on an emotional lexicon shows prominent advantages in capturing emotional information, such as individual attitudes, experiences, and needs, which provides a new perspective and method for emotion recognition and management for patients with breast cancer (BC). However, at present, sentiment analysis in the field of BC is limited, and there is no emotional lexicon for this field. Therefore, it is necessary to construct an emotional lexicon that conforms to the characteristics of patients with BC so as to provide a new tool for accurate identification and analysis of the patients' emotions and a new method for their personalized emotion management.

Objective: This study aimed to construct an emotional lexicon of patients with BC.

Methods: Emotional words were obtained by merging the words in 2 general sentiment lexicons, the Chinese Linguistic Inquiry and Word Count (C-LIWC) and HowNet, and the words in text corpora acquired from patients with BC via Weibo, semistructured interviews, and expressive writing. The lexicon was constructed using manual annotation and classification under the guidance of Russell's valence-arousal space. Ekman's basic emotional categories, Lazarus' cognitive appraisal theory of emotion, and a qualitative text analysis based on the text corpora of patients with BC were combined to determine the fine-grained emotional categories of the lexicon we constructed. Precision, recall, and the F1-score were used to evaluate the lexicon's performance.

Results: The text corpora collected from patients in different stages of BC included 150 written materials, 17 interviews, and 6689 original posts and comments from Weibo, with a total of 1,923,593 Chinese characters. The emotional lexicon of patients with BC contained 9357 words and covered 8 fine-grained emotional categories: joy, anger, sadness, fear, disgust, surprise, somatic symptoms, and BC terminology. Experimental results showed that precision, recall, and the F1-score of positive emotional words were 98.42%, 99.73%, and 99.07%, respectively, and those of negative emotional words were 99.73%, 98.38%, and 99.05%, respectively, which all significantly outperformed the C-LIWC and HowNet.

Conclusions: The emotional lexicon with fine-grained emotional categories conforms to the characteristics of patients with BC. Its performance related to identifying and classifying domain-specific emotional words in BC is better compared to the C-LIWC and HowNet. This lexicon not only provides a new tool for sentiment analysis in the field of BC but also provides a new perspective for recognizing the specific emotional state and needs of patients with BC and formulating tailored emotional management plans.

Keywords: breast cancer; domain emotional lexicon; lexicon construction; natural language processing; sentiment analysis.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Construction process of the emotional lexicon of patients with BC. BC: breast cancer; C-LIWC: Chinese Linguistic Inquiry and Word Count.
Figure 2
Figure 2
Process of text corpora preprocessing.
Figure 3
Figure 3
Process of emotional word screening and determination. C-LIWC: Chinese Linguistic Inquiry and Word Count.
Figure 4
Figure 4
Process of lexicon performance evaluation. C-LIWC: Chinese Linguistic Inquiry and Word Count; FN: false negative; FP: false positive; P: precision; R: recall; TN: true negative; TP: true positive.

Similar articles

Cited by

References

    1. Latest global cancer data: cancer burden rises to 19.3 million new cases and 10.0 million cancer deaths in 2020. International Agency for Research on Cancer. 2020. Dec 15, [2023-08-21]. https://www.iarc.who.int/news-events/latest-global-cancer-data-​cancer-b...
    1. Fortin J, Leblanc M, Elgbeili G, Cordova MJ, Marin M, Brunet A. The mental health impacts of receiving a breast cancer diagnosis: a meta-analysis. Br J Cancer. 2021 Nov 04;125(11):1582–1592. doi: 10.1038/s41416-021-01542-3. https://europepmc.org/abstract/MED/34482373 10.1038/s41416-021-01542-3 - DOI - PMC - PubMed
    1. Holmes C, Jackson A, Looby J, Gallo K, Blakely K. Breast cancer and body image: feminist therapy principles and interventions. J Fem Fam Ther. 2021 Jan 21;33(1):20–39. doi: 10.1080/08952833.2021.1872266. - DOI
    1. Simonelli LE, Siegel SD, Duffy NM. Fear of cancer recurrence: a theoretical review and its relevance for clinical presentation and management. Psychooncology. 2017 Oct 01;26(10):1444–1454. doi: 10.1002/pon.4168. - DOI - PubMed
    1. Bultz BD, Groff SL, Fitch M, Blais MC, Howes J, Levy K, Mayer C. Implementing screening for distress, the 6th vital sign: a Canadian strategy for changing practice. Psychooncology. 2011 May 01;20(5):463–469. doi: 10.1002/pon.1932. - DOI - PubMed

Publication types