Comparing text mining and manual coding methods: Analysing interview data on quality of care in long-term care for older adults

Coen Hacking^{1

2}, Hilde Verbeek^{1

2}, Jan P H Hamers^{1

2}, Sil Aarts^{1

2}

Affiliations

¹ Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands.
² The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands.

PMID: 37939098
PMCID: PMC10631650
DOI: 10.1371/journal.pone.0292578

Comparing text mining and manual coding methods: Analysing interview data on quality of care in long-term care for older adults

Coen Hacking et al. PLoS One. 2023.

. 2023 Nov 8;18(11):e0292578.

doi: 10.1371/journal.pone.0292578. eCollection 2023.

Authors

Coen Hacking^{1

2}, Hilde Verbeek^{1

2}, Jan P H Hamers^{1

2}, Sil Aarts^{1

2}

Affiliations

¹ Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands.
² The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands.

PMID: 37939098
PMCID: PMC10631650
DOI: 10.1371/journal.pone.0292578

Abstract

Objectives: In long-term care for older adults, large amounts of text are collected relating to the quality of care, such as transcribed interviews. Researchers currently analyze textual data manually to gain insights, which is a time-consuming process. Text mining could provide a solution, as this methodology can be used to analyze large amounts of text automatically. This study aims to compare text mining to manual coding with regard to sentiment analysis and thematic content analysis.

Methods: Data were collected from interviews with residents (n = 21), family members (n = 20), and care professionals (n = 20). Text mining models were developed and compared to the manual approach. The results of the manual and text mining approaches were evaluated based on three criteria: accuracy, consistency, and expert feedback. Accuracy assessed the similarity between the two approaches, while consistency determined whether each individual approach found the same themes in similar text segments. Expert feedback served as a representation of the perceived correctness of the text mining approach.

Results: An accuracy analysis revealed that more than 80% of the text segments were assigned the same themes and sentiment using both text mining and manual approaches. Interviews coded with text mining demonstrated higher consistency compared to those coded manually. Expert feedback identified certain limitations in both the text mining and manual approaches.

Conclusions and implications: While these analyses highlighted the current limitations of text mining, they also exposed certain inconsistencies in manual analysis. This information suggests that text mining has the potential to be an effective and efficient tool for analysing large volumes of textual data in the context of long-term care for older adults.

Copyright: © 2023 Hacking et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Confusion matrix comparing sentiment analysis results of the manual and text mining approach.**
The matrix compares manual coding (rows) against text mining predictions (columns) for sentiment values of the text. Each cell within the matrix represents the percentage occurrence of a particular sentiment alignment (or misalignment) between the manual and text mining approaches. The y-axis of each matrix represents the sentiment as determined through manual analysis, while the x-axis indicates the text mining predictions. The diagonal cells (from top left to bottom right) illustrate the percentage of agreement between the two methods, whereas all off-diagonal cells indicate discrepancies. For instance, the cell at the intersection of the "Positive" row and the "Negative" column displays instances where text was manually coded as positive but was predicted as negative by text mining.

**Fig 2. Comparison of results from the thematic content analysis.**
A confusion matrix is shown for each of the main INDEXQUAL themes (Experienced quality of care, Experiences, Expectations and Context). The y-axis of each matrix represents the presence or absence of a theme as determined through manual analysis, while the x-axis indicates the text mining predictions. Cells on the diagonals capture instances of agreement between manual coding and text mining for each theme. Off-diagonal cells detail discrepancies, indicating false positives or false negatives. Percentages within cells show the proportion of occurrences for each scenario in relation to the total dataset.

See this image and copyright information in PMC

References

1. Pols J. Enacting appreciations: Beyond the patient perspective. Health Care Analysis. 2005;13: 203–221. doi: 10.1007/s10728-005-6448-6 - DOI - PubMed
1. Sion K, Verbeek H, de Vries E, Zwakhalen S, Odekerken-Schröder G, Schols J, et al. The feasibility of connecting conversations: A narrative method to assess experienced quality of care in nursing homes from the resident’s perspective. International Journal of Environmental Research and Public Health. 2020;17: 5118. doi: 10.3390/ijerph17145118 - DOI - PMC - PubMed
1. Sion KY, Haex R, Verbeek H, Zwakhalen SM, Odekerken-Schröder G, Schols JM, et al. Experienced quality of post-acute and long-term care from the care recipient’s perspective–a conceptual framework. Journal of the American Medical Directors Association. 2019;20: 1386–1390. doi: 10.1016/j.jamda.2019.03.028 - DOI - PubMed
1. Delespierre T, Denormandie P, Bar-Hen A, Josseran L. Empirical advances with text mining of electronic health records. BMC medical informatics and Decision Making. 2017;17: 1–15. - PMC - PubMed
1. Strauss A, Corbin J. Basics of qualitative research techniques. Sage publications; Thousand Oaks, CA; 1998.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparing text mining and manual coding methods: Analysing interview data on quality of care in long-term care for older adults

Affiliations

Comparing text mining and manual coding methods: Analysing interview data on quality of care in long-term care for older adults

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources