A comparison of rule-based and machine learning approaches for classifying patient portal messages

doi:10.1016/j.ijmedinf.2017.06.004

Comparative Study

. 2017 Sep:105:110-120.

doi: 10.1016/j.ijmedinf.2017.06.004. Epub 2017 Jun 23.

A comparison of rule-based and machine learning approaches for classifying patient portal messages

Robert M Cronin¹, Daniel Fabbri², Joshua C Denny³, S Trent Rosenbloom⁴, Gretchen Purcell Jackson⁵

Affiliations

¹ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA. Electronic address: robert.cronin@vanderbilt.edu.
² Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Computer Science, Vanderbilt University, Nashville, TN, USA.
³ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
⁴ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA.
⁵ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatric Surgery, Vanderbilt University Medical Center, Nashville, TN, USA.

PMID: 28750904
PMCID: PMC5546247
DOI: 10.1016/j.ijmedinf.2017.06.004

Comparative Study

A comparison of rule-based and machine learning approaches for classifying patient portal messages

Robert M Cronin et al. Int J Med Inform. 2017 Sep.

. 2017 Sep:105:110-120.

doi: 10.1016/j.ijmedinf.2017.06.004. Epub 2017 Jun 23.

Authors

Robert M Cronin¹, Daniel Fabbri², Joshua C Denny³, S Trent Rosenbloom⁴, Gretchen Purcell Jackson⁵

Affiliations

¹ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA. Electronic address: robert.cronin@vanderbilt.edu.
² Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Computer Science, Vanderbilt University, Nashville, TN, USA.
³ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
⁴ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA.
⁵ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatric Surgery, Vanderbilt University Medical Center, Nashville, TN, USA.

PMID: 28750904
PMCID: PMC5546247
DOI: 10.1016/j.ijmedinf.2017.06.004

Abstract

Objective: Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care.

Materials and methods: We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers.

Results: The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naïve Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard Index: 0.861. For medical communications, the most predictive variables were NLP concepts (e.g., Temporal_Concept, which maps to 'morning', 'evening' and Idea_or_Concept which maps to 'appointment' and 'refill'). For logistical communications, the most predictive variables contained similar numbers of NLP variables and words (e.g., Telephone mapping to 'phone', 'insurance'). For social and informational communications, the most predictive variables were words (e.g., social: 'thanks', 'much', informational: 'question', 'mean').

Conclusions: This study applies automated classification methods to the content of patient portal messages and evaluates the application of NLP techniques on consumer communications in patient portal messages. We demonstrated that random forest and logistic regression approaches accurately classified the content of portal messages, although the best approach to classification varied by communication type. Words were the most predictive variables for classification of most communication types, although NLP variables were most predictive for medical communication types. As adoption of patient portals increases, automated techniques could assist in understanding and managing growing volumes of messages. Further work is needed to improve classification performance to potentially support message triage and answering.

Keywords: Machine learning; Natural language processing; Patient portal; Text classification.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None.

Figures

**Figure 1**
The taxonomy of consumer health information communication types[17, 33, 34].

**Figure 2**
Example message labeled by communication types

**Figure 3**
Area under the curve (AUC) of the different major communication types. The Basic Classifier was the Rule Based classifier. The error bars represent the 95% Confidence Interval.

**Figure 4**
Bar charts of the Jaccard Indices of the different communication types. The Basic Classifier was the Rule Based classifier. The error bars represent the 95% Confidence Interval.

See this image and copyright information in PMC

Cited by

Improving Cancer Care Communication: Identifying Sociodemographic Differences in Patient Portal Secure Messages Not Authored by the Patient.
Armstrong M, Benda NC, Seier K, Rogers C, Ancker JS, Stetson PD, Peng Y, Diamond LC. Armstrong M, et al. Appl Clin Inform. 2023 Mar;14(2):296-299. doi: 10.1055/a-2015-8679. Epub 2023 Jan 19. Appl Clin Inform. 2023. PMID: 36657471 Free PMC article. No abstract available.
Automating the Classification of Complexity of Medical Decision-Making in Patient-Provider Messaging in a Patient Portal.
Sulieman L, Robinson JR, Jackson GP. Sulieman L, et al. J Surg Res. 2020 Nov;255:224-232. doi: 10.1016/j.jss.2020.05.039. Epub 2020 Jun 19. J Surg Res. 2020. PMID: 32570124 Free PMC article.
A systematic literature review of machine learning in online personal health data.
Yin Z, Sulieman LM, Malin BA. Yin Z, et al. J Am Med Inform Assoc. 2019 Jun 1;26(6):561-576. doi: 10.1093/jamia/ocz009. J Am Med Inform Assoc. 2019. PMID: 30908576 Free PMC article.
Automatic uncovering of patient primary concerns in portal messages using a fusion framework of pretrained language models.
Ren Y, Wu Y, Fan JW, Khurana A, Fu S, Wu D, Liu H, Huang M. Ren Y, et al. J Am Med Inform Assoc. 2024 Aug 1;31(8):1714-1724. doi: 10.1093/jamia/ocae144. J Am Med Inform Assoc. 2024. PMID: 38934289 Free PMC article.
Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.
Sim JA, Huang X, Horan MR, Stewart CM, Robison LL, Hudson MM, Baker JN, Huang IC. Sim JA, et al. Artif Intell Med. 2023 Dec;146:102701. doi: 10.1016/j.artmed.2023.102701. Epub 2023 Nov 1. Artif Intell Med. 2023. PMID: 38042599 Free PMC article.

See all "Cited by" articles

References

1. Shapochka A. Providers Turn to Portals to Meet Patient Demand. Meaningful Use / Journal of AHIMA. 2012
1. Tang PC, Lansky D. The missing link: bridging the patient-provider health information gap. Health Aff (Millwood) 2005;24:1290–1295. - PubMed
1. Calabretta N. Consumer-driven, patient-centered health care in the age of electronic information. J Med Libr Assoc. 2002;90:32–37. - PMC - PubMed
1. Koonce TY, Giuse DA, Beauregard JM, Giuse NB. Toward a more informed patient: bridging health care information through an interactive communication portal. J Med Libr Assoc. 2007;95:77–81. - PMC - PubMed
1. Bussey-Smith KL, Rossen RD. A systematic review of randomized control trials evaluating the effectiveness of interactive computerized asthma patient education programs. Ann Allergy Asthma Immunol. 2007;98:507–516. quiz 516, 566. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

T15 LM007450/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Shapochka A. Providers Turn to Portals to Meet Patient Demand. Meaningful Use / Journal of AHIMA. 2012

[2] Shapochka A. Providers Turn to Portals to Meet Patient Demand. Meaningful Use / Journal of AHIMA. 2012

[3] Tang PC, Lansky D. The missing link: bridging the patient-provider health information gap. Health Aff (Millwood) 2005;24:1290–1295. - PubMed

[4] Tang PC, Lansky D. The missing link: bridging the patient-provider health information gap. Health Aff (Millwood) 2005;24:1290–1295. - PubMed

[5] Calabretta N. Consumer-driven, patient-centered health care in the age of electronic information. J Med Libr Assoc. 2002;90:32–37. - PMC - PubMed

[6] Calabretta N. Consumer-driven, patient-centered health care in the age of electronic information. J Med Libr Assoc. 2002;90:32–37. - PMC - PubMed

[7] Koonce TY, Giuse DA, Beauregard JM, Giuse NB. Toward a more informed patient: bridging health care information through an interactive communication portal. J Med Libr Assoc. 2007;95:77–81. - PMC - PubMed

[8] Koonce TY, Giuse DA, Beauregard JM, Giuse NB. Toward a more informed patient: bridging health care information through an interactive communication portal. J Med Libr Assoc. 2007;95:77–81. - PMC - PubMed

[9] Bussey-Smith KL, Rossen RD. A systematic review of randomized control trials evaluating the effectiveness of interactive computerized asthma patient education programs. Ann Allergy Asthma Immunol. 2007;98:507–516. quiz 516, 566. - PubMed

[10] Bussey-Smith KL, Rossen RD. A systematic review of randomized control trials evaluating the effectiveness of interactive computerized asthma patient education programs. Ann Allergy Asthma Immunol. 2007;98:507–516. quiz 516, 566. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A comparison of rule-based and machine learning approaches for classifying patient portal messages

Affiliations

A comparison of rule-based and machine learning approaches for classifying patient portal messages

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials