. 2020 Aug 12;22(8):e18912.

doi: 10.2196/18912.

A Novel Approach for Continuous Health Status Monitoring and Automatic Detection of Infection Incidences in People With Type 1 Diabetes Using Machine Learning Algorithms (Part 2): A Personalized Digital Infectious Disease Detection Mechanism

Ashenafi Zebene Woldaregay¹, Ilkka Kalervo Launonen², David Albers^{3

4}, Jorge Igual⁵, Eirik Årsand¹, Gunnar Hartvigsen¹

Affiliations

¹ Department of Computer Science, University of Tromsø - The Arctic University of Norway, Tromsø, Norway.
² Department of Clinical Research, University Hospital of North Norway, Tromsø, Norway.
³ Department of Pediatrics, Informatics and Data Science, University of Colorado, Aurora, CO, United States.
⁴ Department of Biomedical Informatics, Columbia University, New York, NY, United States.
⁵ Universitat Politècnica de València, Valencia, Spain.

PMID: 32784179
PMCID: PMC7450372
DOI: 10.2196/18912

A Novel Approach for Continuous Health Status Monitoring and Automatic Detection of Infection Incidences in People With Type 1 Diabetes Using Machine Learning Algorithms (Part 2): A Personalized Digital Infectious Disease Detection Mechanism

Ashenafi Zebene Woldaregay et al. J Med Internet Res. 2020.

. 2020 Aug 12;22(8):e18912.

doi: 10.2196/18912.

Authors

Ashenafi Zebene Woldaregay¹, Ilkka Kalervo Launonen², David Albers^{3

4}, Jorge Igual⁵, Eirik Årsand¹, Gunnar Hartvigsen¹

Affiliations

¹ Department of Computer Science, University of Tromsø - The Arctic University of Norway, Tromsø, Norway.
² Department of Clinical Research, University Hospital of North Norway, Tromsø, Norway.
³ Department of Pediatrics, Informatics and Data Science, University of Colorado, Aurora, CO, United States.
⁴ Department of Biomedical Informatics, Columbia University, New York, NY, United States.
⁵ Universitat Politècnica de València, Valencia, Spain.

PMID: 32784179
PMCID: PMC7450372
DOI: 10.2196/18912

Abstract

Background: Semisupervised and unsupervised anomaly detection methods have been widely used in various applications to detect anomalous objects from a given data set. Specifically, these methods are popular in the medical domain because of their suitability for applications where there is a lack of a sufficient data set for the other classes. Infection incidence often brings prolonged hyperglycemia and frequent insulin injections in people with type 1 diabetes, which are significant anomalies. Despite these potentials, there have been very few studies that focused on detecting infection incidences in individuals with type 1 diabetes using a dedicated personalized health model.

Objective: This study aims to develop a personalized health model that can automatically detect the incidence of infection in people with type 1 diabetes using blood glucose levels and insulin-to-carbohydrate ratio as input variables. The model is expected to detect deviations from the norm because of infection incidences considering elevated blood glucose levels coupled with unusual changes in the insulin-to-carbohydrate ratio.

Methods: Three groups of one-class classifiers were trained on target data sets (regular days) and tested on a data set containing both the target and the nontarget (infection days). For comparison, two unsupervised models were also tested. The data set consists of high-precision self-recorded data collected from three real subjects with type 1 diabetes incorporating blood glucose, insulin, diet, and events of infection. The models were evaluated on two groups of data: raw and filtered data and compared based on their performance, computational time, and number of samples required.

Results: The one-class classifiers achieved excellent performance. In comparison, the unsupervised models suffered from performance degradation mainly because of the atypical nature of the data. Among the one-class classifiers, the boundary and domain-based method produced a better description of the data. Regarding the computational time, nearest neighbor, support vector data description, and self-organizing map took considerable training time, which typically increased as the sample size increased, and only local outlier factor and connectivity-based outlier factor took considerable testing time.

Conclusions: We demonstrated the applicability of one-class classifiers and unsupervised models for the detection of infection incidence in people with type 1 diabetes. In this patient group, detecting infection can provide an opportunity to devise tailored services and also to detect potential public health threats. The proposed approaches achieved excellent performance; in particular, the boundary and domain-based method performed better. Among the respective groups, particular models such as one-class support vector machine, K-nearest neighbor, and K-means achieved excellent performance in all the sample sizes and infection cases. Overall, we foresee that the results could encourage researchers to examine beyond the presented features into other additional features of the self-recorded data, for example, continuous glucose monitoring features and physical activity data, on a large scale.

Keywords: decision support techniques; infection detection; outbreak detection system; self-recorded health data; syndromic surveillance; type 1 diabetes.

©Ashenafi Zebene Woldaregay, Ilkka Kalervo Launonen, David Albers, Jorge Igual, Eirik Årsand, Gunnar Hartvigsen. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 12.08.2020.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
Daily scatter plot of average blood glucose levels versus total insulin (bolus) to total carbohydrate ratio for a specific regular or normal patient year without any infection incidences.

**Figure 2**
Hourly scatter plot of average blood glucose levels versus total insulin (bolus) to total carbohydrate ratio for a specific regular or normal patient year without any infection incidences.

**Figure 3**
Daily scatter plot of average blood glucose levels versus total insulin (bolus) to total carbohydrate ratio for a specific patient year with an infection incidence (flu).

**Figure 4**
Hourly scatter plot of average blood glucose levels versus total insulin (bolus) to total carbohydrate ratio for a specific patient year with an infection incidence (flu).

**Figure 5**
Plot of models’ average computational time for the training phase. The x-axis depicts the sample size, and each label stands for total sample size divided by 24. The y-axis depicts the computational time required by each model. Gauss: Gaussian; IncSVDD: incremental support vector data description; K-NN: K-nearest neighbor; LOF: local outlier factor; MCD: minimum covariance determinant; MOG: mixture of Gaussian; MST: minimum spanning tree; NN: nearest neighbor; NParzen: naïve Parzen; PCA: principal component analysis; SOM: self-organizing maps; SVDD: support vector data description; V-SVM: one-class support vector machine.

**Figure 6**
Plot of models’ average computational time for the testing phase. The x-axis depicts the sample size, and each label stands for total sample size divided by 24. The y-axis depicts the computational time required by each model. Gauss: Gaussian; IncSVDD: incremental support vector data description; K-NN: K-nearest neighbor; LOF: local outlier factor; MCD Gauss: Gaussian: SOM: self-organizing maps; MOG: mixture of Gaussian; MST: minimum spanning tree; NN: nearest neighbor; NParzen: naïve Parzen; PCA: principal component analysis; SVDD: support vector data description; V-SVM: one-class support vector machine.

**Figure 7**
Quadrants of wellness in people with type 1 diabetes. The figure depicts the 4 possible scenarios of different parameters: carbohydrate action, insulin action, physical activity action, and abnormality because of metabolic change such as infection and stress. BG: blood glucose; PA: physical activity.

**Figure 8**
Average performance (F1-score) of each model across all the infection cases. AE: auto-encoder; Gauss: Gaussian; IncSVDD: incremental support vector data description; K-NN: K-nearest neighbor; LOF: local outlier factor; MCD: minimum covariance determinant; MOG: mixture of Gaussian; MST: minimum spanning tree; NN: nearest neighbor; NP: naïve Parzen; PCA: principal component analysis; SOM: self-organizing maps; SVDD: support vector data description; V-SVM: one-class support vector machine.

See this image and copyright information in PMC

References

1. Dunning T, Friedman E. In: Practical Machine Learning: A New Look at Anomaly Detection. Loukides M, editor. New York, USA: O'Reilly Media Inc; 2014.
1. Agrawal S, Agrawal J. Survey on anomaly detection using data mining techniques. Procedia Comput Sci. 2015;60:708–13. doi: 10.1016/j.procs.2015.08.220. doi: 10.1016/j.procs.2015.08.220. - DOI - DOI
1. Pimentel MA, Clifton DA, Clifton L, Tarassenko L. A review of novelty detection. Sig Process. 2014 Jun;99:215–49. doi: 10.1016/j.sigpro.2013.12.026. doi: 10.1016/j.sigpro.2013.12.026. - DOI - DOI
1. Cohen G, Hilario M, Sax H, Hugonnet S, Pellegrini C, Geissbuhler A. An application of one-class support vector machine to nosocomial infection detection. Stud Health Technol Inform. 2004;107(Pt 1):716–20. - PubMed
1. Cohen G, Sax H, Geissbuhler A. Novelty detection using one-class Parzen density estimator. An application to surveillance of nosocomial infections. Stud Health Technol Inform. 2008;136:21–6. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 LM012734/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A Novel Approach for Continuous Health Status Monitoring and Automatic Detection of Infection Incidences in People With Type 1 Diabetes Using Machine Learning Algorithms (Part 2): A Personalized Digital Infectious Disease Detection Mechanism

Affiliations

A Novel Approach for Continuous Health Status Monitoring and Automatic Detection of Infection Incidences in People With Type 1 Diabetes Using Machine Learning Algorithms (Part 2): A Personalized Digital Infectious Disease Detection Mechanism

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical