Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 7;17(16):5707.
doi: 10.3390/ijerph17165707.

Automatic Process Comparison for Subpopulations: Application in Cancer Care

Affiliations

Automatic Process Comparison for Subpopulations: Application in Cancer Care

Francesca Marazza et al. Int J Environ Res Public Health. .

Abstract

Processes in organisations, such as hospitals, may deviate from the intended standard processes, due to unforeseeable events and the complexity of the organisation. For hospitals, the knowledge of actual patient streams for patient populations (e.g., severe or non-severe cases) is important for quality control and improvement. Process discovery from event data in electronic health records can shed light on the patient flows, but their comparison for different populations is cumbersome and time-consuming. In this paper, we present an approach for the automatic comparison of process models that were extracted from events in electronic health records. Concretely, we propose comparing processes for different patient populations by cross-log conformance checking, and standard graph similarity measures obtained from the directed graph underlying the process model. We perform a user study with 20 participants in order to obtain a ground truth for similarity of process models. We evaluate our approach on two data sets, the publicly available MIMIC database with the focus on different cancer patients in intensive care, and a database on breast cancer patients from a Dutch hospital. In our experiments, we found average fitness to be a good indicator for visual similarity in the ZGT use case, while the average precision and graph edit distance are strongly correlated with visual impression for cancer process models on MIMIC. These results are a call for further research and evaluation for determining which similarity or combination of similarities is needed in which type of process model comparison.

Keywords: MIMIC database; breast cancer care; cancer types; process comparison; process mining; quality control.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure A1
Figure A1
Discovered process models for different cancer types in MIMIC.
Figure A2
Figure A2
Relation of non-aggregated measures for MIMIC. Based on measures reported in Table 3. Diagonal: normalized histogram and kernel density estimation of the distribution. Lower triangle: scatterplot with estimated linear regression line. Upper triangle: pairwise Spearman’s rank correlations with p-values.
Figure A3
Figure A3
Discovered process models on ZGT for sub-populations with events on level 1.
Figure A4
Figure A4
Relation of non-aggregated measures for ZGT. Based on measures reported in Table 7. Diagonal: normalized histogram and kernel density estimation of the distribution. Lower triangle: scatterplot with estimated linear regression line. Upper triangle: pairwise Spearman’s rank correlations with p-values.
Figure 1
Figure 1
Overview of the approach.
Figure 2
Figure 2
Overview of methods for pairwise comparison of process models.
Figure 3
Figure 3
Intuition for graph-based similarity. Left: for graphs constructed from IvM models; right: for graphs constructed from Petri nets. Event nodes match only if they represent the same event. Other nodes match if they have the same type, e.g., represent the start of an XOR branch (XOR-S), independent on the ID of the graph node.
Figure 4
Figure 4
Discovered process models for cancer types 2, 7, and 8 in MIMIC.
Figure 5
Figure 5
Process model (left) and constructed directed graph (right) for cancer type 10. Graph on the right was drawn using the networkx implementation of Kamada-Kawai path-length cost-function. Note that there is node and edge overlap for all event nodes, making the position nodes hardly visible.
Figure 6
Figure 6
Relation of aggregated measures for MIMIC. Based on measures reported in Table 3. Diagonal: normalized histogram and kernel density estimation of the distribution. Lower triangle: scatterplot with estimated linear regression line. Upper triangle: pairwise Spearman’s rank correlations (ρs) with p-values.
Figure 7
Figure 7
Event types on different levels of granularity (level 1 on top, level 2 at the bottom)
Figure 8
Figure 8
Discovered process models on ZGT for sub-populations with events on level 1.
Figure 9
Figure 9
Process model for NoSVOB population (top) and derived directed graph (bottom). Edge weights and loops omitted in the graph for readability.
Figure 10
Figure 10
Discovered process models on ZGT for sub-populations with second-level events.
Figure 11
Figure 11
Relation of aggregated measures for ZGT. Based on measures reported in Table 7. Diagonal: normalized histogram and kernel density estimation of the distribution. Lower triangle: scatterplot with estimated linear regression line. Upper triangle: pairwise Spearman’s rank correlations (ρs) with p-values.

References

    1. Donabedian A. Evaluating the quality of medical care. Milbank Meml. Fund Q. 1966;44:166–206. doi: 10.2307/3348969. - DOI - PubMed
    1. van Aalst W.M., van Hee K.M., van Werf J.M., Verdonk M. Auditing 2.0: Using process mining to support tomorrow’s auditor. Computer. 2010;43:90–93. doi: 10.1109/MC.2010.61. - DOI
    1. Marazza F., Bukhsh F., Vijlbrief O., Geerdink J., Pathak S., van Keulen M., Seifert C. Proceedings of International Workshop on Process-Oriented Data Science for Healthcare. Springer; Wien, Austria: 2019. Comparing Process Models for Patient Populations: Application in Breast Cancer Care.
    1. Noumeir R., Pambrun J.F. Images within the Electronic Health Record; Proceedings of the 2009 16th IEEE International Conference on Image Processing (ICIP); Cairo, Egypt. 7–10 November 2009; pp. 1761–1764. - DOI
    1. Johnson A.E., Pollard T.J., Shen L., Lehman L.w.H., Feng M., Ghassemi M., Moody B., Szolovits P., Anthony Celi L., Mark R.G. MIMIC-III, a freely accessible critical care database. Sci. Data. 2016;3:160035. doi: 10.1038/sdata.2016.35. - DOI - PMC - PubMed