Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 27;23(1):212.
doi: 10.1186/s12874-023-02019-y.

Analytical methods for identifying sequences of utilization in health data: a scoping review

Affiliations

Analytical methods for identifying sequences of utilization in health data: a scoping review

Amelie Flothow et al. BMC Med Res Methodol. .

Abstract

Background: Healthcare, as with other sectors, has undergone progressive digitalization, generating an ever-increasing wealth of data that enables research and the analysis of patient movement. This can help to evaluate treatment processes and outcomes, and in turn improve the quality of care. This scoping review provides an overview of the algorithms and methods that have been used to identify care pathways from healthcare utilization data.

Method: This review was conducted according to the methodology of the Joanna Briggs Institute and the Preferred Reporting Items for Systematic Reviews Extension for Scoping Reviews (PRISMA-ScR) Checklist. The PubMed, Web of Science, Scopus, and EconLit databases were searched and studies published in English between 2000 and 2021 considered. The search strategy used keywords divided into three categories: the method of data analysis, the requirement profile for the data, and the intended presentation of results. Criteria for inclusion were that health data were analyzed, the methodology used was described and that the chronology of care events was considered. In a two-stage review process, records were reviewed by two researchers independently for inclusion. Results were synthesized narratively.

Results: The literature search yielded 2,865 entries; 51 studies met the inclusion criteria. Health data from different countries ([Formula: see text]) and of different types of disease ([Formula: see text]) were analyzed with respect to different care events. Applied methods can be divided into those identifying subsequences of care and those describing full care trajectories. Variants of pattern mining or Markov models were mostly used to extract subsequences, with clustering often applied to find care trajectories. Statistical algorithms such as rule mining, probability-based machine learning algorithms or a combination of methods were also applied. Clustering methods were sometimes used for data preparation or result compression. Further characteristics of the included studies are presented.

Conclusion: Various data mining methods are already being applied to gain insight from health data. The great heterogeneity of the methods used shows the need for a scoping review. We performed a narrative review and found that clustering methods currently dominate the literature for identifying complete care trajectories, while variants of pattern mining dominate for identifying subsequences of limited length.

Keywords: Care pathway; Claims data; Data mining method; Health data; Patient pathway; Pattern mining; Scoping review; Sequences.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interests. Prof. Leonie Sundmacher was Co-author of one included study [57]. The authors made every effort to treat and analyze all studies equally.

Figures

Fig. 1
Fig. 1
PRISMA diagram of the study selection process
Fig. 2
Fig. 2
Relation of stated aim, used method and presentation of result. The figure on the left shows the proportion per specified aim using a particular method in the study (size of the same color bubbles). The figure on the right shows the proportion per specified outcome using a particular method in the study. The size of the bubbles represents the proportion [in percent] within the specified aims (left graph) and the proportion within a parent result type (right graph). The values within the graphic indicate the number of studies which meet the given characteristics on the respective axes. Applied methods include Clustering, Markov Models (MM), Pattern Mining (PM) and other methods (Other). Presented results include Trajectories (Trajectory, T), Patterns (Patterns, P) or both (T+P) in a tabular (tab) or visualized (viz) way

Similar articles

Cited by

References

    1. Rydning DRJGJ, Reinsel J, Gantz J. The digitization of the world from edge to core, vol. 16. Framingham: International Data Corporation; 2018. p. 1–28.
    1. Kreis K, Neubauer S, Klora M, Lange A, Zeidler J. Status and perspectives of claims data analyses in germany—a systematic review. Health Pol (Amsterdam, Netherlands). 2016;120(2):213–26. 10.1016/j.healthpol.2016.01.007. - PubMed
    1. Blin P, Lassalle R, Thurin N, Bosco-Levy P, Droz-Perroteau C, Moore N. Snds, the french nationwide claims database: A powerful tool for pharmacoeconomy and pharmacoepidemiology. Value Health 21. 10.1016/j.jval.2018.09.221
    1. Novelli A, Frank-Teewag J, Bleek J, Guenster C, Schneider U, Marschall U, Schloessler K, Donner-Banzhoff N, Sundmacher L. Identifying and investigating ambulatory care sequences before invasive coronary angiography. Med Care 60. 10.1097/MLR.0000000000001738 - PMC - PubMed
    1. Vanasse A, Courteau J, Courteau M, Benigeri M., Chiu YM, Dufour I, Couillard S., Larivee, P., Hudon, C.: Healthcare utilization after a first hospitalization for copd: a new approach of state sequence analysis based on the ‘6w’ multidimensional model of care trajectories. BMC Health Serv Res 2020;20(1). 10.1186/s12913-020-5030-0 - PMC - PubMed

Publication types

LinkOut - more resources