Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 1;49(6):e563-e577.
doi: 10.1097/CCM.0000000000004916.

Sharing ICU Patient Data Responsibly Under the Society of Critical Care Medicine/European Society of Intensive Care Medicine Joint Data Science Collaboration: The Amsterdam University Medical Centers Database (AmsterdamUMCdb) Example

Affiliations

Sharing ICU Patient Data Responsibly Under the Society of Critical Care Medicine/European Society of Intensive Care Medicine Joint Data Science Collaboration: The Amsterdam University Medical Centers Database (AmsterdamUMCdb) Example

Patrick J Thoral et al. Crit Care Med. .

Abstract

Objectives: Critical care medicine is a natural environment for machine learning approaches to improve outcomes for critically ill patients as admissions to ICUs generate vast amounts of data. However, technical, legal, ethical, and privacy concerns have so far limited the critical care medicine community from making these data readily available. The Society of Critical Care Medicine and the European Society of Intensive Care Medicine have identified ICU patient data sharing as one of the priorities under their Joint Data Science Collaboration. To encourage ICUs worldwide to share their patient data responsibly, we now describe the development and release of Amsterdam University Medical Centers Database (AmsterdamUMCdb), the first freely available critical care database in full compliance with privacy laws from both the United States and Europe, as an example of the feasibility of sharing complex critical care data.

Setting: University hospital ICU.

Subjects: Data from ICU patients admitted between 2003 and 2016.

Interventions: We used a risk-based deidentification strategy to maintain data utility while preserving privacy. In addition, we implemented contractual and governance processes, and a communication strategy. Patient organizations, supporting hospitals, and experts on ethics and privacy audited these processes and the database.

Measurements and main results: AmsterdamUMCdb contains approximately 1 billion clinical data points from 23,106 admissions of 20,109 patients. The privacy audit concluded that reidentification is not reasonably likely, and AmsterdamUMCdb can therefore be considered as anonymous information, both in the context of the U.S. Health Insurance Portability and Accountability Act and the European General Data Protection Regulation. The ethics audit concluded that responsible data sharing imposes minimal burden, whereas the potential benefit is tremendous.

Conclusions: Technical, legal, ethical, and privacy challenges related to responsible data sharing can be addressed using a multidisciplinary approach. A risk-based deidentification strategy, that complies with both U.S. and European privacy regulations, should be the preferred approach to releasing ICU patient data. This supports the shared Society of Critical Care Medicine and European Society of Intensive Care Medicine vision to improve critical care outcomes through scientific inquiry of vast and combined ICU datasets.

PubMed Disclaimer

Conflict of interest statement

Dr. Sijbrands’ institution received funding from European Institute of Innovation and Technology (EIT) Health and Amgen. Drs. Kaplan and Bailey received funding from Society of Critical Care Medicine. Dr. Cecconi received funding from Directed Systems, Edwards Lifesciences, and Cheetah Medical. Dr. Churpek’s institution received funding from an EarlySense research grant; he is supported by National Institutes of Health (NIH) R01 (GM123193), and he has a patent pending for risk stratification algorithm for hospitalized patients (money from royalties from the University of Chicago). Dr. Clermont received funding from the NIH, Department of Defense, National Science Foundation, and NOMA AI. The remaining authors have disclosed that they do not have any potential conflicts of interest.

Figures

Figure 1.
Figure 1.
An overview of the process to create AmsterdamUMCdb from different source databases (A). The anonymization threshold separates personal data from anonymous data (B). The applied risk-based deidentification strategy demonstrating the iterative nature of performing deidentification (C). Final table structure depicting the relations with the admissions table (D). Capitalized words in the tables refer to data types used: INTEGER (whole number), SMALINT (small-range integer), BIGINT (large-range integer), FLOAT (floating-point number) or VARCHAR (variable size character data). DBs = databases, EHR = electronic health record, GLIMS = General Laboratory Information Management System, ID = identifier, LIS = Laboratory Information System, PDMS = Patient Data Management System, PSS = patient scoring system.
Figure 2.
Figure 2.
Overview of the diversity of data in Amsterdam University Medical Centers Database (AmsterdamUMCdb). The plots show a selection of the most common data shown as percentage of availability for all admissions: device data that have been automatically filed (A), observations and scores that were entered manually (B), laboratory measurements (C), and administered drugs (D). ALAT = alanine aminotransferase, APACHE = Acute Physiology and Chronic Health Evaluation, ASAT = aspartate aminotransferase, CO = cardiac output, DNAR = do not attempt resuscitation, PEEP = positive end-expiratory pressure, SDD = selective decontamination of the digestive tract, Spo2 = peripheral oxygen saturation, TSH = thyroid-stimulating hormone.
Figure 3.
Figure 3.
Example of time series data from an ICU admission in Amsterdam University Medical Centers Database (AmsterdamUMCdb) displayed as a graphical timeline. The data are from a patient admitted after cardiopulmonary resuscitation who received mild therapeutic hypothermia and developed shock and acute kidney injury with initiation of renal replacement therapy. The series show a selection of data documented throughout the admission: vital variables, clinical observations, infusions of medication, fluid input and output, supportive care, and inserted catheters, drains, and catheters. Data have been downsampled for readability and translated to English from the original Dutch variables and values. ABP = arterial blood pressure, CPAP = continuous positive airway pressure, CPR = cardiopulmonary resuscitation, CVVH = continuous veno-venous hemofiltration, I/O = input/output, NSR = normal sinus rhythm, PC = pressure control ventilation, PEEP = positive end-expiratory pressure, PS = pressure support ventilation, Spo2 = peripheral oxygen saturation, Svo2 = venous oxygen saturation.

Comment in

References

    1. Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019; 380:1347–1358 - PubMed
    1. Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. 2018; 319:1317–1318 - PubMed
    1. Bailly S, Meyfroidt G, Timsit J-F. What’s new in ICU in 2050: Big data and machine learning. Intensive Care Med. 2017; 44:1524–1527 - PubMed
    1. Cosgriff CV, Celi LA, Stone DJ. Critical care, critical data. Biomed Eng Comput Biol. 2019; 10:1-7 - PMC - PubMed
    1. Stupple A, Singerman D, Celi LA. The reproducibility crisis in the age of digital medicine. npj Digit Med. 2019; 2:2. - PMC - PubMed

Publication types

MeSH terms