Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;51(3):556-569.
doi: 10.1007/s00134-025-07848-7. Epub 2025 Mar 13.

A common longitudinal intensive care unit data format (CLIF) for critical illness research

Affiliations

A common longitudinal intensive care unit data format (CLIF) for critical illness research

Juan C Rojas et al. Intensive Care Med. 2025 Mar.

Erratum in

  • Correction: A common longitudinal intensive care unit data format (CLIF) for critical illness research.
    Rojas JC, Lyons PG, Chhikara K, Chaudhari V, Bhavani SV, Nour M, Buell KG, Smith KD, Gao CA, Amagai S, Mao C, Luo Y, Barker AK, Nuppnau M, Hermsen M, Koyner JL, Beck H, Baccile R, Liao Z, Carey KA, Park-Egan B, Han X, Ortiz AC, Schmid BE, Weissman GE, Hochberg CH, Ingraham NE, Parker WF. Rojas JC, et al. Intensive Care Med. 2025 Apr;51(4):836-839. doi: 10.1007/s00134-025-07869-2. Intensive Care Med. 2025. PMID: 40163136 Free PMC article. No abstract available.

Abstract

Rationale: Critical illness threatens millions of lives annually. Electronic health record (EHR) data are a source of granular information that could generate crucial insights into the nature and optimal treatment of critical illness.

Objectives: Overcome the data management, security, and standardization barriers to large-scale critical illness EHR studies.

Methods: We developed a Common Longitudinal Intensive Care Unit (ICU) data Format (CLIF), an open-source database format to harmonize EHR data necessary to study critical illness. We conducted proof-of-concept studies with a federated research architecture: (1) an external validation of an in-hospital mortality prediction model for critically ill patients and (2) an assessment of 72-h temperature trajectories and their association with mechanical ventilation and in-hospital mortality using group-based trajectory models.

Measurements and main results: We converted longitudinal data from 111,440 critically ill patient admissions from 2020 to 2021 (mean age 60.7 years [standard deviation 17.1], 28% Black, 7% Hispanic, 44% female) across 9 health systems and 39 hospitals into CLIF databases. The in-hospital mortality prediction model had varying performance across CLIF consortium sites (AUCs: 0.73-0.81, Brier scores: 0.06-0.10) with degradation in performance relative to the derivation site. Temperature trajectories were similar across health systems. Hypothermic and hyperthermic-slow-resolver patients consistently had the highest mortality.

Conclusions: CLIF enables transparent, efficient, and reproducible critical care research across diverse health systems. Our federated case studies showcase CLIF's potential for disease sub-phenotyping and clinical decision-support evaluation. Future applications include pragmatic EHR-based trials, target trial emulations, foundational artificial intelligence (AI) models of critical illness, and real-time critical care quality dashboards.

Keywords: Critical care data; Machine learning; Temperature trajectory modeling.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
CLIF Entity Relationship Diagram. This diagram depicts the relationships between various tables in the Common Longitudinal ICU Data Format (CLIF) schema. The diagram includes the following 22 tables: 1. Patient, 2. Hospitalization, 3. Admission Diagnosis, 4. Provider, 5. ADT (Admission, Discharge, Transfer), 6. Vitals, 7. Scores, 8. Dialysis, 9. Intake/Output, 10. Procedures, 11. Therapy Session, 12. Therapy Details, 13. Respiratory Support, 14. Position, 15. ECMO (Extracorporeal Membrane Oxygenation) and Mechanical Circulatory Support (MCS), 16. Labs, 17. Microbiology Culture, 18. Sensitivity, 19. Microbiology Non-culture, 20. Medication Orders, 21. Medication Admin Intermittent, 22. Medication Admin Continuous. Each table represents a specific aspect of ICU data, and lines between tables indicate how they are related through shared identifiers, primarily hospitalization_id. The depicted entity-relationship model was version 1.0, used for the case studies in this manuscript. The CLIF format is maintained with the git version control system, release 2.0.0 is available at https://clif-consortium.github.io/website/
Fig. 2
Fig. 2
Mortality model performance validation via receiver-operator characteristic curves (A), calibration curves (B), and decision curves (C). A ROC Curve. Purpose: Demonstrates the model’s ability to distinguish between patients who survive and those who do not (discrimination) across sites. How to interpret: A curve closer to the top left corner reflects better discrimination. The Area Under the Curve (AUC) quantifies this, with higher values (closer to 1) indicating stronger predictive performance. B Calibration plot. Purpose: Evaluate how closely the model’s predicted probabilities align with the actual outcomes at each site (calibration). How to interpret: A curve that closely follows the diagonal “ideal” line means the predicted risks are accurate. Deviations at higher probabilities suggest the model may over- or underestimate risk in those cases, with the degree of deviation varying by site. C Decision analysis curve. Purpose: Evaluate how beneficial the model is for decision-making at different prediction thresholds, balancing true positives and false positives. How to interpret: The vertical red line marks the high-risk threshold chosen by RUSH University (0.218). At this threshold, the model’s net benefit varies across sites, with the University of Chicago showing the highest value (0.026), indicating the model would add the most clinical value there. Conversely, the University of Minnesota and OHSU have the lowest net benefits (0.002 and 0.007, respectively), followed by Northwestern (0.009), meaning the model is less beneficial but still better than a ‘no alert’ or ‘alert all’ approach at these sites
Fig. 3
Fig. 3
Temperature trends across temperature trajectory sub phenotypes at all sites. Purpose: Visualize temperature trajectories over time (hours) for different patient subphenotypes: HSR (yellow), HFR (green), NT (red), and HT (blue). How to interpret: HSR patients exhibit the highest initial temperature, which stabilizes at a higher average compared to other groups. HFR patients experience a sharp temperature drop within the first 24 h. NT patients maintain a relatively stable temperature around 37 °C. HT patients have the lowest temperature trajectory, remaining below 37 °C throughout. The variations in site-specific trends suggest that patient temperature responses may differ based on institutional practices or population characteristics

Update of

References

    1. Johnson AEW, Pollard TJ, Shen L et al. (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035. - PMC - PubMed
    1. Pollard TJ, Johnson AEW, Raffa JD et al. (2018) The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci Data 5:180178. - PMC - PubMed
    1. Thoral PJ, Peppink JM, Driessen RH et al. (2021) Sharing ICU patient data responsibly under the Society of Critical Care Medicine/European Society of Intensive Care Medicine Joint Data Science Collaboration: The Amsterdam University Medical Centers Database (AmsterdamUMCdb) example: The Amsterdam university medical centers database (AmsterdamUMCdb) example. Crit Care Med 49:e563–e577 - PMC - PubMed
    1. Johnson AEW, Bulgarelli L, Shen L et al. (2023) MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 10:1. - PMC - PubMed
    1. Rojas JC, Carey KA, Edelson DP et al. (2018) Predicting intensive care unit readmission with machine learning using electronic health record data. Ann Am Thorac Soc 15:846–853 - PMC - PubMed

LinkOut - more resources