Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 22:14:369-384.
doi: 10.2147/CLEP.S323292. eCollection 2022.

Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS

Kristin Kostka  1   2 Talita Duarte-Salles  3 Albert Prats-Uribe  4 Anthony G Sena  5   6 Andrea Pistillo  3 Sara Khalid  4 Lana Y H Lai  7 Asieh Golozar  8   9 Thamir M Alshammari  10 Dalia M Dawoud  11 Fredrik Nyberg  12 Adam B Wilcox  13   14 Alan Andryc  5 Andrew Williams  15 Anna Ostropolets  16 Carlos Areia  17 Chi Young Jung  18 Christopher A Harle  19 Christian G Reich  1   2 Clair Blacketer  5   6 Daniel R Morales  20 David A Dorr  21 Edward Burn  3   4 Elena Roel  3   22 Eng Hooi Tan  4 Evan Minty  23 Frank DeFalco  5 Gabriel de Maeztu  24 Gigi Lipori  19 Hiba Alghoul  25 Hong Zhu  26 Jason A Thomas  13 Jiang Bian  19 Jimyung Park  27 Jordi Martínez Roldán  28 Jose D Posada  29 Juan M Banda  30 Juan P Horcajada  31 Julianna Kohler  32 Karishma Shah  33 Karthik Natarajan  16   34 Kristine E Lynch  35   36 Li Liu  37 Lisa M Schilling  38 Martina Recalde  3   22 Matthew Spotnitz  14 Mengchun Gong  39 Michael E Matheny  40   41 Neus Valveny  42 Nicole G Weiskopf  21 Nigam Shah  29 Osaid Alser  43 Paula Casajust  42 Rae Woong Park  27   44 Robert Schuff  21 Sarah Seager  1 Scott L DuVall  35   36 Seng Chan You  45 Seokyoung Song  46 Sergio Fernández-Bertolín  3 Stephen Fortin  5 Tanja Magoc  19 Thomas Falconer  16 Vignesh Subbian  47 Vojtech Huser  48 Waheed-Ul-Rahman Ahmed  33   49 William Carter  38 Yin Guan  50 Yankuic Galvan  19 Xing He  19 Peter R Rijnbeek  6 George Hripcsak  16   34 Patrick B Ryan  5   16 Marc A Suchard  51 Daniel Prieto-Alhambra  4
Affiliations

Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS

Kristin Kostka et al. Clin Epidemiol. .

Abstract

Purpose: Routinely collected real world data (RWD) have great utility in aiding the novel coronavirus disease (COVID-19) pandemic response. Here we present the international Observational Health Data Sciences and Informatics (OHDSI) Characterizing Health Associated Risks and Your Baseline Disease In SARS-COV-2 (CHARYBDIS) framework for standardisation and analysis of COVID-19 RWD.

Patients and methods: We conducted a descriptive retrospective database study using a federated network of data partners in the United States, Europe (the Netherlands, Spain, the UK, Germany, France and Italy) and Asia (South Korea and China). The study protocol and analytical package were released on 11th June 2020 and are iteratively updated via GitHub. We identified three non-mutually exclusive cohorts of 4,537,153 individuals with a clinical COVID-19 diagnosis or positive test, 886,193 hospitalized with COVID-19, and 113,627 hospitalized with COVID-19 requiring intensive services.

Results: We aggregated over 22,000 unique characteristics describing patients with COVID-19. All comorbidities, symptoms, medications, and outcomes are described by cohort in aggregate counts and are readily available online. Globally, we observed similarities in the USA and Europe: more women diagnosed than men but more men hospitalized than women, most diagnosed cases between 25 and 60 years of age versus most hospitalized cases between 60 and 80 years of age. South Korea differed with more women than men hospitalized. Common comorbidities included type 2 diabetes, hypertension, chronic kidney disease and heart disease. Common presenting symptoms were dyspnea, cough and fever. Symptom data availability was more common in hospitalized cohorts than diagnosed.

Conclusion: We constructed a global, multi-centre view to describe trends in COVID-19 progression, management and evolution over time. By characterising baseline variability in patients and geography, our work provides critical context that may otherwise be misconstrued as data quality issues. This is important as we perform studies on adverse events of special interest in COVID-19 vaccine surveillance.

Keywords: OHDSI; OMOP CDM; descriptive epidemiology; open science; real world data; real world evidence.

PubMed Disclaimer

Conflict of interest statement

Ms. Kostka was an employee of IQVIA during the conduct of this study and received grant funding from the NIH NCATS National COVID Cohort Collaborative and the Bill and Melinda Gates Foundation. Mr. Sena is an employee and holds stock at Janssen Research & Development, a Johnson and Johnson family of companies. Dr. Golozar reports personal fees from Regeneron Pharmaceuticals, outside the submitted work. She is a full-time employee at Regeneron Pharmaceuticals. This work was not conducted at Regeneron Pharmaceuticals. Dr. Nyberg was an employee of AstraZeneca until 2019 and hold some shares. Dr. Wilcox reports grants from Bill and Melinda Gates Foundation, grants from National Institute of Health, during the conduct of the study. Mr. Andryc is an employee of Janssen Research & Development, a subsidiary of Johnson & Johnson. Dr. Reich is an employee of IQVIA. Dr. Blacketer reports she is an employee and holds stock at Janssen Research & Development, a Johnson and Johnson family of companies. Dr. Morales is supported by a Wellcome Trust Clinical Research Development Fellowship (Grant 214588/Z/18/Z) and reports grants from Chief Scientist Office (CSO), grants from Health Data Research UK (HDR-UK), grants from National Institute of Health Research (NIHR), outside the submitted work. Mr. DeFalco reports he is an employee and holds stock at Janssen Research & Development, a Johnson and Johnson family of companies. Mr. Thomas reports grants from Bill and Melinda Gates Foundation (INV-016910), grants from National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, through Grant Award Number UL1TR002369 to his institution, during the conduct of the study. Dr Jiang Bian reports grants from NIH/NIEHS (R21ES032762), during the conduct of the study. Dr. Posada reports grants from National Library of Medicine, during the conduct of the study. Dr. Natarajan reports grants from US NIH, during the conduct of the study. Dr. Matheny reports grants from US NIH, grants from US VA HSR&D, during the conduct of the study. Dr. Weiskopf reports personal fees from Merck, during the conduct of the study and outside the submitted work. Dr. Shah reports grants from National Library of Medicine, during the conduct of the study. Dr. Park reports grants from Ministry of Trade, Industry & Energy, Republic of Korea, grants from Ministry of Health & Welfare, Republic of Korea, grants from Bill & Melinda Gates Foundation, during the conduct of the study. Mr Robert Schuff reports grants from Gates Foundation, grants from NIH-NCATS, during the conduct of the study. Ms. Seager is an employee of IQVIA. Dr. DuVall reports grants from Anolinx, LLC, Astellas Pharma, Inc, AstraZeneca Pharmaceuticals LP, Boehringer Ingelheim International GmbH, Celgene Corporation, Eli Lilly and Company, Genentech Inc., Genomic Health, Inc., Gilead Sciences Inc., GlaxoSmithKline PLC, Innocrin Pharmaceuticals Inc., Janssen Pharmaceuticals, Inc., Kantar Health, Myriad Genetic Laboratories, Inc., Novartis International AG, and Parexel International Corporation through the University of Utah or Western Institute for Veteran Research outside the submitted work. Dr. Fortin is an employee of Janssen R&D, a subsidiary of Johnson and Johnson. Dr. Subbian reports grants from State of Arizona; Arizona Board of Regents, during the conduct of the study; grants from National Science Foundation (grant# 1838745), grants from Agency for Healthcare Research and Quality, grants from National Institutes of Health, outside the submitted work. Dr. Rijnbeek reports grants from Innovative Medicines Initiative, Janssen Research and Development, during the conduct of the study. He also works for a research institute which receives/received unconditional research grants from Yamanouchi, Pfizer-Boehringer Ingelheim, GSK, Amgen, UCB, Novartis, Astra-Zeneca, Chiesi, Janssen Research and Development, none of which relate to the content of this work. Dr. Hripcsak reports grants from US NIH and Janssen Research. Dr. Ryan is an employee of Janssen Research and Development and shareholder of Johnson & Johnson. Dr. Suchard reports grants from US National Institutes of Health, Department of Veterans Affairs, during the conduct of the study; grants and/or personal fees from IQVIA, Janssen Research and Development, US Food and Drug Administration, and Private Health Management, outside the submitted work. Dr. Prieto-Alhambra reports grants, non-financial support, speaker/consultancy services and/or advisory board membership from AMGEN, UCB Biopharma, and Les Laboratoires Servier, outside the submitted work; and Janssen, on behalf of IMI-funded EHDEN and EMIF consortiums, and Synapse Management Partners have supported training programmes organised by DPA’s Department and open for external participants. The views expressed are those of the authors and do not necessarily represent the views or policy of the Department of Veterans Affairs or the United States Government. No other relationships or activities that could appear to have influenced the submitted work. The authors report no other conflicts of interest in this work.

Figures

Figure 1
Figure 1
COVID-19 cases across the OHDSI COVID-19 network.
Figure 2
Figure 2
Distribution of diagnosed, hospitalized and requiring intensive services COVID-19 cases by age and sex across the OHDSI COVID-19 network in the United States.

Update of

  • Unraveling COVID-19: a large-scale characterization of 4.5 million COVID-19 cases using CHARYBDIS.
    Prieto-Alhambra D, Kostka K, Duarte-Salles T, Prats-Uribe A, Sena A, Pistillo A, Khalid S, Lai L, Golozar A, Alshammari TM, Dawoud D, Nyberg F, Wilcox A, Andryc A, Williams A, Ostropolets A, Areia C, Jung CY, Harle C, Reich C, Blacketer C, Morales D, Dorr DA, Burn E, Roel E, Tan EH, Minty E, DeFalco F, de Maeztu G, Lipori G, Alghoul H, Zhu H, Thomas J, Bian J, Park J, Roldán JM, Posada J, Banda JM, Horcajada JP, Kohler J, Shah K, Natarajan K, Lynch K, Liu L, Schilling L, Recalde M, Spotnitz M, Gong M, Matheny M, Valveny N, Weiskopf N, Shah N, Alser O, Casajust P, Park RW, Schuff R, Seager S, DuVall S, You SC, Song S, Fernández-Bertolín S, Fortin S, Magoc T, Falconer T, Subbian V, Huser V, Ahmed WU, Carter W, Guan Y, Galvan Y, He X, Rijnbeek P, Hripcsak G, Ryan P, Suchard M. Prieto-Alhambra D, et al. Res Sq [Preprint]. 2021 Mar 1:rs.3.rs-279400. doi: 10.21203/rs.3.rs-279400/v1. Res Sq. 2021. Update in: Clin Epidemiol. 2022 Mar 22;14:369-384. doi: 10.2147/CLEP.S323292. PMID: 33688639 Free PMC article. Updated. Preprint.

References

    1. Kent S, Burn E, Dawoud D, et al. Common problems, common data model solutions: evidence generation for health technology assessment. Pharmacoeconomics. 2020;39:275–285. doi: 10.1007/s40273-020-00981-9 - DOI - PMC - PubMed
    1. Forrest CB, McTigue KM, Hernandez AF, et al. PCORnet® 2020: current state, accomplishments, and future directions. J Clin Epidemiol. 2021;129:60–67. doi: 10.1016/j.jclinepi.2020.09.036 - DOI - PMC - PubMed
    1. Hripcsak G, Duke JD, Shah NH, et al. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574–578. - PMC - PubMed
    1. Sena A, Kostka K, Schuemie M, Posada JD, Schuemie M. ohdsi-studies/Covid19CharacterizationCharybdis: Charybdis v1.1.1 - Publication Package. 2020. doi: 10.5281/zenodo.4033034. - DOI
    1. WHO Director-General’s opening remarks at the media briefing on COVID-19-11 March 2020; 2021. Available from: https://www.who.int/director-general/speeches/detail/who-director-genera....