Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Nov 25;10(11):giab076.
doi: 10.1093/gigascience/giab076.

An overview of the National COVID-19 Chest Imaging Database: data quality and cohort analysis

Affiliations
Review

An overview of the National COVID-19 Chest Imaging Database: data quality and cohort analysis

Dominic Cushnan et al. Gigascience. .

Erratum in

Abstract

Background: The National COVID-19 Chest Imaging Database (NCCID) is a centralized database containing mainly chest X-rays and computed tomography scans from patients across the UK. The objective of the initiative is to support a better understanding of the coronavirus SARS-CoV-2 disease (COVID-19) and the development of machine learning technologies that will improve care for patients hospitalized with a severe COVID-19 infection. This article introduces the training dataset, including a snapshot analysis covering the completeness of clinical data, and availability of image data for the various use-cases (diagnosis, prognosis, longitudinal risk). An additional cohort analysis measures how well the NCCID represents the wider COVID-19-affected UK population in terms of geographic, demographic, and temporal coverage.

Findings: The NCCID offers high-quality DICOM images acquired across a variety of imaging machinery; multiple time points including historical images are available for a subset of patients. This volume and variety make the database well suited to development of diagnostic/prognostic models for COVID-associated respiratory conditions. Historical images and clinical data may aid long-term risk stratification, particularly as availability of comorbidity data increases through linkage to other resources. The cohort analysis revealed good alignment to general UK COVID-19 statistics for some categories, e.g., sex, whilst identifying areas for improvements to data collection methods, particularly geographic coverage.

Conclusion: The NCCID is a growing resource that provides researchers with a large, high-quality database that can be leveraged both to support the response to the COVID-19 pandemic and as a test bed for building clinically viable medical imaging models.

Keywords: COVID-19; SARS-CoV2; machine learning; medical imaging; thoracic imaging.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1:
Figure 1:
Diagram of the data collection pipeline for the NCCID warehouse.
Figure 2:
Figure 2:
Completeness of clinical data fields related to (A) dates, (B) patient medical history, (C) symptoms on admissions and (D) COVID-related information. bp: blood presure; crp: C-reactive protein; itu: intensive therapy unit; o2: oxygen; pmh: past medical history; wcc: white blood cell count.
Figure 3:
Figure 3:
Number of historical/acute/total image studies per NCCID COVID-positive patient (n = 2,826) for (A) X-rays and (B) CTs. In both sets of box plots, outliers are indicated by dots outside the limit of the plot whiskers, and whiskers correspond to Q1 or Q3 ±1.5*iqr (interquartile range).
Figure 4:
Figure 4:
(A) Number of days between the patient’s RT-PCR swab test and the image acquisition (nXRAY = 2,410, nCT = 507) and (B) Number of days between patient symptom onset and image acquisition (nXRAY = 803, nCT = 133). In both sets of box plots, outliers are indicated by dots outside the limit of the plot whiskers, and whiskers correspond to Q1 or Q3 ±1.5*iqr (interquartile range).
Figure 5:
Figure 5:
Number of COVID-positive and negative (A) X-ray studies by manufacturer and (B) CT studies by manufacturer. In both cases the manufacturers are ordered by highest to lowest total (positive+negative) number of studies.
Figure 6:
Figure 6:
NCCID positive and negative patients submitted by region, sorted by total contribution.
Figure 7:
Figure 7:
Comparison of national COVID-19 admissions at a regional level with NCCID positive cases.
Figure 8:
Figure 8:
Comparison of sex split within (A) the NCCID COVID-19 patients, the general UK population (as reported in the 2011 census), and COVID-19 hospital admissions (reported by ISARIC); (B) NCCID recorded deaths and NHS England COVID-19 hospital mortality data.
Figure 9:
Figure 9:
Comparison of ethnicity proportions within (A) the NCCID COVID-19 patients, the UK population (as reported in the 2011 national census), and COVID-19 hospital admissions (reported by ISARIC); (B) the NCCID recorded deaths and NHS England COVID-19 hospital mortality data.
Figure 10:
Figure 10:
Comparison of age proportions between COVID-19 hospital admissions (reported by PHE) and NCCID positive patients for (A) England, (B) East of England, (C) London, (D) Midlands, (E) Northeast and Yorkshire, (F) Northwest (G), Southeast, and (H) Southwest.
Figure 11:
Figure 11:
Comparison of age distributions between recorded COVID-19 deaths (as reported by NHSE) and the NCCID (England only).
Figure 12:
Figure 12:
Comparison of COVID-19 admissions to NCCID positive cases by week.

References

    1. Kanne JP, Bai H, Bernheim A, et al. COVID-19 imaging: what we know now and what remains unknown. Radiology. 2021;299(3):E262–79. - PMC - PubMed
    1. Hosseiny M, Kooraki S, Gholamrezanezhad A, et al. Radiology perspective of coronavirus disease 2019 (COVID-19): lessons from severe acute respiratory syndrome and Middle East respiratory syndrome. Am J Roentgenol. 2020;214(5):1078–82. - PubMed
    1. Kooraki S, Hosseiny M, Myers L, et al. Coronavirus (COVID-19) outbreak: what the department of radiology should know. J Am Coll Radiol. 2020;17(4):447–51. - PMC - PubMed
    1. Shi H, Han X, Jiang N, et al. Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study. Lancet Infect Dis. 2020;20(4):425–34. - PMC - PubMed
    1. Lee EY, Ng MY, Khong PL. COVID-19 pneumonia: what has CT taught us?. Lancet Infect Dis. 2020;20(4):384–5. - PMC - PubMed

Publication types