Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul-Sep;30(3):14604582241276969.
doi: 10.1177/14604582241276969.

Demonstrating the data integrity of routinely collected healthcare systems data for clinical trials (DEDICaTe): A proof-of-concept study

Affiliations

Demonstrating the data integrity of routinely collected healthcare systems data for clinical trials (DEDICaTe): A proof-of-concept study

Macey L Murray et al. Health Informatics J. 2024 Jul-Sep.

Abstract

Introduction/aims: Healthcare systems data (also known as real-world or routinely collected health data) could transform the conduct of clinical trials. Demonstrating integrity and provenance of these data is critical for clinical trials, to enable their use where appropriate and avoid duplication using scarce trial resources. Building on previous work, this proof-of-concept study used a data intelligence tool, the "Central Metastore," to provide metadata and lineage information of nationally held data. Methods: The feasibility of NHS England's Central Metastore to capture detailed records of the origins, processes, and methods that produce four datasets was assessed. These were England's Hospital Episode Statistics (Admitted Patient Care, Outpatients, Critical Care) and the Civil Registration of Deaths (England and Wales). The process comprised: information gathering; information ingestion using the tool; and auto-generation of lineage diagrams/content to show data integrity. A guidance document to standardise this process was developed. Results/Discussion: The tool can ingest, store and display data provenance in sufficient detail to support trust and transparency in using these datasets for trials. The slowest step was information gathering from multiple sources, so consistency in record-keeping is essential.

Keywords: clinical trials; data integrity; data provenance; data quality; healthcare systems data; metadata; routinely collected health data.

PubMed Disclaimer

Conflict of interest statement

Declaration of conflicting interestsThe authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: MRS declares research grants from Astellas, Clovis Oncology, Janssen, Novartis, and Sanofi-Aventis unrelated to this manuscript; consultancy fees from Eli Lilly, and speaker fees from Lilly Oncology, Janssen, and Eisai all unrelated to this manuscript. MM declares research grants from Novartis and Novo Nordisk unrelated to this manuscript. MKBP declares research funding support from Astellas, AstraZeneca, Baxter, Bayer, GlaxoSmithKline, and Janssen all unrelated to the work in this manuscript. All remaining authors declare no competing interests.

Figures

Figure 1
Figure 1. The elements of the metadata model captured in NHS England’s Central Metastore, reproduced from the study’s operating manual/guidance.,
Figure 2
Figure 2
Summary business lineage view of the Admitted Patient Care (APC), Outpatients (OP), and Critical Care (CC) datasets of Hospital Episode Statistics (HES). The data journey from left (submission from hospitals using business rules via an XML schema) to right (production stage where derivations and processing rules are applied) to form the final releasable HES schema containing APC, OP and CC tables in the data access environment (far right).
Figure 3
Figure 3
Field level lineage of the Civil Registration of Deaths (CRD) with an example of derivation to confirm NHS number. The data journey moves from left (submission via Message Exchange for Social Care and Health) to right (production stage, then to releasable in the data access environment).
Figure 4
Figure 4. Example of data item details captured in the Central Metastore at NHS England (diagnosis code, Hospital Episode Statistics Admitted Patient Care).

Similar articles

References

    1. Sydes MR, Barbachano Y, Bowman L, et al. Realising the full potential of data-enabled trials in the UK: a call for action. BMJ Open. 2021;11(6):e043906. doi: 10.1136/bmjopen-2020-043906. - DOI - PMC - PubMed
    1. Medicines and Healthcare products Regulatory Agency (MHRA) MHRA guidance on the use of real-world data in clinical studies to support regulatory decisions. London, UK: MHRA; 2021. [accessed 14 September 2023]. Available from: https://www.gov.uk/government/publications/mhra-guidance-on-the-use-of-r....
    1. US Food and Drug Administration. Real-world data: assessing electronic health records and medical claims data to support regulatory decision-making for drug and biological products - Guidance for Industry. Washington, DC: US Food and Drug Administration; 2024. [accessed 25 July 2024]. Available from: https://www.fda.gov/media/152503/download. - DOI - PMC - PubMed
    1. The RECOVERY Collaborative Group. Dexamethasone in hospitalized patients with Covid-19. N Engl J Med. 2021;384(8):693–704. doi: 10.1056/NEJMoa2021436. - DOI - PMC - PubMed
    1. Yu LM, Bafadhel M, Dorward J, et al. Inhaled budesonide for COVID-19 in people at high risk of complications in the community in the UK (PRINCIPLE): a randomised, controlled, open-label, adaptive platform trial. Lancet. 2021;398(10303):843–855. doi: 10.1016/S0140-6736(21)01744-X. - DOI - PMC - PubMed

LinkOut - more resources