Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Observational Study
. 2025 Apr 23;15(4):e093080.
doi: 10.1136/bmjopen-2024-093080.

Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data

Collaborators, Affiliations
Observational Study

Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data

Richard Williams et al. BMJ Open. .

Abstract

Objectives: To assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the UK. The original study examined the risk factors associated with hospitalisation following COVID-19 infection in people with diabetes.

Design: A replication of a retrospective cohort study.

Setting: Observational electronic health record data from primary and secondary care sources in the UK. The original study used data from a large, urbanised region (Greater Manchester Care Record, Greater Manchester, UK-2.8 m patients). This replication study used a national database covering the whole of England, UK (NHS England's Secure Data Environment service for England, accessed via the BHF Data Science Centre's CVD-COVID-UK/COVID-IMPACT Consortium-54 m patients).

Participants: Individuals with a diagnosis of type 1 diabetes or type 2 diabetes prior to a positive COVID-19 test result. The matched controls (3:1) were individuals who had a positive COVID-19 test result, but who did not have a diagnosis of diabetes on the date of their positive COVID-19 test result. Matching was done on age at COVID-19 diagnosis, sex and approximate date of COVID-19 test.

Primary and secondary outcome measures: Hospitalisation within 28 days of a positive COVID-19 test.

Results: We found that many of the effect sizes did not show a statistically significant difference, but that some did. Where effect sizes were statistically significant in the regional study, then they remained significant in the national study and the effect size was the same direction and of similar magnitude.

Conclusions: There is some evidence that the findings from studies in smaller regional datasets can be extrapolated to a larger, national setting. However, there were some differences, and therefore replication studies remain an essential part of healthcare research.

Keywords: DIABETES & ENDOCRINOLOGY; Electronic Health Records; Observational Study; Retrospective Studies.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1. Univariable analysis for patients with type 1 diabetes. ‘GMCR’ is the original published study (Greater Manchester Care Record), ‘N1’ is the first replication analysis using COVID-19 test data from the primary care data feed and ‘N2’ is the second replication analysis using the Second-Generation Surveillance System for the COVID-19 test results.
Figure 2
Figure 2. Univariable analysis for patients with type 2 diabetes. ‘GMCR’ is the original published study (Greater Manchester Care Record), ‘N1’ is the first replication analysis using COVID-19 test data from the primary care data feed and ‘N2’ is the second replication analysis using the Second-Generation Surveillance System for the COVID-19 test results.
Figure 3
Figure 3. Multivariable analysis for patients with type 1 diabetes and their controls. ‘GMCR’ is the original published study (Greater Manchester Care Record), ‘N1’ is the first replication analysis using COVID-19 test data from the primary care data feed and ‘N2’ is the second replication analysis using the Second-Generation Surveillance System for the COVID-19 test results.
Figure 4
Figure 4. Multivariable analysis for patients with type 2 diabetes and their controls. ‘GMCR’ is the original published study (Greater Manchester Care Record), ‘N1’ is the first replication analysis using COVID-19 test data from the primary care data feed and ‘N2’ is the second replication analysis using the Second-Generation Surveillance System for the COVID-19 test results.

Similar articles

Cited by

References

    1. Rotelli MD. Ethical considerations for increased transparency and reproducibility in the retrospective analysis of health care data. Ther Innov Regul Sci. 2015;49:342–7. doi: 10.1177/2168479015578155. - DOI - PubMed
    1. Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JPA. Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey. BMJ . 2016:i493. doi: 10.1136/bmj.i493. - DOI - PMC - PubMed
    1. Goodman SN, Fanelli D, Ioannidis JPA. What does research reproducibility mean? Getting to good: research integrity in the biomedical sciences. Springer International Publishing; 2018. pp. 96–102.
    1. Heald AH, Jenkins DA, Williams R, et al. The risk factors potentially influencing hospital admission in people with diabetes, following SARS-CoV-2 infection: a population-level analysis. Diabetes Ther. 2022;13:1007–21. doi: 10.1007/s13300-022-01230-2. - DOI - PMC - PubMed
    1. Williams R, Bolton T, Jenkins D, et al. The challenges of replication: a worked example of methods reproducibility using electronic health record data. Health Informatics. 2024 doi: 10.1101/2024.08.06.24311535. - DOI

Publication types

LinkOut - more resources