Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 15;11(9):e0162388.
doi: 10.1371/journal.pone.0162388. eCollection 2016.

Algorithms for the Capture and Adjudication of Prevalent and Incident Diabetes in UK Biobank

Affiliations

Algorithms for the Capture and Adjudication of Prevalent and Incident Diabetes in UK Biobank

Sophie V Eastwood et al. PLoS One. .

Abstract

Objectives: UK Biobank is a UK-wide cohort of 502,655 people aged 40-69, recruited from National Health Service registrants between 2006-10, with healthcare data linkage. Type 2 diabetes is a key exposure and outcome. We developed algorithms to define prevalent and incident diabetes for UK Biobank. The algorithms will be implemented by UK Biobank and their results made available to researchers on request.

Methods: We used UK Biobank self-reported medical history and medication to assign prevalent diabetes and type, and tested this against linked primary and secondary care data in Welsh UK Biobank participants. Additionally, we derived and tested algorithms for incident diabetes using linked primary and secondary care data in the English Clinical Practice Research Datalink, and ran these on secondary care data in UK Biobank.

Results and significance: For prevalent diabetes, 0.001% and 0.002% of people classified as "diabetes unlikely" in UK Biobank had evidence of diabetes in their primary or secondary care record respectively. Of those classified as "probable" type 2 diabetes, 75% and 96% had specific type 2 diabetes codes in their primary and secondary care records. For incidence, 95% of people with the type 2 diabetes-specific C10F Read code in primary care had corroborative evidence of diabetes from medications, blood testing or diabetes specific process of care codes. Only 41% of people identified with type 2 diabetes in primary care had secondary care evidence of type 2 diabetes. In contrast, of incident cases using ICD-10 type 2 diabetes specific codes in secondary care, 77% had corroborative evidence of diabetes in primary care. We suggest our definition of prevalent diabetes from UK Biobank baseline data has external validity, and recommend that specific primary care Read codes should be used for incident diabetes to ensure precision. Secondary care data should be used for incident diabetes with caution, as around half of all cases are missed, and a quarter have no corroborative evidence of diabetes in primary care.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist

Figures

Fig 1
Fig 1. Data sources used in the development of UK Biobank diabetes prevalence and incidence algorithms.
Solid arrows indicate established linkages, dotted arrows indicate anticipated linkages. 1data used to derive prevalence algorithms, 2data used to test prevalence algorithms, 3data used to derive incidence algorithms, 4data used to test incidence algorithms. HES = Hospital episode statistics, SMR01 = Scottish morbidity record, SAIL = Secure anonymised information linkage databank, PEDW = patient episode database for Wales, CPRD = clinical practice research datalink.
Fig 2
Fig 2
(a) Prevalence algorithm 1: Distinction between diabetes presence or absence, and initial sorting of diabetes type using baseline UK Biobank assessment data. See S1 appendix for rationale and further data for each step. (b) Prevalence algorithm 2: Finalising type 1 diabetes diagnosis and classification into probable and possible categories. See S1 appendix for rationale and further data for each step.(c) Prevalence algorithm 3: Finalising type 2 diabetes diagnosis and classification into probable and possible categories. See S1 appendix for rationale and further data for each step.(d) Final diabetes diagnostic status in UKB.
Fig 3
Fig 3
a. Diabetes incidence algorithms for primary care data, run in CPRD. b. Diabetes incidence algorithm for secondary care data, run in UK Biobank-held in-patient data. *Includes categories:probable type 1 diabetes, probable type 2 diabetes,. **ICD-10: E10, E11, E13, E14. Includes main or secondary diagnostic codes for in-patient data.
Fig 4
Fig 4. Flow of participants identified with diabetes in UK Biobank.
*or mid-point of last consultation/episode without diabetes diagnosis (UK Biobank inception if not available) and 1st diabetes diagnosis dates.

References

    1. UK Biobank Protocol for a large-scale prospective epidemiological resource.
    1. UK Biobank UK Biobank data showcase.
    1. Collins R (2012) What makes UK Biobank special? Lancet 379: 1173–1174. 10.1016/S0140-6736(12)60404-8 - DOI - PubMed
    1. Goto A, Morita A, Goto M, Sasaki S, Miyachi M, et al. (2013) Validity of diabetes self-reports in the Saku diabetes study. J Epidemiol 23: 295–300. - PMC - PubMed
    1. Jackson JM, DeFor TA, Crain AL, Kerby TJ, Strayer LS, et al. (2014) Validity of diabetes self-reports in the Women's Health Initiative. Menopause 21: 861–868. 10.1097/GME.0000000000000189 - DOI - PMC - PubMed