Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 11;95(6):e697-e707.
doi: 10.1212/WNL.0000000000009924. Epub 2020 Jul 2.

Accuracy of identifying incident stroke cases from linked health care data in UK Biobank

Affiliations

Accuracy of identifying incident stroke cases from linked health care data in UK Biobank

Kristiina Rannikmäe et al. Neurology. .

Abstract

Objective: In UK Biobank (UKB), a large population-based prospective study, cases of many diseases are ascertained through linkage to routinely collected, coded national health datasets. We assessed the accuracy of these for identifying incident strokes.

Methods: In a regional UKB subpopulation (n = 17,249), we identified all participants with ≥1 code signifying a first stroke after recruitment (incident stroke-coded cases) in linked hospital admission, primary care, or death record data. Stroke physicians reviewed their full electronic patient records (EPRs) and generated reference standard diagnoses. We evaluated the number and proportion of cases that were true-positives (i.e., positive predictive value [PPV]) for all codes combined and by code source and type.

Results: Of 232 incident stroke-coded cases, 97% had EPR information available. Data sources were 30% hospital admission only, 39% primary care only, 28% hospital and primary care, and 3% death records only. While 42% of cases were coded as unspecified stroke type, review of EPRs enabled a pathologic type to be assigned in >99%. PPVs (95% confidence intervals) were 79% (73%-84%) for any stroke (89% for hospital admission codes, 80% for primary care codes) and 83% (74%-90%) for ischemic stroke. PPVs for small numbers of death record and hemorrhagic stroke codes were low but imprecise.

Conclusions: Stroke and ischemic stroke cases in UKB can be ascertained through linked health datasets with sufficient accuracy for many research studies. Further work is needed to understand the accuracy of death record and hemorrhagic stroke codes and to develop scalable approaches for better identifying stroke types.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Selection of included UK Biobank (UKB) participants
GP = general practitioner; NHS = National Health Service.
Figure 2
Figure 2. Positive predictive values (PPVs) of stroke codes
PPVs of stroke codes stratified by code source (A) and code type (B). Primary position: includes primary care codes, where no code position is specified, and only primary position hospital admission codes. CI = confidence interval; ICH = intracerebral hemorrhage; SAH = subarachnoid hemorrhage.
Figure 3
Figure 3. Assessing administrative vs overall accuracy
High clinical certainty mentions of stroke: “stroke,” “probable stroke,” “presumptive stroke,” “consistent with stroke,” “compatible with stroke,” “likely stroke,” “treated as stroke” or equivalent stroke terms (ICH, SAH, intracerebral hemorrhage, subarachnoid hemorrhage, ischemic stroke, infarct). Medium clinical certainty mentions of stroke: including above plus “possible stroke,” “suspected stroke,” “impression of stroke,” “suggestive of stroke,” “query stroke,” or equivalent stroke terms (ICH, SAH, intracerebral hemorrhage, subarachnoid hemorrhage, ischemic stroke, infarct). Low clinical certainty mentions of stroke include above plus “TIA” and “transient ischaemic attack” with any level of certainty preceding it. The hierarchy of clinical certainty was based on the ICD-10 clinical coding instruction manual (isdscotland.org/Products-and-Services/Terminology-Services/Clinical-Coding- Guidelines/, April 2010 version), which is used by the coding departments in UK hospitals. CI = confidence interval; PPV = positive predictive value.
Figure 4
Figure 4. Exploratory analyses to improve accuracy of hemorrhagic stroke codes
*Excluding other stroke code same day: excluded cases with a diagnostic code for >1 stroke pathologic type on the same day. This was done because a patient who has one pathologic stroke type (e.g., ischemic stroke) can sometimes develop a complication and subsequent brain scan appearances similar to another pathologic stroke type (e.g., a patient with ischemic stroke can have a bleed in the brain as a result of the ischemic stroke, which could lead to a false diagnosis of an intracerebral hemorrhage [ICH]). CI = confidence interval; PPV = positive predictive value; SAH = subarachnoid hemorrhage.

Comment in

References

    1. Lozano R, Naghavi M, Foreman K, et al. . Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012;380:2095–2128. - PMC - PubMed
    1. Burton PR, Hansell AL, Fortier I, et al. . Size matters: just how big is BIG? quantifying realistic sample size requirements for human genome epidemiology. Int J Epidemiol 2009:263–273. - PMC - PubMed
    1. Sudlow C, Gallacher J, Allen N, et al. . UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015;12:e1001779. - PMC - PubMed
    1. Woodfield R, Grant I, UK Biobank Stroke Outcomes Group, UK Biobank Follow-Up and Outcomes Working Group, Sudlow CL. Accuracy of electronic health record data for identifying stroke cases in large-scale epidemiological studies: a systematic review from the UK Biobank stroke outcomes group. PLoS One 2015;10:e0140533. - PMC - PubMed
    1. WHO MONICA Project Investigators. The World Health Organization MONICA project (Monitoring Trends and Determinants in Cardiovascular Disease). J Clin Epidemiol 1988;41:105–114. - PubMed

Publication types