Phenotypes and rates of cancer-relevant symptoms and tests in the year before cancer diagnosis in UK Biobank and CPRD Gold
- PMID: 38100737
- PMCID: PMC10723831
- DOI: 10.1371/journal.pdig.0000383
Phenotypes and rates of cancer-relevant symptoms and tests in the year before cancer diagnosis in UK Biobank and CPRD Gold
Abstract
Early diagnosis of cancer relies on accurate assessment of cancer risk in patients presenting with symptoms, when screening is not appropriate. But recorded symptoms in cancer patients pre-diagnosis may vary between different sources of electronic health records (EHRs), either genuinely or due to differential completeness of symptom recording. To assess possible differences, we analysed primary care EHRs in the year pre-diagnosis of cancer in UK Biobank and Clinical Practice Research Datalink (CPRD) populations linked to cancer registry data. We developed harmonised phenotypes in Read v2 and CTV3 coding systems for 21 symptoms and eight blood tests relevant to cancer diagnosis. Among 22,601 CPRD and 11,594 UK Biobank cancer patients, 54% and 36%, respectively, had at least one consultation for possible cancer symptoms recorded in the year before their diagnosis. Adjusted comparisons between datasets were made using multivariable Poisson models, comparing rates of symptoms/tests in CPRD against expected rates if cancer site-age-sex-deprivation associations were the same as in UK Biobank. UK Biobank cancer patients compared with those in CPRD had lower rates of consultation for possible cancer symptoms [RR: 0.61 (0.59-0.63)], and lower rates for any primary care consultation [RR: 0.86 (95%CI 0.85-0.87)]. Differences were larger for 'non-alarm' symptoms [RR: 0.54 (0.52-0.56)], and smaller for 'alarm' symptoms [RR: 0.80 (0.76-0.84)] and blood tests [RR: 0.93 (0.90-0.95)]. In the CPRD cohort, approximately representative of the UK population, half of cancer patients had recorded symptoms in the year before diagnosis. The frequency of non-specific presenting symptoms recorded in the year pre-diagnosis of cancer was substantially lower among UK Biobank participants. The degree to which results based on highly selected biobank cohorts are generalisable needs to be examined in disease-specific contexts.
Copyright: © 2023 Barclay et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of interest statement
MB reports personal fees from Grail Inc for membership of an Independent Data Monitoring Committee, outside the submitted work. No other disclosures were reported.
Figures




Similar articles
-
Implementation and external validation of the Cambridge Multimorbidity Score in the UK Biobank cohort.BMC Med Res Methodol. 2024 Mar 20;24(1):71. doi: 10.1186/s12874-024-02175-9. BMC Med Res Methodol. 2024. PMID: 38509467 Free PMC article.
-
Treatment of first-time traumatic anterior shoulder dislocation: the UK TASH-D cohort study.Health Technol Assess. 2019 Apr;23(18):1-104. doi: 10.3310/hta23180. Health Technol Assess. 2019. PMID: 31043225 Free PMC article.
-
Completeness, agreement, and representativeness of ethnicity recording in the United Kingdom's Clinical Practice Research Datalink (CPRD) and linked Hospital Episode Statistics (HES).Popul Health Metr. 2023 Mar 14;21(1):3. doi: 10.1186/s12963-023-00302-0. Popul Health Metr. 2023. PMID: 36918866 Free PMC article.
-
Defining the optimum strategy for identifying adults and children with coeliac disease: systematic review and economic modelling.Health Technol Assess. 2022 Oct;26(44):1-310. doi: 10.3310/ZUCE8371. Health Technol Assess. 2022. PMID: 36321689 Free PMC article.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
Cited by
-
Genetics, primary care records and lifestyle factors for short-term dynamic risk prediction of colorectal cancer: prospective study of asymptomatic and symptomatic UK Biobank participants.BMJ Oncol. 2025 Feb 18;4(1):e000336. doi: 10.1136/bmjonc-2024-000336. eCollection 2025. BMJ Oncol. 2025. PMID: 40046831 Free PMC article.
-
A computational framework for defining and validating reproducible phenotyping algorithms of 313 diseases in the UK Biobank.Sci Rep. 2025 Jul 9;15(1):24607. doi: 10.1038/s41598-025-05838-9. Sci Rep. 2025. PMID: 40634319 Free PMC article.
References
-
- World Health Organisation. Guide to cancer early diagnosis. 2017 [cited 27 Jan 2022]. https://apps.who.int/iris/handle/10665/254500.
Grants and funding
LinkOut - more resources
Full Text Sources