Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records
- PMID: 23842533
- PMCID: PMC3751474
- DOI: 10.1186/1472-6947-13-71
Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records
Abstract
Background: Electronic health records (EHRs) provide enormous potential for health research but also present data governance challenges. Ensuring de-identification is a pre-requisite for use of EHR data without prior consent. The South London and Maudsley NHS Trust (SLaM), one of the largest secondary mental healthcare providers in Europe, has developed, from its EHRs, a de-identified psychiatric case register, the Clinical Record Interactive Search (CRIS), for secondary research.
Methods: We describe development, implementation and evaluation of a bespoke de-identification algorithm used to create the register. It is designed to create dictionaries using patient identifiers (PIs) entered into dedicated source fields and then identify, match and mask them (with ZZZZZ) when they appear in medical texts. We deemed this approach would be effective, given high coverage of PI in the dedicated fields and the effectiveness of the masking combined with elements of a security model. We conducted two separate performance tests i) to test performance of the algorithm in masking individual true PIs entered in dedicated fields and then found in text (using 500 patient notes) and ii) to compare the performance of the CRIS pattern matching algorithm with a machine learning algorithm, called the MITRE Identification Scrubber Toolkit - MIST (using 70 patient notes - 50 notes to train, 20 notes to test on). We also report any incidences of potential breaches, defined by occurrences of 3 or more true or apparent PIs in the same patient's notes (and in an additional set of longitudinal notes for 50 patients); and we consider the possibility of inferring information despite de-identification.
Results: True PIs were masked with 98.8% precision and 97.6% recall. As anticipated, potential PIs did appear, owing to misspellings entered within the EHRs. We found one potential breach. In a separate performance test, with a different set of notes, CRIS yielded 100% precision and 88.5% recall, while MIST yielded a 95.1% and 78.1%, respectively. We discuss how we overcome the realistic possibility - albeit of low probability - of potential breaches through implementation of the security model.
Conclusion: CRIS is a de-identified psychiatric database sourced from EHRs, which protects patient anonymity and maximises data available for research. CRIS demonstrates the advantage of combining an effective de-identification algorithm with a carefully designed security model. The paper advances much needed discussion of EHR de-identification - particularly in relation to criteria to assess de-identification, and considering the contexts of de-identified research databases when assessing the risk of breaches of confidential patient information.
Figures



Similar articles
-
Cohort profile of the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register: current status and recent enhancement of an Electronic Mental Health Record-derived data resource.BMJ Open. 2016 Mar 1;6(3):e008721. doi: 10.1136/bmjopen-2015-008721. BMJ Open. 2016. PMID: 26932138 Free PMC article.
-
The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register: development and descriptive data.BMC Psychiatry. 2009 Aug 12;9:51. doi: 10.1186/1471-244X-9-51. BMC Psychiatry. 2009. PMID: 19674459 Free PMC article.
-
Reviewing a Decade of Research Into Suicide and Related Behaviour Using the South London and Maudsley NHS Foundation Trust Clinical Record Interactive Search (CRIS) System.Front Psychiatry. 2020 Nov 27;11:553463. doi: 10.3389/fpsyt.2020.553463. eCollection 2020. Front Psychiatry. 2020. PMID: 33329090 Free PMC article. Review.
-
The Camden & Islington Research Database: Using electronic mental health records for research.PLoS One. 2018 Jan 29;13(1):e0190703. doi: 10.1371/journal.pone.0190703. eCollection 2018. PLoS One. 2018. PMID: 29377897 Free PMC article.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
Cited by
-
Mood instability is a common feature of mental health disorders and is associated with poor clinical outcomes.BMJ Open. 2015 May 21;5(5):e007504. doi: 10.1136/bmjopen-2014-007504. BMJ Open. 2015. PMID: 25998036 Free PMC article.
-
The growing evidence for mental health rehabilitation services and directions for future research.Front Psychiatry. 2023 Nov 20;14:1303073. doi: 10.3389/fpsyt.2023.1303073. eCollection 2023. Front Psychiatry. 2023. PMID: 38053541 Free PMC article. No abstract available.
-
Developing a new model for patient recruitment in mental health services: a cohort study using Electronic Health Records.BMJ Open. 2014 Dec 2;4(12):e005654. doi: 10.1136/bmjopen-2014-005654. BMJ Open. 2014. PMID: 25468503 Free PMC article.
-
Should Clinicians Split or Lump Psychiatric Symptoms? The Structure of Psychopathology in Two Large Pediatric Clinical Samples from England and Norway.Child Psychiatry Hum Dev. 2018 Aug;49(4):607-620. doi: 10.1007/s10578-017-0777-1. Child Psychiatry Hum Dev. 2018. PMID: 29243079 Free PMC article.
-
Identification of the delivery of cognitive behavioural therapy for psychosis (CBTp) using a cross-sectional sample from electronic health records and open-text information in a large UK-based mental health case register.BMJ Open. 2017 Jul 17;7(7):e015297. doi: 10.1136/bmjopen-2016-015297. BMJ Open. 2017. PMID: 28716789 Free PMC article.
References
-
- Stewart R, Soremekun M, Perera G, Broadbent M, Callard F, Denis M, Hotopf M, Thornicroft G, Lovestone S. The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register: development and descriptive data. BMC Psychiatry. 2009;9:51. doi: 10.1186/1471-244X-9-51. - DOI - PMC - PubMed
-
- Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, Barber N, Avery A, Fernando B, Jacklin A. et al.Implementation and adoption of nationwide electronic health records in secondary care in England: qualitative analysis of interim results from a prospective national evaluation. BMJ. 2010;341:c4564. doi: 10.1136/bmj.c4564. - DOI - PMC - PubMed
-
- Armstrong V, Barnett J, Cooper H, Monkman M, Moran-Ellis J, Shepherd R. Public Perspectives on the Governance of Biomedical Research: a qualitative study in a deliberative context. London: Wellcome Trust; 2007.
-
- Callard F, Wykes T. Mental health and perceptions of biomarker research – possible effects on participation. J Ment Health. 2008;17(1):1–7. doi: 10.1080/09638230801931944. - DOI
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Research Materials
Miscellaneous