Standardised and Reproducible Phenotyping Using Distributed Analytics and Tools in the Data Analysis and Real World Interrogation Network (DARWIN EU)

Francesco Dernie^{1

2}, George Corby^{1

2}, Abigail Robinson^{1

2}, James Bezer^{1

2}, Nuria Mercade-Besora², Romain Griffier³, Guillaume Verdy³, Angela Leis⁴, Juan Manuel Ramirez-Anguita⁵, Miguel A Mayer^{4

5}, James T Brash⁶, Sarah Seager⁶, Rowan Parry⁷, Annika Jodicke², Talita Duarte-Salles^{7

8}, Peter R Rijnbeek⁷, Katia Verhamme⁷, Alexandra Pacurariu⁹, Daniel Morales^{9

10}, Luis Pinheiro⁹, Daniel Prieto-Alhambra^{2

7}, Albert Prats-Uribe²

Affiliations

¹ Medical Sciences Division, University of Oxford, Oxford, UK.
² Pharmaco- and Device Epidemiology, Centre for Statistics in Medicines, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK.
³ Public Health Department, Medical Information Service, Medical Informatics and Archiving Unit (IAM), University Hospital of Bordeaux, Bordeaux, France.
⁴ Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Research Institute, Barcelona, Spain.
⁵ Management and Control Department, Consorci Mar Parc de Salut de Barcelona, Barcelona, Spain.
⁶ Real World Solutions, IQVIA, Brighton, UK.
⁷ Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, The Netherlands.
⁸ Fundació Institut Universitari per a la Recerca a l'Atencio Primaria de Salut Jordi Gol I Gurina (IDIAPJGol), Universitat Autonoma de Barcelona, Barcelona, Spain.
⁹ Real World Evidence Workstream, European Medicines Agency, Amsterdam, The Netherlands.
¹⁰ Division of Population Health and Genomics, University of Dundee, Dundee, UK.

PMID: 39532529
DOI: 10.1002/pds.70042

Standardised and Reproducible Phenotyping Using Distributed Analytics and Tools in the Data Analysis and Real World Interrogation Network (DARWIN EU)

Francesco Dernie et al. Pharmacoepidemiol Drug Saf. 2024 Nov.

. 2024 Nov;33(11):e70042.

doi: 10.1002/pds.70042.

Authors

Affiliations

¹ Medical Sciences Division, University of Oxford, Oxford, UK.
² Pharmaco- and Device Epidemiology, Centre for Statistics in Medicines, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK.
³ Public Health Department, Medical Information Service, Medical Informatics and Archiving Unit (IAM), University Hospital of Bordeaux, Bordeaux, France.
⁴ Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Research Institute, Barcelona, Spain.
⁵ Management and Control Department, Consorci Mar Parc de Salut de Barcelona, Barcelona, Spain.
⁶ Real World Solutions, IQVIA, Brighton, UK.
⁷ Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, The Netherlands.
⁸ Fundació Institut Universitari per a la Recerca a l'Atencio Primaria de Salut Jordi Gol I Gurina (IDIAPJGol), Universitat Autonoma de Barcelona, Barcelona, Spain.
⁹ Real World Evidence Workstream, European Medicines Agency, Amsterdam, The Netherlands.
¹⁰ Division of Population Health and Genomics, University of Dundee, Dundee, UK.

PMID: 39532529
DOI: 10.1002/pds.70042

Abstract

Purpose: The generation of representative disease phenotypes is important for ensuring the reliability of the findings of observational studies. The aim of this manuscript is to outline a reproducible framework for reliable and traceable phenotype generation based on real world data for use in the Data Analysis and Real-World Interrogation Network (DARWIN EU). We illustrate the use of this framework by generating phenotypes for two diseases: pancreatic cancer and systemic lupus erythematosus (SLE).

Methods: The phenotyping process involves a 14-steps process based on a standard operating procedure co-created by the DARWIN EU Coordination Centre in collaboration with the European Medicines Agency. A number of bespoke R packages were utilised to generate and review codelists for two phenotypes based on real world data mapped to the OMOP Common Data Model.

Results: Codelists were generated for both pancreatic cancer and SLE, and cohorts were generated in six OMOP-mapped databases. Diagnostic checks were performed, which showed these cohorts had broadly similar incidence and prevalence figures to previously published literature, despite significant inter-database variability. Co-occurrent symptoms, conditions, and medication use were in keeping with pre-specified clinical descriptions based on previous knowledge.

Conclusions: Our detailed phenotyping process makes use of bespoke tools and allows for comprehensive codelist generation and review, as well as large-scale exploration of the characteristics of the resulting cohorts. Wider use of structured and reproducible phenotyping methods will be important in ensuring the reliability of observational studies for regulatory purposes.

Keywords: pancreatic cancer; phenotyping; systemic lupus erythematosus.

PubMed Disclaimer

References

1. G. Hripcsak and D. J. Albers, “High‐Fidelity Phenotyping: Richness and Freedom From Bias,” Journal of the American Medical Informatics Association 25, no. 3 (2017): 289–294.
1. G. Hripcsak, J. D. Duke, N. H. Shah, et al., “Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers,” Studies in Health Technology and Informatics 216 (2015): 574–578.
1. J. M. Overhage, P. B. Ryan, C. G. Reich, A. G. Hartzema, and P. E. Stang, “Validation of a Common Data Model for Active Safety Surveillance Research,” Journal of the American Medical Informatics Association 19, no. 1 (2012): 54–60.
1. R. A. DeFronzo, E. Ferrannini, L. Groop, et al., “Type 2 Diabetes Mellitus,” Nature Reviews Disease Primers 1, no. 1 (2015): 15019.
1. S. Lanes, J. S. Brown, K. Haynes, M. F. Pollack, and A. M. Walker, “Identifying Health Outcomes in Healthcare Databases,” Pharmacoepidemiology and Drug Safety 24, no. 10 (2015): 1009–1016.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Wiley
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Standardised and Reproducible Phenotyping Using Distributed Analytics and Tools in the Data Analysis and Real World Interrogation Network (DARWIN EU)

Affiliations

Standardised and Reproducible Phenotyping Using Distributed Analytics and Tools in the Data Analysis and Real World Interrogation Network (DARWIN EU)

Authors

Affiliations

Abstract

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical