Development and Experience with Cancer Risk Prediction Models Using Federated Databases and Electronic Health Records
- PMID: 35605073
- Bookshelf ID: NBK580626
- DOI: 10.36255/exon-publications-digital-health-federated-databases
Development and Experience with Cancer Risk Prediction Models Using Federated Databases and Electronic Health Records
Excerpt
Early diagnosis is critical to improving survival rates of lethal cancers, such as pancreatic duct adenocarcinoma (PDAC). However, there are no reliable screening test for these cancers. In this chapter, we present potential methods for predicting early, evolving cancers by leveraging readily available electronic health record (EHR) data and machine learning. We discuss the various aspects of our collaborative experience, involving clinical and computer scientists, in navigating the process of using EHRs to develop cancer risk prediction models. This chapter is intended to serve as a guide to others preforming this type of research. We cover the different steps involved, based on our initial experience of model development using single-institution data, including data acquisition, querying and downloading data, protecting patient confidentiality, data curation, model development, and validation. Challenges encountered when using single-institution data is presented, along with lessons learned. Drawing from our experience working with a federated database of EHR data from multiple institutions to develop a risk prediction model for PDAC, we also discuss how many of these challenges can be addressed by using such a federated database of EHR data. We also discuss future clinical opportunities that may arise from leveraging data from a federated network, such as the deployment of risk models for clinical studies.
Copyright: The Authors.; The authors confirm that the materials included in this chapter do not violate copyright laws. Where relevant, appropriate permissions have been obtained from the original copyright holder(s), and all original sources have been appropriately acknowledged or referenced.
Conflict of interest statement
Sections
- INTRODUCTION
- THE GOAL OF DEVELOPING CANCER RISK PREDICTION MODELS
- THE RATIONALE FOR USING REAL-WORLD DATA TO BUILD MODELS
- MAXIMIZING THE POTENTIAL OF EHR DATA FOR MODEL DEVELOPMENT
- OUR INITIAL EXPERIENCE WITH LOCAL HOSPITAL DATA
- OVERCOMING THE LIMITATIONS OF SINGLE-INSTITUTION DATA USING A FEDERATED NETWORK DATABASE
- OPPORTUNITIES FOR REAL-TIME MODEL DEPLOYMENT AND ASSESSMENT
- CONCLUSION
- REFERENCES
References
-
- Siu AL, Force USPST. Screening for Breast Cancer: U.S. Preventive Services Task Force Recommendation Statement. Ann Intern Med. 2016;164(4):279–96. https://doi.org/10.7326/M15-2886 . - DOI - PubMed
-
- Canto MI, Almario JA, Schulick RD, Yeo CJ, Klein A, Blackford A, et al. Risk of Neoplastic Progression in Individuals at High Risk for Pancreatic Cancer Undergoing Long-term Surveillance. Gastroenterology. 2018;155(3):740–51 e2. https://doi.org/10.1053/j.gastro.2018.05.035 . - DOI - PMC - PubMed
-
- Kenner B, Chari ST, Kelsen D, Klimstra DS, Pandol SJ, Rosenthal M, et al. Artificial Intelligence and Early Detection of Pancreatic Cancer: 2020 Summative Review. Pancreas. 2021;50(3):251–79. https://doi.org/10.1097/MPA.0000000000001762 . - DOI - PMC - PubMed
-
- Khozin S, Blumenthal GM, Pazdur R. Real-world Data for Clinical Evidence Generation in Oncology. J Natl Cancer Inst. 2017;109(11) https://doi.org/10.1093/jnci/djx187 . - DOI - PubMed
-
- Rayner J, Khan T, Chan C, Wu C. Illustrating the patient journey through the care continuum: Leveraging structured primary care electronic medical record (EMR) data in Ontario, Canada using chronic obstructive pulmonary disease as a case study. Int J Med Inform. 2020;140:104159. https://doi.org/10.1016/j.ijmedinf.2020.104159 . - DOI - PubMed
Publication types
LinkOut - more resources
Full Text Sources