Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Sep;29(9):765-776.
doi: 10.1016/j.molmed.2023.06.006. Epub 2023 Jul 18.

Opportunities and challenges for biomarker discovery using electronic health record data

Affiliations
Review

Opportunities and challenges for biomarker discovery using electronic health record data

P Singhal et al. Trends Mol Med. 2023 Sep.

Abstract

Electronic health records (EHRs) have become increasingly relied upon as a source for biomedical research. One important research application of EHRs is the identification of biomarkers associated with specific patient states, especially within complex conditions. However, using EHRs for biomarker identification can be challenging because the EHR was not designed with research as the primary focus. Despite this challenge, the EHR offers huge potential for biomarker discovery research to transform our understanding of disease etiology and treatment and generate biological insights informing precision medicine initiatives. This review paper provides an in-depth analysis of how EHR data is currently used for phenotyping and identifying molecular biomarkers, current challenges and limitations, and strategies we can take to mitigate challenges going forward.

Keywords: biomarker discovery; electronic health records; phenotyping; precision medicine.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests No interests are declared.

Figures

Figure 1.
Figure 1.. Anatomy of the ideal patient electronic health record.
This ideal example of a patient chart contains routinely collected structured data including billing-related data (gray), unstructured notes, oral history, and messaging data (blue), consumer health data and molecular omic data (green), and environmental and SDOH data (purple). Data type-specific methods, validation tests, and interpretations are needed to identify biomarkers effectively. Icons in patient chart customized using BioRender.com.
Figure 2.
Figure 2.. Biomarker discovery workflow.
Integration of existing patient data with other health variables can create a more comprehensive picture of patient health. Transforming structured and unstructured data using knowledge sources generates an interpretable clinical context. Lastly, implementation of machine learning and statistical models on large-scale, diverse patient data can enable biomarker discovery. Icons in patient chart customized using BioRender.com.

References

    1. Abul-Husn NS and Kenny EE (2019) Personalized Medicine and the Power of Electronic Health Records Cell, 17758–69 - PMC - PubMed
    1. Mersha TB and Abebe T (2015) Self-reported race/ethnicity in the age of genomic research: its potential impact on understanding health disparities. Hum. Genomics 9, 1. - PMC - PubMed
    1. Hirsch JA et al. (2016) ICD-10: History and Context. AJNR Am. J. Neuroradiol 37, 596–599 - PMC - PubMed
    1. Dotson P (2013) CPT® Codes: What Are They, Why Are They Necessary, and How Are They Developed? Adv. Wound Care 2, 583–587 - PMC - PubMed
    1. Forrey AW et al. (1996) Logical observation identifier names and codes (LOINC) database: a public use set of codes and names for electronic reporting of clinical laboratory test results. Clin. Chem 42, 81–90 - PubMed

Publication types

LinkOut - more resources