Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 1;25(3):289-294.
doi: 10.1093/jamia/ocx110.

High-fidelity phenotyping: richness and freedom from bias

Affiliations

High-fidelity phenotyping: richness and freedom from bias

George Hripcsak et al. J Am Med Inform Assoc. .

Abstract

Electronic health record phenotyping is the use of raw electronic health record data to assert characterizations about patients. Researchers have been doing it since the beginning of biomedical informatics, under different names. Phenotyping will benefit from an increasing focus on fidelity, both in the sense of increasing richness, such as measured levels, degree or severity, timing, probability, or conceptual relationships, and in the sense of reducing bias. Research agendas should shift from merely improving binary assignment to studying and improving richer representations. The field is actively researching new temporal directions and abstract representations, including deep learning. The field would benefit from research in nonlinear dynamics, in combining mechanistic models with empirical data, including data assimilation, and in topology. The health care process produces substantial bias, and studying that bias explicitly rather than treating it as merely another source of noise would facilitate addressing it.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Data assimilation to find latent phenotypes. Data assimilation on a mechanistic glucose model produces estimates for a set of physiologic parameters, including plasma insulin degradation, shown here. Starting with the same initial value but based on 5 different patients’ data, data assimilation evolves the parameter to a different value for each patient (P1–P5). This represents a latent phenotype.
Figure 2.
Figure 2.
Topology. (A) Based on an underlying, unknown space that is shown in blue, a sample of black points are drawn. (B) We attempt to recreate the underlying space by creating green neighborhoods of radius epsilon around each point and joining touching neighborhoods. (C) As epsilon grows, we see features of the underlying space recreated, such as 2 distinct groups where one of them is a ring. (D) As epsilon grows further, all the points become joined. Arithmetic topology supplies the tools needed to infer properties of the underlying space based on properties of the neighborhoods as epsilon varies over its range.

Similar articles

Cited by

References

    1. Hripcsak G,Albers DJ. Next-generation phenotyping of electronic health records.J Am Med Inform. Assoc. 2013;20:117–21. - PMC - PubMed
    1. Pathak J,Kho AN,Denny JC. Electronic health records–driven phenotyping: challenges, recent advances, and perspectives.J Am Med Inform Assoc. 2013;20(e2):e206–11. - PMC - PubMed
    1. Warner HR. Knowledge sectors for logical processing of patient data in the HELP system.Proc Annu Symp Comput Appl Med Care. 1978:401–04.
    1. Hripcsak G,Friedman C,Alderson PO,DuMouchel W,Johnson SB,Clayton PD. Unlocking clinical data from narrative reports: a study of natural language processing.Ann Intern Med. 1995;122:681–88. - PubMed
    1. Newton KM,Peissig PL,Kho AN,et al.Validation of electronic medical record–based phenotyping algorithms: results and lessons learned from the eMERGE network.J Am Med Inform Assoc. 2013;20(e1):e147–54. - PMC - PubMed