Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 15;16(1):121.
doi: 10.1186/s12911-016-0358-4.

An information model for computable cancer phenotypes

Affiliations

An information model for computable cancer phenotypes

Harry Hochheiser et al. BMC Med Inform Decis Mak. .

Abstract

Background: Standards, methods, and tools supporting the integration of clinical data and genomic information are an area of significant need and rapid growth in biomedical informatics. Integration of cancer clinical data and cancer genomic information poses unique challenges, because of the high volume and complexity of clinical data, as well as the heterogeneity and instability of cancer genome data when compared with germline data. Current information models of clinical and genomic data are not sufficiently expressive to represent individual observations and to aggregate those observations into longitudinal summaries over the course of cancer care. These models are acutely needed to support the development of systems and tools for generating the so called clinical "deep phenotype" of individual cancer patients, a process which remains almost entirely manual in cancer research and precision medicine.

Methods: Reviews of existing ontologies and interviews with cancer researchers were used to inform iterative development of a cancer phenotype information model. We translated a subset of the Fast Healthcare Interoperability Resources (FHIR) models into the OWL 2 Description Logic (DL) representation, and added extensions as needed for modeling cancer phenotypes with terms derived from the NCI Thesaurus. Models were validated with domain experts and evaluated against competency questions.

Results: The DeepPhe Information model represents cancer phenotype data at increasing levels of abstraction from mention level in clinical documents to summaries of key events and findings. We describe the model using breast cancer as an example, depicting methods to represent phenotypic features of cancers, tumors, treatment regimens, and specific biologic behaviors that span the entire course of a patient's disease.

Conclusions: We present a multi-scale information model for representing individual document mentions, document level classifications, episodes along a disease course, and phenotype summarization, linking individual observations to high-level summaries in support of subsequent integration and analysis.

Keywords: Cancer; Deep phenotyping; Information extraction; Information model.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
A schematic representation of the workflow used by the authors to generate the FHIR cancer models
Fig. 2
Fig. 2
Classes used in cancer phenotype representations. Individual mentions extracted from NLP (Level 1) are instantiated as FHIR Objects, which are collected in Compositions corresponding to individual documents (Level 2). These FHIR objects become events that are aggregated into distinct Episodes of care (Level 3) and eventually analyzed to form patient and phenotype level summaries (Level 4)
Fig. 3
Fig. 3
Example patient records and their representation as compositions
Fig. 4
Fig. 4
Summarization of records from Fig. 3 into Episodes and Patient/Phenotype Summary
Fig. 5
Fig. 5
An example abstraction rule and its expression in SWRL. Summarization rules convert assertions extracted from individual documents into higher-level summaries. (1) A subset of the upper-levels of the information model showing key concepts in representation of both instance and summary models. (2) A mapping of those concepts to levels in the information model. (3) A subset of the elements used in a Patient/Phentoype level summary. (4) A graphical example of a rule taking instances (5) and transforming them into a summary representation (6). This rule indicates that the value of a FISH test will take precedence over results of an IHC test. This rule is given in English (7), SWRL (8), and Drools (9)

References

    1. Robinson PN. Deep phenotyping for precision medicine. Hum Mutat. 2012;33(5):777–780. doi: 10.1002/humu.22080. - DOI - PubMed
    1. Index—FHIR v1.0.2 [http://hl7.org/fhir/]. Accessed 4 Sept 2016.
    1. Xu J, Rasmussen LV, Shaw PL, Jiang G, Kiefer RC, Mo H, Pacheco JA, Speltz P, Zhu Q, Denny JC, et al. Review and evaluation of electronic health records-driven phenotype algorithm authoring tools for clinical and translational research. J Am Med Inform Assoc. 2015;22(6):ocv070. - PMC - PubMed
    1. Hiatt RA, Tai CG, Blayney DW, Deapen D, Hogarth M, Kizer KW, Lipscomb J, Malin J, Phillips SK, Santa J et al. Leveraging state cancer registries to measure and improve the quality of cancer care: a potential strategy for California and beyond. J Natl Cancer Inst 2015, 107 (5):djv047 - PubMed
    1. Helfand B, Roehl K, Cooper P, McGuire B, Fitzgerald L, Cancel-Tassin G, Cornu J-N, Bauer S, Van Blarigan E, Chen X et al. Associations of prostate cancer risk variants with disease aggressiveness: results of the NCI-SPORE Genetics Working Group analysis of 18,343 cases. Hum Genet. 2015;134(4):439–50. - PMC - PubMed

Publication types