Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 7;12(4):e0172187.
doi: 10.1371/journal.pone.0172187. eCollection 2017.

Combining clinical and genomics queries using i2b2 - Three methods

Affiliations

Combining clinical and genomics queries using i2b2 - Three methods

Shawn N Murphy et al. PLoS One. .

Abstract

We are fortunate to be living in an era of twin biomedical data surges: a burgeoning representation of human phenotypes in the medical records of our healthcare systems, and high-throughput sequencing making rapid technological advances. The difficulty representing genomic data and its annotations has almost by itself led to the recognition of a biomedical "Big Data" challenge, and the complexity of healthcare data only compounds the problem to the point that coherent representation of both systems on the same platform seems insuperably difficult. We investigated the capability for complex, integrative genomic and clinical queries to be supported in the Informatics for Integrating Biology and the Bedside (i2b2) translational software package. Three different data integration approaches were developed: The first is based on Sequence Ontology, the second is based on the tranSMART engine, and the third on CouchDB. These novel methods for representing and querying complex genomic and clinical data on the i2b2 platform are available today for advancing precision medicine.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Overview of the three different approaches.
1) Using i2b2 by adding patient facts that have concepts coded per the Genome Sequence Ontology, 2) using i2b2/tranSMART by adding patient facts represented by a unique ontology allowing greater variant exploration, 3) using i2b2 by generating a patient set from i2b2 Star Schema database contained phenotypes and then using an alternate NoSQL-NGS variant storage to complete the genomic part of the query.
Fig 2
Fig 2. Classical i2b2 user interface for use case 1.
Which individuals with a lower mode of HLA-DQB1 protein levels (i.e., HLA-DQB1 log protein ratio < 0) have missense or nonsense mutations in that gene? The available ontologies are displayed on the left side and the phenotypic and genotypic concepts used to build the query are shown on the right.
Fig 3
Fig 3
Panel A: Designing a query in the i2b2/tranSMART interface using phenotypic and genomic variables. Use case 1: Which individuals with a lower mode of HLA-DQB1 protein levels (i.e., HLA-DQB1 log protein ratio < 0) have missense or nonsense mutations in that gene? Panel B: Results of a query in the i2b2/tranSMART interface using phenotypic and genomic variables. Use case 1: Which individuals with a lower mode of HLA-DQB1 protein levels (i.e., HLA-DQB1 log protein ratio < 0) have missense or nonsense mutations in that gene?
Fig 4
Fig 4. Display of counts per population in two subgroups in i2b2/tranSMART (use case 3).
Fig 5
Fig 5. System components and their inter-relationships.
The Data annotation/upload process requires the user to provide one or more VCF files that are functionally annotated with ANNOVAR and used to create one JSON document for each variant belonging to a single patient; these JSONs are stored inside CouchDB to be queried by the BigQ-NGS Cell. On the client-side, the BigQ-NGS Plugin allows the user to create a genetic query with drag-and-drop interactions within the i2b2 Webclient; afterwards the plugin communicates with the cell to run the query and collect the results that are shown to the user.
Fig 6
Fig 6. Screenshot of BigQ-NGS Plugin with user interactions highlighted.
(1) The user creates a query by dragging and dropping different blocks inside the plugin’s workspace. Each block represents a query on a single attribute that will be performed by the NoSQL-NGS Cell. After the blocks are connected to each other, the query is defined. (2) A patient set, previously created with a standard i2b2 query, is dragged and dropped on the Patient Result Set Drop (PRS Drop) block to define the patients whose exomes will be queried. (3) By double-clicking the standard query blocks (in yellow), it is possible to specify their query logic and query parameters. (4) Afterwards, the query process can start, and each block executes its query sequentially, calling the NoSQL-NGS Cell. (5) When all blocks have performed their query, the user can visualize the results by double-clicking the Patient Result Set Table (PRS Table) block.

Similar articles

Cited by

References

    1. Mandl KD, Kohane IS. Federalist principles for healthcare data networks. Nat Biotechnol. 2015;33: 360–363. 10.1038/nbt.3180 - DOI - PMC - PubMed
    1. Liao KP, Cai T, Gainer V, Goryachev S, Zeng-treitler Q, Raychaudhuri S, et al. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res (Hoboken). John Wiley & Sons, Inc; 2010;62: 1120–1127. - PMC - PubMed
    1. Kurreeman F.,Liao K., Chibnik L., Hickey B., Stahl E., Gainer V., et al., Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records. Am J Hum Genet, 2011. 88(1): p. 57–69. 10.1016/j.ajhg.2010.12.007 - DOI - PMC - PubMed
    1. Savaiano J. Bring healthcare's dark data to light. In: healthcareitnews.com [Internet]. 30 Jan 2013 [cited 20 Nov 2014]. http://www.healthcareitnews.com/news/bring-healthcares-dark-data-light?s...
    1. Stephens Z.D., Lee S.Y., Faghri F., Campbell R.H., Zhai C., Efron M.J., et al. Big Data: Astronomical or Genomical? PLoS Biol, 2015. 13(7): p. e1002195 10.1371/journal.pbio.1002195 - DOI - PMC - PubMed

LinkOut - more resources