. 2018 Nov 21;18(1):107.

doi: 10.1186/s12911-018-0665-z.

Variant information systems for precision oncology

Johannes Starlinger^{1

2}, Steffen Pallarz³, Jurica Ševa³, Damian Rieke^{4

5

6}, Christine Sers⁷, Ulrich Keilholz⁴, Ulf Leser³

Affiliations

¹ Department of Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, Berlin, 10099, Germany. starlinger@informatik.hu-berlin.de.
² Department of Anesthesiology and Operative Intensive Care Medicine (CCM/CVK), Charité Unviersitätsmedizin Berlin, Charitéplatz 1, Berlin, 10117, Germany. starlinger@informatik.hu-berlin.de.
³ Department of Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, Berlin, 10099, Germany.
⁴ Charité Conprehensive Cancer Center, Charité Unviersitätsmedizin Berlin, Charitéplatz 1, Berlin, 10117, Germany.
⁵ Department of Hematology and Medical Oncology, Campus Benjamin Franklin, Charité Unviersitätsmedizin Berlin, Hindenburgdamm 30, Berlin, 12203, Germany.
⁶ Berlin Institute of Health (BIH), Kapelle-Ufer 2, Berlin, 10117, Germany.
⁷ Institute of Pathology Molecular Tumor Pathology, Charité Unviersitätsmedizin Berlin, Charitéplatz 1, Berlin, 10117, Germany.

PMID: 30463544
PMCID: PMC6249891
DOI: 10.1186/s12911-018-0665-z

Variant information systems for precision oncology

Johannes Starlinger et al. BMC Med Inform Decis Mak. 2018.

. 2018 Nov 21;18(1):107.

doi: 10.1186/s12911-018-0665-z.

Authors

Johannes Starlinger^{1

2}, Steffen Pallarz³, Jurica Ševa³, Damian Rieke^{4

5

6}, Christine Sers⁷, Ulrich Keilholz⁴, Ulf Leser³

Affiliations

¹ Department of Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, Berlin, 10099, Germany. starlinger@informatik.hu-berlin.de.
² Department of Anesthesiology and Operative Intensive Care Medicine (CCM/CVK), Charité Unviersitätsmedizin Berlin, Charitéplatz 1, Berlin, 10117, Germany. starlinger@informatik.hu-berlin.de.
³ Department of Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, Berlin, 10099, Germany.
⁴ Charité Conprehensive Cancer Center, Charité Unviersitätsmedizin Berlin, Charitéplatz 1, Berlin, 10117, Germany.
⁵ Department of Hematology and Medical Oncology, Campus Benjamin Franklin, Charité Unviersitätsmedizin Berlin, Hindenburgdamm 30, Berlin, 12203, Germany.
⁶ Berlin Institute of Health (BIH), Kapelle-Ufer 2, Berlin, 10117, Germany.
⁷ Institute of Pathology Molecular Tumor Pathology, Charité Unviersitätsmedizin Berlin, Charitéplatz 1, Berlin, 10117, Germany.

PMID: 30463544
PMCID: PMC6249891
DOI: 10.1186/s12911-018-0665-z

Abstract

Background: The decreasing cost of obtaining high-quality calls of genomic variants and the increasing availability of clinically relevant data on such variants are important drivers for personalized oncology. To allow rational genome-based decisions in diagnosis and treatment, clinicians need intuitive access to up-to-date and comprehensive variant information, encompassing, for instance, prevalence in populations and diseases, functional impact at the molecular level, associations to druggable targets, or results from clinical trials. In practice, collecting such comprehensive information on genomic variants is difficult since the underlying data is dispersed over a multitude of distributed, heterogeneous, sometimes conflicting, and quickly evolving data sources. To work efficiently, clinicians require powerful Variant Information Systems (VIS) which automatically collect and aggregate available evidences from such data sources without suppressing existing uncertainty.

Methods: We address the most important cornerstones of modeling a VIS: We take from emerging community standards regarding the necessary breadth of variant information and procedures for their clinical assessment, long standing experience in implementing biomedical databases and information systems, our own clinical record of diagnosis and treatment of cancer patients based on molecular profiles, and extensive literature review to derive a set of design principles along which we develop a relational data model for variant level data. In addition, we characterize a number of public variant data sources, and describe a data integration pipeline to integrate their data into a VIS.

Results: We provide a number of contributions that are fundamental to the design and implementation of a comprehensive, operational VIS. In particular, we (a) present a relational data model to accurately reflect data extracted from public databases relevant for clinical variant interpretation, (b) introduce a fault tolerant and performant integration pipeline for public variant data sources, and (c) offer recommendations regarding a number of intricate challenges encountered when integrating variant data for clincal interpretation.

Conclusion: The analysis of requirements for representation of variant level data in an operational data model, together with the implementation-ready relational data model presented here, and the instructional description of methods to acquire comprehensive information to fill it, are an important step towards variant information systems for genomic medicine.

Keywords: Data model; Genomic variant data integration; Molecular cancer therapy; Variant information system.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

**Fig. 1**
A Variant Information System (VIS) integrates public data sources and makes their joint information available for use both within inhouse systems for patient knowledge management and directly to domain expert users. (Clipart source: openclipart.org; public domain)

**Fig. 2**
The relational class model to represent minimum variant level data (MVLD) and possible extensions; colors correspond to Ritter et al. [7]: brown: somatic interpretive data; purple: allele interpretive data; blue: allele descriptive data; white: background data extending MVLD. Cardinalities of relationships indicated as follows: (A)1–n(B): one instance of (A) is associated with an arbitrary number of instances of (B); (A)0..1–n(B): no or one instance of (A) is associated with an arbitrary number of instances of (B)

**Fig. 3**
Overview of data integration: source databases are processed by extract/transform/load (ETL) scripts which generate source specific table spaces within the local database. From these, the relevant elements are semantically mapped to and loaded into the core data model

See this image and copyright information in PMC

References

1. Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Hoover J, et al. Clinvar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44(D1):862–8. doi: 10.1093/nar/gkv1222. - DOI - PMC - PubMed
1. Griffith M, Spies NC, Krysiak K, McMichael JF, Coffman AC, Danos AM, Ainscough BJ, Ramirez CA, Rieke DT, Kujan L, et al. Civic is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat Genet. 2017;49(2):170–4. doi: 10.1038/ng.3774. - DOI - PMC - PubMed
1. Forbes SA, Beare D, Gunasekaran P, Leung K, Bindal N, Boutselakis H, Ding M, Bamford S, Cole C, Ward S, et al. Cosmic: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 2015;43(D1):805–11. doi: 10.1093/nar/gku1075. - DOI - PMC - PubMed
1. Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, et al. Drugbank 30: a comprehensive resource for ’omics’ research on drugs. Nucleic Acids Res. 2010;39(suppl_1):1035–41. - PMC - PubMed
1. Kanehisa M, Goto S. Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27–30. doi: 10.1093/nar/28.1.27. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Variant information systems for precision oncology

Affiliations

Variant information systems for precision oncology

Authors

Affiliations

Abstract

Conflict of interest statement

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources