Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar:5:256-265.
doi: 10.1200/CCI.20.00094.

OSIRIS: A Minimum Data Set for Data Sharing and Interoperability in Oncology

Affiliations

OSIRIS: A Minimum Data Set for Data Sharing and Interoperability in Oncology

Julien Guérin et al. JCO Clin Cancer Inform. 2021 Mar.

Abstract

Purpose: Many institutions throughout the world have launched precision medicine initiatives in oncology, and a large amount of clinical and genomic data is being produced. Although there have been attempts at data sharing with the community, initiatives are still limited. In this context, a French task force composed of Integrated Cancer Research Sites (SIRICs), comprehensive cancer centers from the Unicancer network (one of Europe's largest cancer research organization), and university hospitals launched an initiative to improve and accelerate retrospective and prospective clinical and genomic data sharing in oncology.

Materials and methods: For 5 years, the OSIRIS group has worked on structuring data and identifying technical solutions for collecting and sharing them. The group used a multidisciplinary approach that included weekly scientific and technical meetings over several months to foster a national consensus on a minimal data set.

Results: The resulting OSIRIS set and event-based data model, which is able to capture the disease course, was built with 67 clinical and 65 omics items. The group made it compatible with the HL7 Fast Healthcare Interoperability Resources (FHIR) format to maximize interoperability. The OSIRIS set was reviewed, approved by a National Plan Strategic Committee, and freely released to the community. A proof-of-concept study was carried out to put the OSIRIS set and Common Data Model into practice using a cohort of 300 patients.

Conclusion: Using a national and bottom-up approach, the OSIRIS group has defined a model including a minimal set of clinical and genomic data that can be used to accelerate data sharing produced in oncology. The model relies on clear and formally defined terminologies and, as such, may also benefit the larger international community.

PubMed Disclaimer

Figures

FIG 1.
FIG 1.
The overall methodology used to deliver the first release of the OSIRIS set. During several months, weekly meetings of several national groups (SIRIC multidisciplinary group and scientific and technical boards) were held to release the first version of the OSIRIS set. SIRIC, Integrated Cancer Research Sites.
FIG 2.
FIG 2.
OSIRIS clinical data model. This figure shows the OSIRIS event–based clinical data model to follow the disease course longitudinally. For each event type (primary tumor and local and metastatic relapse), the response and adverse events of a treatment are associated. Moreover, any analysis carried out on a sample (imaging, omics, biology, pathologic examination) is also linked to a specific event.
FIG 3.
FIG 3.
OSIRIS omics data model. Thanks to an object-oriented model, the omics concepts were designed to be scalable and modular. The model uses inheritance to store common (ie, AlterationOnSample concept) and specific attributes of various kinds of genomic alterations. Each genomic alteration is annotated for cancer diagnosis (ie, Annotation concept) along with the confidence level of the prediction (ie, validation concept).
FIG A1.
FIG A1.
Description of the use of the OSIRIS structured flat files. We use the OSIRIS flat files as an entry point to standardize data from different data sources (ie, EHRs, eCRFs, data warehouses, and cancer registries). These pivot files are then used to facilitate interoperability with other standards. For instance, we used them to construct ETLs with I2B2 CDM instances and the FHIR API. API, application programming interface; CDM, Common Data Model; EHR, Electronic Health Record; ETL, extract, transform, and load; FHIR, Fast Healthcare Interoperability Resources.

Similar articles

Cited by

References

    1. Lawrence MS Stojanov P Mermel CH, et al. : Discovery and saturation analysis of cancer genes across 21 tumor types. Nature 505:495–501, 2014 - PMC - PubMed
    1. CIT Program. Carte d'Identité des Tumeurs—Accueil. https://cit.ligue-cancer.net/
    1. The Cancer Genome Atlas Program. National Cancer Institute. https://www.cancer.gov/about-nci/organization/ccg/research/structural-ge..., 2018
    1. International Cancer Genome Consortium. https://icgc.org/, 2019
    1. Learned K Durbin A Currie R, et al. : Barriers to accessing public cancer genomic data. Sci Data 6:98, 2019 - PMC - PubMed

Publication types

LinkOut - more resources