Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 4;19(1):493.
doi: 10.1186/s12967-021-03147-z.

From biobank and data silos into a data commons: convergence to support translational medicine

Affiliations

From biobank and data silos into a data commons: convergence to support translational medicine

Rebecca Asiimwe et al. J Transl Med. .

Abstract

Background: To drive translational medicine, modern day biobanks need to integrate with other sources of data (clinical, genomics) to support novel data-intensive research. Currently, vast amounts of research and clinical data remain in silos, held and managed by individual researchers, operating under different standards and governance structures; a framework that impedes sharing and effective use of data. In this article, we describe the journey of British Columbia's Gynecological Cancer Research Program (OVCARE) in moving a traditional tumour biobank, outcomes unit, and a collection of data silos, into an integrated data commons to support data standardization and resource sharing under collaborative governance, as a means of providing the gynecologic cancer research community in British Columbia access to tissue samples and associated clinical and molecular data from thousands of patients.

Results: Through several engagements with stakeholders from various research institutions within our research community, we identified priorities and assessed infrastructure needs required to optimize and support data collections, storage and sharing, under three main research domains: (1) biospecimen collections, (2) molecular and genomics data, and (3) clinical data. We further built a governance model and a resource portal to implement protocols and standard operating procedures for seamless collections, management and governance of interoperable data, making genomic, and clinical data available to the broader research community.

Conclusions: Proper infrastructures for data collection, sharing and governance is a translational research imperative. We have consolidated our data holdings into a data commons, along with standardized operating procedures to meet research and ethics requirements of the gynecologic cancer community in British Columbia. The developed infrastructure brings together, diverse data, computing frameworks, as well as tools and applications for managing, analyzing, and sharing data. Our data commons bridges data access gaps and barriers to precision medicine and approaches for diagnostics, treatment and prevention of gynecological cancers, by providing access to large datasets required for data-intensive science.

Keywords: Biobank-technologies; Biobanks; Biospecimens; Data commons; Data governance; Federated systems; Laboratory Information Management Systems (LIMS); Precision medicine.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Needs-to-biobank mapping and the number of requirements fulfilled by each LIMS. a Tiled plot of the mapping of each biospecimen research need to the biobank solution meeting that need. Surveyed biobanks are plotted on the y-axis and research needs (desired biobank features) are plotted on the x-axis, grouped and colored by feature class. b Barplot on the overall number of features provided by a specific LIMS. The LIMS solutions are plotted on the y-axis and the number of features provided are plotted on the x-axis
Fig. 2
Fig. 2
OVCARE’s data commons infrastructure and software stack. The overall data commons infrastructure comprises of five main components: (1) A clinical database (REDCap) that consolidates and manages clinical data collections from the BC Cancer Registry and the Cheryl Brown Gynecological Cancers Outcomes Unit, (2) a Library Information Management System (OpenSpecimen) that stores and manages biospecimens collected from consented participants at different hospital sites (i.e. Vancouver General Hospital, the University of British Columbia Hospital, BC Cancer Vancouver, and now a few more centers in BC, (3) the cBioPortal that supports the exploration, analysis and visualization of clinical attributes and molecular profiles from patient tumor samples, (4) the OVCARE Resource Portal (ORP) that governs data and resource sharing based on stipulated protocols, standard operating procedures and research ethics, and (5) the Research Community (this includes the OVCARE internal research and informatics team, and the broader research community that OVCARE serves). Each of the components (REDCap, OpenSpecimen, cBioPortal, ORP) identified to meet our research needs are separately hosted in our hospital’s computing environment and programmatically interlinked through API calls. The data from the different domains are interlinked using system-wide unique identifiers that link patients to their biospecimen collections and molecular/genomics data. To access the amassed clinical and biospecimen collections, authenticated researchers in the OVCARE research community send data and sample acquisition requests to the ORP through which those requests are met by informatics staff, if all stipulated requirements including ethics approval are met. Upon successful data and sample acquisition, researchers conduct their respective studies, and the data generated (raw or processed, and/ biospecimen derivatives) from their research are retuned to OVCARE making it available for re-purposing/secondary use. Furthermore, molecular data returned to the data commons are linked back to the available and stored patient biospecimens. Together with clinical outcomes, these molecular profiles are further explored, analyzed and visualized using the cBioPortal
Fig. 3
Fig. 3
Implementation timeline of OVCARE’s data commons
Fig. 4
Fig. 4
Clinical and outcome data on all gynecological cancer patients diagnosed in British Columbia. In the tiled plot, data elements (demographic, medical history, pathology, chemotherapy, radiation, surgery and quality of life data) were plotted on the y-axis against gynecological cancer patients (patient 1 to n) on the x-axis. Darker tiles indicate availability of data on a patient per data element. Clinical studies (study 1 to n) are interested in certain patients with available data on specific data elements. Subsets of patients overlap between clinical studies

References

    1. Vaught J. Biobanking comes of age: the transition to biospecimen science. Annu Rev Pharmacol Toxicol. 2016;56(1):211–228. doi: 10.1146/annurev-pharmtox-010715-103246. - DOI - PubMed
    1. Vaught J, Kelly A, Hewitt R. A review of international biobanks and networks: success factors and key benchmarks. Biopreserv Biobank. 2009;7(3):143–150. doi: 10.1089/bio.2010.0003. - DOI - PMC - PubMed
    1. Eiseman E, Haga S. Handbook of human tissue sources: a national resource of human tissue samples. Santa Monica: Rand; 1999. p. 251.
    1. Coppola L, Cianflone A, Grimaldi AM, Incoronato M, Bevilacqua P, Messina F, et al. Biobanking in health care: evolution and future directions. J Transl Med. 2019;17(1):172. doi: 10.1186/s12967-019-1922-3. - DOI - PMC - PubMed
    1. Greenberg B, Christian J, Henry LM, Leavy M, Moore H. Biorepositories. Addendum to registries for evaluating patient outcomes: a user’s guide, third edition. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018. (AHRQ Methods for Effective Health Care). http://www.ncbi.nlm.nih.gov/books/NBK493632/. Accessed 22 Jun 2021. - PubMed

Publication types

LinkOut - more resources