Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan;20(1):139-153.
doi: 10.1007/s12021-021-09516-9. Epub 2021 May 18.

The C-BIG Repository: an Institution-Level Open Science Platform

Affiliations

The C-BIG Repository: an Institution-Level Open Science Platform

Samir Das et al. Neuroinformatics. 2022 Jan.

Abstract

In January 2016, the Montreal Neurological Institute-Hospital (The Neuro) declared itself an Open Science organization. This vision extends beyond efforts by individual scientists seeking to release individual datasets, software tools, or building platforms that provide for the free dissemination of such information. It involves multiple stakeholders and an infrastructure that considers governance, ethics, computational resourcing, physical design, workflows, training, education, and intra-institutional reporting structures. The C-BIG repository was built in response as The Neuro's institutional biospecimen and clinical data repository, and collects biospecimens as well as clinical, imaging, and genetic data from patients with neurological disease and healthy controls. It is aimed at helping scientific investigators, in both academia and industry, advance our understanding of neurological diseases and accelerate the development of treatments. As many neurological diseases are quite rare, they present several challenges to researchers due to their small patient populations. Overcoming these challenges required the aggregation of datasets from various projects and locations. The C-BIG repository achieves this goal and stands as a scalable working model for institutions to collect, track, curate, archive, and disseminate multimodal data from patients. In November 2020, a Registered Access layer was made available to the wider research community at https://cbigr-open.loris.ca , and in May 2021 fully open data will be released to complement the Registered Access data. This article outlines many of the aspects of The Neuro's transition to Open Science by describing the data to be released, C-BIG's full capabilities, and the design aspects that were implemented for effective data sharing.

Keywords: Biobank; Database; Genetic; Interoperability; Open Science; Registered access.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Architectural diagram of the various components used in C-BIG to illustrate the workflows involved in acquiring, curating, processing, and disseminating data: Software: The LORIS platform undergoes continual development with regular releases to improve the functionality, security and interface of the C-BIG repository. Data Acquisition: C-BIG acquires a number of data modalities from consenting patients whose unique IDs are tracked via the SPI patient registry based on patient preference and their risk tolerance. Internal Database: Data are standardized and housed in the C-BIG Internal Database, where metadata/data can be viewed and manipulated by lab technicians and researchers across multiple projects via a web-based repository organizing the metadata in modules specific to their needs. Anonymization: Datasets are then anonymized to reduce the risk of re-identification. High Performance Computing can be leveraged using any HPC system depending on the user preferences. Public Access: Datasets are made available via the public layer where user access is regulated by tiers determined by the nature of the dataset. The data can be queried in a granular manner by researchers wanting to do specific processing and analysis, or seeking summary statistics, documentation or quality control results
Fig. 2
Fig. 2
Data flow and interoperability chart between the various subsystems
Fig. 3
Fig. 3
Biobank data entry form for biospecimens
Fig. 4
Fig. 4
Biobank specimen page displays specimen metadata, processing stages & life cycle
Fig. 5
Fig. 5
Biobank container page with graphic display of the container dimensions and contents
Fig. 6
Fig. 6
The number of biospecimens collected over time in the C-BIG since 2008
Fig. 7
Fig. 7
(LEFT) C-BIG currently houses data from 1720 patients (931 males/789 females) in 88 disease groups across 20 projects and 19 sites, 49% with a clinical diagnosis of Parkinson’s disease. (RIGHT) Over 32,500 biospecimen samples have been collected and archived in C-BIG. The storage infrastructure includes 367 Matrix Boxes, and 12 Freezers and Cryogenic Tanks

References

    1. Abraham, S. A., Keshavan, A., Rosli, Z., Clucas, J., Klein, A., Ghosh, S., & Das, S. (2019). ReproSchema and library: A JSON-LD schema to harmonize behavioral, cognitive, and neuropsych assessments. Neuroinformatics. 10.12751/incf.ni2019.0074.
    1. Abrams, M. B., Bjaalie, J. G., Das, S., et al. (2021). A Standards Organization for Open and FAIR Neuroscience: the International Neuroinformatics Coordinating Facility. Neuroinform. 10.1007/s12021-020-09509-0. - PMC - PubMed
    1. Baker M. Is there a reproducibility crisis? Nature. 2016;533:452–454. doi: 10.1038/533452a. - DOI - PubMed
    1. Betsou F, Bilbao R, Case J, Chuaqui R, Clements JA, de Souza Y, de Wilde A, Geiger J, Grizzle W, Guadagni F, Gunter E, Heil S, Kiehntopf M, Koppandi I, Lehmann S, Linsen L, Mackenzie-Dodds J, Quesada RA, Tebbakha R, Selander T, Shea K, Sobel M, Somiari S, Spyropoulos D, Stone M, Tybring G, Valyi-Nagy K, Wadhwa L, the ISBER Biospecimen Science Worki Standard PREanalytical code version 3.0. Biopreservation and Biobanking. 2018;16(1):9–12. doi: 10.1089/bio.2017.0109. - DOI - PMC - PubMed
    1. Bosch-Bayard J, Aubert-Vazquez E, Brown ST, Rogers C, Kiar G, Glatard T, Scaria L, Galan-Garcia L, Bringas-Vega ML, Virues-Alba T, Taheri A, Das S, Madjar C, Mohaddes Z, MacIntyre L, CHBMP. Evans AC, Valdes-Sosa, P. A. A quantitative EEG toolbox for the MNI Neuroinformatics ecosystem: Normative SPM of EEG source spectra. Frontiers in Neuroinformatics. 2020;14:33. doi: 10.3389/fninf.2020.00033. - DOI - PMC - PubMed

Publication types

Grants and funding