Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Jan;20(1):203-219.
doi: 10.1007/s12021-021-09533-8. Epub 2021 Aug 4.

Promoting FAIR Data Through Community-driven Agile Design: the Open Data Commons for Spinal Cord Injury (odc-sci.org)

Collaborators, Affiliations
Review

Promoting FAIR Data Through Community-driven Agile Design: the Open Data Commons for Spinal Cord Injury (odc-sci.org)

Abel Torres-Espín et al. Neuroinformatics. 2022 Jan.

Abstract

The past decade has seen accelerating movement from data protectionism in publishing toward open data sharing to improve reproducibility and translation of biomedical research. Developing data sharing infrastructures to meet these new demands remains a challenge. One model for data sharing involves simply attaching data, irrespective of its type, to publisher websites or general use repositories. However, some argue this creates a 'data dump' that does not promote the goals of making data Findable, Accessible, Interoperable and Reusable (FAIR). Specialized data sharing communities offer an alternative model where data are curated by domain experts to make it both open and FAIR. We report on our experiences developing one such data-sharing ecosystem focusing on 'long-tail' preclinical data, the Open Data Commons for Spinal Cord Injury (odc-sci.org). ODC-SCI was developed with community-based agile design requirements directly pulled from a series of workshops with multiple stakeholders (researchers, consumers, non-profit funders, governmental agencies, journals, and industry members). ODC-SCI focuses on heterogeneous tabular data collected by preclinical researchers including bio-behaviour, histopathology findings and molecular endpoints. This has led to an example of a specialized neurocommons that is well-embraced by the community it aims to serve. In the present paper, we provide a review of the community-based design template and describe the adoption by the community including a high-level review of current data assets, publicly released datasets, and web analytics. Although odc-sci.org is in its late beta stage of development, it represents a successful example of a specialized data commons that may serve as a model for other fields.

Keywords: Data sharing; FAIR; community repository; data reuse; neurotrauma; spinal cord injury.

PubMed Disclaimer

Conflict of interest statement

J.-B.P. was partially funded by National Institutes of Health (NIH) NIH-NIBIB P41 EB019936 (ReproNim) NIH-NIMH R01 MH083320 and NIH RF1 MH120021 (NIDM), NIMH Award Number R01MH096906 (Neurosynth), as well as the Canada First Research Excellence Fund, awarded to McGill University for the Healthy Brains for Healthy Lives initiative.

Figures

Fig. 1
Fig. 1
Staged development. We have divided the process by which the ODC-SCI and the SCI data-sharing community has come together in 4 stages (A). The three first stages seeded the foundations for ODC-SCI and stage 4, that has recently started, will bring ODC-SCI to maturity. During these stages the engagement with the SCI data-sharing community and the development of tools has occurred in parallel, in both cases using agile design principles (B). These consist on performing a requirement analysis (e.g., ask the community what data needs to be shared), followed by a period of design and development of tools and policies, and a period of feedback (testing) by the users and the community. When the implementation satisfies the requirements, the new functionalities can be incorporated to the ODC-SCI
Fig. 2
Fig. 2
ODC-SCI data spaces and movement of data. Data on the ODC-SCI can live in different data spaces on increasing order of privacy. The Personal space is the most private space where users (Registered users who are part of a Lab) can upload data, share their uploaded data with their Lab (after PI/Lab manager approval) and explore and access data that is present in the user Personal space. Datasets at the Lab space can be explored and accessed by all users who are members of the same Lab. In addition, PIs and Lab managers can release the data to the ODC-SCI Community space or request DOI for publication. Datasets that are released to the ODC-SCI Community space can be explored and accessed by any registered user who has a Community member account (eighter general members or members of a laboratory). From the Community space, datasets can also be published by requesting DOI. This tiered system is hierarchical, since a dataset that for instance is released to the Community space, is still present in and belongs to the original Lab space and uploaders
Fig. 3
Fig. 3
ODC-SCI account types and functions. Access to different functions on the site are determined by the account types. Visitors to the platform with no account can only explore the metadata for published datasets but can not see nor download the data. Registered users who are not part of the ODC-SCI Community can explore and download published datasets. Registered users who become part of the ODC-SCI Community will be able to explore and download published datasets, as well as get private peer sharing (feature still under development). To gain access to all the full suite of functions users will have to be part of a Lab in the ODC-SCI
Fig. 4
Fig. 4
Machine vs. human readable tabular formats. How data is formatted into spreadsheets can affect the readability of it. As humans, we benefit from visual clues such as blank spaces or colors and from complex data organizations that divide data into chunks (e.g., groups of subjects) (A). Although this formatting of the data can be self-explanatory for humans, the complexity and lack of a consistent structure across researches make it challenging to generate standards that can be used by machines to process and understand data. The readability of a spreadsheet by a machine can be dramatically improved with simple rules (Broman & Woo, 2018) to organize the data in a structured manner (B to D). In ODC-SCI, data can be uploaded using spreadsheet type file (as .csv file) where columns are variables (also known as fields), the first row contain the variable names or headers and each consequent row is a unique record, meaning that there are not two identical rows on the dataset, and completely empty rows and columns are not allowed. The ODC-SCI database is organized around the subject identification number and thus it must always be present in the dataset. This formatting can have different variations depending on the hierarchical relationships between variables (such as in the case of repeated measures like time). For example, the same variables are collected at different timepoints, a time column can be specified, and subjects can be repeated in rows with records for each time point in different rows, known as semi-long format (C). Contrary, a new column can be created for every variable and every time point known as wide format (D), in which case each subject is only present in one row. When possible, ODC-SCI recommends using semi-long formats
Fig. 5
Fig. 5
ODC-SCI activity. We tracked the activity on when users registered to the site (A), when datasets got uploaded (B), the number of uploaded datasets per Lab (C), and the status or data space where datasets are set (D). A total of 234 datasets have been uploaded. An estimated 38 % of the uploaded datasets are placeholder datasets created to explore the functionality of the portal, including datasets uploaded during development, test sets during outreach activities, and datasets by users who include “test” or “practice” on the description. Most of those datasets have been subsequently deleted and only active datasets are shown in D. Notice that although we have 11 requests for DOI at the time of writing, there are 2 datasets in preparation for being uploaded, and therefore not reflected in D
Fig. 6
Fig. 6
Traffic of visitors to the odc-sci.org. Using Google analytics traffic monitoring data we identified new and returning visitors over time (A-B), as well as the time spend per session in minutes (C-D) and the number of pages viewed per visitor/session (E-F). A, C and F show the raw daily metrics while B, D and F show 3 weeks moving average over the same period of time. Some of the important outreach activities are annotated on the graphs: SFN 2018 STREET-FAIR workshop, the SCI2020 meeting hosted by NINDS, the press release of the new multi-agency grant launch, the SFN 2019 ODC-SCI stand as part of NIF and the IOSCIRS online workshop
Fig. 7
Fig. 7
Geographical origin of internet traffic to the odc-sci.org. New and Returning visitors have viewed odc-sci.org since we started monitoring traffic

References

    1. Aguilar RM, Steward O. A bilateral cervical contusion injury model in mice: Assessment of gripping strength as a measure of forelimb motor function. Experimental Neurology. 2010;221(1):38–53. doi: 10.1016/j.expneurol.2009.09.028. - DOI - PMC - PubMed
    1. Almeida, C. A., Torres-Espin, A., Huie, J. R., Sun, D., Noble-Haeusslein, L. J., Young, W., Beattie, M. S., Bresnahan, J. C., Nielson, J. L., Ferguson, A. R. (2021). Excavating FAIR Data: the Case of the Multicenter Animal Spinal Cord Injury Study (MASCIS), Blood Pressure, and Neuro-Recovery. Neuroinformatics. 10.1007/s12021-021-09512-z. - PMC - PubMed
    1. Anderson KD, Sharp KG, Hofstadter M, Irvine K-A, Murray M, Steward O. Forelimb locomotor assessment scale (FLAS): Novel assessment of forelimb dysfunction after cervical spinal cord injury. Experimental Neurology. 2009;220(1):23–33. doi: 10.1016/j.expneurol.2009.08.020. - DOI - PMC - PubMed
    1. Basso DM, Beattie MS, Bresnahan JC. A sensitive and reliable locomotor rating scale for open field testing in rats. Journal of Neurotrauma. 1995;12(1):1–21. doi: 10.1089/neu.1995.12.1. - DOI - PubMed
    1. Basso DM, Beattie MS, Bresnahan JC, Anderson DK, Faden AI, Gruner JA, Holford TR, Hsu CY, Noble LJ, Nockels R, Perot PL, Salzman SK, Young W. MASCIS evaluation of open field locomotor scores: Effects of experience and teamwork on reliability. Multicenter Animal Spinal Cord Injury Study. Journal of Neurotrauma. 1996;13(7):343–359. doi: 10.1089/neu.1996.13.343. - DOI - PubMed

Publication types