Extension of research data repository system to support direct compute access to biomedical datasets: enhancing Dataverse to support large datasets
- PMID: 27862010
- PMCID: PMC5546227
- DOI: 10.1111/nyas.13272
Extension of research data repository system to support direct compute access to biomedical datasets: enhancing Dataverse to support large datasets
Abstract
Access to experimental X-ray diffraction image data is important for validation and reproduction of macromolecular models and indispensable for the development of structural biology processing methods. In response to the evolving needs of the structural biology community, we recently established a diffraction data publication system, the Structural Biology Data Grid (SBDG, data.sbgrid.org), to preserve primary experimental datasets supporting scientific publications. All datasets published through the SBDG are freely available to the research community under a public domain dedication license, with metadata compliant with the DataCite Schema (schema.datacite.org). A proof-of-concept study demonstrated community interest and utility. Publication of large datasets is a challenge shared by several fields, and the SBDG has begun collaborating with the Institute for Quantitative Social Science at Harvard University to extend the Dataverse (dataverse.org) open-source data repository system to structural biology datasets. Several extensions are necessary to support the size and metadata requirements for structural biology datasets. In this paper, we describe one such extension-functionality supporting preservation of file system structure within Dataverse-which is essential for both in-place computation and supporting non-HTTP data transfers.
Keywords: Data Access Alliance; Dataverse; RDMS; SBGrid; X-ray diffraction; research data management system.
© 2016 New York Academy of Sciences.
Conflict of interest statement
The authors declare no conflicts of interest.
Figures




References
-
- Berman H, Henrick K, Nakamura H. Announcing the worldwide protein data bank. Nat Struct Biol. 2003;10:980–980. - PubMed
-
- Bilderback DH, Elleaume P, Weckert E. Review of third and next generation synchrotron light sources. J Phys B At Mol Opt Phys. 2005;38:S773–S797.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources