Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Jan;15(1):65-78.
doi: 10.1093/bib/bbs064. Epub 2012 Oct 9.

Data management strategies for multinational large-scale systems biology projects

Affiliations
Review

Data management strategies for multinational large-scale systems biology projects

Wasco Wruck et al. Brief Bioinform. 2014 Jan.

Abstract

Good accessibility of publicly funded research data is essential to secure an open scientific system and eventually becomes mandatory [Wellcome Trust will Penalise Scientists Who Don't Embrace Open Access. The Guardian 2012]. By the use of high-throughput methods in many research areas from physics to systems biology, large data collections are increasingly important as raw material for research. Here, we present strategies worked out by international and national institutions targeting open access to publicly funded research data via incentives or obligations to share data. Funding organizations such as the British Wellcome Trust therefore have developed data sharing policies and request commitment to data management and sharing in grant applications. Increased citation rates are a profound argument for sharing publication data. Pre-publication sharing might be rewarded by a data citation credit system via digital object identifiers (DOIs) which have initially been in use for data objects. Besides policies and incentives, good practice in data management is indispensable. However, appropriate systems for data management of large-scale projects for example in systems biology are hard to find. Here, we give an overview of a selection of open-source data management systems proved to be employed successfully in large-scale projects.

Keywords: data citation; data management; data sharing; open access; systems biology.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
SysMO-DB system chart: SysMO-DB was developed for a multi-national large-scale project consisting of multiple ‘sub’-projects with own data management solutions which were not changed. For that purpose the Just Enough Results Model (JERM) was introduced which aims at finding minimal information to make data comparable across project borders. JERM templates cater for compliance to MIBBI. Data of multiple projects are brought together via upload to the assets catalogue which can be performed automatically using so-called JERM ‘harvesters’. The yellow pages component provides details about projects, participating people and institutions to enable exchange of expertise and association of assets to people. Many external resources are connected to SysMO-DB, e.g. integration of JWS-Online provides systems biological modelling facilities for project data.
Figure 2:
Figure 2:
DIPSBC system chart. Data are first converted to the Solr xml format (‘normalized’) and afterward indexed by the Solr search server. Then data sets can be found efficiently and will be passed to document type specific objects which initiate processes corresponding to the dedicated data type (here MIAME data and PubMed data are shown). New data types can be introduced straightforwardly by adapting new objects derived from existing ones, e.g. the xml object.
Figure 3:
Figure 3:
Benchmarking: data management systems for large-scale systems biology projects.

References

    1. Committee on Issues in the Transborder Flow of Scientific Data, National Research Council: Bits of Power: Issues in Global Access to Scientific Data. Washington, D.C.: The National Academies Press; 1997.
    1. Organization for Economic Co-operation and Development (OECD) Principles and Guidelines for Access to Research Data from Public Funding. OECD Publications 2007.
    1. Wellcome Trust will Penalise Scientists Who Don’t Embrace Open Access. The Guardian 2012. http://www.guardian.co.uk/science/2012/jun/28/wellcome-trust-scientists-... (24 September 2012, date last accessed)
    1. Lyon L. Dealing with data: roles, rights, responsibilities and relationships. Consultancy Report. 2007:54.
    1. Gattiker A, Hermida L, Liechti R, et al. MIMAS 3.0 is a multiomics information management and annotation system. BMC Bioinformatics. 2009;10:151. - PMC - PubMed