Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct:2017:79-88.
doi: 10.1109/eScience.2017.20. Epub 2017 Nov 16.

Experiences with Deriva: An Asset Management Platform for Accelerating eScience

Affiliations

Experiences with Deriva: An Asset Management Platform for Accelerating eScience

Alejandro Bugacov et al. Proc IEEE Int Conf Escience. 2017 Oct.

Abstract

The pace of discovery in eScience is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. It is all too common for investigators to spend inordinate amounts of time developing ad hoc procedures to manage their data. In previous work, we presented Deriva, a Scientific Asset Management System, designed to accelerate data driven discovery. In this paper, we report on the use of Deriva in a number of substantial and diverse eScience applications. We describe the lessons we have learned, both from the perspective of the Deriva technology, as well as the ability and willingness of scientists to incorporate Scientific Asset Management into their daily workflows.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Deriva architecture consisting of metadata catalog (ERMrest), object storage (Hatrac), web applications (Chaise), ingest/export and automation agents (IObox), and policy enforcement and authentication.
Fig. 2
Fig. 2
IObox enabled workflow depiction. Boxes indicate key operations performed by IObox while interactions and data flow are indicated by arrows.
Fig. 3
Fig. 3
FaceBase ERM. Metadata are organized broadly as investigation, biosample, bioassay, and asset entities with relationships indicated by arrows.
Fig. 4
Fig. 4
FaceBase Data Curation Pipeline. Shaded boxes indicate Spoke responsibilities versus clear boxes for the Hub’s activities.
Fig. 5
Fig. 5
Select elements of GPCR catalog model. From top to bottom, four tiers of entities and relationships have been added in phases: core protein concepts; core assets including alignment and expression data; experiment metadata; and experiment assets capturing experimental results.
Fig. 6
Fig. 6
Use of RecordEdit application to create GPCR experiments.
Fig. 7
Fig. 7
GPCR condition-action processing pipelines. Observable data states are depicted as labeled conditions, while processing actions are implied as arrows transitioning from one state to the next: A) a new construct is aligned using a third-party service, GPCRDB [16]; B) an aggregate alignment is maintained for each target, tracking its most recent construct alignments; C) a multi-sample FCS source file is processed in bulk, generating idempotent checkpoints for D) a single-sample FCS file.
Fig. 8
Fig. 8
Dynamically generated display of GPCR Target including metadata, activity tracking graph, and alignment.

References

    1. Borgman CL. The conundrum of sharing research data. Journal of the American Society for Information Science and Technology. 2012;63(6):1059–1078.
    1. Kandel S, et al. Enterprise data analysis and visualization: An interview study. IEEE Transactions on Visualization and Computer Graphics. 2012;18(12):2917–2926. - PubMed
    1. Begley CG. Six red flags for suspect work. Nature. 2013 May;497(7450):433–4. - PubMed
    1. Schuler R, Kesselman C, Czjakowski K. Accelerating data-driven discovery with scientific asset management. IEEE 12th International Conference on eScience; IEEE; 2016.
    1. Goble C, De Roure D, Bechhofer S. Accelerating scientists knowledge turns. International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management; Springer; 2011. pp. 3–25.

LinkOut - more resources