Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 1;7(5):giy023.
doi: 10.1093/gigascience/giy023.

The Scientific Filesystem

Affiliations

The Scientific Filesystem

Vanessa Sochat. Gigascience. .

Abstract

Background: Here, we present the Scientific Filesystem (SCIF), an organizational format that supports exposure of executables and metadata for discoverability of scientific applications. The format includes a known filesystem structure, a definition for a set of environment variables describing it, and functions for generation of the variables and interaction with the libraries, metadata, and executables located within. SCIF makes it easy to expose metadata, multiple environments, installation steps, files, and entry points to render scientific applications consistent, modular, and discoverable. A SCIF can be installed on a traditional host or in a container technology such as Docker or Singularity. We start by reviewing the background and rationale for the SCIF, followed by an overview of the specification and the different levels of internal modules ("apps") that the organizational format affords. Finally, we demonstrate that SCIF is useful by implementing and discussing several use cases that improve user interaction and understanding of scientific applications. SCIF is released along with a client and integration in the Singularity 2.4 software to quickly install and interact with SCIF. When used inside of a reproducible container, a SCIF is a recipe for reproducibility and introspection of the functions and users that it serves.

Results: We use SCIF to evaluate container software, provide metrics, serve scientific workflows, and execute a primary function under different contexts. To encourage collaboration and sharing of applications, we developed tools along with an open source, version-controlled, tested, and programmatically accessible web infrastructure. SCIF and associated resources are available at https://sci-f.github.io. The ease of using SCIF, especially in the context of containers, offers promise for scientists' work to be self-documenting and programatically parseable for maximum reproducibility. SCIF opens up an abstraction from underlying programming languages and packaging logic to work with scientific applications, opening up new opportunities for scientific software development.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Assessing “read calls” across a range of different programming language implementations of “Hello World” shows a surprising range of differences and reflects common knowledge that more extensive programs (e.g., Octave) add complexity to the seemingly simple command.

References

    1. Glatard T, Lewis LB, Ferreira da Silva R et al. . Reproducibility of neuroimaging analyses across operating systems. Front Neuroinform. 2015;9. - PMC - PubMed
    1. Merkel D. Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014;2014.
    1. Docker-based solutions to reproducibility in science- Seven Bridges. 2015. https://blog.sbgenomics.com/docker-based-solutions-to-reproducibility-in.... Accessed: 17 Dec 2016.
    1. Hosny A, Vera-Licona P, Laubenbacher R et al. . AlgoRun: a Docker-based packaging system for platform-agnostic implemented algorithms. Bioinformatics. 2016;32:2396–98. - PMC - PubMed
    1. Moreews F, Sallou O, Ménager H et al. . BioShaDock: a community driven bioinformatics shared Docker-based tools registry. F1000Res. 2015;4:1443. - PMC - PubMed

Publication types

LinkOut - more resources