Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 12;25(1):110.
doi: 10.1186/s12859-024-05695-9.

CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications

Affiliations

CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications

Simone Alessandri et al. BMC Bioinformatics. .

Abstract

Background: The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible research outcomes due to inconsistencies and the lack of standardization in the analysis process. These issues can lead to discrepancies in results, undermining the credibility and impact of bioinformatics research and creating mistrust in the scientific process. To address these challenges, open science practices such as sharing data, code, and methods have been encouraged.

Results: CREDO, a Customizable, REproducible, DOcker file generator for bioinformatics applications, has been developed as a tool to moderate reproducibility issues by building and distributing docker containers with embedded bioinformatics tools. CREDO simplifies the process of generating Docker images, facilitating reproducibility and efficient research in bioinformatics. The crucial step in generating a Docker image is creating the Dockerfile, which requires incorporating heterogeneous packages and environments such as Bioconductor and Conda. CREDO stores all required package information and dependencies in a Github-compatible format to enhance Docker image reproducibility, allowing easy image creation from scratch. The user-friendly GUI and CREDO's ability to generate modular Docker images make it an ideal tool for life scientists to efficiently create Docker images. Overall, CREDO is a valuable tool for addressing reproducibility issues in bioinformatics research and promoting open science practices.

Keywords: Bioinformatics; Docker; Open science; Reproducibility; Software sharing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
CREDO workflow. CREDO embraces the principles of FAIR, which stand for Findable, Accessible, Interoperable, and Reusable. These principles serve as a framework to promote data and resource sharing in a way that maximizes their usability and impact. (1) User Customization: Represents the first step where users can customize the contents of the Docker image according to their specific requirements. (2) Docker Image Assembly: Represents the second step where a dummy Docker container is downloaded, archiving all the necessary files for offline building, and generating an installation script. (3) Offline Building: Represents the third step where a new Docker image is created using the recorded instructions from the Docker Image Assembly step and the downloaded files. This step ensures reproducibility and independence from internet connectivity, allowing users to build Docker images offline with all the necessary dependencies and configurations
Fig. 2
Fig. 2
CREDOengine's structured flow across layers. Each layer builds upon the previous one, creating a sequential enhancement of the docker object: Layer 0 provides the basic environment, requiring either Python or R modules. Layer 1 builds on this, adding combined Python and R support, and each subsequent layer extends the capabilities. Layer 2 allows the implementation of a graphical interface. Layer 3 implements the ability to run docker in docker. Layer 4 provides the infrastructure for the installation of additional software beyond Python and R. This sequential flow ensures a coherent build-up of features, allowing users to develop a Dockerfile progressively tailored to their needs
Fig. 3
Fig. 3
Examples of config files
Fig. 4
Fig. 4
Screenshot of the CREDOgui. In CREDOgui, the dependencies among the different layers are more stringent than in CREDOengine. Specifically, any layer depends on the previous one

Similar articles

Cited by

References

    1. Kulkarni N, Alessandri L, Panero R, Arigoni M, Olivero M, Ferrero G, Cordero F, Beccuti M, Calogero RA. Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines. BMC Bioinform. 2018;19(Suppl 10):349. doi: 10.1186/s12859-018-2296-x. - DOI - PMC - PubMed
    1. Bayat A. Science, medicine, and the future: bioinformatics. BMJ. 2002;324(7344):1018–1022. doi: 10.1136/bmj.324.7344.1018. - DOI - PMC - PubMed
    1. Dall'Alba G, Casa PL, Abreu FP, Notari DL, de Avila ESS. A survey of biological data in a big data perspective. Big Data. 2022;10(4):279–297. doi: 10.1089/big.2020.0383. - DOI - PubMed
    1. Sun W, Nasraoui O, Shafto P. Evolution and impact of bias in human and machine learning algorithm interaction. PLoS ONE. 2020;15(8):e0235502. doi: 10.1371/journal.pone.0235502. - DOI - PMC - PubMed
    1. Hollmann S, Kremer A, Baebler S, Trefois C, Gruden K, Rudnicki WR, Tong W, Gruca A, Bongcam-Rudloff E, Evelo CT, Nechyporenko A, Frohme M, Safranek D, Regierer B, D'Elia D. The need for standardisation in life science research—an approach to excellence and trust. F1000Res. 2020;9:1398. doi: 10.12688/f1000research.27500.2. - DOI - PMC - PubMed

LinkOut - more resources