Understanding and improving the quality and reproducibility of Jupyter notebooks
- PMID: 33994841
- PMCID: PMC8106381
- DOI: 10.1007/s10664-021-09961-9
Understanding and improving the quality and reproducibility of Jupyter notebooks
Abstract
Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and other rich media. The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way in which notebooks are being used leads to unexpected behavior, encourages poor coding practices, and makes it hard to reproduce its results. To better understand good and bad practices used in the development of real notebooks, in prior work we studied 1.4 million notebooks from GitHub. We presented a detailed analysis of their characteristics that impact reproducibility, proposed best practices that can improve the reproducibility, and discussed open challenges that require further research and development. In this paper, we extended the analysis in four different ways to validate the hypothesis uncovered in our original study. First, we separated a group of popular notebooks to check whether notebooks that get more attention have more quality and reproducibility capabilities. Second, we sampled notebooks from the full dataset for an in-depth qualitative analysis of what constitutes the dataset and which features they have. Third, we conducted a more detailed analysis by isolating library dependencies and testing different execution orders. We report how these factors impact the reproducibility rates. Finally, we mined association rules from the notebooks. We discuss patterns we discovered, which provide additional insights into notebook reproducibility. Based on our findings and best practices we proposed, we designed Julynter, a Jupyter Lab extension that identifies potential issues in notebooks and suggests modifications that improve their reproducibility. We evaluate Julynter with a remote user experiment with the goal of assessing Julynter recommendations and usability.
Keywords: GitHub; Jupyter notebook; Lint; Quality; Reproducibility.
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.
Figures

























Similar articles
-
Visualising data science workflows to support third-party notebook comprehension: an empirical study.Empir Softw Eng. 2023;28(3):58. doi: 10.1007/s10664-023-10289-9. Epub 2023 Mar 23. Empir Softw Eng. 2023. PMID: 36968214 Free PMC article.
-
Computational reproducibility of Jupyter notebooks from biomedical publications.Gigascience. 2024 Jan 2;13:giad113. doi: 10.1093/gigascience/giad113. Gigascience. 2024. PMID: 38206590 Free PMC article.
-
Reproducible Bioconductor workflows using browser-based interactive notebooks and containers.J Am Med Inform Assoc. 2018 Jan 1;25(1):4-12. doi: 10.1093/jamia/ocx120. J Am Med Inform Assoc. 2018. PMID: 29092073 Free PMC article.
-
Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing.Metabolomics. 2019 Sep 14;15(10):125. doi: 10.1007/s11306-019-1588-0. Metabolomics. 2019. PMID: 31522294 Free PMC article. Review.
-
Publishing computational research - a review of infrastructures for reproducible and transparent scholarly communication.Res Integr Peer Rev. 2020 Jul 14;5:10. doi: 10.1186/s41073-020-00095-y. eCollection 2020. Res Integr Peer Rev. 2020. PMID: 32685199 Free PMC article. Review.
Cited by
-
My friend MIROSLAV: A hackable open-source hardware and software platform for high-throughput monitoring of rodent activity in the home cage.Behav Res Methods. 2025 Jun 13;57(7):198. doi: 10.3758/s13428-025-02719-x. Behav Res Methods. 2025. PMID: 40514586
-
Visualising data science workflows to support third-party notebook comprehension: an empirical study.Empir Softw Eng. 2023;28(3):58. doi: 10.1007/s10664-023-10289-9. Epub 2023 Mar 23. Empir Softw Eng. 2023. PMID: 36968214 Free PMC article.
-
Computational reproducibility of Jupyter notebooks from biomedical publications.Gigascience. 2024 Jan 2;13:giad113. doi: 10.1093/gigascience/giad113. Gigascience. 2024. PMID: 38206590 Free PMC article.
-
BioVisReport: A Markdown-based lightweight website builder for reproducible and interactive visualization of results from peer-reviewed publications.Comput Struct Biotechnol J. 2022 Jun 8;20:3133-3139. doi: 10.1016/j.csbj.2022.06.009. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35782729 Free PMC article.
References
-
- Agrawal R, Srikant R, et al. (1994) Fast algorithms for mining association rules. In: VLDB conference, VLDB, vol 1215, pp 487–499
-
- Anaconda (2018) Anaconda software distribution. https://www.anaconda.com. Accessed: 2019-10-01
-
- Arnaoudova V, Di Penta M, Antoniol G. Linguistic antipatterns: what they are and how developers perceive them. Empir Softw Eng. 2016;21(1):104–158. doi: 10.1007/s10664-014-9350-8. - DOI
-
- Bangor A, Kortum PT, Miller JT. An empirical evaluation of the system usability scale. Int J Hum–Comput Interact. 2008;24(6):574–594. doi: 10.1080/10447310802205776. - DOI
-
- Benedek J, Miner T. Measuring desirability: new methods for evaluating desirability in a usability lab setting. Proc Usabil Prof Assoc. 2002;2003(8–12):57.
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous