Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 20;17(6):e3000333.
doi: 10.1371/journal.pbio.3000333. eCollection 2019 Jun.

Challenges and recommendations to improve the installability and archival stability of omics computational tools

Affiliations

Challenges and recommendations to improve the installability and archival stability of omics computational tools

Serghei Mangul et al. PLoS Biol. .

Abstract

Developing new software tools for analysis of large-scale biological data is a key component of advancing modern biomedical research. Scientific reproduction of published findings requires running computational tools on data generated by such studies, yet little attention is presently allocated to the installability and archival stability of computational software tools. Scientific journals require data and code sharing, but none currently require authors to guarantee the continuing functionality of newly published tools. We have estimated the archival stability of computational biology software tools by performing an empirical analysis of the internet presence for 36,702 omics software resources published from 2005 to 2017. We found that almost 28% of all resources are currently not accessible through uniform resource locators (URLs) published in the paper they first appeared in. Among the 98 software tools selected for our installability test, 51% were deemed "easy to install," and 28% of the tools failed to be installed at all because of problems in the implementation. Moreover, for papers introducing new software, we found that the number of citations significantly increased when authors provided an easy installation process. We propose for incorporation into journal policy several practical solutions for increasing the widespread installability and archival stability of published bioinformatics software.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Archival stability of 36,702 published URLs across 10 systems and computational biology journals over the span of 13 years.
An asterisk (*) denotes categories that have a difference that is statistically significant. Error bars, where present, indicate SEM. (A) Archival stability status of all links evaluated from papers published between 2005 and 2017. Percentages of each category (y-axis) are reported over a 13-year span (x-axis). (B) A line graph comparing the overall numbers (y-axis) of functional (green circles) and nonfunctional (orange squares) links observed in papers published over time (x-axis). (C) A bar chart showing the mean Altmetric “attention score” (y-axis) for papers, separated by the status of the URL (x-axis) observed in that paper. (D) A bar chart showing the mean number of mentions of papers in social media (blog posts, Twitter feeds, etc.) according to Altmetric, divided by the age of the paper in years (y-axis). Papers are separated by the status of the URL (x-axis) found in the paper. (E) A bar chart illustrating the mean Altmetric readership count per year of papers (y-axis) containing URLs in each of the categories (x-axis). (F) The proportion of unreachable links (due to connection time-out or due to error) stored on web services designed to host source code (e.g., GitHub and SourceForge) and “Other” web services. (G) A line plot illustrating the proportion (y-axis) of the total links observed in each year (x-axis) that point to GitHub or SourceForge. (H) A bar chart illustrating the proportion of links hosted on GitHub or SourceForge (vertical axis) that are no longer functional (horizontal axis) compared with links hosted elsewhere. SEM, standard error of the mean; URL, uniform resource locators.
Fig 2
Fig 2. Installability of 98 randomly selected published software tools across 22 life-science journals over a span of 15 years.
Error bars, where present, indicate SEM. (A) Pie chart showing the percentage of tools with various levels of installability. (B) A pie chart showing the proportion of evaluated tools that required no deviation from the documented installation procedure. (C) Tools that require no manual intervention (pass automatic installation test) exhibit decreased installation time. (D) Tools installed exhibit increased citation per year compared with tools that were not installed (Kruskal–Wallis, p-value = 0.035). (E) Tools that are easy to install include a decreased portion of undocumented commands (Not Installed versus Easy Install: Mann–Whitney U test, p-value = 0.01, Easy Install versus Complex Install: Mann–Whitney U test, p-value = 8.3 × 10−8). (F) Tools available in well-maintained package managers such as Bioconda were always installable, whereas tools not shipped via package managers were prone to problems in 32% of the studied cases. SEM, standard error of the mean.

References

    1. Van Noorden R, Maher B, Nuzzo R. The top 100 papers. Nature. 2014;514: 550–553. 10.1038/514550a - DOI - PubMed
    1. Wren JD. Bioinformatics programs are 31-fold over-represented among the highest impact scientific papers of the past two decades. Bioinformatics. 2016;32: 2686–2691. 10.1093/bioinformatics/btw284 - DOI - PubMed
    1. Greene AC, Giffin KA, Greene CS, Moore JH. Adapting bioinformatics curricula for big data. Brief Bioinform. 2016;17: 43–50. 10.1093/bib/bbv018 - DOI - PMC - PubMed
    1. Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ, et al. Big Data: Astronomical or Genomical? PLoS Biol. 2015;13: e1002195 10.1371/journal.pbio.1002195 - DOI - PMC - PubMed
    1. Ahn W-Y, Busemeyer JR. Challenges and promises for translating computational tools into clinical practice. Current Opinion in Behavioral Sciences. 2016;11: 1–7. 10.1016/j.cobeha.2016.02.001 - DOI - PMC - PubMed

Publication types