Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 May 1;35(5):1037-1046.
doi: 10.1093/molbev/msy014.

The State of Software for Evolutionary Biology

Affiliations
Review

The State of Software for Evolutionary Biology

Diego Darriba et al. Mol Biol Evol. .

Abstract

With Next Generation Sequencing data being routinely used, evolutionary biology is transforming into a computational science. Thus, researchers have to rely on a growing number of increasingly complex software. All widely used core tools in the field have grown considerably, in terms of the number of features as well as lines of code and consequently, also with respect to software complexity. A topic that has received little attention is the software engineering quality of widely used core analysis tools. Software developers appear to rarely assess the quality of their code, and this can have potential negative consequences for end-users. To this end, we assessed the code quality of 16 highly cited and compute-intensive tools mainly written in C/C++ (e.g., MrBayes, MAFFT, SweepFinder, etc.) and JAVA (BEAST) from the broader area of evolutionary biology that are being routinely used in current data analysis pipelines. Because, the software engineering quality of the tools we analyzed is rather unsatisfying, we provide a list of best practices for improving the quality of existing tools and list techniques that can be deployed for developing reliable, high quality scientific software from scratch. Finally, we also discuss journal as well as science policy and, more importantly, funding issues that need to be addressed for improving software engineering quality as well as ensuring support for developing new and maintaining existing software. Our intention is to raise the awareness of the community regarding software engineering quality issues and to emphasize the substantial lack of funding for scientific software development.

PubMed Disclaimer

References

    1. Abdelmalek NN. 1971. Round off error analysis for Gram–Schmidt method and solution of linear least squares problems. BIT Numer. Math. 114:345–367.
    1. Barone L, Williams J, Micklos D.. 2017. Unmet needs for analyzing biological big data: a survey of 704 nsf principal investigators. PLoS Comput Biol 1310:e1005755. - PMC - PubMed
    1. Biczok R, Bozsoky P, Eisenmann P, Ernst J, Ribizel T, Scholz F, Trefzer A, Weber F, Hamann M, Stamatakis A.. 2017. Two C++ libraries for counting trees on a phylogenetic terrace. bioRxiv. https://www.biorxiv.org/content/early/2017/11/02/211276. - PMC - PubMed
    1. Briand LC, Wüst J, Ikonomovski SV, Lounis H.. 1999. Investigating quality factors in object-oriented designs: an industrial case study. In: Proceedings of the 21st International Conference on Software Engineering, ACM. p. 345–354.
    1. Briand LC, Wüst J, Daly JW, Porter DV.. 2000. Exploring the relationships between design measures and software quality in object-oriented systems. J. Syst. Softw. 513:245–273.

Publication types