PLoS Computational Biology (Nov 2021)
Ten simple rules on writing clean and reliable open-source scientific software
Abstract
Functional, usable, and maintainable open-source software is increasingly essential to scientific research, but there is a large variation in formal training for software development and maintainability. Here, we propose 10 “rules” centered on 2 best practice components: clean code and testing. These 2 areas are relatively straightforward and provide substantial utility relative to the learning investment. Adopting clean code practices helps to standardize and organize software code in order to enhance readability and reduce cognitive load for both the initial developer and subsequent contributors; this allows developers to concentrate on core functionality and reduce errors. Clean coding styles make software code more amenable to testing, including unit tests that work best with modular and consistent software code. Unit tests interrogate specific and isolated coding behavior to reduce coding errors and ensure intended functionality, especially as code increases in complexity; unit tests also implicitly provide example usages of code. Other forms of testing are geared to discover erroneous behavior arising from unexpected inputs or emerging from the interaction of complex codebases. Although conforming to coding styles and designing tests can add time to the software development project in the short term, these foundational tools can help to improve the correctness, quality, usability, and maintainability of open-source scientific software code. They also advance the principal point of scientific research: producing accurate results in a reproducible way. In addition to suggesting several tips for getting started with clean code and testing practices, we recommend numerous tools for the popular open-source scientific software languages Python, R, and Julia.