SoftwareX (Dec 2024)

Pynblint: A quality assurance tool to improve the quality of Python Jupyter notebooks

  • Luigi Quaranta,
  • Fabio Calefato,
  • Filippo Lanubile

Journal volume & issue
Vol. 28
p. 101959

Abstract

Read online

Jupyter Notebook is widely recognized as a crucial tool for data science professionals and students. Its interactive and self-documenting nature makes it particularly suitable for data-driven programming tasks. Nonetheless, it faces criticism for its limited support for software engineering best practices and its tendency to encourage bad programming habits, such as non-linear code execution. These issues often result in non-reproducible, poorly documented, and low-quality notebook code. In this paper, we introduce Pynblint, a static analyzer for Python Jupyter notebooks. Pynblint is designed to help data scientists write better notebooks, easy to understand and reproduce. We report on how we validated Pynblint with both professional data scientists and students, receiving overall positive feedback. Additionally, we discuss the potential of Pynblint to facilitate research inquiries into computational notebooks.

Keywords