SoftwareX (Dec 2023)

PyOphidia: A Python library for High Performance Data Analytics at scale

  • Donatello Elia,
  • Cosimo Palazzo,
  • Sandro Fiore,
  • Alessandro D’Anca,
  • Andrea Mariello,
  • Giovanni Aloisio

Journal volume & issue
Vol. 24
p. 101538

Abstract

Read online

The increasing size of scientific datasets has caused a deep transformation in the way scientific research is currently carried out. In multiple domains, Big Data challenges have called for novel software solutions capable of exploiting fast computing resources, workflows technologies and parallel paradigms at scale, by providing a high-level of abstraction while hiding, at the same time, the underlying infrastructural and software complexity from scientists. This paper describes PyOphidia, an open-source Python module for High Performance Data Analytics on multi-dimensional scientific datasets. PyOphidia aims to simplify the execution of parallel data analysis over scientific datacubes on High Performance Computing infrastructures. It can be easily integrated with other existing Python libraries and tools, within multiple data science environments.

Keywords