Journal of Open Research Software (Apr 2017)

xarray: N-D labeled Arrays and Datasets in Python

  • Stephan Hoyer,
  • Joe Hamman

DOI
https://doi.org/10.5334/jors.148
Journal volume & issue
Vol. 5, no. 1

Abstract

Read online

xarray is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays. Our approach combines an application programing interface (API) inspired by pandas with the Common Data Model for self-described scientific data. Key features of the xarray package include label-based indexing and arithmetic, interoperability with the core scientific Python packages (e.g., pandas, NumPy, Matplotlib), out-of-core computation on datasets that don’t fit into memory, a wide range of serialization and input/output (I/O) options, and advanced multi-dimensional data manipulation tools such as group-by and resampling. xarray, as a data model and analytics toolkit, has been widely adopted in the geoscience community but is also used more broadly for multi-dimensional data analysis in physics, machine learning and finance.

Keywords