Python Array API Standard: Toward Array Interoperability in the Scientific Python Ecosystem
Aaron Meurer
Athan Reines
Ralf Gommers
Yao-Lung L. Fang
John Kirkham
Matthew Barber
Stephan Hoyer
Andreas Müller
Sheng Zha
Saul Shanabrook
Stephannie Jiménez Gacha
Mario Lezcano-Casado
Thomas J. Fan
Tyler Reddy
Alexandre Passos
Hyukjin Kwon
Travis Oliphant
Consortium for Python Data API Standards
The Python array API standard specifies standardized application
programming interfaces (APIs) and behaviors for array and tensor objects
and operations as commonly found in libraries such as NumPy
Harris2020a, CuPy Okuta2017a, PyTorch Paszke2019a,
JAX Bradbury2018a, TensorFlow Abadi2016a, Dask
Rocklin2015a, and MXNet Chen2015a. The establishment and
subsequent adoption of the standard aims to reduce ecosystem fragmentation
and facilitate array library interoperability in user code and among
array-consuming libraries, such as scikit-learn Pedregosa2011a and
SciPy Virtanen2020a. A key benefit of array interoperability for
downstream consumers of the standard is device agnosticism, whereby
previously CPU-bound implementations can more readily leverage hardware
acceleration via graphics processing units (GPUs), tensor processing units
(TPUs), and other accelerator devices.
In this paper, we first introduce the Consortium for Python Data API
Standards and define the scope of the array API standard. We then discuss
the current status of standardization and associated tooling (including a
test suite and compatibility layer). We conclude by outlining plans for
future work.
Python, Arrays, Tensors, NumPy, CuPy, PyTorch, JAX, Tensorflow, Dask, MXNet
DOI10.25080/gerudo-f2bc6f59-001