Conference site » Proceedings

Python Array API Standard: Toward Array Interoperability in the Scientific Python Ecosystem

Aaron Meurer
Quansight

Athan Reines
Quansight

Ralf Gommers
Quansight

Yao-Lung L. Fang
NVIDIA Corporation

John Kirkham
NVIDIA Corporation

Matthew Barber
Quansight

Stephan Hoyer
Google

Andreas Müller
Microsoft

Sheng Zha
Amazon

Saul Shanabrook

Stephannie Jiménez Gacha
Quansight

Mario Lezcano-Casado
Quansight

Thomas J. Fan
Quansight

Tyler Reddy
LANL

Alexandre Passos

Hyukjin Kwon
Databricks

Travis Oliphant
Quansight

Consortium for Python Data API Standards

Abstract

The Python array API standard specifies standardized application programming interfaces (APIs) and behaviors for array and tensor objects and operations as commonly found in libraries such as NumPy Harris2020a, CuPy Okuta2017a, PyTorch Paszke2019a, JAX Bradbury2018a, TensorFlow Abadi2016a, Dask Rocklin2015a, and MXNet Chen2015a. The establishment and subsequent adoption of the standard aims to reduce ecosystem fragmentation and facilitate array library interoperability in user code and among array-consuming libraries, such as scikit-learn Pedregosa2011a and SciPy Virtanen2020a. A key benefit of array interoperability for downstream consumers of the standard is device agnosticism, whereby previously CPU-bound implementations can more readily leverage hardware acceleration via graphics processing units (GPUs), tensor processing units (TPUs), and other accelerator devices.

In this paper, we first introduce the Consortium for Python Data API Standards and define the scope of the array API standard. We then discuss the current status of standardization and associated tooling (including a test suite and compatibility layer). We conclude by outlining plans for future work.

Keywords

Python, Arrays, Tensors, NumPy, CuPy, PyTorch, JAX, Tensorflow, Dask, MXNet

DOI

10.25080/gerudo-f2bc6f59-001

Bibtex entry

Full text PDF