Conference site » Proceedings

Equity, Scalability, and Sustainability of Data Science Infrastructure

Anthony Suen
University of California, Berkeley

Laura Norén
New York University

Alan Liang
University of California, Berkeley

Andrea Tu
University of California, Berkeley

Abstract

We seek to understand the current state of equity, scalability, and sustainability of data science education infrastructure in both the U.S. and Canada. Our analysis of the technological, funding, and organizational structure of four types of institutions shows an increasing divergence in the ability of universities across the United States to provide students with accessible data science education infrastructure, primarily JupyterHub. We observe that generally liberal arts colleges, community colleges, and other institutions with limited IT staff and experience have greater difficulty setting up and maintaining JupyterHub, compared to well-funded private institutions or large public research universities with a deep technical bench of IT staff. However, by leveraging existing public-private partnerships and the experience of Canada’s national JupyterHub (Syzygy), the U.S. has an opportunity to provide a wider range of institutions and students access to JupyterHub.

Keywords

data science education, Jupyter, Jupyterhub, higher education

DOI

10.25080/Majora-4af1f417-002

Bibtex entry

Full text PDF