Conference site ยป Proceedings

Monitoring Scientific Python Usage on a Supercomputer

Rollin Thomas
National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road MS59-4010A, Berkeley, California, 94720

Laurie Stephey
National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road MS59-4010A, Berkeley, California, 94720

Annette Greiner
National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road MS59-4010A, Berkeley, California, 94720

Brandon Cook
National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road MS59-4010A, Berkeley, California, 94720

Abstract

In 2021, more than 30\% of users at the National Energy Research Scientific Computing Center (NERSC) used Python on the Cori supercomputer. To determine this we have developed and open-sourced a simple, minimally invasive monitoring framework that leverages standard Python features to capture Python imports and other job data via a package called \textquotedbl{}Customs\textquotedbl{}. To analyze the data we collect via Customs, we have developed a Jupyter-based analysis framework designed to be interactive, shareable, extensible, and publishable via a dashboard. Our stack includes Papermill to execute parameterized notebooks, Dask-cuDF for multi-GPU processing, and Voila to render our notebooks as web-based dashboards. We report preliminary findings from Customs data collection and analysis. This work demonstrates that our monitoring framework can capture insightful and actionable data including top Python libraries, preferred user software stacks, and correlated libraries, leading to a better understanding of user behavior and affording us opportunity to make increasingly data-driven decisions regarding Python at NERSC.

Keywords

HPC, Python monitoring, GPUs, dashboards, parallel, Jupyter

DOI

10.25080/majora-1b6fd038-010

Bibtex entry

Full text PDF