Conference site ยป Proceedings

signac: Data Management and Workflows for Computational Researchers

Bradley D. Dice
Department of Physics, University of Michigan, Ann Arbor

Brandon L. Butler
Department of Chemical Engineering, University of Michigan, Ann Arbor

Vyas Ramasubramani
Department of Chemical Engineering, University of Michigan, Ann Arbor

Alyssa Travitz
Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor

Michael M. Henry
Micron School of Materials Science and Engineering, Boise State University

Hardik Ojha
Department of Chemical Engineering, Indian Institute of Technology Roorkee

Kelly L. Wang
Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor

Carl S. Adorf
Department of Chemical Engineering, University of Michigan, Ann Arbor

Eric Jankowski
Micron School of Materials Science and Engineering, Boise State University

Sharon C. Glotzer
Department of Physics, University of Michigan, Ann Arbor
Department of Chemical Engineering, University of Michigan, Ann Arbor
Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor
Biointerfaces Institute, University of Michigan, Ann Arbor

Abstract

The signac data management framework (https://signac.io) helps researchers execute reproducible computational studies, scales workflows from laptops to supercomputers, and emphasizes portability and fast prototyping. With signac, users can track, search, and archive data and metadata for file-based workflows and automate workflow submission on high performance computing (HPC) clusters. We will discuss recent improvements to the software's feature set, scalability, scientific applications, usability, and community. Newly implemented synced data structures, features for generalized workflow execution, and performance optimizations will be covered, as well as recent research using the framework and changes to the project's outreach and governance as a response to its growth.

Keywords

data management, data science, database, simulation, collaboration, workflow, HPC, reproducibility

DOI

10.25080/majora-1b6fd038-003

Bibtex entry

Full text PDF