cesium: Open-Source Platform for Time-Series Inference
Brett Naul
Stéfan van der Walt
Arien Crellin-Quick
Joshua S. Bloom
Fernando Pérez
Video: https://youtu.be/ZgHGCfwExw0
Abstract
Inference on time series data is a common requirement in many scientific
disciplines and internet of things (IoT) applications, yet there are few
resources available to domain scientists to easily, robustly, and repeatably
build such complex inference workflows: traditional statistical
models of time series are often too rigid to explain complex time domain
behavior, while popular machine learning packages require already-featurized
dataset inputs. Moreover, the software engineering tasks required to
instantiate the computational platform are daunting. cesium is an
end-to-end time series analysis framework, consisting of a Python library as
well as a web front-end interface, that allows researchers to featurize raw
data and apply modern machine learning techniques in a simple, reproducible,
and extensible way. Users can apply out-of-the-box feature engineering
workflows as well as save and replay their own analyses. Any steps taken in
the front end can also be exported to a Jupyter notebook, so users can
iterate between possible models within the front end and then fine-tune their
analysis using the additional capabilities of the back-end library. The
open-source packages make us of many use modern Python toolkits, including
xarray, dask, Celery, Flask, and scikit-learn.
time series, machine learning, reproducible science
DOI10.25080/Majora-629e541a-004