popmon: Analysis Package for Dataset Shift Detection
Simon Brugman
Tomas Sostak
Pradyot Patil
Max Baak
popmon is an open-source Python package
to check the stability of a tabular dataset.
popmon creates histograms of features binned in time-slices, and compares the stability of its profiles and distributions
using statistical tests, both over time and with respect to a reference dataset.
It works with numerical, ordinal and categorical features, on both pandas and Spark dataframes,
and the histograms can be higher-dimensional, e.g. it can also track correlations between sets of features.
popmon can automatically detect and alert on changes observed over time, such as trends, shifts, peaks, outliers, anomalies, changing correlations, etc.,
using monitoring business rules that are either static or dynamic.
popmon results are presented in a self-contained report.
dataset shift detection, population shift, covariate shift, histogramming, profiling
DOI10.25080/majora-212e5952-01d