Ease.ml/ci & ease.ml/meter
Towards Data Management for Statistical Generialization

What is ease.ml/ci & ease.ml/meter?

When training a machine learning model becomes fast, and model selection and hyper-parameter tuning become automatic, will non-CS experts finally have the tool they need to build ML applications all by themselves? We at DS3Lab focus on those users who are still struggling — not because of the speed and the lack of automation of an ML system, but because it is so powerful that it is easily misused as an overfitting machine. For many of these users, the quality of their ML applications might actually decrease with these powerful tools without proper guidelines and feedback (like what software engineering provides for traditional software development). We introduce two systems, ease.ml/ci and ease.ml/meter, which we built as an early attempt at an ML system that tries to enforce the right user behavior during the development process of ML applications. The core technical challenge is how to answer adaptive statistical queries in a rigorous but practical (in terms of label complexity) way. Interestingly, both systems can be seen as a new type of data management system which, instead of managing the (relational) querying of the data, manages the statistical generalization power of the data.

Projects and Publications

Ease.ml/ci

Ease.ml/ci is a continuous integration engine for ML that gives developers a pass/fail signal for each developed ML model depending on whether they satisfy certain predefined properties over the (unknown) true distribution.

Publications

Bojan Karlaš, Matteo Interlandi, Cedric Renggli, Wentao Wu, Ce Zhang, Deepak Mukunthu Iyappan Babu, Jordan Edwards, Chris Lauren, Andy Xu and Markus Weimer. Building Continuous Integration Services for Machine Learning. KDD 2020 (Applied Data Science, Oral Presentation 44/756).
Cedric Renggli, Bojan Karlas, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, Ce Zhang. external pageContinuous Integration of Machine Learning Models: A Rigorous Yet Practical Treatment. SysML 2019. [external pageVideo: Youtube]

Demo

Cedric Renggli*, Frances Ann Hubis*, Bojan Karlaš, Kevin Schawinski, Wentao Wu, Ce Zhang. external pageEase.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization. VLDB Demo 2019.

Ease.ml/meter

Ease.ml/meter is a system that continuously returns some notion of the degree of overfitting to the developer.

Publication

Frances Ann Hubis, Wentao Wu, Ce Zhang. external pageEase.ml/meter: Quantitative Overfitting Management for Human-in-the-loop ML Application Development. Manuscript ArXiv 1906.00299, 2019

Demo

Cedric Renggli*, Frances Ann Hubis*, Bojan Karlaš, Kevin Schawinski, Wentao Wu, Ce Zhang. external pageEase.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization. VLDB Demo 2019.

People

External Collaborators

Wentao Wu (Microsoft Research)
Bolin Ding (Alibaba)

DS3Lab Members

Cedric Renggli
Bojan Karlaš
Frances Ann Hubis (previously)
Ce Zhang

Additional Information

Contact

Cédric Renggli

Location STF G 222

Institut für Computing Platforms
Stampfenbachstrasse 114
8092 Zürich
Switzerland

Ease.ml/ci & ease.ml/meter Towards Data Management for Statistical Generialization

What is ease.ml/ci & ease.ml/meter?

Projects and Publications

Ease.ml/ci

Publications

Demo

Ease.ml/meter

Publication

Demo

People

External Collaborators

DS3Lab Members

Ease.ml/ci & ease.ml/meter
Towards Data Management for Statistical Generialization