DataScope

MLDevs and MLOps are Data Problems -- The goal of DataScope is to provide a principled ramework to understanding the impact of data noises and data examples, which enables data cleaning for ML, data market, and also advancement of ML robustness.

datascope

This Webpage is currently under construction.

  • See how DataScope fits into the Ease.ML framework for MLDev and MLOps
  • See a short presentation about DataScope
  • See Publications as follows
By playing the video you accept the privacy policy of YouTube.Learn more OK

Technical Core

Bojan Karlaš, Peng Li, Renzhi Wu, Nezihe Merve Gürel, Xu Chu, Wentao Wu, Ce Zhang. Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions. VLDB 2021.

Ruoxi Jia, Xuehui Sun, Jiacen Xu, Ce Zhang, Bo Li, Dawn Song. An Empirical and Comparative Analysis of Data Valuation with Scalable Algorithms. CVPR 2021.

Peng Li, Xi Rao, Jeffinifer Blase, Yue Zhang, Xu Chu, Ce Zhang. CleanML: A Benchmark for Evaluating the Impact of Data Cleaning on ML Classification Tasks. ICDE 2021.

Ruoxi Jia, David Dao, Boxin Wang, Frances A. Hubis, Nick Hynes, Nezihe M. Gürel, Bo Li, Ce Zhang, Dawn Song and Costas J. Spanos. Towards Efficient Data Valuation Based on the Shapley Value. AISTATS 2019.

Ruoxi Jia, David Dao, Boxin Wang, Frances A. Hubis, Nezihe M. Gürel, Bo Li, Ce Zhang, Costas J. Spanos and Dawn Song. Efficient Task-​Specific Data Valuation for Nearest Neighbor Algorithms. VLDB 2019.

Applications to Robustness

Nezihe Merve Gürel*, Xiangyu Qi*, Luka Rimanic, Ce Zhang, Bo Li. Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks. ICML 2021.

Linyi Li*, Maurice Weber*, Xiaojun Xu, Luka Rimanic, Bhavya Kailkhura, Tao Xie, Ce Zhang, Bo Li. TSS: Transformation-​Specific Smoothing for Robustness Certification. CCS 2021.

Maurice Weber, Xiaojun Xu, Bojan Karlaš, Ce Zhang, Bo Li. RAB: Provable Robustness Against Backdoor Attacks. arXiv:2003.08904.

Z Yang, Z Zhao, H Pei, B Wang, B Karlas, J Liu, H Guo, B Li, C Zhang. End-to-end Robustness for Sensing-Reasoning Machine Learning Pipelines. arXiv:2003.00120.

JavaScript has been disabled in your browser