Algorithms for Active Classifier Selection: Maximizing Recall with Precision Constraints

  • Paul N. Bennett ,
  • David M. Chickering ,
  • Christopher Meek ,
  • Xiaojin Zhu

Proceedings of the 10th ACM International Conference on Web Search and Data Mining (WSDM '17) |

Published by ACM

论文与出版物

Software applications often use classification models to trigger specialized experiences for users. Search engines, for example, use query classifiers to trigger specialized “instant answer” experiences where information satisfying the user query is shown directly on the result page, and email applications use classification models to automatically move messages to a spam folder. When such applications have acceptable default (i.e., non-specialized) behavior, users are often more sensitive to failures in model precision than failures in model recall. In this paper, we consider model-selection algorithms for these precision-constrained scenarios. We develop adaptive model-selection algorithms to identify, using as few samples as possible, the best classifier from among a set of (precision) qualifying classifiers. We provide statistical correctness and sample complexity guarantees for our algorithms. We show with an empirical validation that our algorithms work well in practice.