Selective Sampling Based Efficient Classifier Representation in Distributed Learning Conference Paper uri icon

abstract

  • 2016 IEEE. Communication is a bottleneck for many distributed machine learning problems with large data sets. One effective approach is to exchange the set of good classifiers among the learners. The key challenge in such an approach is how to represent the set of good classifiers, which can be viewed as a source coding problem in communication systems. A nonuniform sampling framework is proposed for this source coding problem. A metric that describes the quality of the sampled data for the target hypothesis space is proposed. An optimization problem is formulated by minimizing the upper bound of the proposed metric, and an efficient weight based algorithm is provided to solve the optimization problem. Numerical simulations on synthetic and real world data set show that the learning performance based on the proposed sampling framework is close to the global optimum with high confidence, while requiring less samples than baseline random sampling algorithms.

name of conference

  • 2016 IEEE Global Communications Conference (GLOBECOM)

published proceedings

  • 2016 IEEE Global Communications Conference (GLOBECOM)

author list (cited authors)

  • Fan, Y., Li, H., & Tian, C.

complete list of authors

  • Fan, Yawen||Li, Husheng||Tian, Chao