On Version Space Compression

  • Shai Ben-David
  • Ruth Urner
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9925)


We study compressing labeled data samples so as to maintain version space information. While classic compression schemes [11] only ask for recovery of a samples’ labels, many applications, such as distributed learning, require compact representations of more diverse information which is contained in a given data sample. In this work, we propose and analyze various frameworks for compression schemes designed to allow for recovery of version spaces. We consider exact versus approximate recovery as well as compression to subsamples versus compression to subsets of the version space. For all frameworks, we provide some positive examples and sufficient conditions for compressibility while also pointing out limitations by formally establishing impossibility of compression for certain classes.


Version Space Marginal Distribution Initial Segment Concept Class Compression Scheme 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Balcan, M.-F., Blum, A., Fine, S., Mansour, Y.: Distributed learning, communication complexity and privacy. In: Proceedings of the 25th Annual Conference on Learning Theory (COLT), pp. 26.1–26.22 (2012)Google Scholar
  2. 2.
    Ben-David, S.: 2 notes on classes with Vapnik-Chervonenkis dimension 1 (2015). CoRR arXiv:1507.05307
  3. 3.
    Ben-David, S., Litman, A.: Combinatorial variability of Vapnik-Chervonenkis classes with applications to sample compression schemes. Discrete Appl. Math. 86(1), 3–25 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Ben-David, S.: Low-sensitivity functions from unambiguous certificates. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 23, no. 84 (2016)Google Scholar
  5. 5.
    Chen, S.-T., Balcan, M.-F., Chau, D.H.: Communication efficient distributed agnostic boosting (2015). CoRR arXiv:1506.06318
  6. 6.
    Floyd, S., Warmuth, M.K.: Sample compression, learnability, and the Vapnik-Chervonenkis dimension. Mach. Learn. 21(3), 269–304 (1995)Google Scholar
  7. 7.
    Goldman, S.A., Kearns, M.J.: On the complexity of teaching. J. Comput. Syst. Sci. 50(1), 20–31 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hanneke, S., Yang, L.: Minimax analysis of active learning. J. Mach. Learn. Res. 16, 3487–3602 (2015)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Kuzmin, D., Warmuth, M.K.: Unlabeled compression schemes for maximum classes. J. Mach. Learn. Res. 8, 2047–2081 (2007)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Li, L., Littman, M.L., Walsh, T.J., Strehl, A.L.: Knows what it knows: a framework for self-aware learning. Mach. Learn. 82(3), 399–443 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Littlestone, N., Warmuth, M.K.: Relating data compression and learnability (1986, unpublished manuscript)Google Scholar
  12. 12.
    Mitchell, T.M.: Version spaces: a candidate elimination approach to rule learning. In: Proceedings of the 5th International Joint Conference on Artificial Intelligence, pp. 305–310 (1977)Google Scholar
  13. 13.
    Moran, S., Shpilka, A., Wigderson, A., Yehudayoff, A.: Compressing and teaching for low VC-dimension. In: Proceedings of IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS), pp. 40–51 (2015)Google Scholar
  14. 14.
    Moran, S., Warmuth, M.K.: Labeled compression schemes for extremal classes (2015). CoRR arXiv:1506.00165
  15. 15.
    Rivest, R.L., Sloan, R.H.: Learning complicated concepts reliably and usefully. In: Proceedings of the 7th National Conference on Artificial Intelligence, pp. 635–640 (1988)Google Scholar
  16. 16.
    Samei, R., Semukhin, P., Yang, B., Zilles, S.: Sample compression for multi-label concept classes. In: Proceedings of The 27th Conference on Learning Theory (COLT), pp. 371–393 (2014)Google Scholar
  17. 17.
    Sayedi, A., Zadimoghaddam, M., Blum, A.: Trading off mistakes and don’t-know predictions. In: Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems (NIPS), pp. 2092–2100 (2010)Google Scholar
  18. 18.
    Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning. Cambridge University Press, Cambridge (2014)CrossRefzbMATHGoogle Scholar
  19. 19.
    Vapnik, V.N., Chervonenkis, A.J.: On the uniform convergence of relative frequencies of events to their probabilities. Theor. Probab. Appl. 16(2), 264–280 (1971)CrossRefzbMATHGoogle Scholar
  20. 20.
    Wiener, Y., Hanneke, S., El-Yaniv, R.: A compression technique for analyzing disagreement-based active learning. J. Mach. Learn. Res. 16, 713–745 (2015)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Zhang, C., Chaudhuri, K.: The extended littlestone’s dimension for learning with mistakes and abstentions (2016). CoRR arXiv:1604.06162

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.University of WaterlooWaterlooCanada
  2. 2.Max Planck Institute for Intelligent SystemsStuttgartGermany

Personalised recommendations