Skip to main content

A Vector-Contraction Inequality for Rademacher Complexities

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9925))

Abstract

The contraction inequality for Rademacher averages is extended to Lipschitz functions with vector-valued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and sub-gaussian variables. Example applications are given for multi-category learning, K-means clustering and learning-to-learn.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3, 463–482 (2002)

    MathSciNet  MATH  Google Scholar 

  2. Baxter, J.: A model of inductive bias learning. J. Artif. Intell. Res. 12, 149–198 (2000)

    MathSciNet  MATH  Google Scholar 

  3. Biau, G., Devroye, L., Lugosi, G.: On the performance of clustering in Hilbert spaces. IEEE Trans. Inf. Theory 54(2), 781–790 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  4. Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities. Oxford University Press, Oxford (2013)

    Book  MATH  Google Scholar 

  5. Caponnetto, A., De Vito, E.: Optimal rates for regularized least-squares algorithm. Found. Comput. Math. 7, 331–368 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chapelle, O., Wu, M.: Gradient descent optimization of smoothed information retrieval metrics. Inf. Retr. 13(3), 216–235 (2010)

    Article  Google Scholar 

  7. Chaudhuri, S., Tewari, A.: Generalization bounds for learning to rank: does the length of document lists matter? In: ICML 2015 (2015)

    Google Scholar 

  8. Ciliberto, C., Poggio, T., Rosasco, L.: Convex learning of multiple tasks and their structure (2015). arXiv preprint: arXiv:1504.03101

  9. Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)

    MATH  Google Scholar 

  10. Kakade, S.M., Shalev-Shwartz, S., Tewari, A.: Regularization techniques for learning with matrices. J. Mach. Learn. Res. 13, 1865–1890 (2012)

    MathSciNet  MATH  Google Scholar 

  11. Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Stat. 30(1), 1–50 (2002)

    MathSciNet  MATH  Google Scholar 

  12. Ledoux, M., Talagrand, M.: Probability in Banach Spaces: Isoperimetry and Processes. Springer, Berlin (1991)

    Book  MATH  Google Scholar 

  13. Lei, Y., Dogan, U., Binder, A., Kloft, M.: Multi-class SVMs: from tighter data-dependent generalization bounds to novel algorithms. In: Advances in Neural Information Processing Systems, pp. 2026–2034 (2015)

    Google Scholar 

  14. Maurer, A.: Transfer bounds for linear feature learning. Mach. Learn. 75(3), 327–350 (2009)

    Article  Google Scholar 

  15. Maurer, A., Pontil, M.: K-dimensional coding schemes in Hilbert spaces. IEEE Trans. Inf. Theory 56(11), 5839–5846 (2010)

    Article  MathSciNet  Google Scholar 

  16. Maurer, A., Pontil, M., Romera-Paredes, B.: The benefit of multitask representation learning. J. Mach. Learn. Res. 17(81), 1–32 (2016)

    MathSciNet  MATH  Google Scholar 

  17. McDonald, D.J., Shalizi, C.R., Schervish, M.: Generalization error bounds for stationary autoregressive models (2011). arXiv preprint: arXiv:1103.0942

  18. Meir, R., Zhang, T.: Generalization error bounds for Bayesian mixture algorithms. J. Mach. Learn. Res. 4, 839–860 (2003)

    MathSciNet  MATH  Google Scholar 

  19. Michelli, C.A., Pontil, M.: On learning vector-valued functions. J. Mach. Learn. Res. 6, 615–637 (2005)

    MathSciNet  Google Scholar 

  20. Mroueh, Y., Poggio, T., Rosasco, L., Slotine, J.J.: Multiclass learning with simplex coding. In: Advances in Neural Information Processing Systems, pp. 2789–2797 (2012)

    Google Scholar 

  21. Slepian, D.: The one-sided barrier problem for Gaussian noise. Bell Syst. Tech. J. 41, 463–501 (1962)

    Article  MathSciNet  Google Scholar 

  22. Szarek, S.: On the best constants in the Khintchine inequality. Stud. Math. 58, 197–208 (1976)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Maurer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Maurer, A. (2016). A Vector-Contraction Inequality for Rademacher Complexities. In: Ortner, R., Simon, H., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2016. Lecture Notes in Computer Science(), vol 9925. Springer, Cham. https://doi.org/10.1007/978-3-319-46379-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46379-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46378-0

  • Online ISBN: 978-3-319-46379-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics