International Conference on Algorithmic Learning Theory

Algorithmic Learning Theory pp 209-223 | Cite as

Permutational Rademacher Complexity

A New Complexity Measure for Transductive Learning
  • Ilya Tolstikhin
  • Nikita Zhivotovskiy
  • Gilles Blanchard
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9355)

Abstract

Transductive learning considers situations when a learner observes m labelled training points and u unlabelled test points with the final goal of giving correct answers for the test points. This paper introduces a new complexity measure for transductive learning called Permutational Rademacher Complexity (PRC) and studies its properties. A novel symmetrization inequality is proved, which shows that PRC provides a tighter control over expected suprema of empirical processes compared to what happens in the standard i.i.d. setting. A number of comparison results are also provided, which show the relation between PRC and other popular complexity measures used in statistical learning theory, including Rademacher complexity and Transductive Rademacher Complexity (TRC). We argue that PRC is a more suitable complexity measure for transductive learning. Finally, these results are combined with a standard concentration argument to provide novel data-dependent risk bounds for transductive learning.

Keywords

Transductive learning Rademacher complexity Statistical learning theory Empirical processes Concentration inequalities 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bartlett, P., Bousquet, O., Mendelson, S.: Local rademacher complexities. The Annals of Statistics 33(4), 1497–1537 (2005)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Bartlett, P., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2001)MathSciNetMATHGoogle Scholar
  3. 3.
    Blum, A., Langford, J.: PAC-MDL bounds. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 344–357. Springer, Heidelberg (2003) CrossRefGoogle Scholar
  4. 4.
    Boucheron, S., Lugosi, G., Bousquet, O.: Theory of classification: a survey of recent advances. ESAIM: Probability and Statistics 9, 323–375 (2005)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press (2013)Google Scholar
  6. 6.
    Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press (2006)Google Scholar
  7. 7.
    Cortes, C., Mohri, M.: On transductive regression. In: NIPS 2006, pp. 305–312 (2007)Google Scholar
  8. 8.
    Cortes, C., Mohri, M., Pechyony, D., Rastogi, A.: Stability analysis and learning bounds for transductive regression algorithms (2009). CoRR abs/0904.0814
  9. 9.
    Derbeko, P., El-Yaniv, R., Meir, R.: Explicit learning curves for transduction and application to clustering and compression algorithms. Journal of Artificial Intelligence Research 22(1), 117–142 (2004)MathSciNetMATHGoogle Scholar
  10. 10.
    El-Yaniv, R., Pechyony, D.: Transductive rademacher complexity and its applications. Journal of Artificial Intelligence Research 35(1), 193–234 (2009)MathSciNetMATHGoogle Scholar
  11. 11.
    Gross, D., Nesme, V.: Note on sampling without replacing from a finite collection of matrices (2010). http://arxiv.org/abs/1001.2738v2
  12. 12.
    Haagerup, U.: The best constants in Khinchine inequality. Studia Mathematica 70(3), 231–283 (1981)MathSciNetMATHGoogle Scholar
  13. 13.
    Koltchinskii, V.: Oracle inequalities in empirical risk minimization and sparse recovery problems. Springer (2011)Google Scholar
  14. 14.
    Koltchinskii, V., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: Gine. D.E., Wellner, J. (eds.) High Dimensional Probability, II, pp. 443–457. Birkhauser (1999)Google Scholar
  15. 15.
    Ledoux, M., Talagrand, M.: Probability in Banach Space. Springer-Verlag (1991)Google Scholar
  16. 16.
    Magdon-Ismail, M.: Permutation complexity bound on out-sample error. In: Advances in Neural Information Processing Systems (NIPS 2010), pp. 1531–1539 (2010)Google Scholar
  17. 17.
    Mendelson, S.: Learning without Concentration (2014). CoRR abs/1401.0304
  18. 18.
    Pechyony, D.: Theory and Practice of Transductive Learning. PhD thesis (2008)Google Scholar
  19. 19.
    Stanica, P.: Good lower and upper bounds on binomial coefficients. Journal of Inequalities in Pure and Applied Mathematics 2(3) (2001)Google Scholar
  20. 20.
    Tolstikhin, I., Blanchard, G., Kloft, M.: Localized complexities for transductive learning. In: COLT 2014, pp. 857–884 (2014)Google Scholar
  21. 21.
    Van der Vaart, A.W., Wellner, J.: Weak Convergence and Empirical Processes: With Applications to Statistics. Springer (2000)Google Scholar
  22. 22.
    Vapnik, V.: Statistical Learning Theory. John Wiley & Sons (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ilya Tolstikhin
    • 1
  • Nikita Zhivotovskiy
    • 2
    • 3
  • Gilles Blanchard
    • 4
  1. 1.Max-Planck-Institute for Intelligent SystemsTübingenGermany
  2. 2.Moscow Institute of Physics and TechnologyMoscowRussia
  3. 3.Institute for Information Transmission ProblemsMoscowRussia
  4. 4.Department of MathematicsUniversität PotsdamPotsdamGermany

Personalised recommendations