Advertisement

Efficient Model Selection for Regularized Classification by Exploiting Unlabeled Data

  • Georgios BalikasEmail author
  • Ioannis Partalas
  • Eric Gaussier
  • Rohit Babbar
  • Massih-Reza Amini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9385)

Abstract

Hyper-parameter tuning is a resource-intensive task when optimizing classification models. The commonly used k-fold cross validation can become intractable in large scale settings when a classifier has to learn billions of parameters. At the same time, in real-world, one often encounters multi-class classification scenarios with only a few labeled examples; model selection approaches often offer little improvement in such cases and the default values of learners are used. We propose bounds for classification on accuracy and macro measures (precision, recall, F1) that motivate efficient schemes for model selection and can benefit from the existence of unlabeled data. We demonstrate the advantages of those schemes by comparing them with k-fold cross validation and hold-out estimation in the setting of large scale classification.

Notes

Acknowledgements

This work is partially supported by the CIFRE N 28/2015 and by the LabEx PERSYVAL Lab ANR-11-LABX-0025.

References

  1. 1.
    Arlot, S., Lerasle, M.: Why V=5 is enough in V-fold cross-validation. ArXiv e-prints (2012)Google Scholar
  2. 2.
    Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Babbar, R., Partalas, I., Gaussier, E., Amini, M.r.: Re-ranking approach to classification in large-scale power-law distributed category systems. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2014 (2014)Google Scholar
  4. 4.
    Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Quantification via probability estimators. In: 2010 IEEE 10th International Conference on Data Mining, (ICDM), pp. 737–742. IEEE (2010)Google Scholar
  5. 5.
    Bengio, Y., Grandvalet, Y.: No unbiased estimator of the variance of k-fold cross-validation. J. Mach. Learn. Res. 5, 1089–1105 (2004)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Blum, A., Kalai, A., Langford, J.: Beating the hold-out: bounds for k-fold and progressive cross-validation. In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 203–208 (1999)Google Scholar
  7. 7.
    Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006). http://www.kyb.tuebingen.mpg.de/ssl-book
  8. 8.
    Esuli, A., Sebastiani, F.: Optimizing text quantifiers for multivariate loss functions. Technical report 2013-TR-005, Istituto di Scienza e Tecnologie dellInformazione, Consiglio Nazionale delle Ricerche, Pisa, IT (2013)Google Scholar
  9. 9.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  10. 10.
    Forman, G.: Counting positives accurately despite inaccurate classification. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 564–575. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  11. 11.
    Forman, G.: Quantifying counts and costs via classification. Data Min. Knowl. Discov. 17(2), 164–206 (2008)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, IJCAI 1995 (1995)Google Scholar
  13. 13.
    Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, Cambridge (2012)zbMATHGoogle Scholar
  14. 14.
    Partalas, I., Kosmopoulos, A., Baskiotis, N., Artieres, T., Paliouras, G., Gaussier, E., Androutsopoulos, I., Amini, M.R., Galinari, P.: Lshtc: A benchmark for large-scale text classification. CoRR abs/1503.08581, March 2015Google Scholar
  15. 15.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Georgios Balikas
    • 1
    Email author
  • Ioannis Partalas
    • 2
  • Eric Gaussier
    • 1
  • Rohit Babbar
    • 3
  • Massih-Reza Amini
    • 1
  1. 1.University of Grenoble, AlpesSaint-Martin-d’HèresFrance
  2. 2.Viseo R&DGrenobleFrance
  3. 3.Max-Planck Institute for Intelligent SystemsTübingenGermany

Personalised recommendations