Advertisement

Datum-Wise Classification: A Sequential Approach to Sparsity

  • Gabriel Dulac-Arnold
  • Ludovic Denoyer
  • Philippe Preux
  • Patrick Gallinari
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6911)

Abstract

We propose a novel classification technique whose aim is to select an appropriate representation for each datapoint, in contrast to the usual approach of selecting a representation encompassing the whole dataset. This datum-wise representation is found by using a sparsity inducing empirical risk, which is a relaxation of the standard L 0 regularized risk. The classification problem is modeled as a sequential decision process that sequentially chooses, for each datapoint, which features to use before classifying. Datum-Wise Classification extends naturally to multi-class tasks, and we describe a specific case where our inference has equivalent complexity to a traditional linear classifier, while still using a variable number of features. We compare our classifier to classical L 1 regularized linear models (L 1-SVM and LARS) on a set of common binary and multi-class datasets and show that for an equal average number of features used we can get improved performance using our method.

Keywords

Feature Selection Sequential Approach Reward Function Empirical Risk Policy Iteration 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (January 1994)Google Scholar
  2. 2.
    Sutton, R., Barto, A.: Reinforcement Learning. MIT Press, Cambridge (1998)Google Scholar
  3. 3.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least-angle regression. Annals of statistics 32(2), 407–499 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)CrossRefzbMATHGoogle Scholar
  5. 5.
    Har-Peled, S., Roth, D., Zimak, D.: Constraint classification: A new approach to multiclass classification. Algorithmic Learning Theory, 1–11 (2002)Google Scholar
  6. 6.
    Lagoudakis, M.G., Parr, R.: Reinforcement learning as classification: Leveraging modern classifiers. In: ICML 2003 (2003)Google Scholar
  7. 7.
    Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)zbMATHGoogle Scholar
  8. 8.
    Guyon, I., Elisseefi, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3(7-8), 1157–1182 (2003)zbMATHGoogle Scholar
  9. 9.
    Girgin, S., Preux, P.: Feature discovery in reinforcement learning using genetic programming. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 218–229. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Gaudel, R., Sebag, M.: Feature Selection as a One-Player Game. In: ICML (2010)Google Scholar
  11. 11.
    Xu, Z., Zhang, H., Wang, Y., Chang, X., Liang, Y.: L1/2 regularization. Science China Information Sciences 53(6), 1159–1169 (2010)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Ertin, E.: Reinforcement learning and design of nonparametric sequential decision networks. In: Proceedings of SPIE, pp. 40–47 (2002)Google Scholar
  13. 13.
    Ji, S., Carin, L.: Cost-sensitive feature acquisition and classification. Pattern Recognition 40(5), 1474–1485 (2007)CrossRefzbMATHGoogle Scholar
  14. 14.
    Póczos, B., Abbasi-Yadkori, Y., Szepesvári, C., Greiner, R., Sturtevant, N.: Learning when to stop thinking and do something! In: ICML 2009, pp. 1–8 (2009)Google Scholar
  15. 15.
    Dulac-Arnold, G., Denoyer, L., Gallinari, P.: Text Classification: A Sequential Reading Approach. In: ECIR, pp. 411–423 (2011)Google Scholar
  16. 16.
    Preda, M.: Adaptive building of decision trees by reinforcement learning. In: Proceedings of the 7th WSEAS, pp. 34–39 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gabriel Dulac-Arnold
    • 1
  • Ludovic Denoyer
    • 1
  • Philippe Preux
    • 2
  • Patrick Gallinari
    • 1
  1. 1.Université Pierre et Marie Curie - UPMC, LIP6 Case 169ParisFrance
  2. 2.LIFL (UMR CNRS) & INRIA Lille Nord-Europe Université de LilleFrance

Personalised recommendations