Minimizing data consumption with sequential online feature selection

  • Thomas RückstießEmail author
  • Christian Osendorfer
  • Patrick van der Smagt
Original Article


In most real-world information processing problems, data is not a free resource. Its acquisition is often expensive and time-consuming. We investigate how such cost factors can be included in supervised classification tasks by deriving classification as a sequential decision process and making it accessible to reinforcement learning. Depending on previously selected features and the internal belief of the classifier, a next feature is chosen by a sequential online feature selection that learns which features are most informative at each time step. Experiments on toy datasets and a handwritten digits classification task show significant reduction in required data for correct classification, while a medical diabetes prediction task illustrates variable feature cost minimization as a further property of our algorithm.


Reinforcement learning Feature selection Classification 


  1. 1.
    Bazzani L, Freitas N, Larochelle H, Murino V, Ting JA (2011) Learning attentional policies for tracking and recognition in video with deep networks. In: Getoor L, Scheffer T (eds.) Proceedings of the 28th international conference on machine learning (ICML-11). ICML ’11, pp 937–944Google Scholar
  2. 2.
    Deisenroth M, Rasmussen C, Peters J (2009) Gaussian process dynamic programming. Neurocomputing 72(7–9):1508–1524CrossRefGoogle Scholar
  3. 3.
    Dulac-Arnold G, Denoyer L, Preux P, Gallinari P (2011) Datum-wise classification: a sequential approach to sparsity. In: Proceedings of the European conference of machine learning (ECML 2011). Springer, pp 375–390 Google Scholar
  4. 4.
    Ernst D, Geurts P, Wehenkel L (2005) Tree-based batch mode reinforcement learning. J Mach Learn Res 6(1):503MathSciNetzbMATHGoogle Scholar
  5. 5.
    Frank A, Asuncion A (2011) UCI machine learning repository. University of California, Irvine, CA.
  6. 6.
    Gaudel R, Sebag M (2010) Feature selection as a one-player game. In: Fürnkranz J, Joachims T(eds.) Proceedings of the 27th international conference on machine learning (ICML-10), pp 359–366
  7. 7.
    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  8. 8.
    Hüsken M, Stagge P (2003) Recurrent neural networks for time series classification. Neurocomputing 50:223–235CrossRefzbMATHGoogle Scholar
  9. 9.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  10. 10.
    Lin L (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn 8(3):293–321Google Scholar
  11. 11.
    Liu F, Su J (2004) Reinforcement learning-based feature learning for object tracking. In: Proceedings of the 17th international conference on pattern recognition, vol 2. IEEE, pp 748–751Google Scholar
  12. 12.
    Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall, LondonGoogle Scholar
  13. 13.
    Monahan G (1982) A survey of partially observable Markov decision processes: theory, models, and algorithms. Manag Sci 1–16Google Scholar
  14. 14.
    Neumann G, Peters J (2009) Fitted q-iteration by advantage weighted regression. Adv Neural Inf Process Syst 21:1177–1184Google Scholar
  15. 15.
    Neumann G, Pfeiffer M, Hauser H (2006) Batch reinforcement learning methods for point to point movements. Tech rep Graz University of TechnologyGoogle Scholar
  16. 16.
    Norouzi E, Nili Ahmadabadi M, Nadjar Araabi B (2010) Attention control with reinforcement learning for face recognition under partial occlusion. Mach Vis Appl 1–12Google Scholar
  17. 17.
    Paletta L, Fritz G, Seifert C (2005) Q-learning of sequential attention for visual object recognition from informative local descriptors. In: Proceedings of the 22nd international conference on machine learning, vol 22, p 649Google Scholar
  18. 18.
    Perkins S, Theiler J (2003) Online feature selection using grafting. In: Proceedings of the 20th international conference on machine learning (ICML), pp 592–599Google Scholar
  19. 19.
    Riedmiller M (2005) NNeural fitted Q iteration—first experiences with a data efficient neural reinforcement learning method. In: Lecture notes in computer science, vol 3720Google Scholar
  20. 20.
    Saar-Tsechansky M, Provost F (2007) Handling missing values when applying classification models. J Mach Learn Res 8(1625–1657):9Google Scholar
  21. 21.
    Schmidhuber J, Huber R (1991) Learning to generate artificial fovea trajectories for target detection. Int J Neural Syst 2(1):135–141CrossRefGoogle Scholar
  22. 22.
    Timmer S, Riedmiller M (2007) Fitted q iteration with cmacs. In: IEEE international symposium on approximate dynamic programming and reinforcement learning, 2007. ADPRL 2007, IEEE pp 1–8 Google Scholar
  23. 23.
    Vijayakumar S, Schaal S (2000) Locally weighted projection regression: An o (n) algorithm for incremental real time learning in high dimensional space. In: Proceedings of the seventeenth international conference on machine learning (ICML 2000), Citeseer 1:288–293Google Scholar
  24. 24.
    Williams R, Peng J (1990) An efficient gradient-based algorithm for on-line training of recurrent network trajectories. Neural Comput 2(4):490–501CrossRefGoogle Scholar
  25. 25.
    Wu X, Yu K, Wang H, Ding W (2010) Online streaming feature selection. In: Fürnkranz J, Joachims T (eds.) Proceedings of the 27th international conference on machine learning (ICML-10), pp 1159–1166Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Thomas Rückstieß
    • 1
    Email author
  • Christian Osendorfer
    • 1
  • Patrick van der Smagt
    • 2
  1. 1.Institute of Computer Science VITechnische Universität MünchenGarchingGermany
  2. 2.DLR-Institute of Robotics and MechatronicsWesslingGermany

Personalised recommendations