Recurrent Neural Networks for Adaptive Feature Acquisition

  • Gabriella ContardoEmail author
  • Ludovic Denoyer
  • Thierry Artières
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9949)


We propose to tackle the cost-sensitive learning problem, where each feature is associated to a particular acquisition cost. We propose a new model with the following key properties: (i) it acquires features in an adaptive way, (ii) features can be acquired per block (several at a time) so that this model can deal with high dimensional data, and (iii) it relies on representation-learning ideas. The effectiveness of this approach is demonstrated on several experiments considering a variety of datasets and with different cost settings.



This article has been supported within the Labex SMART supported by French state funds managed by the ANR within the Investissements d’Avenir programme under reference ANR-11-LABX-65. Part of this work has benefited from a grant from program DGA-RAPID, project LuxidX.


  1. 1.
    Ba, J., Mnih, V., Kavukcuoglu, K.: Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755 (2014)
  2. 2.
    Benbouzid, D., Busa-Fekete, R., Kégl, B.: Fast classification using sparse decision DAGs. In: ICML (2012)Google Scholar
  3. 3.
    Bi, J., Bennett, K., Embrechts, M., Breneman, C., Song, M.: Dimensionality reduction via sparse support vector machines. JMLR 3, 1229–1243 (2003)zbMATHGoogle Scholar
  4. 4.
    Bilgic, M., Getoor, L.: VOILA: efficient feature-value acquisition for classification. In: Proceedings of the National Conference on Artificial Intelligence (2007)Google Scholar
  5. 5.
    Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-cost sensitive naive Bayes classification. In: Data Mining, ICDM 2004 (2004)Google Scholar
  6. 6.
    Chapelle, O., Shivaswamy, P., Vadrevu, S., Weinberger, K., Zhang, Y., Tseng, B.: Boosted multi-task learning. Mach. Learn. 85(1–2), 149–173 (2011)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Chen, M., Weinberger, K.Q., Chapelle, O., Kedem, D., Xu, Z.: Classifier cascade for minimizing feature evaluation cost. In: AISTATS, pp. 218–226 (2012)Google Scholar
  8. 8.
    Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
  9. 9.
    Dulac-Arnold, G., Denoyer, L., Preux, P., Gallinari, P.: Sequential approaches for learning datum-wise sparse representations. Mach. Learn. 89, 87–122 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. JMLR 3, 1157–1182 (2003)zbMATHGoogle Scholar
  11. 11.
    Ji, S., Carin, L.: Cost-sensitive feature acquisition and classification. Pattern Recogn. 40(5), 1474–1485 (2007)CrossRefzbMATHGoogle Scholar
  12. 12.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  13. 13.
    Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: NIPS (2014)Google Scholar
  14. 14.
    Sermanet, P., Frome, A., Real, E.: Attention for fine-grained categorization. arXiv preprint arXiv:1412.7054 (2014)
  15. 15.
    Trapeznikov, K., Saligrama, V.: Supervised sequential classification under budget constraints. In: AISTATS (2013)Google Scholar
  16. 16.
    Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vis. 4, 51–52 (2001)Google Scholar
  17. 17.
    Weinberger, K., Dasgupta, A., Langford, J., Smola, A., Attenberg, J.: Feature hashing for large scale multitask learning. In: ICML. ACM (2009)Google Scholar
  18. 18.
    Weiss, D.J., Taskar, B.: Learning adaptive value of information for structured prediction. In: NIPS (2013)Google Scholar
  19. 19.
    Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. In: NIPS (2000)Google Scholar
  20. 20.
    Xu, Z., Huang, G., Weinberger, K.Q., Zheng, A.X.: Gradient boosted feature selection. In: ACM SIGKDD (2014)Google Scholar
  21. 21.
    Xu, Z., Kusner, M.J., Weinberger, K.Q., Chen, M., Chapelle, O.: Classifier cascades and trees for minimizing feature evaluation cost. JMLR 15, 2113–2144 (2014)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Xu, Z., Weinberger, K., Chapelle, O.: The greedy miser: learning under test-time budgets. arXiv preprint arXiv:1206.6451 (2012)
  23. 23.
    Yuan, M., Lin, Y.: Efficient empirical Bayes variable selection and estimation in linear models. J. Am. Stat. Assoc. 100, 100–1215 (2005)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., Sun, G.: A general boosting method and its application to learning ranking functions for web search. In: NIPS (2008)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Gabriella Contardo
    • 1
    Email author
  • Ludovic Denoyer
    • 1
  • Thierry Artières
    • 2
  1. 1.Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6ParisFrance
  2. 2.Aix Marseille Univ, CNRS, Centrale Marseille, LIFMarseilleFrance

Personalised recommendations