Model Selection by Linear Programming

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8690)


Budget constraints arise in many computer vision problems. Computational costs limit many automated recognition systems while crowdsourced systems are hindered by monetary costs. We leverage wide variability in image complexity and learn adaptive model selection policies. Our learnt policy maximizes performance under average budget constraints by selecting “cheap” models for low complexity instances and utilizing descriptive models only for complex ones. During training, we assume access to a set of models that utilize features of different costs and types. We consider a binary tree architecture where each leaf corresponds to a different model. Internal decision nodes adaptively guide model-selection process along paths on a tree. The learning problem can be posed as an empirical risk minimization over training data with a non-convex objective function. Using hinge loss surrogates we show that adaptive model selection reduces to a linear program thus realizing substantial computational efficiencies and guaranteed convergence properties.


test-time budget adaptive model selection cost-sensitive learning 

Supplementary material

978-3-319-10605-2_42_MOESM1_ESM.pdf (299 kb)
Electronic Supplementary Material (PDF 300 KB)


  1. 1.
    Anstreicher, K.M., Ji, J., Potra, F.A., Ye, Y.: Probabilistic analysis of an infeasible-interior-point algorithm for linear programming. Math. Oper. Res. 24(1), 176–192 (1999)CrossRefzbMATHMathSciNetGoogle Scholar
  2. 2.
    Bennett, K.P., Mangasarian, O.L.: Bilinear separation of two sets in n-space. Computational Optimization and Applications 2 (1993)Google Scholar
  3. 3.
    Bishop, C.M., et al.: Pattern recognition and machine learning, vol. 1. Springer, New York (2006)zbMATHGoogle Scholar
  4. 4.
    Busa-Fekete, R., Benbouzid, D., Kégl, B.: Fast classification using sparse decision dags. In: 29th International Conference on Machine Learning (ICML) (2012)Google Scholar
  5. 5.
    Chen, M., Xu, Z., Weinberger, K.Q., Chapelle, O., Kedem, D.: Classifier cascade: Tradeoff between accuracy and feature evaluation cost. In: International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 235–242 (2012)Google Scholar
  6. 6.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)Google Scholar
  7. 7.
    Dulac-Arnold, G., Denoyer, L., Preux, P., Gallinari, P.: Datum-wise classification: a sequential approach to sparsity. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 375–390. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Gao, T., Koller, D.: Active classification based on value of classifier. In: NIPS, vol. 24, pp. 1062–1070 (2011)Google Scholar
  9. 9.
    He, H., Daume III, H., Eisner, J.: Imitation learning by coaching. In: Advances In Neural Information Processing Systems (NIPS), pp. 3158–3166 (2012)Google Scholar
  10. 10.
    Jiang, J., Teichert, A.R., Daumé III, H., Eisner, J.: Learned prioritization for trading off accuracy and speed. In: NIPS, pp. 1340–1348 (2012)Google Scholar
  11. 11.
    Karayev, S., Baumgartner, T., Fritz, M., Darrell, T.: Timely object recognition. In: NIPS, pp. 899–907 (2012)Google Scholar
  12. 12.
    Karayev, S., Fritz, M., Darrell, T.: Dynamic feature selection for classification on a budget. In: International Conference on Machine Learning (ICML): Workshop on Prediction with Sequential Models (2013)Google Scholar
  13. 13.
    Maaten, L., Welling, M., Saul, L.K.: Hidden-unit conditional random fields. In: International Conference on Artificial Intelligence and Statistics, pp. 479–488 (2011)Google Scholar
  14. 14.
    Megiddo, N.: On the complexity of polyhedral separability. Discrete & Computational Geometry 3(1) (1988)Google Scholar
  15. 15.
    Patterson, G., Hays, J.: Sun attribute database: Discovering, annotating, and recognizing scene attributes. In: Proceeding of the 25th Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  16. 16.
    Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 661–668 (2010)Google Scholar
  17. 17.
    Taskar, B., Guestrin, C., Koller, D.: Max-margin markov networks. In: NIPS (2003)Google Scholar
  18. 18.
    Trapeznikov, K., Saligrama, V.: Supervised sequential classification under budget constraints. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2013)Google Scholar
  19. 19.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y., Singer, Y.: Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research 6(9) (2005)Google Scholar
  20. 20.
    Viola, P., Jones, M.: Robust Real-time Object Detection. International Journal of Computer Vision 4, 34–47 (2001)Google Scholar
  21. 21.
    Wang, J., Saligrama, V.: Local supervised learning through space partitioning. Advances in Neural Information Processing Systems 25 (2012)Google Scholar
  22. 22.
    Wang, J., Saligrama, V.: Locally-Linear Learning Machines (L3M). In: Asian Conference on Machine Learning, pp. 451–466 (2013)Google Scholar
  23. 23.
    Wang, J., Trapeznikov, K., Saligrama, V.: An LP for Sequential Learning Under Budgets. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, pp. 987–995 (2014)Google Scholar
  24. 24.
    Weiss, D., Sapp, B., Taskar, B.: Dynamic structured model selection. In: ICCV (2013)Google Scholar
  25. 25.
    Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492. IEEE (2010)Google Scholar
  26. 26.
    Xu, Z., Kusner, M., Chen, M., Weinberger, K.Q.: Cost-sensitive tree of classifiers. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), pp. 133–141 (2013)Google Scholar
  27. 27.
    Zhang, C., Zhang, Z.: A Survey of Recent Advances in Face Detection. Tech. rep., Microsoft Research (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Boston UniversityUSA

Personalised recommendations