International Journal of Computer Vision

, Volume 91, Issue 1, pp 24–44 | Cite as

Cost-Sensitive Active Visual Category Learning

  • Sudheendra VijayanarasimhanEmail author
  • Kristen Grauman


We present an active learning framework that predicts the tradeoff between the effort and information gain associated with a candidate image annotation, thereby ranking unlabeled and partially labeled images according to their expected “net worth” to an object recognition system. We develop a multi-label multiple-instance approach that accommodates realistic images containing multiple objects and allows the category-learner to strategically choose what annotations it receives from a mixture of strong and weak labels. Since the annotation cost can vary depending on an image’s complexity, we show how to improve the active selection by directly predicting the time required to segment an unlabeled image. Our approach accounts for the fact that the optimal use of manual effort may call for a combination of labels at multiple levels of granularity, as well as accurate prediction of manual effort. As a result, it is possible to learn more accurate category models with a lower total expenditure of annotation effort. Given a small initial pool of labeled data, the proposed method actively improves the category models with minimal manual intervention.


Visual category learning Active learning Multi-label Multiple-instance learning Cost prediction Cost sensitive learning Object recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bach, F. R., Lanckriet, G. R. G., & Jordan, M. I. (2004). Fast kernel learning using sequential minimal optimization (Tech. Rep. UCB/CSD-04-1307). Google Scholar
  2. Baldridge, J., & Osborne, M. (2008). Active learning and logarithmic opinion pools for Hpsg parse selection. Natural Language Engineering, 14(2), 191–222. CrossRefGoogle Scholar
  3. Bart, E., & Ullman, S. (2005). Cross-generalization: learning novel classes from a single example by feature replacement. In CVPR’05. Google Scholar
  4. Bunescu, R. C., & Mooney, R. J. (2007). Multiple instance learning for sparse positive bags. In ICML. Google Scholar
  5. Cauwenberghs, G., & Poggio, T. (2000). Incremental and decremental support vector machine learning. In NIPS. Google Scholar
  6. Collins, B., Deng, J., Li, K., & Fei-Fei, L. (2008). Towards scalable dataset construction: an active learning approach. In ECCV. Google Scholar
  7. Dietterich, T., Lathrop, R., & Lozano-Perez, T. (1997). Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence. Google Scholar
  8. Fei-Fei, L., Fergus, R., & Perona, P. (2003). A Bayesian approach to unsupervised one-shot learning of object categories. In ICCV. Google Scholar
  9. Fergus, R., Fei-Fei, L., Perona, P., & Zisserman, A. (2005). Learning object categories from Google’s image search. In ICCV. Google Scholar
  10. Gartner, T., Flach, P., Kowalczyk, A., & Smola, A. (2002). Multi-instance kernels. In ICML. Google Scholar
  11. Greiner, R., Grove, A. J., & Roth, D. (2002). Learning cost-sensitive active classifiers. Artificial Intelligence, 139(2), 137–174. CrossRefMathSciNetGoogle Scholar
  12. Haertel, R., Ringger, E., Seppi, K., Carroll, J., & McClanahan, P. (2008). Assessing the costs of sampling methods in active learning for annotation. In Proceedings of workshop on parsing German. Google Scholar
  13. Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2007a). Active learning with Gaussian processes for object categorization. In ICCV. Google Scholar
  14. Kapoor, A., Horvitz, E., & Basu, S. (2007b). Selective supervision: guiding supervised learning with decision-theoretic active learning. In IJCAI. Google Scholar
  15. Kwok, J. T., & Cheung, P. (2007). Marginalized multi-instance kernels. In IJCAI. Google Scholar
  16. Lanckriet, G., Cristianini, N., Bartlett, P., & Ghaoui, LE (2004). Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5. Google Scholar
  17. Lee, Y., & Grauman, K. (2008). Foreground focus: finding meaningful features in unlabeled images. In BMVC. Google Scholar
  18. Li, L., Wang, G., & Fei-Fei, L. (2007). Optimol: automatic online picture collection via incremental model learning. In CVPR. Google Scholar
  19. Maron, O., & Ratan, A.L. (1998). Multiple-instance learning for natural scene classification. In ICML. Google Scholar
  20. Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in large margin classifiers. Cambridge: MIT Press. Google Scholar
  21. Qi, G., Hua, X., Rui, Y., Tang, J., & Zhang, H. (2008). Two-dimensional active learning for image classification. In CVPR. Google Scholar
  22. Quelhas, P., Monay, F., Odobez, J. M., Gatica-Perez, D., Tuytelaars, T., & VanGool, L. (2005). Modeling scenes with local descriptors and latent aspects. In ICCV. Google Scholar
  23. Russell, B., Torralba, A., Murphy, K., & Freeman, W. (2005). Labelme: a database and web-based tool for image annotation (Tech. rep.). MIT. Google Scholar
  24. Settles, B., Craven, M., & Ray, S. (2008). Multiple-instance active learning. In NIPS. Google Scholar
  25. Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In ECCV. Google Scholar
  26. Sivic, J., Russell, B., Efros, A., Zisserman, A., & Freeman, W. (2005). Discovering object categories in image collections. In ICCV. Google Scholar
  27. Sorokin, A., & Forsyth, D. (2008). Utility data annotation with Amazon mechanical turk. In CVPR Workshops. Google Scholar
  28. Verbeek, J., & Triggs, B. (2007). Region classification with Markov field aspect models. In CVPR. Google Scholar
  29. Vijayanarasimhan, S., & Grauman, K. (2008a). Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. In CVPR. Google Scholar
  30. Vijayanarasimhan, S., & Grauman, K. (2008b). Multi-level active prediction of useful image annotations for recognition. In NIPS. Google Scholar
  31. Vijayanarasimhan, S., & Grauman, K. (2009). What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In CVPR. Google Scholar
  32. Vijayanarasimhan, S., Jain, P., & Grauman, K. (2010). Far sighted active learning on a budget for image and video recognition. In Proceedings of the IEEE conference on computer vision and pattent recognition (CVPR). Google Scholar
  33. von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. In CHI. Google Scholar
  34. Weber, M., Welling, M., & Perona, P. (2000). Unsupervised learning of models for recognition. In ECCV. Google Scholar
  35. Winn, J., Criminisi, A., & Minka, T. (2005). Object categorization by learned universal visual dictionary. In ICCV. Google Scholar
  36. Wu, T. F., Lin, C. J., & Weng, R. C. (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 5, 975–1005. MathSciNetGoogle Scholar
  37. Yan, R., Yang, J., & Hauptmann, A. (2003). Automatically labeling video data using multi-class active learning. In ICCV. Google Scholar
  38. Yang, C., & Lozano-Perez, T. (2000). Image database retrieval with multiple-instance learning techniques. In ICDE. Google Scholar
  39. Zha, Z. J., Hua, X. S., Mei, T., Wang, J., Qi, G. J., & Wang, Z. (2008). Joint multi-label multi-instance learning for image classification. In CVPR. Google Scholar
  40. Zhou, Z. H., & Zhang, M. L. (2006). Multi-instance multi-label learning with application to scene classification. In NIPS. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Texas at AustinAustinUSA

Personalised recommendations