Online Adaptation for Joint Scene and Object Classification

  • Jawadul H. BappyEmail author
  • Sujoy Paul
  • Amit K. Roy-Chowdhury
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9912)


Recent efforts in computer vision consider joint scene and object classification by exploiting mutual relationships (often termed as context) between them to achieve higher accuracy. On the other hand, there is also a lot of interest in online adaptation of recognition models as new data becomes available. In this paper, we address the problem of how models for joint scene and object classification can be learned online. A major motivation for this approach is to exploit the hierarchical relationships between scenes and objects, represented as a graphical model, in an active learning framework. To select the samples on the graph, which need to be labeled by a human, we use an information theoretic approach that reduces the joint entropy of scene and object variables. This leads to a significant reduction in the amount of manual labeling effort for similar or better performance when compared with a model trained with the full dataset. This is demonstrated through rigorous experimentation on three datasets.


Scene classification Object detection Active learning 



The work was partially supported by NSF grant IIS-1316934 and US Office of Naval Research contract N00014-15-C-5113 through Mayachitra, Inc.

Supplementary material

419983_1_En_14_MOESM1_ESM.pdf (1.7 mb)
Supplementary material 1 (pdf 1764 KB)


  1. 1.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  2. 2.
    Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: CVPR (2010)Google Scholar
  3. 3.
    Settles, B.: Active learning. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–114 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  4. 4.
    Li, X., Guo, Y.: Multi-level adaptive active learning for scene classification. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 234–249. Springer, Heidelberg (2014)Google Scholar
  5. 5.
    Li, X., Guo, Y.: Adaptive active learning for image classification. In: CVPR (2013)Google Scholar
  6. 6.
    Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between svm and ann. Expert Syst. Appl. 40(2), 621–633 (2013)CrossRefGoogle Scholar
  7. 7.
    Zhang, Y., Liu, X., Chang, M.-C., Ge, W., Chen, T.: Spatio-temporal phrases for activity recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 707–721. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    Shi, L., Zhao, Y., Tang, J.: Batch mode active learning for networked data. ACM Trans. Intell. Syst. Technol. (TIST) 3(2), 33 (2012)Google Scholar
  9. 9.
    Hu, X., Tang, J., Gao, H., Liu, H.: Actnet: Active learning for networked texts in microblogging. In: SDM, pp. 306–314. SIAM (2013)Google Scholar
  10. 10.
    Li, J., Bioucas-Dias, J.M., Plaza, A.: Spectral-spatial classification of hyperspectral data using loopy belief propagation and active learning. IEEE Trans. Geosci. Remote Sens. 51(2), 844–856 (2013)CrossRefGoogle Scholar
  11. 11.
    Mac Aodha, O., Campbell, N., Kautz, J., Brostow, G.: Hierarchical subquery evaluation for active learning on a graph. In: CVPR (2014)Google Scholar
  12. 12.
    Hasan, M., Roy-Chowdhury, A.K.: Context aware active learning of activity recognition models. In: ICCV (2015)Google Scholar
  13. 13.
    Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR (2012)Google Scholar
  14. 14.
    Wojek, C., Schiele, B.: A dynamic conditional random field model for joint labeling of object and scene classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 733–747. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Alberti, M., Folkesson, J., Jensfelt, P.: Relational approaches for joint object classification and scene similarity measurement in indoor environments. In: AAAI 2014 Spring Symposia: Qualitative Representations for Robots (2014)Google Scholar
  16. 16.
    Wang, B., Lin, D., Xiong, H., Zheng, Y.: Joint inference of objects and scenes with efficient learning of text-object-scene relations. IEEE Trans. Multimedia 8(99), 1 (2016)Google Scholar
  17. 17.
    Nimmagadda, T., Anandkumar, A.: Multi-object classification and unsupervised scene understanding using deep learning features and latent tree probabilistic models. arXiv preprint arXiv:1505.00308 (2015)
  18. 18.
    Li, X., Guo, R., Cheng, J.: Incorporating incremental and active learning for scene classification. In: ICMLA (2012)Google Scholar
  19. 19.
    Yue, J., Li, Z., Liu, L., Fu, Z.: Content-based image retrieval using color and texture fused features. Math. Comput. Model. 54(3), 1121–1127 (2011)CrossRefGoogle Scholar
  20. 20.
    Li, Z., Itti, L.: Saliency and gist features for target detection in satellite images. TIP 20(7), 2017–2029 (2011)MathSciNetGoogle Scholar
  21. 21.
    Liu, C., Yuen, J., Torralba, A.: Dense scene alignment using sift flow for object recognition. In: CVPR (2009)Google Scholar
  22. 22.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014)Google Scholar
  23. 23.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)Google Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 346–361. Springer, Heidelberg (2014)Google Scholar
  25. 25.
    Girshick, R.: Fast R-CNN. In: ICCV (2015)Google Scholar
  26. 26.
    Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: CVPR, pp. 3273–3280 (2011)Google Scholar
  27. 27.
    Zhu, Y., Nayak, N., Roy-Chowdhury, A.: Context-aware activity modeling using hierarchical conditional random fields. PAMI 37(7), 1360–1372 (2015)CrossRefGoogle Scholar
  28. 28.
    Zhang, L., Zhen, X., Shao, L.: Learning object-to-class kernels for scene classification. TIP 23(8), 3241–3253 (2014)MathSciNetGoogle Scholar
  29. 29.
    Fathi, A., Balcan, M.F., Ren, X., Rehg, J.M.: Combining self training and active learning for video segmentation. In: BMVC, vol. 29, pp. 78.1–78.11 (2011)Google Scholar
  30. 30.
    Vijayanarasimhan, S., Grauman, K.: Large-scale live active learning: training object detectors with crawled data and crowds. IJCV 108(1–2), 97–114 (2014)CrossRefMathSciNetGoogle Scholar
  31. 31.
    Vondrick, C., Ramanan, D.: Video annotation and tracking with active learning. In: NIPS (2011)Google Scholar
  32. 32.
    Elhamifar, E., Sapiro, G., Yang, A., Sasrty, S.: A convex optimization framework for active learning. In: ICCV (2013)Google Scholar
  33. 33.
    Settles, B.: Active learning literature survey, vol. 52, pp. 55–66. University of Wisconsin, Madison (2010)Google Scholar
  34. 34.
    Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Active learning with gaussian processes for object categorization. In: ICCV (2007)Google Scholar
  35. 35.
    Kading, C., Freytag, A., Rodner, E., Bodesheim, P., Denzler, J.: Active learning and discovery of object categories in the presence of unnameable instances. In: CVPR (2015)Google Scholar
  36. 36.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar
  37. 37.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  38. 38.
    Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: ICCV (2007)Google Scholar
  39. 39.
    Li, Y., Nevatia, R.: Key object driven multi-category object recognition, localization and tracking using spatio-temporal context. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 409–422. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  40. 40.
    Yedidia, J.S., Freeman, W.T., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theor. 51(7), 2282–2312 (2005)CrossRefzbMATHMathSciNetGoogle Scholar
  41. 41.
    Choi, M.J., Lim, J.J., Torralba, A., Willsky, A.S.: Exploiting hierarchical context on a large database of object categories. In: CVPR (2010)Google Scholar
  42. 42.
    Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)Google Scholar
  43. 43.
    Malisiewicz, T., Efros, A.A.: Improving spatial support for objects via multiple segmentations. In: BMVC (2007)Google Scholar
  44. 44.
    Schmidt, M.: UGM: a Matlab toolbox for probabilistic undirected graphical models (2010)Google Scholar
  45. 45.
    Hasan, M., Roy-Chowdhury, A.: Incremental activity modeling and recognition in streaming videos. In: CVPR (2014)Google Scholar
  46. 46.
    Druck, G., Settles, B., McCallum, A.: Active learning by labeling features. In: EMNLP (2009)Google Scholar
  47. 47.
    Doersch, C., Gupta, A., Efros, A.A.: Mid-level visual element discovery as discriminative mode seeking. In: NIPS (2013)Google Scholar
  48. 48.
    Hayat, M., Khan, S.H., Bennamoun, M., An, S.: A spatial layout and scale invariant feature representation for indoor scene classification. arXiv preprint arXiv:1506.05532 (2015)
  49. 49.
    Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 392–407. Springer, Heidelberg (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Jawadul H. Bappy
    • 1
    Email author
  • Sujoy Paul
    • 1
  • Amit K. Roy-Chowdhury
    • 1
  1. 1.Department of ECEUniversity of CaliforniaRiversideUSA

Personalised recommendations