Learning Graphs to Model Visual Objects across Different Depictive Styles

  • Qi Wu
  • Hongping Cai
  • Peter Hall
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8695)


Visual object classification and detection are major problems in contemporary computer vision. State-of-art algorithms allow thousands of visual objects to be learned and recognized, under a wide range of variations including lighting changes, occlusion, point of view and different object instances. Only a small fraction of the literature addresses the problem of variation in depictive styles (photographs, drawings, paintings etc.). This is a challenging gap but the ability to process images of all depictive styles and not just photographs has potential value across many applications. In this paper we model visual classes using a graph with multiple labels on each node; weights on arcs and nodes indicate relative importance (salience) to the object description. Visual class models can be learned from examples from a database that contains photographs, drawings, paintings etc. Experiments show that our representation is able to improve upon Deformable Part Models for detection and Bag of Words models for classification.


Object Recognition Deformable Models Multi-labeled Graph Graph Matching 


  1. 1.
    Amit, Y., Trouvé, A.: Pop: Patchwork of parts models for object recognition. IJCV (2004)Google Scholar
  2. 2.
    Bosch, A., Zisserman, A., Muoz, X.: Image classification using random forests and ferns. In: ICCV (2007)Google Scholar
  3. 3.
    Cho, M., Alahari, K., Ponce, J.: Learning graphs to match. In: ICCV (2013)Google Scholar
  4. 4.
    Cootes, T.F., Edwards, G.J., Taylor, C.J., et al.: Active appearance models. TPAMI (2001)Google Scholar
  5. 5.
    Coughlan, J., Yuille, A., English, C., Snow, D.: Efficient deformable template detection and localization without user initialization. In: CVIU (2000)Google Scholar
  6. 6.
    Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: CVPR (2005)Google Scholar
  7. 7.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV (2004)Google Scholar
  8. 8.
    Dong, J., Xia, W., Chen, Q., Feng, J., Huang, Z., Yan, S.: Subcategory-aware object classification. In: CVPR (2013)Google Scholar
  9. 9.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. IJCV (2005)Google Scholar
  10. 10.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI (2010)Google Scholar
  11. 11.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR (2003)Google Scholar
  12. 12.
    Ferrari, V., Jurie, F., Schmid, C.: From images to shape models for object detection. IJCV (2010)Google Scholar
  13. 13.
    Fischler, M.A., Elschlager, R.: The representation and matching of pictorial structures. IEEE Transactions on Computers (1973)Google Scholar
  14. 14.
    Gu, C., Arbeláez, P., Lin, Y., Yu, K., Malik, J.: Multi-component models for object detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 445–458. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Gu, C., Lim, J.J., Arbeláez, P., Malik, J.: Recognition using regions. In: CVRP (2009)Google Scholar
  16. 16.
    Hu, R., Collomosse, J.: A performance evaluation of gradient field hog descriptor for sketch based image retrieval. CVIU (2013)Google Scholar
  17. 17.
    Jia, W., McKenna, S.: Classifying textile designs using bags of shapes. In: ICPR (2010)Google Scholar
  18. 18.
    Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural svms. Machine Learning (2009)Google Scholar
  19. 19.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)Google Scholar
  20. 20.
    Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV (2008)Google Scholar
  21. 21.
    Li, Y., Song, Y.Z., Gong, S.: Sketch recognition by ensemble matching of structured features. In: BMVC (2013)Google Scholar
  22. 22.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Russakovsky, O., Lin, Y., Yu, K., Fei-Fei, L.: Object-centric spatial pooling for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 1–15. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  24. 24.
    Sapp, B., Toshev, A., Taskar, B.: Cascaded models for articulated pose estimation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 406–420. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  25. 25.
    Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)Google Scholar
  26. 26.
    Shotton, J., Blake, A., Cipolla, R.: Multiscale categorical object recognition using contour fragments. TPAMI (2008)Google Scholar
  27. 27.
    Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM Transaction of Graphics (TOG) (2011)Google Scholar
  28. 28.
    Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for svm. In: ICML (2007)Google Scholar
  29. 29.
    Torresani, L., Kolmogorov, V., Rother, C.: Feature correspondence via graph matching: Models and global optimization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 596–609. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  30. 30.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. JMLR (2005)Google Scholar
  31. 31.
    Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008)Google Scholar
  32. 32.
    Wu, Q., Hall, P.: Modelling visual objects invariant to depictive style. In: BMVC (2013)Google Scholar
  33. 33.
    Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR (2011)Google Scholar
  34. 34.
    Yao, B., Fei-Fei, L.: Action recognition with exemplar based 2.5D graph matching. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 173–186. Springer, Heidelberg (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Qi Wu
    • 1
  • Hongping Cai
    • 1
  • Peter Hall
    • 1
  1. 1.Media Technology Research CentreUniversity of BathUnited Kingdom

Personalised recommendations