Abstract

We propose a probabilistic model that captures contextual information in the form of typical spatial relationships between regions of an image. We represent a region’s local context as a combination of the identity of neighbouring regions as well as the geometry of the neighbourhood. We subsequently cluster all the neighbourhood configurations with the same label at the focal region to obtain, for each label, a set of configuration prototypes. We propose an iterative procedure based on belief propagation to infer the labels of regions of a new image given only the observed spatial relationships between the regions and the hitherto learnt prototypes. We validate our approach on a dataset of hand segmented and labelled images of buildings. Performance compares favourably with that of a boosted, non-contextual classifier.

Keywords

Computer Vision Object Recognition Spatial Relationship Focal Region Factor Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bar, M., Aminoff, E.: Cortical analysis of visual context. Neuron 38, 347–358 (2003)CrossRefGoogle Scholar
  2. 2.
    Bar, M., Kassam, K., Ghuman, A., Boshyan, J., Schmidt, A., Dale, A., Hämäläinen, M., Marinkovic, K., Schacter, D., Rosen, B., Halgren, E.: Top-down facilitation of visual recognition. Proc. National Academy of Sciences 103(2), 449–454 (2006)CrossRefGoogle Scholar
  3. 3.
    Bloch, I.: Fuzzy Representations of Spatial Relations for Spatial Reasoning. John Wiley & Sons, Chichester (2007)Google Scholar
  4. 4.
    Boutell, M., Brown, C., Luo, J.: Learning spatial configuration models using modified Dirichlet priors. In: Workshop on Statistical Relational Learning (2004)Google Scholar
  5. 5.
    Boutell, M., Luo, J., Brown, C.: Factor graphs for region-based whole-scene classification. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition, Semantic Learning Workshop (2006)Google Scholar
  6. 6.
    Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Proc. European Conf. Computer Vision, pp. 350–362 (2004)Google Scholar
  7. 7.
    Christmas, W.J., Kittler, J., Petrou, M.: Structural matching in computer vision using probabilistic relaxation. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(8), 749–764 (1995)CrossRefGoogle Scholar
  8. 8.
    Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Proc. European Conf. Computer Vision (2004)Google Scholar
  9. 9.
    Fidler, S., Boben, M., Leonardis, A.: Similarity-based cross-layered hierarchical representation for object categorization. In: Proc. Int’l. Conf. Computer Vision and Pattern Recognition (2008)Google Scholar
  10. 10.
    Gökalp, D., Aksoy, S.: Scene classification using bag-of-regions representations. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition, Beyond Patches Workshop (2007)Google Scholar
  11. 11.
    He, X., Zemel, R., Ray, D.: Learning and incorporating top-down cues in image segmentation. In: Proc. European Conf. Computer Vision (2006)Google Scholar
  12. 12.
    Kittler, J., Hancock, S.: Combining evidence in probabilistic relaxation. Journal of Pattern Recognition and Artificial Intelligence 3, 29–51 (1989)CrossRefGoogle Scholar
  13. 13.
    Kumar, S., Hebert, H.: Discriminative random fields: a discriminative framework for contextual interaction in classification. In: Proc. Int’l. Conf. Computer Vision (2003)Google Scholar
  14. 14.
    Lowe, D.: Object recognition from local scale-invariant features. In: Proc. Int’l. Conf. Computer Vision, pp. 1150–1157 (1999)Google Scholar
  15. 15.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. Int’l. Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  16. 16.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–393 (2002)Google Scholar
  17. 17.
    Modestino, J., Zhang, J.: A Markov random field model-based approach to image interpretation. IEEE Trans. Pattern Analysis and Machine Intelligence 14(6), 606–615 (1992)CrossRefGoogle Scholar
  18. 18.
    Oliva, A., Torralba, A.: Modelling the shape of the scene: a holistic representation of the spatial envelope. Int’l. Journal Computer Vision 42(3), 145–175 (2001)CrossRefMATHGoogle Scholar
  19. 19.
    Petrou, M.: Learning in Computer Vision: some thoughts. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 1–12. Springer, Heidelberg (2007)Google Scholar
  20. 20.
    Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proc. European Conf. Computer Vision (2006)Google Scholar
  21. 21.
    Singhal, A., Luo, J., Zhu, W.: Probabilistic spatial context models for scene content understanding. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2003)Google Scholar
  22. 22.
    Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Proc. Int’l. Conf. Computer Vision, pp. 1–8 (2003)Google Scholar
  23. 23.
    Torralba, A.: Contextual priming for object detection. Int’l. Journal of Computer Vision 52(2), 169–191 (2003)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Torralba, A., Murphy, K., Freeman, W.: Sharing fatures: efficient boosting procedures for multiclass object detection. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 762–769 (2004)Google Scholar
  25. 25.
    Winn, J., Criminisi, A., Minka, T.: Object categorization by learned universal visual dictionary. In: Proc. Int’l. Conf. Computer Vision, pp. 1800–1807 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Daniel Heesch
    • 1
  • Robby Tan
    • 1
  • Maria Petrou
    • 1
  1. 1.Imperial College LondonUK

Personalised recommendations