Hierarchical Support Vector Random Fields: Joint Training to Combine Local and Global Features

  • Paul Schnitzspan
  • Mario Fritz
  • Bernt Schiele
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5303)


Recently, impressive results have been reported for the detection of objects in challenging real-world scenes. Interestingly however, the underlying models vary greatly even between the most successful approaches. Methods using a global feature descriptor (e.g. ) paired with discriminative classifiers such as SVMs enable high levels of performance, but require large amounts of training data and typically degrade in the presence of partial occlusions. Local feature-based approaches (e.g. ) are more robust in the presence of partial occlusions but often produce a significant number of false positives. This paper proposes a novel approach called hierarchical support vector random field that allows 1) to combine the power of global feature-based approaches with the flexibility of local feature-based methods in one consistent multi-layer framework and 2) to automatically learn the tradeoff and the optimal interplay between local, semi-local and global feature contributions. Experiments show that both the combination of local and global features as well as the joint training result in improved detection performance on challenging datasets.


Part Assignment Evidence Aggregation Loopy Belief Propagation Joint Training Newton Optimization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR (2005)Google Scholar
  2. 2.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial Structures for Object Recognition. IJCV (2005)Google Scholar
  3. 3.
    Fergus, R., Zisserman, A., Perona, P.: Object class recognition by unsupervised scale invariant learning. In: CVPR 2003 (2003)Google Scholar
  4. 4.
    Leibe, B., Seemann, E., Schiele, B.: Pedestrian Detection in Crowded Scenes. In: CVPR (2005)Google Scholar
  5. 5.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: ICML (2001)Google Scholar
  6. 6.
    He, X., Zemel, R., Carreira-Perpinan, M.: Multiscale conditional random fields for image labeling. In: CVPR 2004 (2004)Google Scholar
  7. 7.
    Heisele, B., Ho, P., Wu, J., Poggio, T.: Face recognition: Component-based versus global approaches. In: CVIU 2003 (2003)Google Scholar
  8. 8.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bag of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: CVPR (2006)Google Scholar
  9. 9.
    Zhang, D., Li, S.Z., Gatica-Perez, D.: Real-Time Face Detection Using Boosting in Hierarchical Feature Spaces. In: ICPR 2004 (2004)Google Scholar
  10. 10.
    Reynolds, J., Murphy, K.: Figure-ground segmentation using a hierarchical conditional random field. In: CRV 2007 (2007)Google Scholar
  11. 11.
    Hoiem, D., Rother, C., Winn, J.: 3D Layout CRF for Multi-View Object Class Recognition and Segmentation. In: CVPR 2007 (2007)Google Scholar
  12. 12.
    Kumar, S., August, J., Hebert, M.: Discriminative Random Fields. In: IJCV 2006 (2006)Google Scholar
  13. 13.
    Winn, J., Shotton, J.: The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects. In: CVPR (2006)Google Scholar
  14. 14.
    Lee, C.H., Greiner, R., Schmidt, M.: Support Vector Random Fields for Spatial Classification. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 121–132. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Shotton, J., Winn, J., Rother, C., Criminisi, A.: TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. In: ECCV (2006)Google Scholar
  16. 16.
    Taskar, B., Guestrin, C., Koller, D.: Max margin markov networks. In: NIPS 2003 (2003)Google Scholar
  17. 17.
    Chapelle, O.: Training a Support Vector Machine in the Primal. In: Neural Computation (2007)Google Scholar
  18. 18.
    Kimeldorf, G.S., Wahba, G.: A correspondence between bayesian estimation on stochastic processes and smoothing by splines. Annals of Math. Stat. (1970)Google Scholar
  19. 19.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)zbMATHGoogle Scholar
  20. 20.
    Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. In: IJCV 2004 (2004)Google Scholar
  21. 21.
    Bouchard, G., Triggs, B.: Hierarchical Part-Based Visual Object Categorization. In: CVPR (2005)Google Scholar
  22. 22.
    Fritz, M., Leibe, B., Caputo, B., Schiele, B.: Integrating Representative and Discriminant Models for Object Category Detection. In: ICCV (2005)Google Scholar
  23. 23.
    Joachims, T.: Making large-Scale SVM Learning Practical. In: Advances in Kernel Methods - Support Vector Learning (1999)Google Scholar
  24. 24.
    Mutch, J., Lowe, D.: Multiclass Object Recognition with Sparse, Localized Features. In: CVPR (2006)Google Scholar
  25. 25.
    Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC 2006) Results (2006)Google Scholar
  26. 26.
    Chum, O., Zisserman, A.: An Exemplar Model for Learning Object Classes. In: CVPR (2007)Google Scholar
  27. 27.
    Laptev, I.: Improvements of object detection using boosted histograms. In: BMVC 2006 (2006)Google Scholar
  28. 28.
    Viitaniemi, V., Laaksonen, J.: Techniques for Still Image Scene Classification and Object Detection. In: ICANN (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Paul Schnitzspan
    • 1
  • Mario Fritz
    • 1
  • Bernt Schiele
    • 1
  1. 1.Computer Science DepartmentTU DarmstadtGermany

Personalised recommendations