Skip to main content
Log in

Random clustering ferns for multimodal object recognition

  • IBPRIA 2015
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

We propose an efficient and robust method for the recognition of objects exhibiting multiple intra-class modes, where each one is associated with a particular object appearance. The proposed method, called random clustering ferns, combines synergically a single and real-time classifier, based on the boosted assembling of extremely randomized trees (ferns), with an unsupervised and probabilistic approach in order to recognize efficiently object instances in images and discover simultaneously the most prominent appearance modes of the object through tree-structured visual words. In particular, we use boosted random ferns and probabilistic latent semantic analysis to obtain a discriminative and multimodal classifier that automatically clusters the response of its randomized trees in function of the visual object appearance. The proposed method is validated extensively in synthetic and real experiments, showing that the method is capable of detecting objects with diverse and complex appearance distributions in real-time performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. We use interchangeably the terms cluster and mode to refer to a dense part of the object appearance distribution.

  2. The indicator function \({\mathbb {I}}(e)=1\) if e is true, and 0 otherwise.

  3. The EER is the point in the precision-recall curve where precision = recall.

  4. The score distribution (Gaussian function) is calculated using the confidences of the BRFs for all class samples.

  5. The squared Hellinger distance for two distributions P and Q is defined as: \(H^2(P,Q) = 1 -\sqrt{k_1/k_2}\exp (-0.25k_3/k_2)\), with \(k_1 = 2 \sigma _P \sigma _Q\), \(k_2=\sigma _P^2 + \sigma _Q^2\), and \(k_3 =(\mu _P - \mu _Q)^2\).

  6. However, it is possible to use human assistance during the learning to improve the visual skills of the classifier [40].

  7. The code is available at http://www.iri.upc.edu/people/mvillami/code.html.

  8. For this problem, only 300 visual words are activated out of 38,400 words, each one corresponding to a fern output.

  9. Since the pLSA clustering is automatic, the confusion matrix is not necessarily diagonal. However, here the labels provided by pLSA have been sorted for display purposes.

  10. The code for RCFs is available at http://www.iri.upc.edu/people/mvillami/code.html.

References

  1. Ali K, Saenko K (2014) Confidence-rated multiple instance boosting for object detection. In: CVPR

  2. Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: ICML, pp 55–63

  3. Bosch A, Zisserman A, Muñoz X (2006) Scene classification via pLSA. In: ECCV

  4. Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. In: ICCV, pp 1–8

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  6. Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227

    MATH  Google Scholar 

  7. Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Proceedings ECCV workshop statistical learning in computer vision, pp 59–74

  8. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In : CVPR, pp 886–893

  9. Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: CVPR

  10. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. PAMI 32(9):1627–1645

    Article  Google Scholar 

  11. Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In: CVPR

  12. Gall J, Yao A, Razavi N, Van Gool L, Lempitsky V (2011) Hough forests for object detection, tracking, and action recognition. PAMI 33(11):2188–2202

    Article  Google Scholar 

  13. Garrell A, Villamizar M, Moreno-Noguer F, Sanfeliu A (2013) Proactive behavior of an autonomous mobile robot for human-assisted learning. In: RO-MAN

  14. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587

  15. Hall D, Perona P (2014) From categories to individuals in real time: a unified boosting approach. In: CVPR

  16. Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196

    Article  MATH  Google Scholar 

  17. Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: ICCV, pp 604–610

  18. Kalal Z, Mikolajczyk K, Matas J (2012) Tracking–learning–detection. PAMI 34(7):1409–1422

    Article  Google Scholar 

  19. Kim TK, Cipolla R (2009) Mcboost: multiple classifier boosting for perceptual co-clustering of images and visual features. In: NIPS, pp 841–848

  20. Klein DA, Schulz D, Frintrop S, Cremers AB (2010) Adaptive real-time video-tracking for arbitrary objects. In: IROS

  21. Krupka E, Vinnikov A, Klein B, Hillel AB, Freedman D, Stachniak S (2014) Discriminative ferns ensemble for hand pose recognition. In: CVPR, pp 3670–3677

  22. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  23. Liu B, Xia Y, Yu PS (2000) Clustering through decision tree construction. In: Proceedings of the ninth ACM international conferenced information and knowledge management, pp 20–29

  24. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. IJCV 60(2):91–110

    Article  Google Scholar 

  25. Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: ICCV, pp 89–96

  26. Marée R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: CVPR, pp 34–40

  27. Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. PAMI 30(9):1632–1646

    Article  Google Scholar 

  28. Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge

  29. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168

  30. Ozuysal M, Calonder M, Lepetit V, Fua P (2010) Fast keypoint recognition using random ferns. PAMI 32(3):448–461

    Article  Google Scholar 

  31. Ozuysal M, Lepetit V, Fua P (2009) Pose estimation for category specific multiview object localization. In: CVPR, pp 778–785

  32. Perbet F, Stenger B, Maki A (2009) Random forest clustering and application to video segmentation. In: BMVC, pp 1–10

  33. Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336

    Article  MATH  Google Scholar 

  34. Sharma P, Nevatia R (2014) Multi class boosted random ferns for adapting a generic object detector to a specific video. In: WACV, pp 745–752

  35. Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: CVPR, pp 1–8

  36. Sivic J, Russell B, Efros A, Zisserman A, Freeman WT (2005) Discovering objects and their location in images. In: ICCV

  37. Torralba A, Murphy KP, Freeman WT (2007) Sharing visual features for multiclass and multiview object detection. PAMI 29(5):854–869

    Article  Google Scholar 

  38. Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(85):2579–2605

    MATH  Google Scholar 

  39. Villamizar M, Andrade-Cetto J, Sanfeliu A, Moreno-Noguer F (2012) Bootstrapping boosted random ferns for discriminative and efficient object classification. Pattern Recognit 45(9):3141–3153

    Article  Google Scholar 

  40. Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2012) Online human-assisted learning using random ferns. In: ICPR, pp 2821–2824

  41. Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Modeling robot’s world with minimal effort. In: ICRA

  42. Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Multimodal object recognition using random clustering trees. In: IBPRIA

  43. Villamizar M, Grabner H, Andrade-Cetto J, Sanfeliu A, Van Gool L, Moreno-Noguer F (2011) Efficient 3d object detection using multiple pose-specific classifiers. In: BMVC

  44. Villamizar M, Moreno-Noguer F, Andrade-Cetto J, Sanfeliu A (2010) Efficient rotation invariant object detection using boosted random ferns. In: CVPR, pp 1038–1045

  45. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp l–511

  46. Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view, multi-pose object detection. In: ICCV, pp 1–8

  47. Yan J, Lei Z, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: CVPR, pp 2497–2504

Download references

Acknowledgments

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness under projects ERA-Net Chistera project ViSen PCIN-2013-047, RobInstruct TIN2014-58178-R, ROBOT-INT-COOP DPI2013-42458-P, and by the EU project AEROARMS H2020-ICT-2014-1-644271.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Villamizar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Villamizar, M., Garrell, A., Sanfeliu, A. et al. Random clustering ferns for multimodal object recognition. Neural Comput & Applic 28, 2445–2460 (2017). https://doi.org/10.1007/s00521-016-2284-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2284-x

Keywords

Navigation