Random clustering ferns for multimodal object recognition

Villamizar, M.; Garrell, A.; Sanfeliu, A.; Moreno-Noguer, F.

doi:10.1007/s00521-016-2284-x

Random clustering ferns for multimodal object recognition

IBPRIA 2015
Published: 08 April 2016

Volume 28, pages 2445–2460, (2017)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

M. Villamizar¹,
A. Garrell¹,
A. Sanfeliu¹ &
…
F. Moreno-Noguer¹

317 Accesses
2 Citations
Explore all metrics

Abstract

We propose an efficient and robust method for the recognition of objects exhibiting multiple intra-class modes, where each one is associated with a particular object appearance. The proposed method, called random clustering ferns, combines synergically a single and real-time classifier, based on the boosted assembling of extremely randomized trees (ferns), with an unsupervised and probabilistic approach in order to recognize efficiently object instances in images and discover simultaneously the most prominent appearance modes of the object through tree-structured visual words. In particular, we use boosted random ferns and probabilistic latent semantic analysis to obtain a discriminative and multimodal classifier that automatically clusters the response of its randomized trees in function of the visual object appearance. The proposed method is validated extensively in synthetic and real experiments, showing that the method is capable of detecting objects with diverse and complex appearance distributions in real-time performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal Object Recognition Using Random Clustering Trees

Automatic Feature Detection and Clustering Using Random Indexing

Extremely Randomized Trees and Random Subwindows for Image Classification, Annotation, and Retrieval

Notes

We use interchangeably the terms cluster and mode to refer to a dense part of the object appearance distribution.
The indicator function \({\mathbb {I}}(e)=1\) if e is true, and 0 otherwise.
The EER is the point in the precision-recall curve where precision = recall.
The score distribution (Gaussian function) is calculated using the confidences of the BRFs for all class samples.
The squared Hellinger distance for two distributions P and Q is defined as: \(H^2(P,Q) = 1 -\sqrt{k_1/k_2}\exp (-0.25k_3/k_2)\), with \(k_1 = 2 \sigma _P \sigma _Q\), \(k_2=\sigma _P^2 + \sigma _Q^2\), and \(k_3 =(\mu _P - \mu _Q)^2\).
However, it is possible to use human assistance during the learning to improve the visual skills of the classifier [40].
The code is available at http://www.iri.upc.edu/people/mvillami/code.html.
For this problem, only 300 visual words are activated out of 38,400 words, each one corresponding to a fern output.
Since the pLSA clustering is automatic, the confusion matrix is not necessarily diagonal. However, here the labels provided by pLSA have been sorted for display purposes.
The code for RCFs is available at http://www.iri.upc.edu/people/mvillami/code.html.

References

Ali K, Saenko K (2014) Confidence-rated multiple instance boosting for object detection. In: CVPR
Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: ICML, pp 55–63
Bosch A, Zisserman A, Muñoz X (2006) Scene classification via pLSA. In: ECCV
Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. In: ICCV, pp 1–8
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227
MATH Google Scholar
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Proceedings ECCV workshop statistical learning in computer vision, pp 59–74
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In : CVPR, pp 886–893
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: CVPR
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. PAMI 32(9):1627–1645
Article Google Scholar
Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In: CVPR
Gall J, Yao A, Razavi N, Van Gool L, Lempitsky V (2011) Hough forests for object detection, tracking, and action recognition. PAMI 33(11):2188–2202
Article Google Scholar
Garrell A, Villamizar M, Moreno-Noguer F, Sanfeliu A (2013) Proactive behavior of an autonomous mobile robot for human-assisted learning. In: RO-MAN
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587
Hall D, Perona P (2014) From categories to individuals in real time: a unified boosting approach. In: CVPR
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196
Article MATH Google Scholar
Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: ICCV, pp 604–610
Kalal Z, Mikolajczyk K, Matas J (2012) Tracking–learning–detection. PAMI 34(7):1409–1422
Article Google Scholar
Kim TK, Cipolla R (2009) Mcboost: multiple classifier boosting for perceptual co-clustering of images and visual features. In: NIPS, pp 841–848
Klein DA, Schulz D, Frintrop S, Cremers AB (2010) Adaptive real-time video-tracking for arbitrary objects. In: IROS
Krupka E, Vinnikov A, Klein B, Hillel AB, Freedman D, Stachniak S (2014) Discriminative ferns ensemble for hand pose recognition. In: CVPR, pp 3670–3677
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Liu B, Xia Y, Yu PS (2000) Clustering through decision tree construction. In: Proceedings of the ninth ACM international conferenced information and knowledge management, pp 20–29
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. IJCV 60(2):91–110
Article Google Scholar
Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: ICCV, pp 89–96
Marée R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: CVPR, pp 34–40
Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. PAMI 30(9):1632–1646
Article Google Scholar
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168
Ozuysal M, Calonder M, Lepetit V, Fua P (2010) Fast keypoint recognition using random ferns. PAMI 32(3):448–461
Article Google Scholar
Ozuysal M, Lepetit V, Fua P (2009) Pose estimation for category specific multiview object localization. In: CVPR, pp 778–785
Perbet F, Stenger B, Maki A (2009) Random forest clustering and application to video segmentation. In: BMVC, pp 1–10
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336
Article MATH Google Scholar
Sharma P, Nevatia R (2014) Multi class boosted random ferns for adapting a generic object detector to a specific video. In: WACV, pp 745–752
Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: CVPR, pp 1–8
Sivic J, Russell B, Efros A, Zisserman A, Freeman WT (2005) Discovering objects and their location in images. In: ICCV
Torralba A, Murphy KP, Freeman WT (2007) Sharing visual features for multiclass and multiview object detection. PAMI 29(5):854–869
Article Google Scholar
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(85):2579–2605
MATH Google Scholar
Villamizar M, Andrade-Cetto J, Sanfeliu A, Moreno-Noguer F (2012) Bootstrapping boosted random ferns for discriminative and efficient object classification. Pattern Recognit 45(9):3141–3153
Article Google Scholar
Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2012) Online human-assisted learning using random ferns. In: ICPR, pp 2821–2824
Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Modeling robot’s world with minimal effort. In: ICRA
Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Multimodal object recognition using random clustering trees. In: IBPRIA
Villamizar M, Grabner H, Andrade-Cetto J, Sanfeliu A, Van Gool L, Moreno-Noguer F (2011) Efficient 3d object detection using multiple pose-specific classifiers. In: BMVC
Villamizar M, Moreno-Noguer F, Andrade-Cetto J, Sanfeliu A (2010) Efficient rotation invariant object detection using boosted random ferns. In: CVPR, pp 1038–1045
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp l–511
Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view, multi-pose object detection. In: ICCV, pp 1–8
Yan J, Lei Z, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: CVPR, pp 2497–2504

Download references

Acknowledgments

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness under projects ERA-Net Chistera project ViSen PCIN-2013-047, RobInstruct TIN2014-58178-R, ROBOT-INT-COOP DPI2013-42458-P, and by the EU project AEROARMS H2020-ICT-2014-1-644271.

Author information

Authors and Affiliations

Institut de Robòtica i Informàtica Industrial, CSIC-UPC, Barcelona, Spain
M. Villamizar, A. Garrell, A. Sanfeliu & F. Moreno-Noguer

Authors

M. Villamizar
View author publications
You can also search for this author in PubMed Google Scholar
A. Garrell
View author publications
You can also search for this author in PubMed Google Scholar
A. Sanfeliu
View author publications
You can also search for this author in PubMed Google Scholar
F. Moreno-Noguer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Villamizar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Villamizar, M., Garrell, A., Sanfeliu, A. et al. Random clustering ferns for multimodal object recognition. Neural Comput & Applic 28, 2445–2460 (2017). https://doi.org/10.1007/s00521-016-2284-x

Download citation

Received: 16 October 2015
Accepted: 24 March 2016
Published: 08 April 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s00521-016-2284-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Random clustering ferns for multimodal object recognition

Abstract

Access this article

Similar content being viewed by others

Multimodal Object Recognition Using Random Clustering Trees

Automatic Feature Detection and Clustering Using Random Indexing

Extremely Randomized Trees and Random Subwindows for Image Classification, Annotation, and Retrieval

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Random clustering ferns for multimodal object recognition

Abstract

Access this article

Similar content being viewed by others

Multimodal Object Recognition Using Random Clustering Trees

Automatic Feature Detection and Clustering Using Random Indexing

Extremely Randomized Trees and Random Subwindows for Image Classification, Annotation, and Retrieval

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation