Weak Hypotheses and Boosting for Generic Object Detection and Recognition
Abstract
In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framework: Boosting — together with a weak hypotheses finder — may choose very inhomogeneous features as most relevant for combination into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth [1] where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants [8] and SIFTs [12]. The descriptors are calculated from local patches detected by an interest point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses [22] or complex classifiers [20] have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art [3]. But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented regions and even spatial relationships, leads us a significant step towards generic object recognition.
Keywords
Object Recognition Training Image Object Detection Interest Point Local DescriptorPreview
Unable to display preview. Download preview PDF.
References
- 1.Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–130. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 2.Dorko, G., Schmid, C.: Selection of scale-invariant parts for object class recognition. In: Proc. International Conference on Computer Vision (2003)Google Scholar
- 3.Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. CVPR 2003 (2003)Google Scholar
- 4.Freeman, W., Adelson, E.: The design and use of steerable filters. In: PAMI, pp. 891–906 (1991)Google Scholar
- 5.Freund, Y., Schapire, R.E.: A decision-theoretic generalisation of on-line learning. Computer and System Sciences 55(1) (1997)Google Scholar
- 6.Garg, A., Agarwal, S., Huang, T.S.: Fusion of global and local information for object detection. In: Proc. CVPR, vol. 2, pp. 723–726 (2002)Google Scholar
- 7.Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley, Reading (2001)Google Scholar
- 8.Van Gool, L., Moons, T., Ungureanu, D.: Affine / photometric invariants for planar intensity patterns. In: Buxton, B.F., Cipolla, R. (eds.) ECCV 1996. LNCS, vol. 1065, pp. 642–651. Springer, Heidelberg (1996)CrossRefGoogle Scholar
- 9.Harris, C., Stephens, M.: A combined corner and edge detector. In: Proc. of the 4th ALVEY vision conference, pp. 147–151 (1988)Google Scholar
- 10.Laganiere, R.: A morphological operator for corner detection. Pattern Recognition 31(11), 1643–1652 (1998)CrossRefGoogle Scholar
- 11.Lindeberg, T.: Feature detection with automatic scale selection. International Journal of Computer Vision (1996)Google Scholar
- 12.Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. ICCV, pp. 1150–1157 (1999)Google Scholar
- 13.Maass, W., Warmuth, M.: Efficient learning with virtual threshold gates. Information and Computation 141(1), 66–83 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
- 14.Mahamud, S., Hebert, M., Shi, J.: Object recognition using boosted discriminants. In: Proc. CVPR, vol. 1, pp. 551–558 (2001)Google Scholar
- 15.Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proc. ICCV, pp. 525–531 (2001)Google Scholar
- 16.Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 17.Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: Proc. CVPR (2003)Google Scholar
- 18.Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. In: PAMI, vol. 19, pp. 530–534 (1997)Google Scholar
- 19.Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. International Journal of Computer Vision, 151–172 (2000)Google Scholar
- 20.Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. International Journal of Computer Vision (to appear)Google Scholar
- 21.Shilat, E., Werman, M., Gdalyahu, Y.: Ridge’s corner detection and correspondence. In: Computer Vision and Pattern Recognition, pp. 976–981 (1997)Google Scholar
- 22.Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. CVPR (2001)Google Scholar
- 23.Weber, M.: Unsupervised Learning of Models for Object Recognition. PhD thesis, California Institute of Technology, Pasadena, CA (2000)Google Scholar
- 24.Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)CrossRefGoogle Scholar
- 25.Wuertz, R.P., Lourens, T.: Corner detection in color images by multiscale combination of end-stopped cortical cells. In: International Conference on Artificial Neuronal Networks, pp. 901–906 (1997)Google Scholar