Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views

Ferrari, Vittorio; Tuytelaars, Tinne; Van Gool, Luc

doi:10.1007/s11263-005-3964-7

Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views

Published: 28 January 2006

Volume 67, pages 159–188, (2006)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Vittorio Ferrari¹,
Tinne Tuytelaars² &
Luc Van Gool^1,2

487 Accesses
155 Citations
6 Altmetric
Explore all metrics

Abstract

We present a novel Object Recognition approach based on affine invariant regions. It actively counters the problems related to the limited repeatability of the region detectors, and the difficulty of matching, in the presence of large amounts of background clutter and particularly challenging viewing conditions. After producing an initial set of matches, the method gradually explores the surrounding image areas, recursively constructing more and more matching regions, increasingly farther from the initial ones. This process covers the object with matches, and simultaneously separates the correct matches from the wrong ones. Hence, recognition and segmentation are achieved at the same time. The approach includes a mechanism for capturing the relationships between multiple model views and exploiting these for integrating the contributions of the views at recognition time. This is based on an efficient algorithm for partitioning a set of region matches into groups lying on smooth surfaces. Integration is achieved by measuring the consistency of configurations of groups arising from different model views. Experimental results demonstrate the stronger power of the approach in dealing with extensive clutter, dominant occlusion, and large scale and viewpoint changes. Non-rigid deformations are explicitly taken into account, and the approximative contours of the object are produced. All presented techniques can extend any view-point invariant feature extractor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enforcing Consistency of 3D Scenes with Multiple Objects Using Shape-from-Contours

Curve Propagation, Level Set Methods and Grouping

Multi-scale Image Co-segmentation

References

Baumberg, A. 2000. Reliable feature matching across widely separated views. In ICCV, pp. 774–781.
Bebis, G., Georgiopoulos, M., and Lobo, N.V. 1995. Learning geometric hashing functions for model-based object recognition. In ICCV, pp. 543–548.
Chum, O., Matas, J., and Obdrzalek, S. 2003. Epipolar geometry from three correspondences. In Computer Vision Winter Workshop.
Cyr, C. and Kimia, B. 2001. 3D object recognition using similarity-based aspect graph. ICCV, 254–261.
Ferrari, V. 2004. Affine Invariant Regions ++. PhD Thesis, Selected Readings in Vision and Graphics, Springer Verlag, Zuerich, CH. www.vision.ee.ethz.ch/~ferrari
Ferrari, V., Tuytelaars, T., and Van-Gool, L. 2003. Wide-baseline multiple-view correspondences. CVPR, I:718–728.
Google Scholar
Ferrari, V., Tuytelaars, T., and Van-Gool, L. 2004. Integrating multiple model views for object recognition. CVPR.
Ferrari, V., Tuytelaars, T., and Van-Gool, L. 2004. Simultaneous object recognition and segmentation by image exploration. ECCV, I:40–54.
Google Scholar
Kolmogorov, V. and Zabih, R, 2002. What energy functions can be minimized via graph cuts ? ECCV, III:65–78.
Lazebnik, S., Schmid, C., and Ponce, J. 2004. Semi-local affine parts for object recognition. BMVC, II:779–788.
Google Scholar
Leibe, B. and Schiele, B. 2004. Scale-invariant object categorization using a scale-adaptive mean-shift search, DAGM, 145–153.
Lhuillier, M. and Quan, L. 2002. Match propagation for image-based modeling and rendering, PAMI, 24(8).
Lowe, D. 2001. Local feature view clustering for 3D object recognition. CVPR, 682–688.
Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. IJCV, 60(2): 91–110.
Article Google Scholar
Matas, J., Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide baseline stereo from maximally stable extremal regions. BMVC, 384–393.
Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale-invariant interest points. ICCV, I:525–531.
Google Scholar
Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. ECCV, 128–142.
Mikolajczyk, K. and Schmid, C. 2003. A performance evaluation of local descriptors. CVPR, II:257–263.
Google Scholar
Murase, H. and Nayar, S. 1995. Visual learning and recognition of 3d objects from appearance. IJCV, 14(1).
Obrdzalek, S. and Matas, J. 2002. Object recognition using local affine frames on distinguished regions. BMVC, 414–431.
Osian, M. and Van-Gool, L. 2004. Video shot characterization. Machine Vision and Applications, 15(3): 172–177.
Article Google Scholar
Pritchett, P. and Zisserman, A.1998. Wide baseline stereo matching. ICCV, 754–760.
Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2005. 3D object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. IJCV, in press.
Schaffalitzky, F. and Zisserman, A. 2002. Automated scene matching in movies. In Workshop on Content-Based Image and Video Retrieval, pp. 186–197.
Schaffalitzky, F. and Zisserman, A. 2002. Multi-view matching for unordered image sets. ECCV, I:414–427.
Google Scholar
Schmid, C. 1996. Combining greyvalue invariants with local constraints for object recognition. CVPR, 872–877.
Schmid, C. 1999. A structured probabilistic model for recognition. CVPR, II:485–490.
Google Scholar
Selinger, A. and Nelson, R.C. 1999. A perceptual grouping hierarchy for appearance-based 3d object recognition. Computer Vision and Image Understanding, 76(1): 83–92.
Article Google Scholar
Sivic, J. and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. ICCV.
Swain, M.J., and Ballard, B.H. 1991. Color indexing. IJCV, 7(1): 11–32.
Article Google Scholar
Tell, D. and Carlsson, S. 2002. Combining appearance and topology for wide baseline matching. ECCV, I:68–81.
Google Scholar
Torr, P.H.S. and Murray, D.W. 1997. The development and comparison of robust methods for estimating the fundamental matrix. IJCV, 24(3): 271–300.
Article Google Scholar
Tuytelaars, T. and Van-Gool, L. 2000. Wide baseline stereo based on local, affinely invariant regions. BMVC.
Tuytelaars, T., Van-Gool, L., Dhaene, L., and Koch, R. 1999. Matching affinely invariant regions for visual servoing. In Intl. Conference on Robotics and Automation, 1601–1606.
Yu, S.X., Gross, R., and Shi, J. 2002. Concurrent object recognition and segmentation by graph partitioning. NIPS.
Zhang, Z., Deriche, R., Faugeras, O., and Luong, Q. 1995. A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence, 78: 87–119.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Group (BIWI), ETH Zuerich, Switzerland
Vittorio Ferrari & Luc Van Gool
ESAT-PSI, University of Leuven, Belgium
Tinne Tuytelaars & Luc Van Gool

Authors

Vittorio Ferrari
View author publications
You can also search for this author in PubMed Google Scholar
Tinne Tuytelaars
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vittorio Ferrari.

Additional information

This research was supported by EC project VIBES, the Fund for Scientific Research Flanders, and the IST Network of Excellence PASCAL.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ferrari, V., Tuytelaars, T. & Van Gool, L. Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views. Int J Comput Vision 67, 159–188 (2006). https://doi.org/10.1007/s11263-005-3964-7

Download citation

Received: 21 September 2004
Revised: 04 April 2005
Accepted: 03 May 2005
Published: 28 January 2006
Issue Date: April 2006
DOI: https://doi.org/10.1007/s11263-005-3964-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views

Abstract

Access this article

Similar content being viewed by others

Enforcing Consistency of 3D Scenes with Multiple Objects Using Shape-from-Contours

Curve Propagation, Level Set Methods and Grouping

Multi-scale Image Co-segmentation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views

Abstract

Access this article

Similar content being viewed by others

Enforcing Consistency of 3D Scenes with Multiple Objects Using Shape-from-Contours

Curve Propagation, Level Set Methods and Grouping

Multi-scale Image Co-segmentation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation