A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching

Laptev, Ivan; Lindeberg, Tony

doi:10.1023/A:1022947906601

A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching

Published: May 2003

Volume 52, pages 97–120, (2003)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Ivan Laptev¹ &
Tony Lindeberg¹

184 Accesses
10 Citations
Explore all metrics

Abstract

This paper presents two approaches for evaluating multi-scale feature-based object models. Within the first approach, a scale-invariant distance measure is proposed for comparing two image representations in terms of multi-scale features. Based on this measure, the maximisation of the likelihood of parameterised feature models allows for simultaneous model selection and parameter estimation.

The idea of the second approach is to avoid an explicit feature extraction step and to evaluate models using a function defined directly from the image data. For this purpose, we propose the concept of a feature likelihood map, which is a function normalised to the interval [0, 1], and that approximates the likelihood of image features at all points in scale-space.

To illustrate the applicability of both methods, we consider the area of hand gesture analysis and show how the proposed evaluation schemes can be integrated within a particle filtering approach for performing simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by pyramid implementations of the proposed concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameter Estimation from Motion Tracking Data

Towards the Automatic Definition of the Objective Function for Model-Based 3D Hand Tracking

Combining Multiple Nearest-Neighbor Searches for Multiscale Feature Point Matching

References

Almansa, A. and Lindeberg, T.2000.Fingerprint enhancement by shape adaptation of scale-space operators with automatic scaleselection.IEEE Transactions on Image Processing,9(12):2027– 2042.
Google Scholar
Bevensee, R. 1993. Maximum Entropy Solutions to Scientific Problems. Prentice Hall: Englewood Cliffs, NJ.
Google Scholar
Bigü, J., Granlund, G.H., and Wiklund, J. 1991. Multidimensional orientation estimation with applications to texture analysis and optical flow. IEEE Trans. Pattern Analysis and Machine Intell., 13(8):775–790.
Google Scholar
Billmeyer, F. and Saltzman, M. 1982. Principles of Colour Technology. John Wiley and Sons: New York.
Google Scholar
Black, M. and Jepson, A. 1998. Eigen tracking: Robust matching and tracking of articulated objects using view-based representation. International Journal of Computer Vision, 26(1):63–84.
Google Scholar
Bretzner, L. and Lindeberg, T. 1999a. Qualitative multi-scale feature hierarchies for object tracking. Journal of Visual Communication and Image Representation, 11:115–129.
Google Scholar
Bretzner, L. and Lindeberg, T. 1999b. Qualitative multi-scale feature hierarchies for object tracking. In Proc. 2nd International Conference on Scale-Space Theories in Computer Vision, O.F.O.M. Nielsen, P. Johansen, and J. Weickert (Eds.), vol. 1682, Springer Verlag, Corfu, Greece, pp. 117–128.
Google Scholar
Bretzner, L., Laptev, I., and Lindeberg, T. 2002. Hand-gesture recognition using multi-scale colour features hierarchical features and particle filtering. Proc. Face and Gesture,Washington, DC, USA, pp. 63–74.
Burbeck, C.A. and Pizer, S.M. 1995. Object representation by cores: Identifying and representing primitive spatial regions. Vision Research, 35(13):1917–1930.
Google Scholar
Chomat, O., de Verdiere, V., Hall, D., and Crowley, J. 2000. Local scale selection for Gaussian based description techniques. In Proc. Sixth European Conference on Computer Vision, vol. 1842 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. 117–133.
Google Scholar
Cipolla, R. and Pentland, A. (Eds.) 1998. Computer Vision for Human-Computer Interaction. Cambridge University Press: Cambridge, UK.
Google Scholar
Cipolla, R., Okamoto, Y., and Kuno, Y. 1993. Robust structure from motion using motion parallax. In Proc. Fourth International Conference on Computer Vision, Berlin, Germany, pp. 374–382.
Crowley, J. and Sanderson, A. 1987. Multiple resolution representation and probabilistic matching of 2-D gray-scale shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(1):113–121.
Google Scholar
Cui, Y. and Weng, J. 1996. View-based hand segmentation and handsequence recognition with complex backgrounds. In Proc. 13th International Conference onPattern Recognition,Vienna, Austria, pp. 617–621.
Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In Proc. Computer Vision and Pattern Recognition, Hilton Head, SC, pp. II:126–133.
Florack, L.M.J. 1997. Image Structure. Kluwer Academic Publishers: Dordrecht, The Netherlands.
Google Scholar
Forsyth, D. and Fleck, M.1999. Automatic detection of human nudes. International Journal of Computer Vision, 32(1):63–77.
Google Scholar
Freeman, W.T. and Weissman, C.D. 1995. Television control by hand gestures. In Proc. Int. Conf. on Face and Gesture Recognition, Zurich, Switzerland.
Gårding, J. and Lindeberg, T. 1996. Direct computation of shape cues using scale-adapted spatial derivative operators. Int. J. of Computer Vision, 17(2):163–191.
Google Scholar
Gauch, J.M. and Pizer, S.M. 1993. Multiresolution analysis of ridges and valleys in grey-scale images. IEEE Trans.Pattern Analysis and Machine Intell., 15(6):635–646.
Google Scholar
Geman, S. and Geman, D. 1984. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721–741.
Google Scholar
Griffin, L.D., Colchester, A.C.F., and Robinson, G.P. 1992. Scale and segmentation of images using maximum gradient paths. Image and Vision Computing, 10(6):389–402.
Google Scholar
Hall, D., de Verdiere, V., and Crowley, J. 2000. Object recognition using coloured receptive fields. In Proc. Sixth European Conference on Computer Vision, vol. 1842 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. 164–177.
Google Scholar
Heap, T. and Hogg, D. 1998. Wormholes in shape space: Tracking through discontinuous changes in shape. In Proc. Sixth International Conference on Computer Vision, Bombay, India, pp. 344– 349.
Isard, M. and Blake, A. 1996. Contour tracking by stochastic propagation of conditional density. In Proc. Fourth European Conference on Computer Vision, vol. 1064 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. I:343–356.
Google Scholar
Isard, M. and Blake, A. 1998. ICondensation: Unifying low-level and high-level tracking in a stochastic framework. In Proc. Fifth European Conference on Computer Vision, H. Burkhardt and B. Neumann (Eds.), vol. 1406 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. 893–908.
Google Scholar
Koenderink, J.J. 1984. The structure of images. Biological Cybernetics, 50:363–370.
Google Scholar
Koenderink, J.J. and van Doorn, A.J. 1992. Generic neighborhood operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(6):597–605.
Google Scholar
Laptev, I. and Lindeberg, T. 2001a. A multi-scale feature likelihood map for direct evaluation of object hypotheses. In Proc. Scale-Space '01, M. Kerckhove (Ed.), vol. 2106 of Lecture Notes in Computer Science. Springer-Verlag, Vancouver, Canada, pp. 98– 110.
Google Scholar
Laptev, I. and Lindeberg, T. 2001b.Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features.In Proc. Scale-Space '01, M. Kerckhove (Ed.), vol. 2106 of Lecture Notes in Computer Science. Springer-Verlag, Vancouver, Canada, pp. 63–74.
Google Scholar
Lifshitz, L. and Pizer, S. 1990. A multiresolution hierarchical approach to image segmentation based on intensity extrema. IEEE Trans. Pattern Analysis and Machine Intell., 12(6):529–541.
Google Scholar
Lindeberg, T. 1993. Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention. International Journal of Computer Vision, 11(3):283–318.
Google Scholar
Lindeberg, T. 1994. Scale-Space Theory in Computer Vision. Kluwer Academic Publishers: Boston.
Google Scholar
Lindeberg, T. 1997. On automatic selection of temporal scales in time-causal scale-space. In AFPAC'97: Algebraic Frames for the Perception-Action Cycle, pp. 94–113.
Lindeberg, T. 1998a. Edge detection and ridge detection with automatic scale selection. Int. J. of Computer Vision, 30(2):117–154.
Google Scholar
Lindeberg, T. 1998b. Feature detection with automatic scale selection. Int. J. of Computer Vision, 30(2):77–116.
Google Scholar
Lindeberg, T. and Gårding, J. 1997. Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure. Image and Vision Computing, 15:415–434.
Google Scholar
Lindeberg, T., Niemenmaa, J. and Bretzner, L. 2002. Scale selection in hybrid multi-scale representations, in preparation.
Lowe, D. 1999. Object recognition from local scale-invariant features. In Proc. Seventh International Conference on Computer Vision, Corfu, Greece, pp. 1150–1157.
MacCormick, J. and Isard, M. 2000. Partitioned sampling, articulated objects, and interface-quality hand tracking. In Proc. Sixth European Conference on Computer Vision, vol. 1843 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. II:3–19.
Google Scholar
Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale invariant interest points. In Proc. Eighth International Conference on Computer Vision, Vancouver, Canada, pp. I:525–531.
Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In Proc. Seventh European Conference on Computer Vision, vol. 2350 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. I:128–142.
Google Scholar
Olsen, O.F. 1997. Multi-scale watershed segmentation. In Gaussian Scale-Space Theory: Proc. PhD School on Scale-Space Theory, J. Sporring, M. Nielsen, L. Florack and P. Johansen (Eds.), Kluwer Academic Publishers, Copenhagen, Denmark, pp. 191–200.
Google Scholar
Pavlovic, V.I. Sharma, R., and Huang, T.S. 1997. Visual interpretation of hand gestures for human-computer interaction: A review. IEEE Trans. Pattern Analysis and Machine Intell., 19(7):677– 694.
Google Scholar
Pizer, S.M., Burbeck, C.A., Coggins, J.M., Fritsch, D.S., and Morse, B.S. 1994. Object shape before boundary shape: Scale-space medial axis. Journal of Mathematical Imaging and Vision, 4:303– 313.
Google Scholar
Rao, A.R. and Schunk, B.G. 1991. Computing oriented texture fields. CVGIP; Graphical Models and Image Processing, 53(2):157– 185.
Google Scholar
Rehg, J. and Kanade, T. 1995. Model-based tracking of selfoccluding articulated objects. In Proc. Fifth International Conference on Computer Vision, Cambridge, MA, pp. 612– 617.
Riesenhuber, M. and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature Neuroscience, 2:1019–1025.
Google Scholar
Schiele, B. and Crowley, J. 2000. Recognition without correspondence using multidimensional receptive field histograms. Int. J. of Computer Vision, 36(1):31–50.
Google Scholar
Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):530–535.
Google Scholar
Shokoufandeh, A., Marsic, I., and Dickinson, S. 1999. View-based object recognition using saliency maps. Image and Vision Computing, 17(5/6):445–460.
Google Scholar
Siddiqi, K., Shokoufandeh, A., Dickinson, S., and Zucker, S. 1999. Shock graphs and shape matching. International Journal of Computer Vision, 35(1):13–32.
Google Scholar
Sidenbladh, H. and Black, M. 2001. Learning image statistics for bayesian tracking. In Proc. Eighth International Conference on Computer Vision, Vancouver, Canada, pp. II:709–716.
Sullivan, J., Blake, A., Isard, M., and MacCormick, J. 1999. Object localization by bayesian correlation. In Proc. Seventh International Conference on Computer Vision, Corfu, Greece, pp. 1068– 1075.
Triesch, J. and von der Malsburg, C. 1996. Robust classification of hand postures against complex background. In Proc. Int. Conf. on Face and Gesture Recognition, Killington, Vermont, pp. 170–175.
Vincken, K., Koster, A., and Viergever, M. 1997. Probabilistic multiscale image segmentation. IEEE Trans. Pattern Analysis and Machine Intell., 19(2):109–120.
Google Scholar
von Hardenberg, C. and Bérard, F. 2001. Bare-hand humancomputer interaction. ACM Workshop on Perceptive User Interfaces, Orlando, FL, USA.
Witkin, A.P. 1983. Scale-space filtering. In Proc. 8th Int. Joint Conf. Art. Intell., Karlsruhe, Germany, pp. 1019–1022.

Download references

Author information

Authors and Affiliations

Computational Vision and Active Perception Laboratory (CVAP), Department of Numerical Analysis and Computer Science, KTH (Royal Institute of Technology), SE-100 44, Stockholm, Sweden
Ivan Laptev & Tony Lindeberg

Authors

Ivan Laptev
View author publications
You can also search for this author in PubMed Google Scholar
Tony Lindeberg
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Laptev, I., Lindeberg, T. A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching. International Journal of Computer Vision 52, 97–120 (2003). https://doi.org/10.1023/A:1022947906601

Download citation

Issue Date: May 2003
DOI: https://doi.org/10.1023/A:1022947906601

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching

Abstract

Access this article

Similar content being viewed by others

Parameter Estimation from Motion Tracking Data

Towards the Automatic Definition of the Objective Function for Model-Based 3D Hand Tracking

Combining Multiple Nearest-Neighbor Searches for Multiscale Feature Point Matching

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching

Abstract

Access this article

Similar content being viewed by others

Parameter Estimation from Motion Tracking Data

Towards the Automatic Definition of the Objective Function for Model-Based 3D Hand Tracking

Combining Multiple Nearest-Neighbor Searches for Multiscale Feature Point Matching

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation