Skip to main content
Log in

A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper presents two approaches for evaluating multi-scale feature-based object models. Within the first approach, a scale-invariant distance measure is proposed for comparing two image representations in terms of multi-scale features. Based on this measure, the maximisation of the likelihood of parameterised feature models allows for simultaneous model selection and parameter estimation.

The idea of the second approach is to avoid an explicit feature extraction step and to evaluate models using a function defined directly from the image data. For this purpose, we propose the concept of a feature likelihood map, which is a function normalised to the interval [0, 1], and that approximates the likelihood of image features at all points in scale-space.

To illustrate the applicability of both methods, we consider the area of hand gesture analysis and show how the proposed evaluation schemes can be integrated within a particle filtering approach for performing simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by pyramid implementations of the proposed concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Almansa, A. and Lindeberg, T.2000.Fingerprint enhancement by shape adaptation of scale-space operators with automatic scaleselection.IEEE Transactions on Image Processing,9(12):2027– 2042.

    Google Scholar 

  • Bevensee, R. 1993. Maximum Entropy Solutions to Scientific Problems. Prentice Hall: Englewood Cliffs, NJ.

    Google Scholar 

  • Bigü, J., Granlund, G.H., and Wiklund, J. 1991. Multidimensional orientation estimation with applications to texture analysis and optical flow. IEEE Trans. Pattern Analysis and Machine Intell., 13(8):775–790.

    Google Scholar 

  • Billmeyer, F. and Saltzman, M. 1982. Principles of Colour Technology. John Wiley and Sons: New York.

    Google Scholar 

  • Black, M. and Jepson, A. 1998. Eigen tracking: Robust matching and tracking of articulated objects using view-based representation. International Journal of Computer Vision, 26(1):63–84.

    Google Scholar 

  • Bretzner, L. and Lindeberg, T. 1999a. Qualitative multi-scale feature hierarchies for object tracking. Journal of Visual Communication and Image Representation, 11:115–129.

    Google Scholar 

  • Bretzner, L. and Lindeberg, T. 1999b. Qualitative multi-scale feature hierarchies for object tracking. In Proc. 2nd International Conference on Scale-Space Theories in Computer Vision, O.F.O.M. Nielsen, P. Johansen, and J. Weickert (Eds.), vol. 1682, Springer Verlag, Corfu, Greece, pp. 117–128.

    Google Scholar 

  • Bretzner, L., Laptev, I., and Lindeberg, T. 2002. Hand-gesture recognition using multi-scale colour features hierarchical features and particle filtering. Proc. Face and Gesture,Washington, DC, USA, pp. 63–74.

  • Burbeck, C.A. and Pizer, S.M. 1995. Object representation by cores: Identifying and representing primitive spatial regions. Vision Research, 35(13):1917–1930.

    Google Scholar 

  • Chomat, O., de Verdiere, V., Hall, D., and Crowley, J. 2000. Local scale selection for Gaussian based description techniques. In Proc. Sixth European Conference on Computer Vision, vol. 1842 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. 117–133.

    Google Scholar 

  • Cipolla, R. and Pentland, A. (Eds.) 1998. Computer Vision for Human-Computer Interaction. Cambridge University Press: Cambridge, UK.

    Google Scholar 

  • Cipolla, R., Okamoto, Y., and Kuno, Y. 1993. Robust structure from motion using motion parallax. In Proc. Fourth International Conference on Computer Vision, Berlin, Germany, pp. 374–382.

  • Crowley, J. and Sanderson, A. 1987. Multiple resolution representation and probabilistic matching of 2-D gray-scale shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(1):113–121.

    Google Scholar 

  • Cui, Y. and Weng, J. 1996. View-based hand segmentation and handsequence recognition with complex backgrounds. In Proc. 13th International Conference onPattern Recognition,Vienna, Austria, pp. 617–621.

  • Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In Proc. Computer Vision and Pattern Recognition, Hilton Head, SC, pp. II:126–133.

  • Florack, L.M.J. 1997. Image Structure. Kluwer Academic Publishers: Dordrecht, The Netherlands.

    Google Scholar 

  • Forsyth, D. and Fleck, M.1999. Automatic detection of human nudes. International Journal of Computer Vision, 32(1):63–77.

    Google Scholar 

  • Freeman, W.T. and Weissman, C.D. 1995. Television control by hand gestures. In Proc. Int. Conf. on Face and Gesture Recognition, Zurich, Switzerland.

  • Gårding, J. and Lindeberg, T. 1996. Direct computation of shape cues using scale-adapted spatial derivative operators. Int. J. of Computer Vision, 17(2):163–191.

    Google Scholar 

  • Gauch, J.M. and Pizer, S.M. 1993. Multiresolution analysis of ridges and valleys in grey-scale images. IEEE Trans.Pattern Analysis and Machine Intell., 15(6):635–646.

    Google Scholar 

  • Geman, S. and Geman, D. 1984. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721–741.

    Google Scholar 

  • Griffin, L.D., Colchester, A.C.F., and Robinson, G.P. 1992. Scale and segmentation of images using maximum gradient paths. Image and Vision Computing, 10(6):389–402.

    Google Scholar 

  • Hall, D., de Verdiere, V., and Crowley, J. 2000. Object recognition using coloured receptive fields. In Proc. Sixth European Conference on Computer Vision, vol. 1842 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. 164–177.

    Google Scholar 

  • Heap, T. and Hogg, D. 1998. Wormholes in shape space: Tracking through discontinuous changes in shape. In Proc. Sixth International Conference on Computer Vision, Bombay, India, pp. 344– 349.

  • Isard, M. and Blake, A. 1996. Contour tracking by stochastic propagation of conditional density. In Proc. Fourth European Conference on Computer Vision, vol. 1064 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. I:343–356.

    Google Scholar 

  • Isard, M. and Blake, A. 1998. ICondensation: Unifying low-level and high-level tracking in a stochastic framework. In Proc. Fifth European Conference on Computer Vision, H. Burkhardt and B. Neumann (Eds.), vol. 1406 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. 893–908.

    Google Scholar 

  • Koenderink, J.J. 1984. The structure of images. Biological Cybernetics, 50:363–370.

    Google Scholar 

  • Koenderink, J.J. and van Doorn, A.J. 1992. Generic neighborhood operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(6):597–605.

    Google Scholar 

  • Laptev, I. and Lindeberg, T. 2001a. A multi-scale feature likelihood map for direct evaluation of object hypotheses. In Proc. Scale-Space '01, M. Kerckhove (Ed.), vol. 2106 of Lecture Notes in Computer Science. Springer-Verlag, Vancouver, Canada, pp. 98– 110.

    Google Scholar 

  • Laptev, I. and Lindeberg, T. 2001b.Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features.In Proc. Scale-Space '01, M. Kerckhove (Ed.), vol. 2106 of Lecture Notes in Computer Science. Springer-Verlag, Vancouver, Canada, pp. 63–74.

    Google Scholar 

  • Lifshitz, L. and Pizer, S. 1990. A multiresolution hierarchical approach to image segmentation based on intensity extrema. IEEE Trans. Pattern Analysis and Machine Intell., 12(6):529–541.

    Google Scholar 

  • Lindeberg, T. 1993. Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention. International Journal of Computer Vision, 11(3):283–318.

    Google Scholar 

  • Lindeberg, T. 1994. Scale-Space Theory in Computer Vision. Kluwer Academic Publishers: Boston.

    Google Scholar 

  • Lindeberg, T. 1997. On automatic selection of temporal scales in time-causal scale-space. In AFPAC'97: Algebraic Frames for the Perception-Action Cycle, pp. 94–113.

  • Lindeberg, T. 1998a. Edge detection and ridge detection with automatic scale selection. Int. J. of Computer Vision, 30(2):117–154.

    Google Scholar 

  • Lindeberg, T. 1998b. Feature detection with automatic scale selection. Int. J. of Computer Vision, 30(2):77–116.

    Google Scholar 

  • Lindeberg, T. and Gårding, J. 1997. Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure. Image and Vision Computing, 15:415–434.

    Google Scholar 

  • Lindeberg, T., Niemenmaa, J. and Bretzner, L. 2002. Scale selection in hybrid multi-scale representations, in preparation.

  • Lowe, D. 1999. Object recognition from local scale-invariant features. In Proc. Seventh International Conference on Computer Vision, Corfu, Greece, pp. 1150–1157.

  • MacCormick, J. and Isard, M. 2000. Partitioned sampling, articulated objects, and interface-quality hand tracking. In Proc. Sixth European Conference on Computer Vision, vol. 1843 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. II:3–19.

    Google Scholar 

  • Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale invariant interest points. In Proc. Eighth International Conference on Computer Vision, Vancouver, Canada, pp. I:525–531.

  • Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In Proc. Seventh European Conference on Computer Vision, vol. 2350 of Lecture Notes in Computer Science. Springer Verlag: Berlin, pp. I:128–142.

    Google Scholar 

  • Olsen, O.F. 1997. Multi-scale watershed segmentation. In Gaussian Scale-Space Theory: Proc. PhD School on Scale-Space Theory, J. Sporring, M. Nielsen, L. Florack and P. Johansen (Eds.), Kluwer Academic Publishers, Copenhagen, Denmark, pp. 191–200.

    Google Scholar 

  • Pavlovic, V.I. Sharma, R., and Huang, T.S. 1997. Visual interpretation of hand gestures for human-computer interaction: A review. IEEE Trans. Pattern Analysis and Machine Intell., 19(7):677– 694.

    Google Scholar 

  • Pizer, S.M., Burbeck, C.A., Coggins, J.M., Fritsch, D.S., and Morse, B.S. 1994. Object shape before boundary shape: Scale-space medial axis. Journal of Mathematical Imaging and Vision, 4:303– 313.

    Google Scholar 

  • Rao, A.R. and Schunk, B.G. 1991. Computing oriented texture fields. CVGIP; Graphical Models and Image Processing, 53(2):157– 185.

    Google Scholar 

  • Rehg, J. and Kanade, T. 1995. Model-based tracking of selfoccluding articulated objects. In Proc. Fifth International Conference on Computer Vision, Cambridge, MA, pp. 612– 617.

  • Riesenhuber, M. and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature Neuroscience, 2:1019–1025.

    Google Scholar 

  • Schiele, B. and Crowley, J. 2000. Recognition without correspondence using multidimensional receptive field histograms. Int. J. of Computer Vision, 36(1):31–50.

    Google Scholar 

  • Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):530–535.

    Google Scholar 

  • Shokoufandeh, A., Marsic, I., and Dickinson, S. 1999. View-based object recognition using saliency maps. Image and Vision Computing, 17(5/6):445–460.

    Google Scholar 

  • Siddiqi, K., Shokoufandeh, A., Dickinson, S., and Zucker, S. 1999. Shock graphs and shape matching. International Journal of Computer Vision, 35(1):13–32.

    Google Scholar 

  • Sidenbladh, H. and Black, M. 2001. Learning image statistics for bayesian tracking. In Proc. Eighth International Conference on Computer Vision, Vancouver, Canada, pp. II:709–716.

  • Sullivan, J., Blake, A., Isard, M., and MacCormick, J. 1999. Object localization by bayesian correlation. In Proc. Seventh International Conference on Computer Vision, Corfu, Greece, pp. 1068– 1075.

  • Triesch, J. and von der Malsburg, C. 1996. Robust classification of hand postures against complex background. In Proc. Int. Conf. on Face and Gesture Recognition, Killington, Vermont, pp. 170–175.

  • Vincken, K., Koster, A., and Viergever, M. 1997. Probabilistic multiscale image segmentation. IEEE Trans. Pattern Analysis and Machine Intell., 19(2):109–120.

    Google Scholar 

  • von Hardenberg, C. and Bérard, F. 2001. Bare-hand humancomputer interaction. ACM Workshop on Perceptive User Interfaces, Orlando, FL, USA.

  • Witkin, A.P. 1983. Scale-space filtering. In Proc. 8th Int. Joint Conf. Art. Intell., Karlsruhe, Germany, pp. 1019–1022.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Laptev, I., Lindeberg, T. A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching. International Journal of Computer Vision 52, 97–120 (2003). https://doi.org/10.1023/A:1022947906601

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1022947906601

Navigation