Advertisement

Journal of Mathematical Imaging and Vision

, Volume 10, Issue 1, pp 27–49 | Cite as

VC-Dimension Analysis of Object Recognition Tasks

  • Michael Lindenbaum
  • Shai Ben-David
Article

Abstract

We analyze the amount of data needed to carry out various model-based recognition tasks in the context of a probabilistic data collection model. We focus on objects that may be described as semi-algebraic subsets of a Euclidean space. This is a very rich class that includes polynomially described bodies, as well as polygonal objects, as special cases. The class of object transformations considered is wide, and includes perspective and affine transformations of 2D objects, and perspective projections of 3D objects.

We derive upper bounds on the number of data features (associated with non-zero spatial error) which provably suffice for drawing reliable conclusions. Our bounds are based on a quantitative analysis of the complexity of the hypotheses class that one has to choose from. Our central tool is the VC-dimension, which is a well-studied parameter measuring the combinatorial complexity of families of sets. It turns out that these bounds grow linearly with the task complexity, measured via the VC-dimension of the class of objects one deals with. We show that this VC-dimension is at most logarithmic in the algebraic complexity of the objects and in the cardinality of the model library.

Our approach borrows from computational learning theory. Both learning and recognition use evidence to infer hypotheses but as far as we know, their similarity was not exploited previously. We draw close relations between recognition tasks and a certain learnability framework and then apply basic techniques of learnability theory to derive our sample size upper bounds. We believe that other relations between learning procedures and visual tasks exist and hope that this work will trigger further fruitful study along these lines.

learning theory PAC Vapnik-Chervonenkis dimension localization model-based recognition computer vision 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    H.S. Baird, Model-Based Image Matching using Location, MIT Press: Cambridge, MA, 1985.Google Scholar
  2. 2.
    S. Ben-David, G. Benedek, and Y. Mansour, “A parametrization scheme for classifying models of learnability,” Information and Computation, Vol. 120, No. 1, pp. 11–21, 1995.Google Scholar
  3. 3.
    S. Ben-David and M. Lindenbaum, “Localization vs. identification of semi-algebraic sets,” in Proceedings of the 6th ACM Conference on Computational Learning Theory, 1993, pp. 327–336. See also: “Localization vs. identification of semi-algebraic sets,” Machine Learning, Vol. 32, pp. 207-224, 1998.Google Scholar
  4. 4.
    S. Ben-David and M. Lindenbaum, “Learning distributions by their density levels–A paradigm for learning without a teacher,” in Proc. of the Second European Conference on Computational Learning Theory, 1995. See also: “Learning distributions by their density levels–A paradigm for learning without a teacher,” Journal of Computer and System Science, Vol. 55, No. 1, pp. 171-181, 1987.Google Scholar
  5. 5.
    A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth, “Learnability and the Vapnik-Chervonenkis dimension,” JACM, Vol. 36, No. 4, pp. 929–965, 1989.Google Scholar
  6. 6.
    J.B. Burns, R.S. Weiss, and E.M. Riseman, “The non-existence of general-case view-invariants,” in Geometric Invariance in Computer Vision, J.L Mundy and A.P. Zisserman (Eds.), MIT Press, 1992.Google Scholar
  7. 7.
    A.C. Cass, “Feature matching for object localization in the presence of uncertainty,” in Proc. 3rd Int. Conf. on Comp.Vis., Osaka, 1991, pp. 360–364.Google Scholar
  8. 8.
    D. Clemens and D. Jacobs, “Space and time bounds on indexing 3D models from 2D images,” IEEE Trans. on Pattern Analysis and Machine Intelligence,Vol. 13, No. 10, pp. 1007–1017, 1991.Google Scholar
  9. 9.
    G.E. Collins, “Quantifier elimination for real closed fields by cylindrical algebraic decomposition,” in Proceedings of the 2nd GI Conf. on Automata Theory and Formal Languages, Springer Lec. Notes Comp. Sci., 1975, Vol. 33, pp. 515–532.Google Scholar
  10. 10.
    R.M. Dudley, “A course on empirical processes,” Lecture Notes in Mathematics, Vol. 1097, pp. 2-142.Google Scholar
  11. 11.
    R.E. Ellis, “Geometric uncertainties in polyhedral object recognition,” IEEE Tran. Rob. Aut., Vol. 7, No. 3, pp. 361–371, 1991.Google Scholar
  12. 12.
    O.D. Faugeras and M. Hebert, “A3Drecognition and positioning algorithm using geometrical matching between primitive surfaces,” 8th Int. Joint Conf. Artificial Intell., 1983, pp. 996–1002.Google Scholar
  13. 13.
    P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” in Proc. of COLT93, ACM Press, 1993, pp. 361–369.Google Scholar
  14. 14.
    P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers,” Machine Learning, Vol. 18, pp. 131–148, 1995.Google Scholar
  15. 15.
    P.G. Gottschalk, J.L. Turney, and T.N. Mudge, “Efficient recognition of partially visible objects using a logarithmic complexity matching technique,” Int. J. Rob. Res., Vol. 8, No. 6, pp. 110–131, 1989.Google Scholar
  16. 16.
    W.E.L. Grimson and D.P. Huttenlocher, “On the verification of hypothesized matches in model-based recognition,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-13, No. 12, pp. 1201–1213, 1991.Google Scholar
  17. 17.
    W.E.L. Grimson and D.P. Huttenlocher (Eds.), “Special (double) issue on the interpretation of 3D scenes,” IEEE Trans. on Pattern Analysis and Mach. Intel.,Vol. PAMI-13, No. 10 andVol. PAMI-14, No.2, 1991.Google Scholar
  18. 18.
    W.E.L. Grimson, D.P. Huttenlocher, and D.W. Jacobs, “A study of affine matching with bounded sensor error,” Second Europ. Conf. Comp. Vision, 1992, pp. 291–306.Google Scholar
  19. 19.
    W.E.L. Grimson and T. Lozano-Perez, “Model based recognition and localization from sparse range or tactile data,” Int. J. Rob. Res., Vol. 3, No. 3, pp. 3–35, 1984.Google Scholar
  20. 20.
    D. Haussler, “Decision theoretic generalizations of the PAC model for neural nets and other learning applications,” Information and Computation, Vol. 100, pp. 78–150, 1992.Google Scholar
  21. 21.
    D.P. Huttenlocher, G.A. Klanderman, and J. Rucklidge, “Comparing images using the Hasusdorff distance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-15, No. 9, pp. 850–863, 1993.Google Scholar
  22. 22.
    M. Kearns and Y. Mansour, “On boosting ability of top-down decision tree learning algorithms,” in Proc. 28th Annual ACM STOC, 1996, pp 459–468.Google Scholar
  23. 23.
    D. Keren, D. Cooper, and J. Subrahmonia, “Describing complicated objects by implicit polynomials,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-16, pp. 38–53, 1994.Google Scholar
  24. 24.
    D.J. Kriegman and J. Ponce, “On recognizing and positioning curved 3D objects from image contours,” IEEE Trans. on Pattern Analysis and Mach. Intel., Vol. PAMI-12, pp. 1127–1137, 1990.Google Scholar
  25. 25.
    M. Lindenbaum, “On the amount of data required for reliable recognition,” in Proceedings of the 12th International Conference on Pattern Recognition, 1994,Vol. I, pp. 726–729. See also: “An integrated model for evaluating the amount of data required for recognition,” CIS Report 9329, CS Dept. Technion, Israel, July 1995. See also: “An integrated model for evaluating the amount of data required for reliable recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 19, No. 11, pp. pp1251-1264, 1997.Google Scholar
  26. 26.
    M. Lindenbaum, “Bounds on shape recognition performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-17, No. 7, pp. 666–680, 1995.Google Scholar
  27. 27.
    M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to object recognition,” in Proceedings of the 3rd European Conference on Computer Vision, 1994, pp. 239–240.Google Scholar
  28. 28.
    M. Lindenbaum and S. Ben-David, “Applying VC-dimension analysis to 3D object recognition from perspective projections,” in Proceedings of the 12th National Conf. on Artificial Intelligence (AAAI), 1994, pp. 985–990.Google Scholar
  29. 29.
    J. Matousek, “Epsilon nets and computational geometry,” in New Trends in Discrete and Computational Geometry, J. Pach (Ed.).Google Scholar
  30. 30.
    S.J. Maybank, “Probabilistic analysis of the application of the cross ratio to model based vision,” International Journal Computer Vision, Vol. 16, pp. 5–33, 1995.Google Scholar
  31. 31.
    J. Milnor, “On the Betti numbers of real varieties,” Proc. Amer. Math. Soc., Vol. 15, pp. 275–280, 1964.Google Scholar
  32. 32.
    Y. Moses and S. Ullman, “Limitations of non model-based recognition systems,” Proc. ECCV-92, 1992, pp. 820–828.Google Scholar
  33. 33.
    J.L. Mundy and A.J. Heller, “The evolution and testing of modelbased object recognition systems,” in Proc. 3rd ICCV, 1990, pp. 268–282.Google Scholar
  34. 34.
    J.L. Mundy and A.P. Zisserman (Eds.), Geometric Invariance in Computer Vision, MIT Press, 1992.Google Scholar
  35. 35.
    A. Rudshtein and M. Lindenbaum, “Quantifying the performance of feature-based recognition,” Proceedings of the 12th International Conference on Pattern Recognition, 1996, Vol. 1, pp. 35–39.Google Scholar
  36. 36.
    N. Sauer, “On the density of family of sets,” Journal of Combinatorial Theory (Series A), Vol. 13, pp. 145–147, 1972.Google Scholar
  37. 37.
    J. Serra, Image Analysis and Mathematical Morphology, Academic Press: London, 1982.Google Scholar
  38. 38.
    S.S. Skiena, “Problems in geometric probing,” Algorithmica, Vol. 4, pp. 599–605, 1989.Google Scholar
  39. 39.
    A. Tannenbaum and Y. Yomdin, “Robotic manipulators and the geometry of real semialgebraic sets,” IEEE Journal on Rob. Aut., Vol. RA-3, pp. 301–307, 1987.Google Scholar
  40. 40.
    G. Taubin and D.B. Cooper, “2D and 3D object recognition and positioning with algebraic invariants and covariants,” in Symbolic and Numerical Computation for Artificial Intelligence, B.R. Donald, D. Kapur, and J.L. Mundy (Eds.), 1992.Google Scholar
  41. 41.
    D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer, “Elastically deformable models,” ACM Computer Graphics, Vol. 21, pp. 205–214, 1987.Google Scholar
  42. 42.
    V.N. Vapnik and A.Y. Chervonenkis, “On the uniform convergence of relative frequenciesof events to their probabilities,” Theory of Probability and its applications, Vol. 16, No. 2, pp. 264–280, 1971.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Michael Lindenbaum
    • 1
  • Shai Ben-David
    • 1
  1. 1.Computer Science DepartmentTechnion HaifaIsrael

Personalised recommendations