International Journal of Computer Vision

, Volume 61, Issue 1, pp 81–101 | Cite as

Building Roadmaps of Minima and Transitions in Visual Models

  • Cristian Sminchisescu
  • Bill Triggs


Becoming trapped in suboptimal local minima is a perennial problem when optimizing visual models, particularly in applications like monocular human body tracking where complicated parametric models are repeatedly fitted to ambiguous image measurements. We show that trapping can be significantly reduced by building ‘roadmaps’ of nearby minima linked by transition pathways—paths leading over low ‘mountain passes’ in the cost surface—found by locating the transition state (codimension-1 saddle point) at the top of the pass and then sliding downhill to the next minimum. We present two families of transition-state-finding algorithms based on local optimization. In eigenvector tracking, unconstrained Newton minimization is modified to climb uphill towards a transition state, while in hypersurface sweeping, a moving hypersurface is swept through the space and moving local minima within it are tracked using a constrained Newton method. These widely applicable numerical methods, which appear not to be known in vision and optimization, generalize methods from computational chemistry where finding transition states is critical for predicting reaction parameters. Experiments on the challenging problem of estimating 3D human pose from monocular images show that our algorithms find nearby transition states and minima very efficiently, but also underline the disturbingly large numbers of minima that can exist in this and similar model based vision problems.

model based vision global optimization saddle points 3D human tracking 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Abashkin, Y. and Russo, N. 1994. Transition state structures and reaction profiles from constrained optimization procedure. Implementation in the framework of density functional theory. J. Chem. Phys. Google Scholar
  2. Abashkin, Y., Russo, N., and Toscano, M. 1994. Transition states and energy barriers from density functional studies: Representative isomerization reactions. International Journal of Quantum Chemistry.Google Scholar
  3. Anderson, N. and Walsh, G.R. 1986. A graphical method for a class of Branin trajectories. Journal of Optimization Theory and Applications.Google Scholar
  4. Barkema, G.T. 1996. Event-based relaxation of continuous disordered systems. Physical Review Letters, 77(21).Google Scholar
  5. Bofill, J.M. 1994. Updated Hessian matrix and the restricted step method for locating transition structures. Journal of Computational Chemistry, 15(1):1-11.Google Scholar
  6. Branin, F. and Hoo, S. 1972. A method for finding multiple extrema of a function of n variables. Numerical Methods of Nonlinear Optimization.Google Scholar
  7. Cerjan, C.J. and Miller, W.H. 1981. On finding transition states. J. Chem. Phys., 75(6).Google Scholar
  8. Chiuso, A., Brockett, R., and Soatto, S. 2000. Optimal structure from motion: Local ambiguities and global estimates. International Journal of Computer Vision, 39(3):195-228.Google Scholar
  9. Choo, K. and Fleet, D. 2001. People tracking using hybrid Monte Carlo filtering. In IEEE International Conference on Computer Vision.Google Scholar
  10. Crippen, G.M. and Scheraga, H.A. 1971. Minimization of polypeptide energy. XI. The method of gentlest ascent. Archives of Biochemistry and Biophysics, 144:462-466.Google Scholar
  11. Culot, P., Dive, G., Nguyen, V.H., and Ghuysen, J.M. 1992. A Quasi-Newton algorithm for first-order saddle point location. Theoretica Chimica Acta, 82:189-205.Google Scholar
  12. Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
  13. Deutscher, J., Davidson, A., and Reid, I. 2001. Articulated partitioning of high dimensional search spacs associated with articulated body motion capture. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
  14. Fiodorova, I. 1978. Search for the global optimum of multiextremal problems. Optimal Decision Theory.Google Scholar
  15. Fletcher, R. 1987. Practical methods of optimization. In John Wiley.Google Scholar
  16. Gavrila, D. and Davis, L. 1996. 3-D model based tracking of humans in action: A multiview approach. In IEEE International Conference on Computer Vision and Pattern Recognition, pp. 73-80.Google Scholar
  17. Ge, R. 1987. The theory of the filled function method for finding a global minimizer of a nonlinearly constrained minimization problem. J. of Comp. Math. Google Scholar
  18. Glover, F. 1990. Tabu search-part II. ORSA Journal of Computing 2, pages 4-32.Google Scholar
  19. Goldstein, A. and Price, J. 1971. On descent from local minima. Mathematics of Computation.Google Scholar
  20. Griewank, A. 1981. Generalized descent for global optimization. Journal of Optimization Theory and Applications.Google Scholar
  21. Helgaker, T. 1991. Transition-state optimizations by trust-region image minimization. Chemical Physics Letters, 182(5).Google Scholar
  22. Henkelman, G. and Jonsson, H. 1999. A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives. J. Chem. Phys., 111(15):7011-7022.Google Scholar
  23. Hilderbrandt, R.L. 1977. Application of Newton-Raphson optimization techniques in molecular mechanics calculations. Computers & Chemistry, 1:179-186.Google Scholar
  24. Jensen, F. 1995. Locating transition structures by mode following: A comparison of six methods on the Ar8 Lennard-Jones potential. J. Chem. Phys., 102(17):6706-6718.Google Scholar
  25. Jorgensen, P., Jensen, H.J.A., and Helgaker, T. 1988. A gradient extremal walking algorithm. Theoretica Chimica Acta, 73:55-65.Google Scholar
  26. Kirkpatrick, S., Gelatt, C., and Vecchi, M. 1983. Optimization by simulated annealing. Science.Google Scholar
  27. Lee, H.J. and Chen, Z. 1985. Determination of 3D human body postures from a single view. Computer Vision, Graphics and Image Processing, 30:148-168.Google Scholar
  28. Levy, A. and Montalvo, A. 1985. The tunneling algorithm for the global minimization of functions. SIAM J. of Stat. Comp. Google Scholar
  29. Morris, D. and Rehg, J. 1998. Singularity analysis for articulated object tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, pp. 289-296.Google Scholar
  30. Mousseau, N. and Berkema, G.T. 1998. Traveling through poten-tial energy lanscapes of desordered materials: The activation-relaxation technique. Physical Review E, 57(2).Google Scholar
  31. Munro, L.J. and Wales, D.J. 1999. Defect migration in crystalline silicon. Physical Review B, 59(6):3969-3980.Google Scholar
  32. Neal, R. 2001. Annealed importance sampling. Statistics and Computing, 11:125-139.Google Scholar
  33. Nichols, J., Taylor, H., Schmidt, P., and Simons, J. 1990. Walking on potential energy surfaces. J. Chem. Phys., 92(1).Google Scholar
  34. Oliensis, J. 2001. The error surface for structure from motion. Technical report, NECI.Google Scholar
  35. Sevick, E.M., Bell, A.T., and Theodorou, D.N. 1993. Achain of states method for investigating infrequent event processes occuring in multistate, multidimensional systems. J. Chem. Phys., 98(4).Google Scholar
  36. Sidenbladh, H., Black, M., and Fleet, D. 2000. Stochastic tracking of 3Dhuman figures using 2d image motion. In European Conference on Computer Vision.Google Scholar
  37. Sidenbladh, H., Black, M., and Sigal, L. 2002. Implicit probabilistic models of human motion for synthesis and tracking. In European Conference on Computer Vision.Google Scholar
  38. Simons, J., Jorgensen, P., Taylor, H., and Ozmen, J. 1983. Walking on potential energy surfaces. J. Phys. Chem., 87:2745-2753.Google Scholar
  39. Sminchisescu, C. 2002a. Consistency and coupling in human model likelihoods. In IEEE International Conference on Automatic Face and Gesture Recognition, pp. 27-32, Washington D.C.Google Scholar
  40. Sminchisescu, C. 2002b. Estimation Algorithms for Ambiguous Visual Models-Three-Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences. Ph.D. thesis, Institute National Politechnique de Grenoble (INRIA).Google Scholar
  41. Sminchisescu, C. and Jepson, A. 2004a. Generative modeling for continuous non-linearly embedded visual inference. In International Conference on Machine Learning, pp. 759-766, Banff.Google Scholar
  42. Sminchisescu, C. and Jepson, A. 2004b. Variational mixture smoothing for non-linear dynamical systems. In IEEE International Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 608-615, Washington D.C.Google Scholar
  43. Sminchisescu, C. and Triggs, B. 2001. Covariance-scaled sampling for monocular 3D body tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 447-454, Hawaii.Google Scholar
  44. Sminchisescu, C. and Triggs, B. 2002a. Building roadmaps of local minima of visual models. In European Conference on Computer Vision,vol. 1, pp. 566-582, Copenhagen.Google Scholar
  45. Sminchisescu, C. and Triggs, B. 2002b. Hyperdynamics importance sampling. In European Conference on Computer Vision,vol. 1, pp. 769-783, Copenhagen.Google Scholar
  46. Sminchisescu, C. and Triggs, B. 2003a. Estimating articulated human motion with covariance scaled sampling. International Journal of Robotics Research, 22(6):371-393.Google Scholar
  47. Sminchisescu, C. and Triggs, B. 2003b. Kinematic jump processes for monocular 3D human tracking. In IEEE International Conference on Computer Vision and Pattern Recognition,vol. 1, pp. 69-76, Madison.Google Scholar
  48. Sminchisescu, C., Welling, M., and Hinton, G. 2003. A mode-hopping MCMC sampler. Technical Report CSRG-478, University of Toronto.Google Scholar
  49. Sun, J.Q. and Ruedenberg, K. 1993. Gradient extremals and steepest descend lines on potential energy surfaces. J. Chem. Phys., 98(12).Google Scholar
  50. Triggs, B., McLauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment-a modern synthesis. In Springer-Verlag, editor, Vision Algorithms: Theory and Practice.Google Scholar
  51. Voter, A.F. 1997a. Amethod for accelerating the molecular dynamics simulation of infrequent events. J. Chem. Phys., 106(11):4665-4677.Google Scholar
  52. Voter, A.F. 1997b. Hyperdynamics: Accelerated molecular dynamics of infrequent events. Physical Review Letters, 78(20):3908-3911.Google Scholar
  53. Wales, D.J. 1989. Finding saddle points for clusters. J. Chem. Phys., 91(11).Google Scholar
  54. Wales, D.J. and Walsh, T.R. 1996. Theoretical study of the water pentamer. J. Chem. Phys., 105(16).Google Scholar

Copyright information

© Kluwer Academic Publishers 2005

Authors and Affiliations

  • Cristian Sminchisescu
    • 1
  • Bill Triggs
    • 2
  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada
  2. 2.GRAVIR-CNRS-INRIAMontbonnotFrance

Personalised recommendations