International Journal of Computer Vision

, Volume 115, Issue 1, pp 44–67 | Cite as

Theory and Practice of Hierarchical Data-driven Descent for Optimal Deformation Estimation

  • Yuandong Tian
  • Srinivasa G. Narasimhan


Real-world surfaces such as clothing, water and human body deform in complex ways. Estimating deformation parameters accurately and reliably is hard due to its high-dimensional and non-convex nature. Optimization-based approaches require good initialization while regression-based approaches need a large amount of training data. Recently, to achieve globally optimal estimation, data-driven descent (Tian and Narasimhan in Int J Comput Vis , 98:279–302, 2012) applies nearest neighbor estimators trained on a particular distribution of training samples to obtain a globally optimal and dense deformation field between a template and a distorted image. In this work, we develop a hierarchical structure that first applies nearest neighbor estimators on the entire image iteratively to obtain a rough estimation, and then applies estimators with local image support to refine the estimation. Compared to its non-hierarchical version, our approach has the theoretical guarantees with significantly fewer training samples, is faster by several orders, provides a better metric deciding whether a given image requires more (or fewer) samples, and can handle more complex scenes that include a mixture of global motion and local deformation. We demonstrate in both simulation and real experiments that the proposed algorithm successfully tracks a broad range of non-rigid scenes including water, clothing, and medical images, and compares favorably against several other deformation estimation and tracking approaches that do not provide optimality guarantees.


Deformation modeling Globally optimal solutions Non-rigid deformation Data-driven approach Non-linear optimization  Non-convex optimization Image deformation High-dimensional regression 



This research was supported in parts by ONR grant N00014-11-1-0295, a Microsoft Research PhD fellowship, a University Transportation Center T-SET grant and a gift from TONBO Imaging.


  1. Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56, 221–255.CrossRefGoogle Scholar
  2. Barnes, C., Shechtman, E., Finkelstein, A., & Goldman, D. (2009). Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics-TOG, 28(3), 24.Google Scholar
  3. Barnes, C., Shechtman, E., Goldman, D. B., & Finkelstein, A. (2010). The generalized patchmatch correspondence algorithm. In ECCV, 2010 (pp. 29–43). Berlin: Springer.Google Scholar
  4. Beauchemin, S. S., & Barron, J. L. (1995). The computation of optical flow. ACM Computing Surveys (CSUR), 27(3), 433–466.CrossRefGoogle Scholar
  5. Bookstein, F. L. (1989). Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis & Machine Intelligence, 6, 567–585.CrossRefGoogle Scholar
  6. Cao, X., Wei, Y., Wen, F., & Sun, J. (2012). Face alignment by explicit shape regression. In CVPR.Google Scholar
  7. Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In NIPS.Google Scholar
  8. Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.Google Scholar
  9. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.CrossRefGoogle Scholar
  10. Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In IJCAI (pp. 674–679).Google Scholar
  11. Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60, 135–164.Google Scholar
  12. Moll, M., & Gool, L. V. (2012). Optimal templates for non-rigid surface reconstruction. In ECCV.Google Scholar
  13. Rueckert, D., Sonoda, L., Hayes, C., Hill, D., Leach, M., & Hawkes, D. (1999). Nonrigid registration using free-form deformations: Application to breast MR images. IEEE Transactions on Medical Imaging, 18, 712–721.Google Scholar
  14. Salzmann, M., Hartley, R., & Fua, P. (2007). Convex optimization for deformable surface 3-d tracking. In ICCV.Google Scholar
  15. Salzmann, M., Moreno-Noguer, F., Lepetit, V., & Fua, P. (2008). Closed-form solution to non-rigid 3d surface registration. In ECCV.Google Scholar
  16. Serre, T., Wolf, L., & Poggio, T. (2005). Object recognition with features inspired by visual cortex. In CVPR (Vol. 2, pp. 994–1000).Google Scholar
  17. Shi, J., & Tomasi, C. (1994). Good features to track. In CVPR.Google Scholar
  18. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In CVPR.Google Scholar
  19. Tan, D. J., Holzer, S., Navab, N., & Ilic, S. (2014). Deformable template tracking in 1 ms. In ECCV.Google Scholar
  20. Taylor, J., Jepson, A., & Kutulakos, K. (2010). Non-rigid structure from locally-rigid motion. In CVPR.Google Scholar
  21. Tian, Y., & Narasimhan, S. G. (2012). Globally optimal estimation of nonrigid image distortion. International Journal of Computer Vision, 98, 279–302.zbMATHMathSciNetCrossRefGoogle Scholar
  22. Zhang, S., Zhan, Y., Zhou, Y., Uzunbas, M., & Metaxas, D. (2012). Shape prior modeling using sparse representation and online dictionary learning. Medical image computing and computer-assisted intervention (Vol. 7512, pp. 435–442)., Lecture notes in computer science Berlin: Springer.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.The Robotics InstituteCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations