Parameterizing Object Detectors in the Continuous Pose Space
Abstract
Object detection and pose estimation are interdependent problems in computer vision. Many past works decouple these problems, either by discretizing the continuous pose and training pose-specific object detectors, or by building pose estimators on top of detector outputs. In this paper, we propose a structured kernel machine approach to treat object detection and pose estimation jointly in a mutually benificial way. In our formulation, a unified, continuously parameterized, discriminative appearance model is learned over the entire pose space. We propose a cascaded discrete-continuous algorithm for efficient inference, and give effective online constraint generation strategies for learning our model using structural SVMs. On three standard benchmarks, our method performs better than, or on par with, state-of-the-art methods in the combined task of object detection and pose estimation.
Keywords
object detection continuous pose estimationReferences
- 1.Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: ICCV (2007)Google Scholar
- 2.Gu, C., Ren, X.: Discriminative mixture-of-templates for viewpoint classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 408–421. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 3.Lopez-Sastre, R.J., Tuytelaars, T., Savarese, S.: Deformable part models revisited: A performance evaluation for object category pose estimation. In: ICCV 2011 Workshops (2011)Google Scholar
- 4.Torki, M., Elgammal, A.: Regression from local features for viewpoint and pose estimation. In: ICCV (2011)Google Scholar
- 5.Fenzi, M., Leal-Taixé, L., Rosenhahn, B., Ostermann, J.: Class generative models based on feature regression for pose estimation of object categories. In: CVPR (2013)Google Scholar
- 6.Hara, K., Chellappa, R.: Growing Regression Forests by Classification: Applications to Object Pose Estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 552–567. Springer, Heidelberg (2014)Google Scholar
- 7.Ozuysal, M., Lepetit, V.: P.Fua: Pose estimation for category specific multiview object localization. In: CVPR (2009)Google Scholar
- 8.Stark, M., Goesele, M., Schiele, B.: Back to the future: Learning shape models from 3D CAD data. In: BMVC (2010)Google Scholar
- 9.Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE TPAMI 32(9) (2010)Google Scholar
- 10.Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: CVPR (2012)Google Scholar
- 11.Schels, J., Liebelt, J., Lienhart, R.: Learning an object class representation on a continuous viewsphere. In: CVPR (2012)Google Scholar
- 12.Pepik, B., Gehler, P., Stark, M., Schiele, B.: 3D2PM - 3D deformable part models. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 356–370. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 13.Xiang, Y., Savarese, S.: Estimating the aspect layout of object categories. In: CVPR (2012)Google Scholar
- 14.Mei, L., Liu, J., Hero, A., Savarese, S.: Robust object pose estimation via statistical manifold modeling. In: ICCV (2011)Google Scholar
- 15.Zhang, H., El-Gaaly, T., Elgammal, A., Jiang, Z.: Joint object and pose recognition using homeomorphic manifold analysis. In: AAAI (2013)Google Scholar
- 16.Yuan, Q., Thangali, A., Ablavsky, V., Sclaroff, S.: Multiplicative kernels: Object detection, segmentation and pose estimation. In: CVPR (2008)Google Scholar
- 17.Ionescu, C., Bo, L., Sminchisescu, C.: Structural SVM for visual localization and continuous state estimation. In: ICCV (2009)Google Scholar
- 18.Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. The Annals of Statistics, 1171–1220 (2008)Google Scholar
- 19.Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: A branch and bound framework for object localization. IEEE TPAMI 31(12) (2009)Google Scholar
- 20.Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. JMLR 6(9) (2005)Google Scholar
- 21.Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88(2) (2010)Google Scholar
- 22.Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural SVMs. Machine Learning 77(1) (2009)Google Scholar
- 23.Guzman-Rivera, A., Kohli, P., Batra, D.: Faster training of structural SVMs with diverse M-best cutting-planes. In: AISTATS (2013)Google Scholar
- 24.Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods, pp. 185–208. MIT Press, Cambridge (1999)Google Scholar
- 25.Bordes, A., Usunier, N., Bottou, L.: Sequence labelling SVMs trained in one pass. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 146–161. Springer, Heidelberg (2008)CrossRefGoogle Scholar
- 26.Gourier, N., Hall, D., Crowley, J.L.: Estimating face orientation from robust detection of salient facial structures. In: ICPR 2004 Workshops (2004)Google Scholar
- 27.Glasner, D., Galun, M., Alpert, S., Basri, R., Shakhnarovich, G.: Viewpoint-aware object detection and pose estimation. In: ICCV (2011)Google Scholar
- 28.Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)Google Scholar
- 29.Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)Google Scholar
- 30.Haj, M.A., Gonzalez, J., Davis, L.S.: On partial least squares in head pose estimation: How to simultaneously deal with misalignment. In: CVPR (2012)Google Scholar