Advertisement

Discriminatively Trained Dense Surface Normal Estimation

  • L’ubor Ladický
  • Bernhard Zeisl
  • Marc Pollefeys
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8693)

Abstract

In this work we propose the method for a rather unexplored problem of computer vision - discriminatively trained dense surface normal estimation from a single image. Our method combines contextual and segment-based cues and builds a regressor in a boosting framework by transforming the problem into the regression of coefficients of a local coding. We apply our method to two challenging data sets containing images of man-made environments, the indoor NYU2 data set and the outdoor KITTI data set. Our surface normal predictor achieves results better than initially expected, significantly outperforming state-of-the-art.

Keywords

Computer Vision Ground Truth Random Forest Visual Word Feature Representation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. In: Conference on Computer Vision and Pattern Recognition (2006)Google Scholar
  2. 2.
    Ladicky, L., Shi, J., Pollefeys, M.: Pulling things out of perspective. In: Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  3. 3.
    Hoiem, D., Efros, A.A., Hebert, M.: Closing the loop on scene interpretation. In: Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  4. 4.
    Saxena, A., Chung, S.H., Ng, A.Y.: 3-D Depth Reconstruction from a Single Still Image. International Journal of Computer Vision (2007)Google Scholar
  5. 5.
    Saxena, A., Sun, M., Ng, A.Y.: Make3D: learning 3D scene structure from a single still image. Transactions on Pattern Analysis and Machine Intelligence (2009)Google Scholar
  6. 6.
    Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  7. 7.
    Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  9. 9.
    Horn, B.K.P., Brooks, M.J. (eds.): Shape from Shading. MIT Press (1989)Google Scholar
  10. 10.
    Mallick, S.P., Zickler, T.E., Kriegman, D.J., Belhumeur, P.N.: Beyond lambert: reconstructing specular surfaces using color. In: Conference on Computer Vision and Pattern Recognition (2005)Google Scholar
  11. 11.
    Ikehata, S., Aizawa, K.: Photometric stereo using constrained bivariate regression for general isotropic surfaces. In: Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  12. 12.
    Fouhey, D., Gupta, A., Hebert, M.: Data-driven 3d primitives for single image understanding. In: International Conference on Computer Vision (2013)Google Scholar
  13. 13.
    Hoiem, D., Efros, A.A., Hebert, M.: Recovering Surface Layout from an Image. International Journal of Computer Vision (2007)Google Scholar
  14. 14.
    Gupta, A., Efros, A.A., Hebert, M.: Blocks world revisited: Image understanding using qualitative geometry and mechanics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 482–496. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Delage, E., Lee, H., Ng, A.: A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image. In: Conference on Computer Vision and Pattern Recognition (2006)Google Scholar
  16. 16.
    Barinova, O., Konushin, V., Yakubenko, A., Lee, K., Lim, H., Konushin, A.: Fast Automatic Single-View 3-d Reconstruction of Urban Scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 100–113. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: Conference on Computer Vision and Pattern Recognition (2009)Google Scholar
  18. 18.
    Flint, A., Mei, C., Reid, I., Murray, D.: Growing semantically meaningful models for visual SLAM. In: Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  19. 19.
    Flint, A., Mei, C., Murray, D., Reid, I.: A dynamic programming approach to reconstructing building interiors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 394–407. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. Transactions on Pattern Analysis and Machine Intelligence (2002)Google Scholar
  21. 21.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. Transactions on Pattern Analysis and Machine Intelligence (2000)Google Scholar
  22. 22.
    Zhang, Y., Hartley, R.I., Mashford, J., Burn, S.: Superpixels via pseudo-boolean optimization. In: International Conference on Computer Vision (2011)Google Scholar
  23. 23.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. Transactions on Pattern Analysis and Machine Intelligence (2012)Google Scholar
  24. 24.
    Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: textonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  25. 25.
    Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  26. 26.
    Shotton, J., Fitzgibbon, A., Cook, M., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Conference on Computer Vision and Pattern Recognition (2011)Google Scholar
  27. 27.
    Ladicky, L., Russell, C., Kohli, P., Torr, P.H.S.: Associative hierarchical CRFs for object class image segmentation. In: International Conference on Computer Vision (2009)Google Scholar
  28. 28.
    Kohli, P., Ladicky, L., Torr, P.H.S.: Robust higher order potentials for enforcing label consistency. In: Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  29. 29.
    Yang, L., Meer, P., Foran, D.J.: Multiple class segmentation using a unified framework over mean-shift patches. In: Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  30. 30.
    Batra, D., Sukthankar, R., Tsuhan, C.: Learning class-specific affinities for image labelling. In: Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  31. 31.
    Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  32. 32.
    Boix, X., Cardinal, G., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzalez, J.: Harmony potentials: Fusing global and local scale for semantic image segmentation. International Journal on Computer Vision (2011)Google Scholar
  33. 33.
    Guyon, I., Boser, B., Vapnik, V.: Automatic capacity tuning of very large vc-dimension classifiers. In: Advances in Neural Information Processing Systems (1993)Google Scholar
  34. 34.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  35. 35.
    Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image classification using super-vector coding of local image descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  36. 36.
    Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic Segmentation with Second-Order Pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 430–443. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  37. 37.
    Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: International Conference on Computer Vision (2007)Google Scholar
  38. 38.
    Pantofaru, C., Schmid, C., Hebert, M.: Object recognition by integrating multiple image segmentations. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 481–494. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  39. 39.
    Breiman, L.: Random forests. In: Machine Learning (2001)Google Scholar
  40. 40.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: a Statistical View of Boosting. The Annals of Statistics (2000)Google Scholar
  41. 41.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science (2000)Google Scholar
  42. 42.
    Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: Advances in Neural Information Processing Systems (2009)Google Scholar
  43. 43.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  44. 44.
    Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: Conference on Computer Vision and Pattern Recognition (2004)Google Scholar
  45. 45.
    Malik, J., Belongie, S., Leung, T., Shi, J.: Contour and texture analysis for image segmentation. International Journal of Computer Vision (2001)Google Scholar
  46. 46.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (2004)Google Scholar
  47. 47.
    Hussain, S.u., Triggs, B.: Visual Recognition Using Local Quantized Patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 716–729. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  48. 48.
    Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  49. 49.
    van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  50. 50.
    Bredies, K., Kunisch, K., Pock, T.: Total Generalized Variation. SIAM Journal on Imaging Sciences 3, 492–526 (2010)CrossRefzbMATHMathSciNetGoogle Scholar
  51. 51.
    Chambolle, A., Pock, T.: A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision (2010)Google Scholar
  52. 52.
    Urtasun, R., Fergus, R., Hoiem, D., Torralba, A., Geiger, A., Lenz, P., Silberman, N., Xiao, J., Fidler, S.: Reconstruction Meets Recognition Challenge (2013), http://ttic.uchicago.edu/~rurtasun/rmrc/

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • L’ubor Ladický
    • 1
  • Bernhard Zeisl
    • 1
  • Marc Pollefeys
    • 1
  1. 1.ETH ZürichSwitzerland

Personalised recommendations