Object-Based Illumination Estimation with Rendering-Aware Neural Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12360)


We present a scheme for fast environment light estimation from the RGBD appearance of individual objects and their local image areas. Conventional inverse rendering is too computationally demanding for real-time applications, and the performance of purely learning-based techniques may be limited by the meager input data available from individual objects. To address these issues, we propose an approach that takes advantage of physical principles from inverse rendering to constrain the solution, while also utilizing neural networks to expedite the more computationally expensive portions of its processing, to increase robustness to noisy input data as well as to improve temporal and spatial stability. This results in a rendering-aware system that estimates the local illumination distribution at an object with high accuracy and in real time. With the estimated lighting, virtual objects can be rendered in AR scenarios with shading that is consistent to the real scene, leading to improved realism.

Supplementary material (78.9 mb)
Supplementary material 1 (zip 80784 KB)


  1. 1.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015)., software available from
  2. 2.
    Azinovic, D., Li, T.M., Kaplanyan, A., Niessner, M.: Inverse path tracing for joint material and lighting estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  3. 3.
    Barron, J.T., Malik, J.: Intrinsic scene properties from a single RGB-D image. In: CVPR, pp. 17–24. IEEE, June 2013.
  4. 4.
    Barron, J.T., Malik, J.: Shape, illumination, and reflectance from shading. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1670–1687 (2015).
  5. 5.
    Calian, D.A., Lalonde, J.F., Gotardo, P., Simon, T., Matthews, I., Mitchell, K.: From faces to outdoor light probes. Comput. Graph. Forum 37, 51–61 (2018)CrossRefGoogle Scholar
  6. 6.
    Chaitanya, C.R.A., et al.: Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder. ACM Trans. Graph. 36(4), 98:1–98:12 (2017)CrossRefGoogle Scholar
  7. 7.
    Cheng, D., Shi, J., Chen, Y., Deng, X., Zhang, X.: Learning scene illumination by pairwise photos from rear and front mobile cameras. In: Computer Graphics Forum (2018)Google Scholar
  8. 8.
    Choi, S., Zhou, Q.Y., Miller, S., Koltun, V.: A large dataset of object scans. arXiv:1602.02481 (2016)
  9. 9.
    Debevec, P.: Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography. In: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1998, pp. 189–198. ACM, New York (1998)Google Scholar
  10. 10.
    Gardner, M.A., Hold-Geoffroy, Y., Sunkavalli, K., Gagne, C., Lalonde, J.F.: Deep parametric indoor lighting estimation. In: The IEEE International Conference on Computer Vision (ICCV), October 2019Google Scholar
  11. 11.
    Gardner, M.A., et al.: Learning to predict indoor illumination from a single image. ACM Trans. Graph. 36(6), 1–14 (2017)CrossRefGoogle Scholar
  12. 12.
    Garon, M., Sunkavalli, K., Hadap, S., Carr, N., Lalonde, J.F.: Fast spatially-varying indoor lighting estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  13. 13.
    Georgoulis, S., Rematas, K., Ritschel, T., Fritz, M., Tuytelaars, T., Gool, L.V.: What is around the camera? In: ICCV (2017)Google Scholar
  14. 14.
    Georgoulis, S., et al.: Reflectance and natural illumination from single-material specular objects using deep learning. PAMI 40, 1932–1947 (2017)CrossRefGoogle Scholar
  15. 15.
    Gruber, L., Langlotz, T., Sen, P., Höherer, T., Schmalstieg, D.: Efficient and robust radiance transfer for probeless photorealistic augmented reality. In: 2014 IEEE Virtual Reality (VR), pp. 15–20, March 2014Google Scholar
  16. 16.
    Gruber, L., Richter-Trummer, T., Schmalstieg, D.: Real-time photometric registration from arbitrary geometry. In: 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 119–128, November 2012Google Scholar
  17. 17.
    Hold-Geoffroy, Y., Athawale, A., Lalonde, J.F.: Deep sky modeling for single image outdoor lighting estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  18. 18.
    Hold-Geoffroy, Y., Sunkavalli, K., Hadap, S., Gambaretto, E., Lalonde, J.F.: Deep outdoor illumination estimation. In: IEEE International Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  19. 19.
    Huang, Y., Wang, W., Wang, L.: Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Advances in Neural Information Processing Systems, vol. 28 (2015)Google Scholar
  20. 20.
    Jiddi, S., Robert, P., Marchand, E.: Illumination estimation using cast shadows for realistic augmented reality applications. In: 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct), pp. 192–193, October 2017Google Scholar
  21. 21.
    Karsch, K., et al.: Automatic scene inference for 3D object compositing. ACM Trans. Graph. 33(3), 32:1–32:15 (2014)CrossRefGoogle Scholar
  22. 22.
    Khan, E.A., Reinhard, E., Fleming, R.W., Bülthoff, H.H.: Image-based material editing. ACM Trans. Graph. 25(3), 654–663 (2006).
  23. 23.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR, May 2015Google Scholar
  24. 24.
    Kronander, J., Banterle, F., Gardner, A., Miandji, E., Unger, J.: Photorealistic rendering of mixed reality scenes. Comput. Graph. Forum 34(2), 643–665 (2015)CrossRefGoogle Scholar
  25. 25.
    Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: 3DV (2016)Google Scholar
  26. 26.
    LeGendre, C., et al.: DeepLight: learning illumination for unconstrained mobile mixed reality. In: CVPR (2019)Google Scholar
  27. 27.
    LeGendre, C., et al.: Practical multispectral lighting reproduction. ACM Trans. Graph. 35(4), 32:1–32:11 (2016)CrossRefGoogle Scholar
  28. 28.
    Li, Z., Snavely, N.: CGintrinsics: better intrinsic image decomposition through physically-based rendering. In: European Conference on Computer Vision (ECCV) (2018)Google Scholar
  29. 29.
    Lombardi, S., Nishino, K.: Reflectance and illumination recovery in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 129–141 (2016). Scholar
  30. 30.
    Nishino, K., Nayar, S.K.: Eyes for relighting. ACM Trans. Graph. 23(3), 704–711 (2004)CrossRefGoogle Scholar
  31. 31.
    Romeiro, F., Zickler, T.: Blind reflectometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 45–58. Springer, Heidelberg (2010). Scholar
  32. 32.
    Sato, I., Sato, Y., Ikeuchi, K.: Illumination from shadows. IEEE Trans. Pattern Anal. Mach. Intell. 25(3), 290–300 (2003)CrossRefGoogle Scholar
  33. 33.
    Sengupta, S., Gu, J., Kim, K., Liu, G., Jacobs, D.W., Kautz, J.: Neural inverse rendering of an indoor scene from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8598–8607 (2019)Google Scholar
  34. 34.
    Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNet: learning shape, reflectance and illuminance of faces ‘in the wild’. In: CVPR (2018)Google Scholar
  35. 35.
    Song, S., Funkhouser, T.: Neural illumination: lighting prediction for indoor environments. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  36. 36.
    Sun, T., et al.: Single image portrait relighting. ACM Trans. Graph. 38, 79-1 (2019)Google Scholar
  37. 37.
    Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: CVPR (2018)Google Scholar
  38. 38.
    Tewari, A., et al.: MoFA: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: ICCV (2017)Google Scholar
  39. 39.
    Unger, J., Gustavson, S., Ynnerman, A.: Densely sampled light probe sequences for spatially variant image based lighting. In: Proceedings of GRAPHITE, June 2006Google Scholar
  40. 40.
    Waese, J., Debevec, P.: A real-time high dynamic range light probe. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques: Conference Abstracts and Applications (2002)Google Scholar
  41. 41.
    Weber, H., Prévost, D., Lalonde, J.F.: Learning to estimate indoor lighting from 3D objects. In: 2018 International Conference on 3D Vision (3DV), pp. 199–207. IEEE (2018)Google Scholar
  42. 42.
    Wu, C., Wilburn, B., Matsushita, Y., Theobalt, C.: High-quality shape from multi-view stereo and shading under general illumination. In: CVPR (2011)Google Scholar
  43. 43.
    Yi, R., Zhu, C., Tan, P., Lin, S.: Faces as lighting probes via unsupervised deep highlight extraction. In: ECCV (2018)Google Scholar
  44. 44.
    Zhang, J., et al.: All-weather deep outdoor lighting estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  45. 45.
    Zhou, H., Sun, J., Yacoob, Y., Jacobs, D.W.: Label denoising adversarial network (LDAN) for inverse lighting of faces. In: CVPR (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Microsoft Research AsiaBeijingChina
  2. 2.Zhejiang UniviersityHangzhouChina

Personalised recommendations