Self-supervised Outdoor Scene Relighting

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12367)


Outdoor scene relighting is a challenging problem that requires good understanding of the scene geometry, illumination and albedo. Current techniques are completely supervised, requiring high quality synthetic renderings to train a solution. Such renderings are synthesized using priors learned from limited data. In contrast, we propose a self-supervised approach for relighting. Our approach is trained only on corpora of images collected from the internet without any user-supervision. This virtually endless source of training data allows training a general relighting solution. Our approach first decomposes an image into its albedo, geometry and illumination. A novel relighting is then produced by modifying the illumination parameters. Our solution capture shadow using a dedicated shadow prediction map, and does not rely on accurate geometry estimation. We evaluate our technique subjectively and objectively using a new dataset with ground-truth relighting. Results show the ability of our technique to produce photo-realistic and physically plausible results, that generalizes to unseen scenes.


Neural rendering Image relighting Inverse rendering 

Supplementary material

504482_1_En_6_MOESM1_ESM.pdf (20.5 mb)
Supplementary material 1 (pdf 20975 KB)

Supplementary material 2 (mp4 18488 KB)

Supplementary material 3 (mp4 22490 KB)

Supplementary material 4 (mp4 2136 KB)

Supplementary material 5 (mp4 5481 KB)


  1. 1.
    Barron, J.T., Malik, J.: Shape, illumination, and reflectance from shading. TPAMI 37(8), 1670–1687 (2015)CrossRefGoogle Scholar
  2. 2.
    Debevec, P.: Image-based lighting. IEEE Comput. Graph. Appl. 22(2), 26–34 (2002)CrossRefGoogle Scholar
  3. 3.
    Debevec, P., Hawkins, T., Tchou, C., Duiker, H.P., Sarokin, W., Sagar, M.: Acquiring the reflectance field of a human face. In: Proceedings of SIGGRAPH 2000 (SIGGRAPH 2000) (2000)Google Scholar
  4. 4.
    Duchêne, S., et al.: Multiview intrinsic images of outdoors scenes with an application to relighting. ACM Trans. Graph. 34(5), 164:1–164:16 (2015)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Garon, M., Sunkavalli, K., Hadap, S., Carr, N., Lalonde, J.F.: Fast spatially-varying indoor lighting estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  6. 6.
    Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc. (2014)Google Scholar
  7. 7.
    Haber, T., Fuchs, C., Bekaer, P., Seidel, H., Goesele, M., Lensch, H.P.A.: Relighting objects from image collections. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 627–634, June 2009Google Scholar
  8. 8.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)Google Scholar
  9. 9.
    Hold-Geoffroy, Y., Sunkavalli, K., Hadap, S., Gambaretto, E., Lalonde, J.F.: Deep outdoor illumination estimation. In: CVPR (2017).
  10. 10.
    Karsch, K., Hedau, V., Forsyth, D., Hoiem, D.: Rendering synthetic objects into legacy photographs. In: Proceedings of the 2011 SIGGRAPH Asia Conference, SA 2011, pp. 157:1–157:12. ACM, New York (2011)Google Scholar
  11. 11.
    Kopf, J., et al.: Deep photo: model-based photograph enhancement and viewing. ACM Trans. Graph. 27(5), 116:1–116:10 (2008)CrossRefGoogle Scholar
  12. 12.
    Laffont, P.Y., Bousseau, A., Paris, S., Durand, F., Drettakis, G.: Coherent intrinsic images from photo collections. ACM Trans. Graph. 31(6), 202:1–202:11 (2012)CrossRefGoogle Scholar
  13. 13.
    Lalonde, J.F., Efros, A.A., Narasimhan, S.G.: Webcam clip art: appearance and illuminant transfer from time-lapse sequences. In: ACM SIGGRAPH Asia 2009 Papers, SIGGRAPH Asia 2009, pp. 131:1–131:10. ACM, New York (2009)Google Scholar
  14. 14.
    Lalonde, J.F., Efros, A.A., Narasimhan, S.G.: Estimating the natural illumination conditions from a single outdoor image. Int. J. Comput. Vis. 98(2), 123–145 (2012). Scholar
  15. 15.
    Li, Z., Snavely, N.: Learning intrinsic image decomposition from watching the world. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9039–9048 (2018)Google Scholar
  16. 16.
    Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  17. 17.
    Loscos, C., Frasson, M.-C., Drettakis, G., Walter, B., Granier, X., Poulin, P.: Interactive virtual relighting and remodeling of real scenes. In: Lischinski, D., Larson, G.W. (eds.) EGSR 1999. E, pp. 329–340. Springer, Vienna (1999). Scholar
  18. 18.
    Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6997–7005, July 2017Google Scholar
  19. 19.
    Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)Google Scholar
  20. 20.
    Martin-Brualla, R., et al.: LookinGood: enhancing performance capture with real-time neural re-rendering. ACM Trans. Graph. 37(6), 255:1–255:14 (2018)Google Scholar
  21. 21.
    Masselus, V., Peers, P., Dutré, P., Willems, Y.D.: Relighting with 4D incident light fields. ACM Trans. Graph. 22(3), 613–620 (2003)CrossRefGoogle Scholar
  22. 22.
    Meka, A., Fox, G., Zollhöfer, M., Richardt, C., Theobalt, C.: Live user-guided intrinsic video for static scene. IEEE Trans. Vis. Comput. Graph. 23(11), 2447–2454 (2017)CrossRefGoogle Scholar
  23. 23.
    Meka, A., et al.: Deep reflectance fields: high-quality facial reflectance field inference from color gradient illumination. ACM Trans. Graph. 38(4), 77:1–77:12 (2019)CrossRefGoogle Scholar
  24. 24.
    Meka, A., et al.: LIME: live intrinsic material estimation. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  25. 25.
    Meshry, M.M., et al.: Neural rerendering in the wild. In: Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  26. 26.
    Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
  27. 27.
    Nam, S., Ma, C., Chai, M., Brendel, W., Xu, N., Kim, S.J.: End-to-end time-lapse video synthesis from a single outdoor image. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1409–1418, June 2019.
  28. 28.
    Okabe, M., Zeng, G., Matsushita, Y., Igarashi, T., Quan, L., Shum, H.Y.: Single-view relighting with normal map painting (2006)Google Scholar
  29. 29.
    Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar
  30. 30.
    Patow, G., Pueyo, X.: A survey of inverse rendering problems. Comput. Graph. Forum 22(4), 663–687 (2003)CrossRefGoogle Scholar
  31. 31.
    Philip, J., Gharbi, M., Zhou, T., Efros, A.A., Drettakis, G.: Multi-view relighting using a geometry-aware network. ACM Trans. Graph. 38(4), 78:1–78:14 (2019)CrossRefGoogle Scholar
  32. 32.
    Ramamoorthi, R., Hanrahan, P.: An efficient representation for irradiance environment maps. In: Proceedings of the SIGGRAPH, pp. 497–500. ACM (2001)Google Scholar
  33. 33.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). Scholar
  34. 34.
    Sengupta, S., Gu, J., Kim, K., Liu, G., Jacobs, D.W., Kautz, J.: Neural inverse rendering of an indoor scene from a single image. CoRR abs/1901.02453 (2019)Google Scholar
  35. 35.
    Sengupta, S., Gu, J., Kim, K., Liu, G., Jacobs, D.W., Kautz, J.: Neural inverse rendering of an indoor scene from a single image. In: International Conference on Computer Vision (ICCV) (2019)Google Scholar
  36. 36.
    Shan, Q., Adams, R., Curless, B., Furukawa, Y., Seitz, S.M.: The visual turing test for scene reconstruction. In: Proceedings of the 2013 International Conference on 3D Vision, 3DV 2013, pp. 25–32. IEEE Computer Society, Washington, DC (2013)Google Scholar
  37. 37.
    Shih, Y., Paris, S., Durand, F., Freeman, W.T.: Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans. Graph. 32(6), 200:1–200:11 (2013)CrossRefGoogle Scholar
  38. 38.
    Shu, Z., Yumer, E., Hadap, S., Sunkavalli, K., Shechtman, E., Samaras, D.: Neural face editing with intrinsic image disentangling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5541–5550 (2017)Google Scholar
  39. 39.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)Google Scholar
  40. 40.
    Stumpfel, J., Jones, A., Wenger, A., Tchou, C., Hawkins, T., Debevec, P.: Direct HDR capture of the sun and sky. In: ACM SIGGRAPH 2006 Courses (SIGGRAPH 2006). ACM, New York (2006)Google Scholar
  41. 41.
    Sunkavalli, K., Matusik, W., Pfister, H., Rusinkiewicz, S.: Factored time-lapse video. In: ACM SIGGRAPH 2007 Papers (SIGGRAPH 2007). ACM, New York (2007)Google Scholar
  42. 42.
    Tchou, C., Stumpfel, J., Einarsson, P., Fajardo, M., Debevec, P.: Unlighting the parthenon. In: ACM SIGGRAPH 2004 Sketches (SIGGRAPH 2004). ACM, New York (2004)Google Scholar
  43. 43.
    Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. (TOG) 38(4), 1–12 (2019)CrossRefGoogle Scholar
  44. 44.
    Troccoli, A., Allen, P.: Building illumination coherent 3D models of large-scale outdoor scenes. Int. J. Comput. Vis. 78(2), 261–280 (2008). Scholar
  45. 45.
    Wenger, A., Gardner, A., Tchou, C., Unger, J., Hawkins, T., Debevec, P.: Performance relighting and reflectance transformation with time-multiplexed illumination. ACM Trans. Graph. 24(3), 756–764 (2005)CrossRefGoogle Scholar
  46. 46.
    Xing, G., Zhou, X., Peng, Q., Liu, Y., Qin, X.: Lighting simulation of augmented outdoor scene based on a legacy photograph. Comput. Graph. Forum 32(7), 101–110 (2013)CrossRefGoogle Scholar
  47. 47.
    Xu, Z., Sunkavalli, K., Hadap, S., Ramamoorthi, R.: Deep image-based relighting from optimal sparse samples. ACM Trans. Graph. 37(4), 126:1–126:13 (2018)CrossRefGoogle Scholar
  48. 48.
    Yu, Y., Smith, W.A.P.: InverseRenderNet: learning single image inverse rendering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  49. 49.
    Yu, Y., Debevec, P., Malik, J., Hawkins, T.: Inverse global illumination: recovering reflectance models of real scenes from photographs. In: Proceedings of the SIGGRAPH, pp. 215–224 (1999).
  50. 50.
    Yu, Y., Malik, J.: Recovering photometric properties of architectural scenes from photographs. In: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1998, pp. 207–217. ACM, New York (1998)Google Scholar
  51. 51.
    Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890 (2017)Google Scholar
  52. 52.
    Zhou, H., Hadap, S., Sunkavalli, K., Jacobs, D.W.: Deep single portrait image relighting. In: International Conference on Computer Vision (ICCV) (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of YorkYorkUK
  2. 2.Max Planck Institute for Informatics, Saarland Informatics CampusSaarbrückenGermany

Personalised recommendations