Skip to main content

Crowdsampling the Plenoptic Function

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12346))

Included in the following conference series:

Abstract

Many popular tourist landmarks are captured in a multitude of online, public photos. These photos represent a sparse and unstructured sampling of the plenoptic function for a particular scene. In this paper, we present a new approach to novel view synthesis under time-varying illumination from such data. Our approach builds on the recent multi-plane image (MPI) format for representing local light fields under fixed viewing conditions. We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function, that allows for real-time synthesis of photorealistic views that are continuous in both space and across changes in lighting. Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time. We show how to learn a model of these effects in an unsupervised way from an unstructured collection of photos without temporal registration, demonstrating significant improvements over recent work in neural rendering. More information can be found at crowdsampling.io.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    [1] describes the plenoptic function as 7D, but we can reduce this to a 4D color light field supplemented by time by applying the later observations of [33].

References

  1. Adelson, E.H., Bergen, J.R.: The plenoptic function and the elements of early vision. In: Computational Models of Visual Processing, pp. 3–20. MIT Press (1991)

    Google Scholar 

  2. Buehler, C., Bosse, M., McMillan, L., Gortler, S., Cohen, M.: Unstructured lumigraph rendering. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 425–432 (2001)

    Google Scholar 

  3. Chai, J.X., Tong, X., Chan, S.C., Shum, H.Y.: Plenoptic sampling. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000, pp. 307–318. ACM Press/Addison-Wesley Publishing Co., USA (2000). https://doi.org/10.1145/344779.344932

  4. Chaurasia, G., Duchene, S., Sorkine-Hornung, O., Drettakis, G.: Depth synthesis and local warps for plausible image-based navigation. ACM Trans. Graph. 32(3), 1–12 (2013)

    Article  Google Scholar 

  5. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  6. Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 1511–1520 (2017)

    Google Scholar 

  7. Chen, Z., et al.: A neural rendering framework for free-viewpoint relighting. arXiv preprint arXiv:1911.11530 (2019)

  8. Choi, I., Gallo, O., Troccoli, A., Kim, M.H., Kautz, J.: Extreme view synthesis. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 7781–7790 (2019)

    Google Scholar 

  9. Davis, A., Levoy, M., Durand, F.: Unstructured light fields. Comput. Graph. Forum 31, 305–314 (2012)

    Article  Google Scholar 

  10. Debevec, P.E., Taylor, C.J., Malik, J.: Modeling and rendering architecture from photographs: a hybrid geometry-and image-based approach. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 11–20 (1996)

    Google Scholar 

  11. Eslami, S.A., et al.: Neural scene representation and rendering. Science 360(6394), 1204–1210 (2018)

    Article  Google Scholar 

  12. Flynn, J., et al.: DeepView: view synthesis with learned gradient descent. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 2367–2376 (2019)

    Google Scholar 

  13. Flynn, J., Neulander, I., Philbin, J., Snavely, N.: DeepStereo: learning to predict new views from the world’s imagery. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 5515–5524 (2016)

    Google Scholar 

  14. Garg, R., Du, H., Seitz, S.M., Snavely, N.: The dimensionality of scene appearance. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 1917–1924. IEEE (2009)

    Google Scholar 

  15. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proc. Computer Vision and Pattern Recognition (CVPR). pp. 2414–2423 (2016)

    Google Scholar 

  16. Goodfellow, I., et al.: Generative adversarial nets. In: Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  17. Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The lumigraph. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 43–54 (1996)

    Google Scholar 

  18. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Neural Information Processing Systems, pp. 5767–5777 (2017)

    Google Scholar 

  19. Hauagge, D.C., Wehrwein, S., Upchurch, P., Bala, K., Snavely, N.: Reasoning about photo collections using models of outdoor illumination. In: Proceedings of the British Machine Vision Conference (BMVC) (2014)

    Google Scholar 

  20. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)

    Google Scholar 

  21. Hedman, P., Alsisan, S., Szeliski, R., Kopf, J.: Casual 3D photography. ACM Trans. Graph. 36, 234:1–234:15 (2017)

    Article  Google Scholar 

  22. Hedman, P., Philip, J., Price, T., Frahm, J.M., Drettakis, G., Brostow, G.: Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. 37(6), 1–15 (2018)

    Article  Google Scholar 

  23. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 1501–1510 (2017)

    Google Scholar 

  24. Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11

    Chapter  Google Scholar 

  25. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 1125–1134 (2017)

    Google Scholar 

  26. Kalantari, N.K., Wang, T.C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. ACM Trans. Graph. 35(6), 1–10 (2016)

    Article  Google Scholar 

  27. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)

  28. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 4401–4410 (2019)

    Google Scholar 

  29. Laffont, P.Y., Bousseau, A., Paris, S., Durand, F., Drettakis, G.: Coherent intrinsic images from photo collections. ACM Trans. Graph. 31, 202:1–202:11 (2012)

    Article  Google Scholar 

  30. Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 4681–4690 (2017)

    Google Scholar 

  31. Lee, H.-Y., Tseng, H.-Y., Huang, J.-B., Singh, M., Yang, M.-H.: Diverse image-to-image translation via disentangled representations. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 36–52. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_3

    Chapter  Google Scholar 

  32. Levin, A., Durand, F.: Linear view synthesis using a dimensionality gap light field prior. In: Proceedings Computer Vision and Pattern Recognition (CVPR), pp. 1831–1838 (2010)

    Google Scholar 

  33. Levoy, M., Hanrahan, P.: Light field rendering. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 31–42 (1996)

    Google Scholar 

  34. Li, Z., et al.: Learning the depths of moving people by watching Frozen people. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 4521–4530 (2019)

    Google Scholar 

  35. Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 2041–2050 (2018)

    Google Scholar 

  36. Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: learning dynamic renderable volumes from images. ACM Trans. Graph. 38(4), 65 (2019)

    Article  Google Scholar 

  37. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 2794–2802 (2017)

    Google Scholar 

  38. Martin-Brualla, R., Gallup, D., Seitz, S.M.: 3D time-lapse reconstruction from internet photos. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 1332–1340 (2015)

    Google Scholar 

  39. Martin-Brualla, R., Gallup, D., Seitz, S.M.: Time-lapse mining from internet photos. ACM Trans. Graph. 34(4), 1–8 (2015)

    Article  Google Scholar 

  40. Matzen, K., Snavely, N.: Scene chronology. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 615–630. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_40

    Chapter  Google Scholar 

  41. Meshry, M., et al.: Neural rerendering in the wild. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 6871–6880 (2019)

    Google Scholar 

  42. Mildenhall, B., et al.: Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. 38(4), 1–14 (2019)

    Article  Google Scholar 

  43. Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 2337–2346 (2019)

    Google Scholar 

  44. Penner, E., Zhang, L.: Soft 3D reconstruction for view synthesis. ACM Trans. Graph. 36(6), 1–11 (2017)

    Article  Google Scholar 

  45. Philip, J., Gharbi, M., Zhou, T., Efros, A.A., Drettakis, G.: Multi-view relighting using a geometry-aware network. ACM Trans. Graph. 38(4), 1–14 (2019)

    Article  Google Scholar 

  46. Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 5400–5409 (2017)

    Google Scholar 

  47. Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 4104–4113 (2016)

    Google Scholar 

  48. Shan, Q., Adams, R., Curless, B., Furukawa, Y., Seitz, S.M.: The visual turing test for scene reconstruction. In: International Conference on 3D Vision (3DV), pp. 25–32 (2013)

    Google Scholar 

  49. Sheng, L., Lin, Z., Shao, J., Wang, X.: Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2018)

    Google Scholar 

  50. Shi, L., Hassanieh, H., Davis, A., Katabi, D., Durand, F.: Light field reconstruction using sparsity in the continuous Fourier domain. ACM Trans. Graph. 34, 12:1–12:13 (2014)

    Article  Google Scholar 

  51. Shi, L., Hassanieh, H., Davis, A., Katabi, D., Durand, F.: Light field reconstruction using sparsity in the continuous Fourier domain. ACM Trans. Graph. 34(1) (2015). https://doi.org/10.1145/2682631

  52. Simon, I., Snavely, N., Seitz, S.M.: Scene summarization for online image collections. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 1–8. IEEE (2007)

    Google Scholar 

  53. Sitzmann, V., Thies, J., Heide, F., Nießner, M., Wetzstein, G., Zollhofer, M.: DeepVoxels: learning persistent 3D feature embeddings. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 2437–2446 (2019)

    Google Scholar 

  54. Sitzmann, V., Zollhöfer, M., Wetzstein, G.: Scene representation networks: continuous 3D-structure-aware neural scene representations. In: Neural Information Processing Systems, pp. 1119–1130 (2019)

    Google Scholar 

  55. Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. (SIGGRAPH) (2006)

    Google Scholar 

  56. Srinivasan, P.P., Tucker, R., Barron, J.T., Ramamoorthi, R., Ng, R., Snavely, N.: Pushing the boundaries of view extrapolation with multiplane images. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 175–184 (2019)

    Google Scholar 

  57. Srinivasan, P.P., Wang, T., Sreelal, A., Ramamoorthi, R., Ng, R.: Learning to synthesize a 4D RGBD light field from a single image. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 2243–2251 (2017)

    Google Scholar 

  58. Szeliski, R., Golland, P.: Stereo matching with transparency and matting. Int. J. Comput. Vis. 32, 45–61 (1998)

    Article  Google Scholar 

  59. Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. 38(4), 1–12 (2019)

    Article  Google Scholar 

  60. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 6924–6932 (2017)

    Google Scholar 

  61. Vagharshakyan, S., Bregovic, R., Gotchev, A.P.: Light field reconstruction using shearlet transform. Trans. Pattern Anal. Mach. Intell. 40, 133–147 (2015)

    Article  Google Scholar 

  62. Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 8798–8807 (2018)

    Google Scholar 

  63. Wang, X., Gupta, A.: Generative image modeling using style and structure adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 318–335. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_20

    Chapter  Google Scholar 

  64. Xian, W., et al.: TextureGAN: controlling deep image synthesis with texture patches. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 8456–8465 (2018)

    Google Scholar 

  65. Xu, Z., Bi, S., Sunkavalli, K., Hadap, S., Su, H., Ramamoorthi, R.: Deep view synthesis from sparse photometric images. ACM Trans. Graph. 38(4) (2019)

    Google Scholar 

  66. Yu, Y., Smith, W.A.: InverseRenderNet: learning single image inverse rendering. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 3155–3164 (2019)

    Google Scholar 

  67. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 586–595 (2018)

    Google Scholar 

  68. Zhou, T., Tucker, R., Flynn, J., Fyffe, G., Snavely, N.: Stereo magnification: learning view synthesis using multiplane images. ACM Trans. Graph. 37, 1–12 (2018)

    Google Scholar 

  69. Zhou, T., Tulsiani, S., Sun, W., Malik, J., Efros, A.A.: View synthesis by appearance flow. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 286–301. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_18

    Chapter  Google Scholar 

  70. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 2223–2232 (2017)

    Google Scholar 

  71. Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: Neural Information Processing Systems, pp. 465–476 (2017)

    Google Scholar 

  72. Zitnick, C.L., Kang, S.B., Uyttendaele, M., Winder, S.A.J., Szeliski, R.: High-quality video view interpolation using a layered representation. In: SIGGRAPH 2004 (2004)

    Google Scholar 

Download references

Acknowledgements

We thank Kai Zhang, Jin Sun, and Qianqian Wang for helpful discussions. This research was supported in part by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengqi Li .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 47350 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Z., Xian, W., Davis, A., Snavely, N. (2020). Crowdsampling the Plenoptic Function. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12346. Springer, Cham. https://doi.org/10.1007/978-3-030-58452-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58452-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58451-1

  • Online ISBN: 978-3-030-58452-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics