NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Mildenhall, Ben; Srinivasan, Pratul P.; Tancik, Matthew; Barron, Jonathan T.; Ramamoorthi, Ravi; Ng, Ren

doi:10.1007/978-3-030-58452-8_24

Ben Mildenhall¹²,
Pratul P. Srinivasan¹²,
Matthew Tancik¹²,
Jonathan T. Barron¹³,
Ravi Ramamoorthi¹⁴ &
…
Ren Ng¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12346))

Included in the following conference series:

European Conference on Computer Vision

26k Accesses
1305 Citations
6 Altmetric

Abstract

We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location (x, y, z) and viewing direction \((\theta ,\phi )\)) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays and use classic volume rendering techniques to project the output colors and densities into an image. Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses. We describe how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrate results that outperform prior work on neural rendering and view synthesis. View synthesis results are best viewed as videos, so we urge readers to view our supplementary video for convincing comparisons.

B. M. Pratul, P. Srinivasan and M. Tancik: Authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Buehler, C., Bosse, M., McMillan, L., Gortler, S., Cohen, M.: Unstructured lumigraph rendering. In: SIGGRAPH (2001)
Google Scholar
Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. arXiv:1512.03012 (2015)
Chen, W., et al.: Learning to predict 3D objects with an interpolation-based differentiable renderer. In: NeurIPS (2019)
Google Scholar
Cohen, M., Gortler, S.J., Szeliski, R., Grzeszczuk, R., Szeliski, R.: The lumigraph. In: SIGGRAPH (1996)
Google Scholar
Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: SIGGRAPH (1996)
Google Scholar
Davis, A., Levoy, M., Durand, F.: Unstructured light fields. In: Eurographics (2012)
Google Scholar
Debevec, P., Taylor, C.J., Malik, J.: Modeling and rendering architecture from photographs: a hybrid geometry-and image-based approach. In: SIGGRAPH (1996)
Google Scholar
Flynn, J., et al.: DeepView: view synthesis with learned gradient descent. In: CVPR (2019)
Google Scholar
Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3D morphable model regression. In: CVPR (2018)
Google Scholar
Genova, K., Cole, F., Sud, A., Sarna, A., Funkhouser, T.: Local deep implicit functions for 3D shape. In: CVPR (2020)
Google Scholar
Henzler, P., Mitra, N.J., Ritschel, T.: Learning a neural 3D texture space from 2D exemplars. In: CVPR (2020)
Google Scholar
Henzler, P., Rasche, V., Ropinski, T., Ritschel, T.: Single-image tomography: 3D volumes from 2D cranial X-rays. In: Eurographics (2018)
Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)
Article Google Scholar
Jiang, C., Sud, A., Makadia, A., Huang, J., Nießner, M., Funkhouser, T.: Local implicit grid representations for 3D scenes. In: CVPR (2020)
Google Scholar
Kajiya, J.T., Herzen, B.P.V.: Ray tracing volume densities. Comput. Graph. (SIGGRAPH) 18(3), 165–174 (1984)
Article Google Scholar
Kar, A., Häne, C., Malik, J.: Learning a multi-view stereo machine. In: NeurIPS (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Google Scholar
Kutulakos, K.N., Seitz, S.M.: A theory of shape by space carving. Int. J. Comput. Vis. 1, 307–314 (2000)
MATH Google Scholar
Levoy, M.: Efficient ray tracing of volume data. ACM Trans. Graph. 9(3), 245–261 (1990)
Article Google Scholar
Levoy, M., Hanrahan, P.: Light field rendering. In: SIGGRAPH (1996)
Google Scholar
Li, T.M., Aittala, M., Durand, F., Lehtinen, J.: Differentiable Monte Carlo ray tracing through edge sampling. ACM Trans. Graph. (SIGGRAPH Asia) 37(6), 1–11 (2018)
Google Scholar
Liu, S., Li, T., Chen, W., Li, H.: Soft rasterizer: a differentiable renderer for image-based 3D reasoning. In: ICCV (2019)
Google Scholar
Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: learning dynamic renderable volumes from images. ACM Trans. Graph. (SIGGRAPH) (2019)
Google Scholar
Loper, M.M., Black, M.J.: OpenDR: an approximate differentiable renderer. In: ECCV (2014)
Google Scholar
Max, N.: Optical models for direct volume rendering. IEEE Trans. Visual. Comput. Graph. 1(2), 99–108 (1995)
Article Google Scholar
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3D reconstruction in function space. In: CVPR (2019)
Google Scholar
Mildenhall, B., et al.: Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (SIGGRAPH) 38(4), 1–14 (2019)
Article Google Scholar
Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.: Differentiable volumetric rendering: learning implicit 3D representations without 3D supervision. In: CVPR (2019)
Google Scholar
Nimier-David, M., Vicini, D., Zeltner, T., Jakob, W.: Mitsuba 2: a retargetable forward and inverse renderer. ACM Trans. Graph. (SIGGRAPH Asia) 38(6), 1–17 (2019)
Article Google Scholar
Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.: Texture fields: learning texture representations in function space. In: ICCV (2019)
Google Scholar
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepSDF: learning continuous signed distance functions for shape representation. In: CVPR (2019)
Google Scholar
Penner, E., Zhang, L.: Soft 3D reconstruction for view synthesis. ACM Trans. Graph. (SIGGRAPH Asia) 36(6), 1–11 (2017)
Article Google Scholar
Porter, T., Duff, T.: Compositing digital images. Comput. Graph (SIGGRAPH) (1984)
Google Scholar
Rahaman, N., et al.: On the spectral bias of neural networks. In: ICML (2018)
Google Scholar
Rainer, G., Ghosh, A., Jakob, W., Weyrich, T.: Unified neural encoding of BTFs. Comput. Graph. Forum (Eurographics) (2020)
Google Scholar
Rainer, G., Jakob, W., Ghosh, A., Weyrich, T.: Neural BTF compression and interpolation. Comput. Graph. Forum (Eurographics) 38(2), 235–244 (2019)
Article Google Scholar
Ren, P., Wang, J., Gong, M., Lin, S., Tong, X., Guo, B.: Global illumination with radiance regression functions. ACM Trans. Graph. 32(4), 1–12 (2013)
Article Google Scholar
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: CVPR (2016)
Google Scholar
Seitz, S.M., Dyer, C.R.: Photorealistic scene reconstruction by voxel coloring. Int. J. Comput. Vis. 35, 151–173 (1999). https://doi.org/10.1023/A:1008176507526
Sitzmann, V., Thies, J., Heide, F., Nießner, M., Wetzstein, G., Zollhöfer, M.: DeepVoxels: learning persistent 3D feature embeddings. In: CVPR (2019)
Google Scholar
Sitzmann, V., Zollhoefer, M., Wetzstein, G.: Scene representation networks: continuous 3D-structure-aware neural scene representations. In: NeurIPS (2019)
Google Scholar
Srinivasan, P.P., Tucker, R., Barron, J.T., Ramamoorthi, R., Ng, R., Snavely, N.: Pushing the boundaries of view extrapolation with multiplane images. In: CVPR (2019)
Google Scholar
Stanley, K.O.: Compositional pattern producing networks: a novel abstraction of development. Genet. Program. Evolvable Mach. 8, 131–162 (2007). https://doi.org/10.1007/s10710-007-9028-8
Szeliski, R., Golland, P.: Stereo matching with transparency and matting. In: ICCV (1998)
Google Scholar
Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: CVPR (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Waechter, M., Moehrle, N., Goesele, M.: Let there be color! large-scale texturing of 3D reconstructions. In: ECCV (2014)
Google Scholar
Wood, D.N., et al.: Surface light fields for 3D photography. In: SIGGRAPH (2000)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Google Scholar
Zhong, E.D., Bepler, T., Davis, J.H., Berger, B.: Reconstructing continuous distributions of 3D protein structure from cryo-EM images. In: ICLR (2020)
Google Scholar
Zhou, T., Tucker, R., Flynn, J., Fyffe, G., Snavely, N.: Stereo magnification: learning view synthesis using multiplane images. ACM Trans. Graph. (SIGGRAPH) (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

UC Berkeley, Berkeley, USA
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik & Ren Ng
Google Research, New York, USA
Jonathan T. Barron
UC San Diego, San Diego, USA
Ravi Ramamoorthi

Authors

Ben Mildenhall
View author publications
You can also search for this author in PubMed Google Scholar
Pratul P. Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Tancik
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan T. Barron
View author publications
You can also search for this author in PubMed Google Scholar
Ravi Ramamoorthi
View author publications
You can also search for this author in PubMed Google Scholar
Ren Ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ben Mildenhall .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1880 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R. (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12346. Springer, Cham. https://doi.org/10.1007/978-3-030-58452-8_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-58452-8_24
Published: 03 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58451-1
Online ISBN: 978-3-030-58452-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics