Real-Time Image Based Lighting for 360-Degree Panoramic Video

Iorns, Thomas; Rhee, Taehyun

doi:10.1007/978-3-319-30285-0_12

Thomas Iorns¹⁵ &
Taehyun Rhee¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9555))

Included in the following conference series:

Image and Video Technology

2667 Accesses
5 Citations
3 Altmetric

Abstract

This paper presents an effective approach to rendering virtual 3D objects using real-time image based lighting (IBL) with conventional 360$^{\circ }$ panoramic video. Raw 360$^{\circ }$ panoramic video captured in a low dynamic range setup is the only light source used for the real-time IBL rendering. Input video data is boosted to high dynamic range using inverse tone mapping. This converted video is then reconstructed into low-resolution diffuse radiance maps to speed up diffuse rendering. A mipmap-based specular sampling scheme provides fast GPU rendering even for glossy specular objects. Since our pipeline does not require any precomputation, it can support a live 360$^{\circ }$ panoramic video stream as the radiance map, and the process fits easily into a standard rasterization pipeline. The results provide sufficient performance for IBL via stereo head mounted display (HMD), an ideal device for immersive augmented reality films and games using 360$^{\circ }$ panoramic videos as both the lighting and backdrop for illumination composition.

You have full access to this open access chapter, Download conference paper PDF

OpenDIBR: Open Real-Time Depth-Image-Based renderer of light field videos for VR

Article 23 August 2023

An End-to-End Real-Time 3D System for Integral Photography Display

Pixel Reprojection of 360 Degree Renderings for Small Parallax Effects

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

With the recent rapid growth of interest in stereographic head-mounted displays (HMDs) such as the Oculus Rift [1], content for HMDs is being created at a steadily increasing rate. One important application of content for HMDs is in the creation of virtual reality (VR) and augmented reality (AR) experiences. In particular AR is a live direct or indirect view of a real-world environment whose elements are augmented by computer generated content such as 3D virtual objects. In order to maximize the immersive experience of AR content on HMDs, the seamless composition of the 3D virtual object and the real-world environment is important. Image based lighting (IBL) [2, 3] has been used to emulate global illumination (GI) within a real-world scene, where the distant ambient and directional lighting is stored in a single image acting as the radiance map. This method provides high quality, realistic lighting useful for photo-realistic rendering and composition with real-world scenes in live-action films, and augmented reality. The image used for IBL is created by capturing high dynamic range (HDR) 360-degree panoramic images using photographs at various angles and exposure levels. A 360-degree panoramic image can also provide an ideal and intuitive format for HMDs to cover the whole range of viewpoints arising from motion of the viewer’s head. Because of this the hardware for capturing panoramic images and video has become readily available, and as a result many 360$^{\circ }$ videos (video captured with a full spherical $4\pi $ steradian field of view) can be found on popular video sharing websites such as YouTube [4]. These videos provide a high level of immersion when viewed via HMDs, for a relatively low cost of content production.

The use of 360-degree panoramic images for virtual object lighting in augmented reality has already been proposed [5] and studied in previous research [6]. However when using conventional 360$^{\circ }$ video as the radiance map for IBL, we need to consider some additional challenges. Firstly IBL requires a high dynamic range (HDR) image of the scene environment to be captured. This requires views of various angles and exposure levels to be provided, and is generated by a user-guided post processing step. Although it is possible to create HDR 360$^{\circ }$ video using a special device setup such as in [7, 8], conventional 360$^{\circ }$ videos [4] are captured using low dynamic range sensors and cannot provide enough dynamic range data for direct use in IBL. Secondly the IBL technique used must provide real-time rendering of a pair of stereo images shown at high frame rate; HMDs often require stereo rendering of 60 to 90 frames per second (FPS) in order to prevent visual discomfort [9]. Finally, when considering a live 360$^{\circ }$ video stream for immersive AR applications, precomputation of the radiance map is tricky and limited [10].

In this paper we present a novel pipeline that addresses the above challenges practically. Recent perceptual studies [11, 12] have shown that the proper inverse tone-mapping algorithm can reconstruct the dynamic range that is required for IBL, from low dynamic range (LDR) input, and to such a level that the human visual system (HVS) cannot perceive the difference. Chalmers et al. [11] provide a threshold of image resolution that maintains the seamlessness of the final illumination composition. We adapt the perceptual threshold of the HVS to optimize our pipeline. The dynamic range of the LDR 360$^{\circ }$ video is converted to HDR using an inverse tone mapping operator. By using low-resolution versions of the panoramic video in lighting calculations for which the result is perceptually similar, we are able to emulate various common material properties in real-time. A mipmap-based specular sampling scheme provides fast rendering even for glossy specular objects. Since our pipeline does not require any precomputation, it has the potential to support a live 360-degree panoramic video stream as the radiance map, and the process fits easily into a standard GPU rasterization pipeline. The resulting pipeline reliably provides framerates of over 75 Hz, as required for comfortable viewing on stereo HMDs. The result is the IBL of 3D objects providing a seamless mixture of illumination with the 360-degree panoramic video backdrop. Based on our survey, this is the first practical system that provides interactive IBL from an LDR 360$^{\circ }$ live video stream suitable for HMDs. The overview of the system pipeline is shown in Fig. 2, and examples of the results for some test video frames can be seen in Fig. 8.

2 Background and Related Work

Image based lighting was described early on by Miller and Hoffman [2] and has been popularized by Debevec [3], who has among other applications used it to convincingly render virtual objects into real-life photographs [13]. Heidrich and Seidel [14] described real-time use of IBL, precomputing diffuse and glossy material lighting integrals to efficiently render these materials by looking up the precomputed values according to surface normal or reflection direction. Kautz and McCool [15] simulated more complex material types by approximating their reflectance properties as a linear combination of glossy reflection from multiple directions. These techniques form the core of many real-time IBL applications (see for example [5, 16]).

Ramamoorthi and Hanrahan [17] described an efficient way to represent diffuse radiance maps. They showed that the precomputed lighting integrals for diffuse materials can be described using only a very small number of coefficients in a spherical harmonics basis. These coefficients were shown by King [10] to be able to be calculated in real-time using a dedicated graphics processor, allowing real-time IBL of diffuse materials without precomputation.

An alternate method of calculating material lighting, which has become more performant as graphics hardware has improved, is to approximate the lighting integral directly by sampling from many points on the environment image. Recently this has been used by Kronander et al. [8] to render a virtual object into video in real-time (around 25 frames per second on an Nvidia GeForce 770). For real-time performance it is necessary to use an importance sampling technique (such as [18]) so as to get the most accurate result with the smallest number of samples. To obtain their environment image they used a special device, an additional HDR video camera mounted underneath the primary LDR video camera shooting the scene. This HDR video camera recorded the environment via an attached light probe [3].

Recently Michiels et al. [19] presented a method for IBL using 360$^{\circ }$ video to provide lighting for virtual objects. They analyze video from a moving 360$^{\circ }$ video camera in order to determine the position of the camera at each frame. They then use the frames to vary the lighting on a virtual object as it is moved around a reconstructed virtual environment. As well as having a different focus, their technique is also different from ours. They calculate their lighting in terms of spherical radial basis functions [20], which requires precomputing the lighting for each video frame in this basis. According to our survey, we could not find any prior work creating real-time IBL rendering using conventional LDR 360$^{\circ }$ video as the radiance map without a special device or precomputation.

The main difference in our technique stems from recent perceptual studies. Chalmers et al. [11] observed that the resolution of an environment image being used for lighting could be greatly reduced without causing any noticable difference to the rendered scene. Akyüz et al. [12] found that a simple tonemapping operator is often adequate for believably converting LDR images to HDR. We use these results to perform lighting calculations in real-time, by reducing the resolution of our input and tonemapping LDR to HDR as appropriate.

3 Real-Time IBL from 360-Degree Video

We present a real-time IBL system using 360$^{\circ }$ video. The input video frame can be used to directly represent specular reflection, and along with techniques for representing glossy and diffuse surfaces, a large number of real-world opaque materials are able to be simulated. In order to provide real-time diffuse illumination, we generate a new diffuse radiance map per frame, which can be done in real-time at a perceptually optimized lowered resolution [11]. For specular material types including mirror-like and glossy specular, we sample light from the radiance map according to the reflectance function, at an appropriately reduced resolution using mipmaps [21]. We test our method using three different material setups: diffuse reflection, pure specular reflection, and glossy specular reflection. These properties can be combined to simulate a wide range of believable materials.

3.1 Diffuse Illumiation

Diffuse lighting is calculated by generating a diffuse radiance map, as in [2], and sampling from this according to surface normal direction. Assuming fixed aspect ratio, generating the diffuse radiance map is an $O(N^4)$ operation in environment image height. For even fairly low resolution environments this quickly becomes prohibitive. We find viewing on HMDs however, that even when calculated at resolutions as low as 32 by 16 pixels, the diffuse lighting remains visually similar to the full-resolution version when sampled with bilinear filtering as shown in Fig. 3. Most of the output images in this paper were generated using 32 by 16 diffuse maps. A comparison including generation times can be seen in Fig. 3. Even at 128 by 64 pixels the maps can be generated on a low-end graphics card more quickly than standard video framerates require, and as Chalmers et al. show [11] lighting using diffuse maps down to 80 by 40 pixels can be perceptully indistinguishable from lighting using full resolution maps. Once the diffuse radiance map for a frame is generated, the diffuse lighting calculation for any point on an object’s surface consists simply of a single texture lookup.

3.2 Specular Illumination

Pure specular reflection can be easily achieved by using the input video frame as an environment map. To calculate the specular component of material colour, we simply take the vector from surface to camera, reflect it across the surface normal, and sample from the environment map in this direction, as shown in Fig. 4a. Generating accurate glossy specular reflection is computationally expensive. However, it can be approximated by computing glossy environment maps in a similar manner to the diffuse environment map. This is fine for very rough surfaces where the glossy lighting calculation is similar to the diffuse lighting calculation, but it becomes prohibitive as the surface roughness decreases and the gloss level approaches specular. The closer to mirror-reflection the gloss level is, the higher the resolution of the glossy environment map needed to describe it.

An alternative approach is to directly sample from the specular radiance map (in our case the original video frame), which we do efficiently using a technique similar to that of Colbert and Křivánek [21]. We take a small number of samples in a radius around the specular direction, choosing a mipmap level appropriate to the angular distance between samples. The sampling radius depends on the surface roughness parameter. For lower roughness, a higher resolution mipmap level is sampled from, but the decreasing radius of the glossy reflection lobe (see Figs. 4b and 5) means the same number of samples is required independent of roughness level. In this way glossy specular lighting can be approximated using a fixed number of texture lookups per rendering fragment. In our tests we found that taking about 18 samples inside the primary glossy lobe, 18 samples outside it, and weighting them by a simple Phong model [22] gave fast and believable glossy surface lighting. While this is sufficient for demonstrating the validity of our pipeline, more complex or efficient techniques such as in [21, 23] could easily be substituted (Fig. 6).

4 Inverse Tone-Mapping from LDR to HDR

In a typical IBL setup, all lighting calculations will assume HDR lighting input. The difference between scenes rendered using LDR and HDR IBL is immediately apparent (see Fig. 7), with HDR lighting greatly increasing the realism of scenes when compared with LDR lighting. It turns out, however, that simple automatic LDR to HDR conversions can be sufficient for creating believable lighting effects [11, 12] when targeting the human visual system. As such we are able to believably light virtual objects using only LDR video.

The tone-mapping operator we use is independent of varying frame properties, applying the same transform to each pixel individually. As such it is easily and efficiently implemented on the GPU. We chose this tonemapping operator as a compromise between those in [11, 12], and it appears likely from our experimentation that other simple tonemapping operators would work just as well.

Given an input RGB value we first determine input luminosity as

$$\begin{aligned} L_i = 0.3 \cdot R_i + 0.59 \cdot G_i + 0.11 \cdot B_i\text{, } \end{aligned}$$

(1)

where $R_i$, $G_i$ and $B_i$ are the red, green and blue components of the input image, as values between 0.0 and 1.0, and $L_i$ is the calculated input luminosity. We then calculate a desired scaling factor $L_s$ based on this input luminosity as

$$\begin{aligned} L_s = 10 \cdot L_i^{10} + 1.8\text{. } \end{aligned}$$

(2)

The output red, green and blue components are then determined by

$$\begin{aligned}{}[R_o, G_o, B_o] = L_s \cdot [R_i, G_i, B_i]\text{. } \end{aligned}$$

(3)

Parameters here were determined by experiment, and those in Eq. (2) have been used for all examples in this paper. After this operation, the converted image is used as an HDR radiance map in the following rendering steps.

5 GPU Implementation

We implemented our method using the GPU, and tested on various consumer-grade video cards including Nvidia GeForce 690, 770 and 980 as well as AMD Radeon 270. The pipeline was laid out as follows.

While GPU instructions are being queued, a separate thread loads and decodes the next video frame. This is passed to the GPU using a pixel buffer object for asynchronous memory transfer. As our target display refresh rate is much higher than typical video framerate (minimum 75 frames per second for HMD, typical 25 frames per second for video) one frame is used for display as the next is being loaded, giving smooth video performance.

The main pipeline works as depicted in Fig. 2. The basic procedure is described in Algorithm 1. The diffuse radiance map is upscaled simply by enabling bilinear texture filtering. For head-mounted stereographic display, the display process is executed twice, once for each eye (see Fig. 1 for example output). Object lighting is performed in a GPU fragment shader. Specular and diffuse lighting consist simply of texture lookups into the specular and diffuse radiance maps. Glossy specular lighting uses a somewhat more complicated system of sampling in a fixed pattern from lower-resolution mipmap levels of the specular map. The specific sampling system we used was to take samples in concentric rings around the specular direction, with six samples per ring, three rings inside the glossy lobe (see Fig. 4b), and three rings outside it. Which mipmap level to sample from is determined by taking the distance between samples in a ring, and choosing the level with an equivalent distance between pixels. Samples are weighted by a Phong model [22], and the radius of the glossy lobe is defined to be the distance at which the weighting is 0.5. To reduce discretization artifacts, hardware trilinear filtering is enabled.

6 Results

We tested our method on several 360$^{\circ }$ videos obtained from a popular video sharing website [4]. Video resolutions were between 1920$\,\times \,$960 and 2048$\,\times \,$1024, with framerates between 24 and 30 Hz. We tested various 3D objects without texture maps including Teapot (6320 triangles) and Bunny (69451 triangles). Five objects were displayed at once, each using a different combination of the material setups described in Sect. 3. We show that lighting virtual models using this method, with nothing other than LDR 360$^{\circ }$ panoramic video as input, gives believable results over a wide variety of input lighting conditions (see Fig. 8). We rendered a single camera view at 1280$\,\times \,$720 resolution. By subjective visual comparison, lighting of the virtual objects seems to match that of the background video in all tested cases.

Rendering with tonemapped LDR to HDR frames seems quite sufficient for believable lighting (see Fig. 7). This is in agreement with [11, 12]. Rendering using only the LDR frames for lighting appears dull, and does not match the background scene. Performance of the GPU based IBL is quite efficient, executing at around 90 Hz (11ms per frame) on a GeForce 690 and around 500 Hz (2ms per frame) on a GeForce 980. This leaves plenty of time for additional rendering tasks. Additionally, we tested our method in an Oculus DK2 HMD. Using an Nvidia GeForce 690 we were able to render stereo output at 1182$\,\times \,$1464 resolution for both eyes (see Fig. 1) of the HMD comfortably at 75 Hz.

7 Conclusion

This paper presents an effective approach to rendering virtual 3D objects using real time image based lighting (IBL) with conventional 360$^{\circ }$ panoramic video. Using only low dynamic range 360$^{\circ }$ panoramic video as input we find that 3D virtual objects can be nicely composited into the panoramic video using IBL rendering. This form of video-based object lighting can be done entirely in real-time and with no precomputation required.

The visual quality of our result is similar to previous IBL techniques requiring precomputation and HDR environment images. This is achieved using the perceptual observation that low resolution environment maps and tonemapped LDR to HDR images can be sufficient for believable lighting.

One aspect of virtual object rendering we do not consider is that of shadowing. A possible extension of this work would be to incorporate an existing real-time shadowing technique such as that of [6]. There is also room for improvement in the compositing technique used to blend the rendered virtual objects with a perspective view of the input video. A potential improvement might be to render the virtual objects directly into the video frame, which we have not yet explored.

Our main contribution is a fully self-contained real-time system for lighting virtual objects so as to match an environment provided via LDR 360$^{\circ }$ panoramic video content. The novelty of our work is that our system requires no prior analysis of the video stream, can simulate complex materials efficiently, and can work using only readily-available LDR content. The system is efficient enough to render in real-time at the high framerates and resolutions required for immersive head-mounted display. While each component of the pipeline could be improved using more sophisticated methods, a balanced trade-off between perceptible visual quality and sufficient rendering performance is required for practical applications.

References

Oculus, V.R.: Oculus rift (2015). http://www.oculusvr.com/rift
Miller, G., Hoffman, C.R.: Illumination and reflection maps: simulated objects in simulated and real environments. In: SIGGRApPH 84 Advanced Computer Graphics Animation Seminar notes, vol. 190 (1984)
Google Scholar
Debevec, P.: Image-based lighting. IEEE Comput. Graph. Appl. 22(2), 26–34 (2002)
Article Google Scholar
YouTube: 360Video (2015). http://youtube.com/360
Agusanto, K., Li, L., Chuangui, Z., Sing, N.W.: Photorealistic rendering for augmented reality using environment illumination. In: Proceedings of The Second IEEE and ACM International Symposium on Mixed and Augmented Reality. IEEE, pp. 208–216 (2003)
Google Scholar
Supan, P., Stuppacher, I., Haller, M.: Image based shadowing in real-time augmented reality. IJVR 5, 1–7 (2006)
Google Scholar
Unger, J., Kronander, J., Larsson, P., Gustavson, S., Ynnerman, A.: Temporally and spatially varying image based lighting using hdr-video. In: 2013 Proceedings of the 21st European Signal Processing Conference (EUSIPCO), pp. 1–5. IEEE (2013)
Google Scholar
Kronander, J., Dahlin, J., Jonsson, D., Kok, M., Schon, T., Unger, J.: Real-time video based lighting using gpu raytracing. In: 2014 Proceedings of the 22nd European Signal Processing Conference (EUSIPCO), pp. 1627–1631. IEEE (2014)
Google Scholar
Yao, R., Heath, T., Davies, A., Forsyth, T., Mitchell, N., Hoberman, P.: Oculus vr best practices guide. Oculus VR (2014)
Google Scholar
King, G.: Real-time computation of dynamic irradiance environment maps. GPU Gems 2, 167–176 (2005)
Google Scholar
Chalmers, A., Choi, J.J., Rhee, T.: Perceptually optimised illumination for seamless composites. In: Keyser, J., Kim, Y.J., Wonka, P. (eds.) Pacific Graphics Short Papers, The Eurographics Association (2014)
Google Scholar
Akyüz, A.O., Fleming, R., Riecke, B.E., Reinhard, E., Bülthoff, H.H.: Do HDR displays support LDR content?: a psychophysical evaluation. ACM Trans. Graph. (TOG), 26(3), Article no. 38 (2007)
Google Scholar
Debevec, P.: Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography. In: ACM SIGGRApPH 2008 Classes, p. 32. ACM (2008)
Google Scholar
Heidrich, W., Seidel, H.P.: Realistic, hardware-accelerated shading and lighting. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques. ACM Press/Addison-Wesley Publishing Co., pp. 171–178 (1999)
Google Scholar
Kautz, J., McCool, M.D.: Approximation of glossy reflection with prefiltered environment maps. Graph. Interface 2000, 119–126 (2000)
Google Scholar
Unger, J., Wrenninge, M., Ollila, M.: Real-time image based lighting in software using HDR panoramas. In: Proceedings of the 1st International Conference on Computer Graphics and Interactive Techniques in Australasia and South East Asia, pp. 263–264. ACM (2003)
Google Scholar
Ramamoorthi, R., Hanrahan, P.: An efficient representation for irradiance environment maps. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 497–500. ACM (2001)
Google Scholar
Burke, D., Ghosh, A., Heidrich, W.: Bidirectional importance sampling for direct illumination. Rendering Tech. 5, 147–156 (2005)
Google Scholar
Michiels, N., Jorissen, L., Put, J., Bekaert, P.: Interactive augmented omnidirectional video with realistic lighting. In: De Paolis, L.T., Mongelli, A. (eds.) AVR 2014. LNCS, vol. 8853, pp. 247–263. Springer, Heidelberg (2014)
Google Scholar
Wang, J., Ren, P., Gong, M., Snyder, J., Guo, B.: All frequency rendering of dynamic spatially varying reflectance. ACM Trans. Graph. (TOG) 28, 133:1–133:10 (2009)
Google Scholar
Colbert, M., Křivánek, J.: Gpu-based importance sampling. GPU Gems 3, 459–476 (2007)
Google Scholar
Phong, B.T.: Illumination for computer generated pictures. Commun. ACM 18, 311–317 (1975)
Article Google Scholar
McGuire, M., Evangelakos, D., Wilcox, J., Donow, S., Mara, M.: Plausible Blinn-Phong reflection of standard cube MIP-maps. Technical report CSTR201301, 47 Lab Campus Drive, Williamstown, MA 01267, USA (2013)
Google Scholar

Download references

Acknowledgements

Research reported in this paper was supported by the Human Digital Contents Interaction for 4D Home Entertainment (HDI4D) project funded by the Ministry of Business, Innovation and Employment (MBIE) in New Zealand.

Author information

Authors and Affiliations

Victoria University of Wellington, Wellington, New Zealand
Thomas Iorns & Taehyun Rhee

Authors

Thomas Iorns
View author publications
You can also search for this author in PubMed Google Scholar
Taehyun Rhee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Taehyun Rhee .

Editor information

Editors and Affiliations

National Ilan University , Yi-Lan, Taiwan
Fay Huang
National Institute of Informatics , Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iorns, T., Rhee, T. (2016). Real-Time Image Based Lighting for 360-Degree Panoramic Video. In: Huang, F., Sugimoto, A. (eds) Image and Video Technology – PSIVT 2015 Workshops. PSIVT 2015. Lecture Notes in Computer Science(), vol 9555. Springer, Cham. https://doi.org/10.1007/978-3-319-30285-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-30285-0_12
Published: 25 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30284-3
Online ISBN: 978-3-319-30285-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)