A method for estimating the errors in many-light rendering with supersampling

In many-light rendering, a variety of visual and illumination effects, including anti-aliasing, depth of field, volumetric scattering, and subsurface scattering, are combined to create a number of virtual point lights (VPLs). This is done in order to simplify computation of the resulting illumination. Naive approaches that sum the direct illumination from many VPLs are computationally expensive; scalable methods can be computed more efficiently by clustering VPLs, and then estimating their sum by sampling a small number of VPLs. Although significant speed-up has been achieved using scalable methods, clustering leads to uncontrollable errors, resulting in noise in the rendered images. In this paper, we propose a method to improve the estimation accuracy of many-light rendering involving such visual and illumination effects. We demonstrate that our method can improve the estimation accuracy by a factor of 2.3 over the previous method.


Introduction
Many-light rendering methods simplify the complex computation of global illumination into a simple summation of the contributions from many virtual point lights (VPLs) [1]. In many-light rendering, the incident radiance at a point to be shaded (the shading point) is calculated using VPLs. Because the number of VPLs used is generally quite large, previous methods [2][3][4] have clustered VPLs to streamline the computation. These methods, however, cannot control the errors produced by clustering, resulting in noise in the rendered images. To eliminate noise in such systems, users must adjust clustering parameters in tedious trial-and-error processes.
To address this problem, Nabata et al. [5] proposed a method to estimate errors in the pixel values due to VPL clustering. This method, however, estimates each pixel value using a single shading point and cannot be applied to visual effects such as anti-aliasing and depth-of-field (DOF), which require multiple shading points to estimate the pixel value. Walter et al. [6] proposed a method called multidimensional lightcuts (MDLC ), which attempts to control the clustering errors in many-light rendering of various visual effects. This method, however, does not estimate the error in each pixel value, which appears as noise in the rendered images. This paper proposes a method to improve the accuracy of pixel values for visual effects such as anti-aliasing, DOF, and volumetric effects, as shown in Fig. 1. Our method automatically partitions clusters and continues sampling them until the relative errors in the pixel values are smaller than a user-specified threshold.

Scalable VPL rendering
Recent advances in many-light rendering have demonstrated that global illumination effects can be more than adequately approximated using many virtual lights [1,7]. Keller [8] introduced the instant radiosity method, which calculates the indirect illumination from virtual point lights (VPLs). Walter et al. [2,6] proposed a scalable solution to many-light methods using a hierarchical representation of VPLs, called cuts. Hašan et al. [9] represented many-light rendering via a large matrix, and explored the matrix structure by using row-column sampling. Ou and Pellacini [3] clustered the shading points into groups called slices and performed matrix rowcolumn sampling for each slice. Georgiev et al. [10] proposed an importance sampling method for VPLs by recording the contributions of the VPLs at each cache point. Yoshida et al. [11] proposed an adaptive cache insertion method to improve VPL sampling. Wu and Chuang [12] proposed the VisibilityCluster algorithm, which approximates the visibilities between each cluster of VPLs and those of the shading points by estimating the averaged visibility. Huo et al. [4] proposed a matrix samplingand-recovery method to efficiently gather contributions from the VPLs by sampling a small number of them. Although these methods can significantly accelerate many-light rendering, the results rely on user-specified parameters, and finding the optimal parameter values remains a challenging task.

Participating media
VPL rendering suffers from splotches that stem from the singularity of the geometry term relating shading points and VPLs. Clamping is often used to avoid these artifacts. Engelhardt et al. [13] proposed a method to compensate for energy loss due to clamping and proposed a rendering method for participating media using VPLs. Novák et al. [14] proposed virtual ray lights (VRLs) to alleviate this singularity, and several groups have proposed acceleration methods [15,16] that use VPLs and VRLs. These methods, however, do not estimate the errors due to clustering of the VPLs and VRLs.

Subsurface scattering
Arbree et al. [17] proposed a scalable rendering method for translucent materials using VPLs. Walter et al. [18] proposed bidirectional lightcuts that supports complex illumination effects, including subsurface scattering. Wu et al. [19] formulated a radiance computation method for subsurface scattering by sampling light-to-surface and surface-tocamera matrices. Although these methods can render translucent materials efficiently by clustering VPLs, they cannot control the errors due to clustering.
To address this problem, we provide an error estimation method that can handle a wide variety of illumination and visual effects. After specifying the relative error tolerance and the probability α, our method stochastically estimates the relative errors with probability α, i.e., using our method, the relative errors of a proportion α of the pixels in the rendered image is likely to be less than .

Background
In many-light rendering, the outgoing radiance L(x, x v ) at shading point x toward the viewpoint x v is calculated by where L is the set of VPLs, I(y) is the intensity of VPL y, and f (y, x, x v ) is the material function that encompasses the bidirectional reflectance distribution function (BRDF) on the surface and the phase function within the volume. V (y, x) and G(y, x) are the visibility and geometry terms [1,7] relating y and x, respectively. To estimate the pixel value for the rendering of the participating media, antialiasing, and DOF, the outgoing radiances from multiple shading points are required, as shown in Fig. 2. For example, when multiple shading points are generated via supersampling for anti-aliasing, the pixel values are calculated using the weighted sum of the outgoing radiances from the shading points. The pixel value I produced by supersampling is calculated as where G is the set of shading points, and W is the weighting function. The weighting function W and material function f depend on the visual effects and the rendered objects. Details of W and f are provided in Sections 4.4-4.6.
The computational cost of Eq. (2) is proportional to the product of the number of VPLs |L| and the number of shading points |G|. Since, in general, many VPLs are used, the computational cost of Eq. (2) is prohibitive. MDLC is an efficient scalable many-light rendering method that clusters VPLs and shading points [6]. MDLC implicitly represents the hierarchy of clusters with a product graph that consists of pairs of clusters for VPLs and shading points. MDLC estimates the upper bound for clustering error, but does not estimate the error in the pixel value (i.e., the sum of the errors due to clustering). We improve the estimation accuracy of the pixel value by combining a method for estimating errors [5] with MDLC [6]. Figure 3 shows an overview of our method, which estimates image pixel values by clustering VPLs and shading points. Clusters of VPLs and shading points are referred to as VPL clusters and shading clusters, respectively. VPL clusters and shading clusters are represented by binary trees, whose leaf nodes correspond to VPLs (or shading points), and whose inner nodes correspond to clusters of VPLs (or shading points). As shown in Fig. 3(b), we refer to the binary trees for the VPL clusters and the shading clusters as the light tree and the shading tree, respectively. Following MDLC, we denote a cluster pair consisting of a VPL cluster C L and a shading cluster C G by (C L , C G ).

Overview
We create the VPL clusters and build the light tree, used for all pixels, in a preprocess. For each pixel, we first generate shading points and build the shading tree by clustering the shading points. We prepare a priority queue Q that stores the cluster pairs in descending order of standard deviation of each pair. Q is initialized with the pair (C L r , C G r ) of root nodes of the light tree and the shading tree. By using this pair, the estimateÎ of the pixel value and (c) Two VPLs and shading points from each cluster are sampled from the pair (C L r , C G r ), then the pixel valueÎ, error ΔÎ, and standard deviation are estimated. (d) One of the clusters is subdivided if the estimated error ΔÎ is greater than tolerance Î . A cluster is selected based on the length of the diagonal of its bounding box weighted by the numbers of VPLs and shading points in each cluster. Here, the VPL cluster C L r is selected and replaced with two child nodes C L 1 and C L 2 ; two new pairs (C L 1 , C G r ) and (C L 2 , C G r ) are created. The pixel value is estimated from the two pairs (C L 1 , C G r ) and (C L 2 , C G r ).
the estimated error ΔÎ = |Î − I| are initialized, and the standard deviation σ is calculated (see Section 4.2). Then we repeat the following processes: 2. Subdivide one of the two clusters, C L i or C G i , and move down one step in the corresponding binary tree. Cluster selection is described in Section 4.3. Replace the selected cluster with two child nodes and create two new pairs.
3. UpdateÎ and ΔÎ, and calculate the standard deviations for the two new pairs. Push the two new pairs into Q.
4. Terminate the process if the estimated error, ΔÎ, is smaller than the tolerance, Î , where is the relative error tolerance. Otherwise, return to Step 1.

Estimating I and ΔI
Outgoing radiances I i due to the shading cluster C G i illuminated from VPL cluster C L i are calculated by (3) As summing over all the VPLs and shading points in clusters C L i and C G i is computationally expensive, our method estimates I i by sampling a small number of VPLs and shading points as follows: where x k and y k are the k-th sample of the shading point and VPL, respectively, K is the number of samples, and p G and p L are probability mass functions for sampling, calculated as follows: Substituting these functions into Eq. (4),Î i is calculated from: Computing the error ΔI = |Î − I| necessitates knowing the true value of I in Eq. (2), but the computational cost of obtaining this value is high. Therefore, our method estimates ΔÎ using the following equation [5]: where t α is the α quantile of the t-distribution, N is the number of pairs, and s 2 i is the sample variance of I i for the i-th pair of (C L i , C G i ); Eq. (6) is derived in the Appendix.
To select a pair of clusters to be subdivided, the standard deviation σ i for each pair (C L i , C G i ) is required. Although the sample variance s 2 i can be used to estimate σ i , the accuracy of this estimate is low, since our method estimatesÎ i with K = 2 samples, as does the previous method [5]. Instead of using the sample variance, our method calculates the standard deviation σ i for the i-th pair using the upper bounds of f and G, as follows: where f ub and G ub are the upper bounds of f and G, respectively, within the VPL cluster C L i and shading cluster C G i . Var [V ] is the standard deviation of the visibility function; a maximum value of 0.5 is used. f ub and G ub for C L i and C G i are calculated in a similar way to those in MDLC.

Cluster selection
We now determine which cluster of the two clusters in the pair with maximum standard deviation is to be subdivided. In MDLC, the cluster is chosen by a refinement heuristic; it basically selects the cluster with the largest bounding box. Although this works well in some cases, it predominantly selects the VPL cluster C L , since VPLs are usually distributed throughout a scene, whereas the shading points are distributed locally, as shown by Fig. 2. In addition, the numbers of VPLs and shading points are often substantially different (e.g., the number of VPLs in Fig. 1 is 35.1k while the number of shading points is only 256). These facts cause unequal subdivisions between VPL clusters and shading clusters, which results in lower estimation accuracy. To address this problem, we propose a cluster selection method that accounts for the size difference between the bounding boxes of the root nodes of the VPL cluster and the shading cluster. We also consider the number of VPLs and shading points when selecting the cluster to be subdivided. Specifically, the bounding box for each shading cluster is scaled using the following two coefficients: where C L r and C G r are the root nodes of the VPL and shading clusters, respectively, l C L r and l C G r are the diagonal lengths of the bounding boxes for C L r and C G r , respectively, and |L| and |G| are the numbers of VPLs and shading points, respectively.

Anti-aliasing and DOF
In many-light rendering with anti-aliasing or DOF, N spp viewing rays are generated via sampling on the screen pixel or camera lens, and the point of intersection between each viewing ray and the surface in the scene is calculated. The material function f at each shading point x i is represented by a BRDF f r . As we use a simple box filter, the weighting function is calculated by W (x i ) = 1/N spp ; other filters (e.g., Gaussian filters) can also be applied.
Here, for simplicity, we describe methods for rendering translucent materials and participating media with a single ray per pixel; however, with simple modifications, our method can render these materials using multiple rays, as shown in Fig. 8.

Rendering translucent materials
The outgoing radiance at x o on the surface of a translucent material in viewing direction ω o due to multiple scattered light within the translucent material is calculated using the diffuse bidirectional scattering surface reflectance distribution function (BSSRDF) R d [20]: subsurface scattering of light at x j in viewing direction ω j is given by where T η is the Fresnel transmittance, x i and n i are a point and the normal to the surface of the translucent material at x i , respectively, r ij = x i − x j , A is the set of points on the surface of the translucent material, and Ω is a set of directions over the hemisphere. Our method traces N spp rays through each pixel and defines the point of intersection between the j-th ray and the surface of the translucent material as x j . The pixel value due to the subsurface scattering of light is calculated by where ω i = (y − x i )/ y − x i . Our method samples |G| shading points around x j using the probability density function p d , which is as proportional as possible to the diffuse BSSRDF R d [21]. Using importance-sampled shading points, Eq. (10) becomes By comparing Eqs. (11) and (2), the weighting function W and material function f are represented as follows:

.6 Rendering participating media
Like MDLC, our method assumes homogeneous participating media and an isotropic phase function. To calculate a pixel value in the presence of homogeneous participating media, the scattered radiances are integrated along the viewing ray, as shown in Fig. 2(b). Let x o be the point of intersection of the viewing ray and the surface in the scene. Thus, the outgoing radiance from x o is calculated using the sum of the reflected radiance L s at x o and the scattered radiance L m along the viewing ray, as follows:

I(y)f p (y, x(t), x v )V (y, x(t))G(y, x(t))
where f p is the (isotropic) phase function.
To compute the integral in Eq. (13), shading points x i are generated by uniformly subdividing the viewing ray with step size Δt; L m is calculated by summing over the shading points and VPLs as follows: Comparing this equation with Eq. (2), the material function f is represented by the phase function f p , and the weighting function W (x i ) is calculated using Table 1 shows the numbers of VPLs and shading points, as well as the computational time for our method and MDLC measured using a PC with an Intel Xeon E5-2650 v4 2.20 GHz CPU. In all the calculations, the relative tolerance was set to 2%, and the α quantile for Eq. (6) was set to 95%. To measure the estimation accuracy, we defined R as the percentage of pixels satisfying Î − I < I; the estimation is accurate when R is close to α (95% in our experiments). Figure 1 shows results of our rendering method with anti-aliasing, where the insets compare (a) the reference solution, (b) our method, and (c) MDLC. The reference solution is rendered by summing up all the contributions from the VPLs at all the shading points to calculate the pixel values. Insets (d) and (e) illustrate the relative errors and R for our method and MDLC. This figure demonstrates that our method can improve upon the estimation accuracy achieved using MDLC, with corresponding improvements in noise, as  Fig. 1(c). Figure 4 shows a San Miguel scene rendered with DOF. Figure 5 shows a Cornell box scene filled with homogeneous participating media. As shown in insets (c) of Figs. 4 and 5, the stochastic noise is perceptible when using MDLC, whereas it is imperceptible using our error estimation method (see insets (b)). Figure 6 shows a Sponza scene filled with homogeneous participating media. Figure 7 shows a Cornell box scene where the two boxes consist of translucent material. Figure 8 shows a human head model rendered with a diffuse BSSRDF and anti-aliasing. In all of these experiments, our method showed improved estimation accuracy R over MDLC.     In Figs. 9 and 10, cluster selections without scaling (the second column), scaling by c l (the third column), and scaling by c d (the fourth column) are compared. Scaling by c d yields the best estimation accuracy R for five out of six of these scenes. Moreover, comparing results for these scenes without scaling with those with scaling demonstrates that scaling by c d consistently improves the estimation accuracy. In Fig. 9(a), the first row of the San Miguel scene with anti-aliasing shows relative errors greater than 5%, especially around the ivy-covered wall in the first column, but are reduced in the third column. In Fig. 9(d), the fourth row of the kitchen scene with DOF, relative errors greater than 5% can be seen, especially around the shadow boundaries and outlines. We think that these low estimation accuracies arise from the uneven cluster subdivisions (i.e., shading clusters are not subdivided, whereas VPL clusters are predominantly subdivided). Comparing scalings by c l and c d shows that estimation accuracies in the three scenes deteriorate when using c l , although c l yields the best estimation accuracy in the San Miguel scene with DOF (c). Based on these experiments, we used the scaling coefficient c d to select the clusters.

Figures 1 and 4-8 show our results.
While our method outperformed MDLC in estimation accuracy, its computational time was from 1.11 to 8.55 times greater. We attempted to adjust  Cluster selection without scaling (the second column), with scaling by c l (the third column), and scaling by c d (the fourth column) for subsurface scattering (above) and participating media (below). Although scaling by c l provides better estimation accuracy R than not scaling in the human head scene (above), relative errors greater than 5% can be seen in the neck. Scaling using c d provides the best estimation accuracy and relative errors greater than 5% are not seen. In the Cornell-box scene with homogeneous participating media (below), since VPLs and shading points are distributed throughout the scene, the improvements due to the use of c d are subtle, but its use does not impair the estimation accuracy.
the parameter so that the estimation accuracy of MDLC was similar to our result in Fig. 5. After several trial-and-error processes and re-renderings, R of MDLC became 91.2% with = 0.5%. The computational time for MDLC with = 0.5% was 33.5 s, which is comparable to our result (35.6 s). However, our method does not require the tedious trial-and-error processes needed for MDLC.

Conclusions and future work
We have presented a scalable many-light rendering method that can improve the estimation accuracy for various visual and illumination effects. Our method automatically partitions VPL and shading clusters so that pixel errors are smaller than the relative tolerance .
Currently, our method is limited to homogeneous participating media with isotropic scattering. We would like to lift this limitation in future work. Moreover, we intend to apply our method to motion blur.
If the samples in each pair follow a normal distribution and Neyman allocation is used for the number of samples for each pair, their statistics T follow a tdistribution with (n − N ) degrees of freedom: where N is the number of pairs, n i is the number of samples for the i-th pair (C L i , C G i ), n is the total number of samples (i.e., n = N i=1 n i ), and s i is the sample variance of the i-th pair. The error ΔI =Î−I is calculated using the α quantile in the t-distribution, t α , as To estimate the pixel value, it is known to be more efficient to use a large number of pairs with a small number of samples than a small number of pairs with many samples. Since at least two samples for each pair is required to calculate the sample variance s i , we sample two VPLs and shading points from the pair (K = 2 in Eq. (4)). By substituting n i = 2 and n = 2N in Eq. (15) The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.