Abstract
Photometric camera calibration is often required in physicsbased computer vision. There have been a number of studies to estimate camera response functions (gamma function), and vignetting effect from images. However less attention has been paid to camera spectral sensitivities and white balance settings. This is unfortunate, since those two properties significantly affect image colors. Motivated by this, a method to estimate camera spectral sensitivities and white balance setting jointly from images with sky regions is introduced. The basic idea is to use the sky regions to infer the sky spectra. Given sky images as the input and assuming the sun direction with respect to the camera viewing direction can be extracted, the proposed method estimates the turbidity of the sky by fitting the image intensities to a sky model. Subsequently, it calculates the sky spectra from the estimated turbidity. Having the sky \(RGB\) values and their corresponding spectra, the method estimates the camera spectral sensitivities together with the white balance setting. Precomputed basis functions of camera spectral sensitivities are used in the method for robust estimation. The whole method is novel and practical since, unlike existing methods, it uses sky images without additional hardware, assuming the geolocation of the captured sky is known. Experimental results using various real images show the effectiveness of the method.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Photometrically calibrating a camera is necessary, particularly when applying physicsbased computer vision methods, such as photometric stereo (Woodham 1980; Ikeuchi 1981), shape from shading (Ikeuchi and Horn 1981; Zhang et al. 1999), color constancy (Hordley 2006; Weijer et al. 2007; Tan et al. 2004; Kawakami and Ikeuchi 2009), illumination estimation (Sato et al. 2003; Li et al. 2003; Lalonde et al. 2009), and surface reflectance estimation (Debevec et al. 2000; Hara et al. 2005; Haber et al. 2009). There have been a number of studies on automatic calibration of camera response functions (or gamma function) and vignetting correction (Lin et al. 2004; Takamatsu et al. 2008; Kuthirummal et al. 2008). These methods produce images that the intensity values are strictly proportional to the radiance of the scenes.
In computer vision literature, less attention has been paid to estimating camera spectral sensitivities and white balance settings,^{Footnote 1} despite the fact that both of them are crucial for color calibration between different types of cameras. The lack of attention is because physicsbased methods usually assume images are captured by the same cameras; thus, the color space used in the whole process is the same. However, when different types of cameras are used, color calibration becomes vital, since without it, identical scene radiance will result in different color values.
To highlight the effects of camera spectral sensitivities and white balance settings, Fig. 1 shows the same scene captured by three different consumer cameras. As shown in the figure, the colors of a scene vary, due to the different camera spectral sensitivities and white balance settings. To further emphasize on the differences, the caption in the figure also includes the comparisons in terms of the chromaticity values.
The intensity formation of colored images can be modeled as:
where \(I_c\) is the intensity at channel \(c,\) with \(c \in \{r,\,g,\,b\},\,\varOmega \) is the range of the visible wavelengths, and \(L\) is the incoming spectral radiance.^{Footnote 2} Considering the von Kries model in computational color constancy, we assume that for different white balance settings, cameras automatically multiply the intensity of each color channel with different scaling factors (\(k_c\)), namely, \(q_c = k_c q_c^\prime ,\) where \(q_c^\prime \) and \(k_c\) are the spectral sensitivity and white balance for \(c\) color channel, respectively.
Based on the last equation, our goal is to estimate \(q_c\) from given \(I_c.\) This means that given image intensity values, we intend to estimate the camera spectral sensitivities and white balance setting together, without any intention to separate them (\(q_c = k_c q^\prime _c\)). Note that, \(k_c\) is estimated up to a scale, and thus its relative value can be obtained from \(q_c.\)
In the literature, one of the basic techniques to achieve the goal is to use a monochromator (Vora et al. 1997), a special device that can transmit a selected narrow band of wavelengths of light. The method provides accurate estimation, and hence is commonly used. Other methods that do not use a monochromator require both input images and the corresponding scene spectral radiances (Hubel et al. 1994; Sharma and Trussell 1993; Finlayson et al. 1998; Barnard and Funt 2002; Ebner 2008; Thomson and Westland 2001).
Unlike the existing methods, in this paper, we introduce a novel method that uses only images without requiring additional devices. The basic idea of our method is, first, to estimate the sky spectral radiance \(L(\lambda )\) through a sky image \(I_c,\) and then to obtain the mixture of the spectral sensitivities and white balance setting, \(q_c(\lambda ),\) by solving Eq. (1). To our knowledge, this approach is novel, particularly the use of images with sky regions.
To estimate the sky spectral radiance, we calculate the turbidity of the sky from image intensities, assuming the sun direction with respect to the camera viewing direction can be extracted. The calculated turbidity provides the CIE chromaticities that can then be converted to the spectral radiance using the formulas of the Judd daylight phases (Judd et al. 1964).
Having the input sky image and its corresponding spectra, we estimate the camera spectral sensitivities by solving the linear system derived from Eq. (1). However, this solution can be unstable if the variances of the input colors are small, which is the case for sky images. To overcome the problem, we utilize precomputed basis functions.
The main contribution of this paper is a novel method of spectral sensitivity and white balance estimation from images. Other contributions are as follows. First, the improvement of the sky turbidity estimation (Lalonde et al. 2010), where a wide variety of cameras that have different spectral sensitivities and white balance settings can be handled without calibration. Second, a publicly available camera spectral sensitivity database that consists of twelve different cameras (Zhao 2013). Third, the application of the estimated camera spectral sensitivities for physicsbased color correction in outdoor scenes, which according to our experiment, produces better results than those of a color transfer method (Reinhard et al. 2001).
There are a few assumptions used in our method. First, it assumes the presence of the sky in the input images. Ideally, it is clear sky. However, it performs quite robustly even when the sky is hazy or partially cloudy. Second, it assumes that the sun direction with respect to the camera viewing direction can be extracted. If we have the camera at hand, we can arrange the camera in such a way that we can extract the information from the image. However, if we do not have (e.g., we utilize prestored collections of images, such as those available on the Internet or in old albums), the EXIF tag (the time when the image is taken), the site geolocation and the pose of a reference object in the site can be used to determine the camera viewing and the sun directions. While the requirement of a known geolocation and a reference object sounds restrictive, if we apply the method for landmark objects (such as the Statue of Liberty, the Eiffel Tower, etc.), such information can normally be obtained. Moreover, online services like Google Earth or Google Maps can also be used to determine the geolocation of the site. Third, we share the assumption that is used in the sky model (Preetham et al. 1999): the atmosphere can be modeled using sky turbidity, which is the ratio of the optical thickness of haze versus molecules.^{Footnote 3}
The rest of the paper is organized as follows: Sect. 2 briefly reviews the related work. Section 3 describes the sky turbidity estimation and calculation from turbidity to spectral radiance. Section 4 explains the estimation of camera spectral sensitivity using basis functions. Section 5 provides the detailed implementation. Section 6 shows the experimental result, followed by Sect. 7, which introduces an application that corrects colors between different cameras based on the estimated camera spectral sensitivities. Section 8 discusses the limitation and the accuracy of the method. Finally, Sect. 9 concludes our paper.
2 Related Work
Most of the existing methods of camera spectral sensitivity estimation (Barnard and Funt 2002; Thomson and Westland 2001) solve the linear system derived from Eq. (1), given a number of spectra and their corresponding \(RGB\) values. However, such estimation is often unstable, since spectral representations of materials and illumination live in a lowdimensional space (Slater and Healey 1998; Parkkinen et al. 1989), which implies that the dimension of spectra is insufficient to recover highdimensional camera spectral sensitivity information. To make the estimation stable, further constraints are required in the optimization process, and the existing methods mostly differ in the constraints they use.
Pratt and Mancill (1976) impose a smoothing matrix on pseudomatrix inversion, compare it with the Wiener estimation, and claim that the Wiener estimation produces better results. Hubel et al. (1994) later confirm that Wiener estimation does provide smoother results than those of the pseudomatrix inversion. Sharma and Trussell (1993) use a formulation based on set theory and introduce a few constraints on camera spectral sensitivities, such as nonnegativity, smoothness, and error variance. Finlayson et al. (1998) represent camera spectral sensitivities by a linear combination of the first 9 or 15 Fourier basis functions, and use a constraint that a camera spectral sensitivity must be uni or bimodal. Barnard and Funt (2002) use all the constraints, replace the absolute intensity error with the relative intensity error, and estimate the camera spectral sensitivities and response function at once. Ebner (2008) uses an evolution strategy along with the positivity and the smoothness constraints. Thomson and Westland (2001) use the Gram–Charlier expansion (Frieden 1983) for basis functions to reduce the dimensions of camera spectral sensitivities. Nonlinear fitting is performed in the method.
The main limitation of the mentioned methods is the requirement of the spectral radiance, which is problematic if the camera is not at hand, or if no additional devices (such as a monochromator or spectrometer) are available. In constrast, our method does not require known scene spectral radiance. Moreover, in computing the camera spectral sensitivities we do not use an iterative technique that is required in the existing methods, although in the subprocess of estimating turbidity, we do use an optimization process.
It should be noted that several methods of computer vision have utilized the radiometric sky model. Yu and Malik (1998) use the Perez et al.’s sky model (1993) to calculate the sky radiance from photographs, in the context of recovering the photometric properties of architectural scenes. Lalonde et al. (2010) exploit the visible portion of the sky and estimates turbidity to localize clouds in sky images, which are similar to the technique we use. However, the method cannot be used directly for our purpose, since its optimization is based on \(xyY\) color space. To convert image \(RGB\) into \(xyY,\) a linear matrix must be estimated from known camera sensitivities and white balance setting, which are obviously unknown in our case. Thus, instead of using \(xyY,\) we assume that the relative intensity (i.e., the ratio of a sample point’s intensity over a reference point’s intensity) is independent from cameras up to a global scale factor. By fitting the relative intensity between pixels to that of the sky model, our method can estimate the turbidity, which we consider to be an improvement over the method of Lalonde et al. (2010).
3 Estimating Sky Spectra
Camera spectral sensitivities can be estimated if both image \(RGB\) values and the corresponding spectra, respectively \(I_c\) and \(L(\lambda )\) in Eq. (1), are known. However, in our problem setting, only pixel values \(I_c\) are known. To overcome this, our idea is to infer spectra \(L(\lambda )\) from pixel values using sky images. Since, from sky images, turbidity can be estimated, and from turbidity, sky spectra can be obtained. This section will focus on this process. Later, having obtained the sky spectra and the corresponding \(RGB\) values, the camera spectral sensitivities can be calculated from Eq. (1).
3.1 Sky Turbidity Estimation
The appearance of the sky, e.g. the color and the clearness, is determined by the scattering and absorption of the solar irradiance caused by air molecules, aerosols, ozone, water vapor and mixed gases, where some of them change according to the climate conditions (Chaiwiwatworakul and Chirarattananon 2004). Aerosols are attributed to many factors, such as volcanic eruptions, forest fires, etc., and difficult to characterize precisely. However, a single heuristic parameter, namely turbidity, has been studied and used in the atmospheric sciences (Preetham et al. 1999). Higher turbidity implies more scattering and thus whiter sky.
To estimate turbidity, our basic idea is to match the brightness distribution between an actual image and the sky model proposed by Preetham et al. (1999). The model describes the correlation between the brightness distribution and the sky turbidity based on the simulations of various sun positions and turbidity values. According to it, the luminance \(Y\) of the sky in any viewing direction with respect to the luminance at the zenith \({Y_z}\) is given by:
where \(\mathcal{F}(., .)\) is the sky brightness distribution function of turbidity developed by Perez et al. (1993), \(\theta _s\) is the zenith angle of the sun, \({\theta }\) is the zenith angle of the viewing direction, and \({\gamma }\) is the angle of the sun direction with respect to the camera viewing direction, as shown in Fig. 2. More details of calculating the sky luminance are provided in Appendix A.
Hence, to estimate turbidity (\(T\)), our method minimizes the following error function:
where \(n\) represents the number of sample points and \({Y}/Y_{ref}\) is the luminance ratio of the sky, which can be calculated from \(\mathcal{{F}}(\theta ,\,\gamma )/ \mathcal{{F}}(\theta _{ref},\,\gamma _{ref}),\) given the sun direction and the turbidity. \({Y_{ref}}\) is the luminance of a reference point, and we found that it can be the zenith as in Eq. (2), or any other point in the visible sky portion. \(J\) is the total intensity of a pixel:
where \(I_c\) is the image intensity defined in Eq. (1). \(J_{ref}\) is the total intensity of a reference pixel. Since we assume the camera gamma function is linear, the image intensity ratio (\(J_i/J_{ref}\)) is proportional to the luminance ratio of the sky (\(Y_i/Y_{ref}\)), regardless of the camera sensitivities and white balance setting. The error function is minimized by Particle Swarm Optimization (Kennedy and Eberhart 1995), which is generally more robust than the Levenberg–Marquardt algorithm when there are several local minima.
To minimize the error function, the sun and the camera viewing directions are required. With this respect, we consider two cases:

(1)
The easier case is when a single image is taken using a fisheye lens or an omnidirectional camera, since, assuming the optical axis of the camera is perpendicular to the ground, we can fit an ellipse to saturated pixels, and find its center as the sun position in the sky hemisphere shown in Fig. 2.

(2)
The harder case is when the sky is captured by a normal lens, and the sun is not visible in the input image. In this circumstance, we search images that include a reference object with a known pose and geolocation. The pose and geolocation of a reference object (particularly a landmark object) are in many cases searchable on the Internet. The camera viewing direction can, then, be recovered using a few images that include the reference object by using SfM (structure from motion). The sun position is estimated from the time stamp in the EXIF tag and the geolocation of the object. The details of calculating the sun direction when the sun is not visible in the image are given in Appendix B.
Aside from the two cases above, when clouds are present in the input image, the turbidity estimation tends to be erroneous. To tackle this, our method employs a RANSAC type approach, where it estimates the turbidity from sample sky pixels, repeats this procedure, and finds the turbidity that has the largest inliers with the smallest error.
3.2 Sky Spectra from Turbidity
Preetham et al. (1999) also introduce the correlation of turbidity and the CIE chromaticity (\({x}\) and \({y}\)). The CIE chromaticity can be calculated as follows:
where \({x_z}\) and \({y_z}\) represent the zenith chromaticities, and are functions of turbidity. For computing \({x}\) and \({y}\) in detail, see Appendix C.
Having obtained \(x\) and \(y\) in Eq. (5), the sky spectra can be calculated using the known basis functions of daylights (Judd et al. 1964; Wyszecki and Stiles 1982). The sky spectrum \({S_D(\lambda )}\) is given by a linear combination of the mean spectrum and the first two eigenvector functions:
where scalar coefficients \({M_1}\) and \({M_2}\) are determined by chromaticity values \({x}\) and \({y}.\) Computing \({M_1}\) and \({M_2}\) from \({x}\) and \({y}\) is also given in Appendix C. Three basis functions \(S_0(\lambda ),\,S_1(\lambda )\) and \(S_2(\lambda )\) can be found in Judd et al. (1964), Wyszecki and Stiles (1982).
4 Estimating Camera Spectral Sensitivity
Given a number of input image \(RGB\) values and the corresponding spectra, the camera spectral sensitivities can be computed using Eq. (1), of which the matrix notation is expressed as
where \(\mathbf I \) is a \(3\times n\) pixel matrix, \(\mathbf L \) is a \(w \times n\) spectral matrix, and \(\mathbf q \) is a \(w\times 3\) camerasensitivity matrix. \(n\) represents the number of pixels, and \(w\) represents the number of wavelengths. Provided sufficient data for \(\mathbf I \) and \(\mathbf L ,\) we can estimate \(\mathbf q \) by operating \(\mathbf I \mathbf L ^{+},\) where \(\mathbf L ^+\) is the pseudoinverse of \(\mathbf L .\)
Unfortunately, the rank of the matrix \(\mathbf L \) has to be at least \(w,\) to calculate the pseudoinverse \(\mathbf L ^+\) stably. In our case, the representation of the sky spectral radiance is three dimensional (3D) since we calculate the spectral radiance using the basis functions in Eq. (6). This means that the direct matrix inversion method would produce erroneous results.
To solve the problem, we propose to use a set of basis functions computed from known camera spectral sensitivities (Zhao et al. 2009). In many cases, camera spectral sensitivities have different distribution functions but the variances will not be extremely large, meaning that their representation may lie in a lowdimensional space, similar to the illumination basis functions (Slater and Healey 1998). Since basis functions can reduce dimensionality and thus the number of unknowns, this method generally provides robust and more accurate results than the direct matrix inversion method.
4.1 Estimation Using Basis Functions
Representing the camera spectral sensitivities using a linear combination of basis functions is expressed as:
where \({d}\) is the number of the basis functions, \({b_i^c}\) is the coefficient and \({B_i^c(\lambda )}\) is the basis function with \(c \in \{r,\,g,\,b\}.\) By substituting this equation into Eq. (1), we can have:
By using \({E_i^c}\) to describe the multiplication of the spectral radiance and the basis function of a camera spectral sensitivity: \({E_i^c} = {\int _{\varOmega } L(\lambda ) B_i^c(\lambda ) d\lambda },\) we can obtain
Now, let us suppose that we have \({n}\) sets of data (\(n\) image pixels and the corresponding spectral radiance); then, we can describe the last equation as \(\mathbf I = \mathbf b \mathbf E ,\) where \(\mathbf I \) is a \(1\) by \({n}\) matrix, \(\mathbf b \) is a 1 by \({d}\) coefficient matrix, and \(\mathbf E \) is a \({d}\) by \({n}\) matrix. Consequently, this coefficient matrix \(\mathbf b \) can be expressed as follows: \(\mathbf b =\mathbf I \mathbf E ^{+},\) where \(\mathbf E ^{+}\) is the pseudoinverse of \(\mathbf E .\)
4.2 Basis Functions from a Database
The rank of the multiplication matrix (\(\mathbf E \)) has to be larger than the number of the basis functions (\({d}\)) to make the estimation robust. Since the estimated spectral radiance is at most rank three, we use 3D basis functions for the camera spectral sensitivity estimation.
To extract the basis functions, we collected several digital cameras to make a database and measured their spectral sensitivities, including a few spectral sensitivities taken from the literature (Vora et al. 1997; Buil 2005). Cameras included in the database are Sony DXC 930, Kodak DCS 420, Sony DXC 9000, Canon 10D, Nikon D70, and Kodak DCS 460. Those used for testing are not included. This camera spectral sensitivity database is publicly available at our website (Zhao 2013). We obtain the basis functions from the database by using the principal component analysis.
The percentages of eigenvalues for each color channel are shown in Table 1. The sum of the first three eigenvalues is about \({93\,\%}\) for all three channels; thus, the first three vectors cover \({93\,\%}\) information of the database. Based on this, the first three eigenvectors are used as basis functions, which are shown in Fig. 3.
5 Implementation
The flowchart of the algorithm is shown in Fig. 4. If the input image is captured by an omnidirectional camera, where the optical axis is perpendicular to the ground, and the sun appears in the image, then an ellipse is fitted to the saturated pixels, and the sun position is considered to be the center of that ellipse. Subsequently, the angle between the sun and the camera is computed. However, if the image is taken using an ordinary camera, and we cannot directly know the sun position in the image, then the sun position and the camera viewing direction need to be estimated.
The sun position is estimated through a known geolocation (e.g., using Google Earth) and EXIF tag (the time stamp). The camera viewing direction is estimated using the Bundler (Snavely et al. 2006) and the pose of a reference object. This pose of a reference object can be estimated using Google Earth, where the orientation angle is calculated by drawing a line between two specified points, as shown in Fig. 5. However, this estimation is less accurate than the actual onsite measurement (which for some landmark objects, is available on the Internet). The inaccuracy in the camera viewing angle is 6\(^{\circ }\) in this case, which, in turn, decreases the accuracy of estimating the camera spectral sensitivities approximately 3 %.
To estimate the sun position in an image, one might consider Lalonde et al.’s method (2009). However, for a typical image shown in Fig. 11g, the method produced the estimated angular errors 10\(^{\circ }\) for \(\theta \) and 16\(^{\circ }\) for \(\phi ,\) which are considerably large. Therefore, instead of using the method, we used the geolocation and time stamp to estimate the sun position. Note that, according to the psychophysical test by LopezMoreno et al. (2010), the human visual system cannot spot an anomalously lit object with respect to the rest of the scene, when the divergence between the coherent and the anomalous light is up to 35\(^{\circ }.\) Thus, the error of Lalonde et al. (2009) may be tolerable in some applications.
We also evaluated how many images can be used for consistent estimation of the viewing angles of a specific camera by the Bundler (Snavely et al. 2006). The result is shown in Fig. 6, where we tried as many as 300 images and the SfM algorithm produces consistent results after approximately 50 images.
Having determined the sun position and camera viewing direction, a few pixels in the sky, which correspond to points in the sky hemisphere, are sampled uniformly. To ensure the uniformity, a geodesic dome is used to partition the sky hemisphere equally, and sample some points in each partition. For omnidirectional images, the corresponding sky pixels can be obtained directly from the sample points generated using the geodesic dome. For perspective images, we first calculate the camera’s field of view from the image dimension and focal length. Then, the sample geodesic points lying on the camera’s field of view are used to calculate the coordinates of the corresponding sky pixels.
Turbidity is estimated from the intensity ratios of these sampled sky pixels using Particle Swarm Optimization. RANSAC is used to remove the outliers. The spectral radiance is converted from chromaticity values (\(x\) and \(y\)), which are calculated from the turbidity. Finally, the camera spectral sensitivities together with the white balance setting are estimated using the precomputed basis functions, the calculated sky spectra, and their corresponding \(RGB\) values.
6 Experimental Results
In our experiments, we used both raw images, which are affected by minimal builtin color processing, and images downloaded from the Internet. We assume those images were taken with the gamma function off or have been radiometrically calibrated.
Before evaluating our method, we verified the assumption used in the sky turbidity estimation, namely, image intensities are proportional to the sky luminance. We used two cameras: Nikon D1x and Canon 5D attached with a fisheye lens. The images are shown in Fig. 8d, g. We sampled about 120 points uniformly distributed on the sky hemisphere. Figure 7 shows the results, where the image intensities of both cameras are linear with respect to the sky luminance values of the sky model.
6.1 Raw Images
6.1.1 Omnidirectional Images
A number of clear sky images were taken almost at the same time using three different cameras: Ladybug2, Canon 5D, and Nikon D1x, where the latter two cameras were attached with a fisheye lens, as shown in Fig. 8a, d, g. To show the effectiveness of the proposed method, we compared it with Barnard and Funt (2002). In the comparison, we used the same inputs, i.e., the estimated sky spectra and the corresponding sky pixels. Figure 9a, d, g show the estimated results. The groundtruth of these cameras was measured by using a monochromator. The proposed method was able to estimate the same sky turbidity, around \(2.2 \pm 0.02\) through different cameras with different \(RGB\) values.
The mean error and RMSE of both proposed and Barnard and Funt’s methods are shown in Table 2. Here, the maximum values of the estimated camera spectral sensitivities were normalized to \(1.0.\) The largest mean error of the proposed method was less than \(3.5\) %, while that of Barnard et al.’s was \(7\) %. The proposed method also had a smaller standard deviation.
The method was also evaluated for different sky conditions as shown in Fig. 8: (b) partially cloudy sky, (c) thin cloudy sky, (e) hazy sky, and (h) significantly cloudy sky. For Fig. 8b, c, RANSAC was used to exclude the outliers (cloud pixels). For other images, we estimated the sky turbidity from the sampled sky pixels using the Particle Swarm Optimization. The estimated turbidity for those weather conditions were about 2, 3, 4 and 12, respectively. The recovered camera spectral sensitivities are shown in Fig. 9b, c, e, h. A large error occurs in (h), because the whole sky was covered by thick clouds that did not fit Preetham et al.’s model.
In the experiment, we also verified whether the proposed method is effective to estimate the white balance settings by using two images taken from the same camera (thus the same camera spectral sensitivities) but different white balance settings. Figure 8e, f show such images. The estimated camera spectral sensitivities are shown in Fig. 9e, f. As expected, the shapes of the camera spectral sensitivities were the same, and different only in the magnitude.
6.1.2 Perspective Images
We tested our method for perspective images (images rectified from omnidirectional images) and images taken from ordinary cameras. First, to show that the narrower field of view also works with the method, we used the rectified spherical image shown in Fig. 8i. This image is part of Fig. 8a. The recovered sensitivity is shown in Fig. 9i. The performance did not change significantly compared to Fig. 8a, although only the partial sky was visible. We tested three different directions in Fig. 8a, and had similar results. The estimated sun position in Fig. 8a was used here.
Second, to show that the method can handle images where the sun is not visible and the camera poses are unknown, we captured images with a reference object without knowing its pose and geolocation, shown in Fig. 10a. We captured 16 images in total, and recovered each camera pose with respect to the reference object. The sun position was estimated from the time stamp on the EXIF tag. The estimated camera spectral sensitivities are shown in Fig. 10b.
6.2 Incamera Processed Images
General images such as those available on the Internet are much more problematic compared with the images we tested in Sect. 6.1, since the gamma function has to be estimated and the images were usually taken by cameras that have builtin color processing (Ramanath et al. 2005).
Nevertheless, we evaluated our method with those images, which were captured by three different cameras: Canon EOS Rebel XTi, Canon 5D, Canon 5D Mark II. Figure 11 shows the images of the Statue of Liberty downloaded from a photosharing site. These images were JPEG compressed and taken with internal camera processing. Chakrabarti et al. (2009) introduce an empirical camera model, which converts a JPEG image back to a raw image. We implemented the method to photometrically calibrate the camera (to estimate the response function and internal color processing). The camera pose and the sun direction were estimated in the same manner as in the previous experiment (Fig. 10a). As many as 187 images were used. The method was also evaluated by different sky conditions: clear sky (Fig. 11a, g, i), cloudy sky (Fig. 11c, e), and hazy sky (Fig. 11k).
The estimated camera spectral sensitivities are also shown in Fig. 11. The error evaluation is summarized in Table 3. The mean error for \(RGB\) channels is larger than the results from omnidirectional images because of the residual errors of the internal color processing, the estimation of the response function, and the data compression.
We used the Macbeth color chart to evaluate the accuracy of the estimated camera spectral sensitivities. Specifically, we captured the spectral radiance of the first 18 color patches and used the estimated camera spectral sensitivities to predict the image intensities. The predicted and captured image intensities are plotted onto a 2D space. We found that if the error of the estimated camera spectral sensitivities is less than 5 %, then the plotted data forms an almost perfect straight line.
7 Application: Color Correction
One of the applications of estimating camera spectral sensitivities and white balance setting is to correct the colors between different cameras. The purpose of this color correction is similar to the color transfer (Reinhard et al. 2001). Hence, we compared the results of color correction using our estimated camera spectral sensitivities and white balance with those of the color transfer.
Before showing the comparisons, here we briefly discuss our color correction technique. By discretizing Eq. (1) and using matrix notation, we can rewrite it as follows:
where \(\mathbf I \) is the intensity matrix, \(\mathbf L \) is the matrix of the spectral radiance, \(\mathbf Q \) is the matrix of the basis functions for the camera spectral sensitivities, \(\mathbf B \) is the coefficient matrix, and \(\mathbf E \) is the multiplication of \(\mathbf L \) and \(\mathbf Q .\) Note that, the basis functions used here are different from those extracted in Sect. 4.2, where now we use the same basis for the three color channels. \(n\) is the number of surfaces, and \(w\) is the number of sampled wavelengths.
Suppose we have an image captured by one camera, denoted as \(\mathbf I _1= \mathbf E \mathbf B _1;\) then the same scene captured by another camera is expressed as
Since \(\mathbf B _1\) and \(\mathbf B _2\) are computable if both camera spectral sensitivities are known, the color conversion from one image to another is possible via the last equation. Figure 12 shows the extracted basis functions that are common for the three channels.
The color correction result of the Statue of Liberty is shown in Fig. 13. In the figure, (a) and (b) show the source and target images, and (d) is the result of the proposed method. We also implemented Reinhard et al.’s color transfer algorithm (2001) to have an idea how a color transfer method performs for color correction. Reinhard et al. (2001) transforms one color cluster in \(RGB\) space into the other by the combination of translation and scaling, assuming that the two clusters follow the Gaussian distribution. The result of Reinhard et al.’s method is shown in Fig. 13c.
Since the proposed method is based on the physical cameras’ characteristics, and it uses affine transformation shown in Eq. (12) which is more general than the combination of translation and scaling, it produces visually better results, e.g., in the chest area, or in the platform of the statue, as shown in Fig. 13b–d. The proposed method can determine the transformation once the camera sensitivities are obtained, which is beneficial for color correction applications.
The quantitative evaluation is shown in Fig. 13e–g. We sampled six pixels as shown in Fig. 13a, and compared the chromaticity of those pixels of the three images (b)–(d). In those figures, “target image,” “color transfer,” and “our method” represent the chromaticity of the target image, the result of color transfer, and the proposed color correction. The chromaticity values of the proposed method are close to those of the target image, except for the point 4, which lies in the shadow region of the target image.
Note that, while Fig. 13b was captured only 1 h later than Fig. 13a, their color appears significantly different. By assuming that the illumination did not change significantly, the difference should be caused by camera properties, such as spectral sensitivities and white balance settings. Thus, the proposed method would be useful for applications where color calibration between cameras is necessary.
Other two examples of color correction are shown in Figs. 14 and 15. In both figures, (a) and (b) are the gammacorrected images of Fig. 1, and (c) shows the result of the proposed color correction for two different cameras. The quantitative evaluations are also shown in the figures. “Casio,” “Pana,” “Pana2Casio,” “Canon” and “Canon2Casio” represent the chromaticity values of Casio, Panasonic, color corrected from Panasonic to Casio, Canon, and color corrected from Canon to Casio, respectively. The performance was evaluated on the four sampled pixels as shown in (d)–(f) in each figure.
8 Discussion
8.1 Accuracy of the Sky Model
The sky model (Preetham et al. 1999) might pose an accuracy issue in estimating camera spectral sensitivities, therefore we evaluated it by comparing the intensity produced by the model with the actual sky intensity.
The result is shown in Fig. 16, where (a) shows the actual sky image captured by the Canon 5D camera, and (b) is the simulated sky image. The image intensity in (b) was adjusted in such a way that their average became equal to that in (a), although the red ellipse part was excluded from the averaging, since we considered it to be affected by the scattering at the aperture. We took six sample pixels and compared the chromaticity values, which are summarized in Fig. 16c–e.
8.2 Robustness to the Sun Direction Estimation
We tested the robustness of the camera spectral sensitivity estimation by adding noise to the sun direction. With 5\(^{\circ },\) 10\(^{\circ },\) and 15\(^{\circ }\) errors, the mean error of three channels of Nikon D1x and Canon 5D were about 3, 7, and 11 %, respectively. This implies that the error increases linearly to the angular error of the sun direction.
8.3 Comparison of Two Sky Turbidity Estimation Methods
We compared the proposed sky turbidity estimation method with Lalonde et al. (2010). Our method is based on brightness, while Lalonde et al.’s is based on the \(xyY\) color space. Supposing we capture the same scene by two different cameras or with two different white balance settings, then the calculated \(xyY\) values are different according to different \(RGB\) values. Therefore, using Lalonde et al. (2010), the estimated sky turbidity values are different, which cannot be correct since the scene is exactly the same. The proposed method can handle this problem by assuming the image brightness or intensity stays proportional. We conducted an experiment to verify this. Using the two methods, we fitted the sky model to images and the estimated sky turbidity values. The result is shown in Fig. 17, where (a) and (b) are the input images simulated from the sky model whose sky turbidity was manually set to \(2.0,\) and their white balance settings were set to “Daylight” and “Fluorescent,” respectively. The estimated sky turbidity values by the proposed method are \(2.03\) for both input images, while the sky turbidity values by Lalonde et al.’s method are \(2.32\) and \(1.41.\) The simulated sky appearance from the estimated sky turbidity values are shown in Fig. 17c–f. The proposed method can estimate turbidity independent from the white balance settings.
8.4 Limitations of the Proposed Method
Many images, particularly those available on the Internet, have been processed further by image processing software, such as the Adobe Photoshop. To verify the limitation of the method, we created such modified images. We changed the color balance for the first image (by multiplying each color channel by a constant), adjusted the hue manually for the second image (by increasing the pixel values of the green channel to make it greenish), and then estimated the camera spectral sensitivities from them. The result is shown in Fig. 18, where (a) shows the original image, (b) shows the manually color balanced image, (c) shows the manually hueadjusted image, (d) and (e) show the estimated results. The estimated camera spectral sensitivities from Fig. 18b was close to the groundtruth. However, the estimated camera spectral sensitivities from Fig. 18c had large errors compared to the groundtruth, since the sky turbidity was deviated by the hue modification. There are some operations performed on images by the Photoshop that conflict with the camera spectral sensitivity estimation, and in future work we will consider how to automatically filter out such contaminated images.
9 Conclusion
In this paper, we have proposed a novel method to estimate camera spectral sensitivities and white balance setting from images with sky regions. The proposed method could significantly benefit physicsbased computer vision or computer vision in general, particularly for future research where the images on the Internet become valuable. To conclude, our contributions in this paper are (1) the novel method that uses images for camera spectral sensitivity and white balance estimation, (2) the database of camera spectral sensitivities that is publicly available, (3) the improved sky turbidity estimation that handles a wide variety of cameras, and (4) the camera spectral sensitivitybased color correction between different cameras.
Notes
While a few papers propose autowhite balancing methods, this paper estimates the white balance setting inversely from images, where an autowhite balancing method has been applied to them. We also differentiate color constancy from white balance parameter estimation, since the former deals with illumination color and the latter deal only with camera settings.
Equation (1) ignores the camera gain.
Here, molecules refer to those less than \(0.1 \lambda \) in diameter, whose scattering can be modeled by Rayleigh scattering. The term haze, often referred to as a haze aerosol, is for much bigger particles, whose scattering is modeled by Mie scattering (McCartney 1976).
References
Barnard, K., & Funt, B. (2002). Camera characterization for color research. Color Research and Application, 27(3), 153–164.
Buil, C. (2005). Comparative test between Canon 10D and Nikon D70. Retrieved May 31, 2013 from http://www.astrosurf.com/buil/d70v10d/eval.htm.
Chaiwiwatworakul, P., & Chirarattananon, S. (2004). An investigation of atmospheric turbidity of thai sky. Energy and Buildings, 36, 650–659.
Chakrabarti, A., Scharstein, D., & Zickler, T. (2009). An empirical camera model for internet color vision. In Proceedings of British Machine Vision Conference.
Debevec, P., Hawkins, T., Tchou, C., Duiker, H. P., Sarokin, W., & Sagar, M. (2000). Acquiring the reflectance field of a human face. In ACM Transactions on Graphics (SIGGRAPH).
Ebner, M. (2007). Estimating the spectral sensitivity of a digital sensor using calibration targets. In Proceedings of Conference on Genetic and Evolutionary Computation (pp. 642–649).
Finlayson, G., Hordley, S., & Hubel, P. (1998). Recovering device sensitivities with quadratic programming. In Proceedings of Color Science, System, and Application (pp. 90–95).
Frieden, B. R. (1983). Probability, statistical optics and data testing: A problem solving approach. Springer.
Haber, T., Fuchs, C., Bekaert, P., Seidel, H. P., Geoesele, M., & Lensch, H. P. A. (2009). Relighting objects from image collections. In Proceedings of Computer Vision and Pattern Recognition (pp. 627–634).
Hara, K., Nishino, K., & Ikeuchi, K. (2005). Light source position and reflectance estimation from a single view without the distant illumination assumption. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(4), 493–505.
Hordley, S. D. (2006). Scene illuminant estimation: Past, present, and future. Color Research and Application, 31(4), 303–314.
Hubel, P. M., Sherman, D., & Farrell, J. E. (1994). A comparison of method of sensor spectral sensitivity estimation. In Proceedings of Color Science, System, and Application (pp. 45–48).
Ikeuchi, K. (1981). Determining surface orientations of specular surfaces by using the photometric stereo method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 3(6), 661–669.
Ikeuchi, K., & Horn, B. K. P. (1981). Numerical shape from shading and occluding boundaries. Artificial Intelligence, 17(1–3), 141–184.
Judd, D. B., Macadam, D. L., Wyszecki, G., Budde, H. W., Condit, H. R., Henderson, S. T., et al. (1964). Spectral distribution of typical daylight as a function of correlated color temperature. Journal of the Optical Society of America, 54(8), 1031–1036.
Kawakami, R., & Ikeuchi, K. (2009). Color estimation from a single surface color. In Proceedings of Computer Vision and Pattern Recognition (pp. 635–642).
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of IEEE International Conference on Neural Networks (pp. 1942–1948).
Kuthirummal, S., Agarwala, A., Goldman, D. B., & Nayar, S. K. (2008). Priors for large photo collections and what they reveal about cameras. In Proceedings of European Conference on Computer Vision (pp. 74–87).
Lalonde, J. F., Efros, A. A., & Narasimhan, S. G. (2009). Estimating natural illumination from a single outdoor image. In Proceedings of International Conference on Computer Vision (pp. 183–190).
Lalonde, J. F., Narasimhan, S. G., & Efros, A. A. (2010). What do the sun and the sky tell us about the camera? International Journal of Computer Vision, 88(1), 24–51.
Li, Y., Lin, S., Lu, H., & Shum, H. Y. (2003). Multiplecue illumination estimation in textured scenes. In Proceedings of International Conference on Computer Vision (pp. 1366–1373).
Lin, S., Gu, J. W., Yamazaki, S., & Shum, H. Y. (2004). Radiometric calibration using a single image. In Proceedings of Computer Vision and Pattern Recognition (pp. 938–945).
LopezMoreno, J., Hadap, S., Reinhard, E., & Gutierrez, D. (2010). Compositing images through light source detection. Computer and Graphics, 6(34), 698–707.
McCartney, E. J. (1976). Optics of the atmosphere: Scattering by molecules and particles. New York: Wiley.
Parkkinen, J. P. S., Hallikainen, J., & Jaaskelainen, T. (1989). Characteristic spectra of Munsell colors. Journal of the Optical Society of America, 6(2), 318–322.
Perez, R., Seals, R., & Michalsky, J. (1993). An all weather model for sky luminance distribution. Solar Energy, 50(3), 235–245.
Pratt, W. K., & Mancill, C. E. (1976). Spectral estimation techniques for the spectral calibration of a color image scanner. Applied Optics, 15(1), 73–75.
Preetham, A. J., Shirley, P., & Smits, B. (1999). A practical analytic model for daylight. In ACM Transactions on Graphics (SIGGRAPH).
Ramanath, R., Snyder, W., Yoo, Y., & Drew, M. (2005). Color image processing pipeline. IEEE Signal Processing Magazine, 22(1), 34–43.
Reinhard, E., Adhikhmin, M., Gooch, B., & Shirley, P. (2001). Color transfer between images. Computer Graphics and Application, 21(5), 34–41.
Sato, I., Sato, Y., & Ikeuchi, K. (2003). Illumination from shadows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(3), 290–300.
Sharma, G., & Trussell, H. J. (1993). Characterization of scanner sensitivity. In Proceedings of Transforms and Transportability of Color (pp. 103–107).
Slater, D., & Healey, G. (1998). What is the spectral dimensionality of the illumination functions in outdoor scenes?. In Proceedings of Computer Vision and Pattern Recognition (pp. 105–110).
Snavely, N., Seitz, S., Szeliski, R. (2006). Photo tourism: Exploring image collections in 3D. In ACM Transactions on Graphics (SIGGRAPH).
Takamatsu, J., Matsushita, Y., & Ikeuchi, K. (2008). Estimating camera response functions using probabilistic intensity similarity. In Proceedings of Computer Vision and Pattern Recognition.
Tan, R. T., Nishino, K., & Ikeuchi, K. (2004). Color constancy through inverse intensitychromaticity space. Journal of the Optical Society of America, 21(3), 321–334.
Thomson, M., & Westland, S. (2001). Colourimager characterization by parametric fitting of sensor response. Color Research and Application, 26(6), 442–449.
van de Weijer, J., Gevers, T., & Gijsenij, A. (2007). Edgebased color constancy. IEEE Transactions on Image Processing, 16(9), 2207–2214.
Vora, P. L., Farrell, J. E., Tietz, J. D., & Brainard, D. H. (1997). Digital color cameras. 2. Spectral response. Technical, Report HPL9754.
Woodham, R. J. (1980). Photometric method for determining surface orientation from multiple images. Optical Engineering, 19(1), 139–144.
Wyszecki, G., & Stiles, W. S. (1982). Color science. New York: Wiley Interscience Publication.
Yu, Y., & Malik, J. (1998). Recovering photometric properties of architectural scenes from photographs. In ACM Transactions on Graphics (SIGGRAPH).
Zhang, R., Tsai, P., Cryer, J. E., & Shah, M. (1999). Numerical shape from shading and occluding boundaries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(8), 690–706.
Zhao, H. (2013). Spectral sensitivity database. Retrieved May 31, 2013 from http://www.cvl.iis.utokyo.ac.jp/~zhao/database.html.
Zhao, H., Kawakami, R., Tan, R. T., & Ikeuchi, K. (2009). Estimating basis functions for spectral sensitivity of digital cameras. In Meeting on Image Recognition and Understanding.
Acknowledgments
This work was in part supported by the Japan Society for the Promotion of Science (JSPS) through the “Funding Program for Next Generation WorldLeading Researchers (NEXT Program),” initiated by the Council for Science and Technology Policy (CSTP).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A : Calculating the Sky Luminance (\(Y\)) from Sky Turbidity (\(T\)) (Preetham et al. 1999)
The sky luminance is calculated by using Eq. (2), where \(\mathcal{{F}}(\theta ,\,\gamma )\) is Perez et al.’s sky radiance distribution function (1993), and it is described as
where \({A,\,B,\,C,\,D,}\) and \({E}\) are the five distribution coefficients, and \(\theta \) and \(\gamma \) are shown in Fig. 2. The coefficients \({A,\,B,\,C,\,D,}\) and \({E}\) are linearly related to turbidity \(T,\) according to Preetham et al. (1999), while each of the linear transformations depends on \(x,\,y\) and \(Y.\) The coefficients for \(Y\) are as follows:
The ratio of sky luminance between a viewing direction and the reference direction in Eq. (3) is calculated as
Appendix B: Sun Position from Perspective Image
For completeness, we include all the formulas derived in Preetham et al. (1999). The sun direction denoted by the zenith (\(\theta _s\)) and azimuth angle (\(\phi _s\)) can be computed from the following equations:
where \(l\) is the site latitude in radians, \(\delta \) is the solar declination in radians, and \(t\) is the solar time in decimal hours. \(\delta \) and \(t\) are calculated as follows:
where \(J\) is Julian date, the day of the year as an integer in the range from 1 to 365. \(t_s\) is the standard time in decimal hours. \(J\) and \(t_s\) are derived from the time stamp in the image. \(SM\) is the standard meridian for the time zone in radians, and \(L\) is the site longitude in radians. The longitude \(l,\) latitude \(L\) and the standard meridian \(SM\) can be either given from the reference object location, or from the GPS information in the image.
Appendix C: Calculating the Sky Chromaticity from Sky Turbidity (Preetham et al. 1999)
The correlation between the five distribution coefficients for the sky chromaticity values (\(x\) and \(y\)) and the turbidity \(T\) are as follows:
The zenith chromaticity \(x_z\) and \(y_z\) can also be determined by turbidity \(T\) as:
where \(\theta _s\) is the sun direction. Thus, the sky chromaticity \(x\) and \(y\) can be calculated only from the turbidity and the sun direction using Eq. (5). \(T\) usually ranges from 2.0 to 30.0.
The parameters \(M_1\) and \(M_2\) to determine the spectra from the CIE chromaticity \(x\) and \(y\) can be calculated as follows:
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
About this article
Cite this article
Kawakami, R., Zhao, H., Tan, R.T. et al. Camera Spectral Sensitivity and White Balance Estimation from Sky Images. Int J Comput Vis 105, 187–204 (2013). https://doi.org/10.1007/s1126301306321
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1126301306321