1 Introduction

Mapping the raw RGBs measured by a camera to either display coordinates (such as sRGB) or the XYZ tristimuli - a human vision system referenced colour space - is called colour correction. Colour correction is an essential procedure in the camera pipeline since cameras do not “see” the world as humans do. Fundamentally, this is because the relationship between the spectral sensitivities of a camera and human visual matching functions is not a linear mapping. Explicitly, the Luther condition is not satisfied [1].

Many different algorithms have been developed for solving the colour correction problem. The most common method is to apply a linear correction transform mapping RGBs to XYZs (or display RGBs). While linear correction generally works well, it can still fail in a large number of cases, especially for saturated colours. In order to address this issue, polynomial regression methods [2,3,4,5], look-up-table methods [6] and artificial neural networks have been proposed [7]. However, most non-linear methods are not invariant to exposure change. In Finlayson et al. [8] a root-polynomial method is developed that is exposure invariant. Indeed, given two input RGBs \(\underline{p}\) and \(k\underline{p}\), the outputs of colour correction are \(\underline{q}\) and \(k\underline{q}\) if it is exposure invariant.

Another way of achieving better colour fidelity is to make more than three measurements [9]. However, when more sensors are used, the acquisition is generally more complex and suffers from problems such as reduced resolution (when sensor filter mosaic is used) or registration problem (when multiple pictures are captured). Multispectral camera systems are much more expensive than conventional cameras and are not widely deployed.

An alternate approach to increase the dimensionality of a camera system is to take two pictures of every scene with and without a coloured filter [10]. Of course this approach requires the two images to be registered (a far from easy problem to solve). Finlayson et al. [11] also proposed a prefiltering solution but the aim here was not to increase the dimensionality of capture. Rather a filter was found such that the device sensitivities multiplied by the filter and then linearly transformed by a \(3 \times 3\) matrix were as close as possible to the XYZ colour matching functions. We call this method spectral-based colorimetric filter design. Surprisingly, for some cameras there exists a filter that makes them almost colorimetric. In [11] a filter is sought that will allow a camera to capture colorimetric data for all possible spectra. Yet, we know that the spectra we measure in the world are not arbitrary. In particular, surface reflectances are smooth and can be well approximated by lower dimensions [12].

In this paper we wish to find the filter that makes a camera as colorimetric as possible for a given set of measured lights and surfaces. Figure 1 illustrates our approach. Here we see a standard D65 illuminant lighting a colour target with known reflectances. Given these spectra and the spectral sensitivities of the camera and XYZ colour maching functions, we can calculate the camera RGB responses and XYZ triplets respectively. In our optimisation we seek to find a colour filter (red ellipse in the Figure) in combination with a \(3 \times 3\) colour correction transform.

Mathematically, we will show that the simultaneous calculation of the colour filter and colour correction matrix is a bilinear optimisation problem. We show that this can be solved using Alternating Least-Squares (ALS). We regulate the optimisation to allow us to control the shape of the filter (e.g. its transmittance property).

Experiments validate our approach. For a large corpus of data we solve for the best filter for a large range of cameras. We show a filter can always be found, with which the camera system becomes much more colorimetric.

The paper is organized as follows. In Sect. 2, we present the background on image formation and linear colour correction. Section 3 discusses the formulation and calculation of a filter and a transform matrix. The colorimetric performance is evaluated in comparing with two other methods in Sect. 4. The paper concludes in Sect. 5.

Fig. 1.
figure 1

Schematic diagram of colour measurement for an object viewed under a given illuminant. We try to determine a filter (placed in front) for a corresponding camera such that the RGB outputs after a linear mapping become the same as perceptual XYZ tristimulus results. Note that the human eye and camera system should be placed at the same viewing geometry in practice. (Color figure online)

2 Background

Suppose a light \(E(\lambda )\) strikes a surface \(S(\lambda )\) then, under the Lambertian model of image formation, the reflected light \(C(\lambda )\) is proportional to \(E(\lambda )S(\lambda )\). Given a set of three spectral sensitivity functions, \(\underline{Q}(\lambda )\), then the sensor response is defined as:

$$\begin{aligned} \underline{\rho } = \int _{\omega } C(\lambda ) \underline{Q}(\lambda ) d\lambda \end{aligned}$$
(1)

where the integral is taken over the visible spectrum \(\omega \). Similarly, the colour response of human visual system can be defined as

$$\begin{aligned} \underline{x} = \int _{\omega } C(\lambda ) \underline{\chi }(\lambda ) d\lambda \end{aligned}$$
(2)

where \(\chi (\lambda )\) represents the observer colour matching functions (including long-, medium- and short- wavelengths).

In practice, the spectral data is measured through sampling across the visible spectrum, i.e. typically from 400 nm to 700 nm at a 10 nm interval. Given a discrete representation of our data, the integrals shown above can be replaced by vector-matrix multiplication.

$$\begin{aligned} \underline{\rho } = Q^{t}\underline{c} \end{aligned}$$
(3)
$$\begin{aligned} \underline{x} = \chi ^{t}\underline{c} \end{aligned}$$
(4)

\(\underline{c}\) denotes one colour signal spectrum as a \(31\times 1\) vector. Q and \(\chi \) are \(31 \times 3\) matrices. The 3-vector camera response \(\underline{\rho }\) and visual system response \(\underline{x}\) are \(3 \times 1\) vectors.

Given a \( 31 \times N\) matrix C of colour signal spectra (one spectrum per column) then, respectively, the camera responses and XYZ tristimuli are \(N\times 3\) matrices written as

$$\begin{aligned} P = {C}^{t}Q \end{aligned}$$
(5)
$$\begin{aligned} X = {C}^{t}\chi \end{aligned}$$
(6)

In linear colour correction we solve for the best \(3\times 3\) matrix M that best maps camera RGBs to XYZ tristimuli. Therefore, we minimize:

$$\begin{aligned} \min \limits _{M} \parallel PM - X \parallel \end{aligned}$$
(7)

The matrix M can be solved for in closed form (using the Moore-Penrose inverse)

$$\begin{aligned} M = P^{+}X = [P^{t}P]^{-1}P^{t}X \end{aligned}$$
(8)

where the superscript \(^{+}\) and \(^{t}\) denote the pseudo-inverse and transpose operation respectively.

Finally, in the next section, we are interested in designing a filter that makes a camera more colorimetric. How then can we model the effect of a filter given the linear algebra formulation of color formation we have been developing in this section? Suppose \(f(\lambda )\) denotes a transmissive filter and \(C(\lambda )\) a colour signal spectrum. Physically, the light passing though a filter is equal to the product of the spectra \(f(\lambda )C(\lambda )\). In the discrete domain our spectral functions are now represented by the 31-vectors \(\underline{f}\) and \(\underline{c}\). Unfortunately, component-wise multiplication of vectors do not exist in linear algebra. Rather we must re-express \(\underline{f}\) as a diagonal matrix:

$$\begin{aligned} D({\underline{f}})=diag(\underline{f}) {\left\{ \begin{array}{ll} D({\underline{f}})_{ij}=0 &{} \text {if}\,i\ne j\\ D({\underline{f}})_{ij}=f_{i} &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(9)

Now, \(D({\underline{f}})\underline{c}\) equals the component-wise multiplication of \(\underline{f}\) and \(\underline{c}\).

3 Optimisation-Based Filter Design

Let us return to Fig. 1. For a given set of measured colour signal spectra and camera sensitivities, we can calculate the camera RGBs and the corresponding tristimuli. Now we wish to find a transmissive filter - that we can place in front of the camera - that will allow the RGBs to be corrected more accurately. That is, when we carry out a least-squares regression of the filtered RGBs we are closer to the ground-truth XYZs.

A high-level mathematical formulation of the optimisation - for finding the optimal filter supporting colour correction - can be addressed as:

$$\begin{aligned} \min \limits _{\underline{f}, M} \parallel C^{t}diag(\underline{f})QM - C^{t}\chi \parallel \end{aligned}$$
(10)

As before C denotes a set of N combinations of colour signal spectra. Respectively, Q and \(\chi \) are the \(31 \times 3\) matrices encoding the spectral sensitivities of the camera and XYZ colour matching functions. The colour filter is denoted by the \(31\times 1\) vector \(\underline{f}\). Remember the function diag() turns a vector into a diagonal matrix where the filter components are mapped to the diagonal of the matrix (see the end of Background Section) for a description. Finally, M denotes a colour correction matrix.

The form of Eq. 10 is bilinear. That is to say we are solving for \(\underline{f}\) and M, and if one (or the other) is held fixed the problem becomes a simple linear optimisation. We exploit this insight to solve for the overall optimisation problem. See Algorithm 1 below for the details.

figure a

Most of the optimisation shown in Algorithm 1 is straightforward. Particularly in Step 5, we are solving a normal linear regression (and can use the Moore-Penrose inverse). However, solving for the filter \(\underline{f}\) is more complex. It is still linear, ultimately, and can be solved using the Moore-Penrose inverse but there is some ‘book-keeping’ (equation rearranging) to be done.

First, let us rewrite \(diag(\underline{f})\) in the following way

(11)

where matrix \(D_{i}\) is a sparse matrix having one non-zero value in the \(i^{th}\) diagonal. Based on this property, the calculation of \(C^{t}diag(\underline{f})QM\) can be expressed as follows

$$\begin{aligned} C^{t}diag(\underline{f})QM = f_1C^{t}D_{1}QM + f_2C^{t}D_{2}QM + ... + f_{31}C^{t}D_{31}QM \end{aligned}$$
(12)

Let us define a vector \(\underline{V}_i=vec(C^{t}D_{i}QM)\) where the vec() function strips out a matrix into a vector. A new matrix \(V = [V_1, V_2, ..., V_{31}]\) can now be constructed accordingly (the vector \(\underline{V_1}\) is placed in the first column followed by the second \(\underline{V_2}\) and the third \(\underline{V_3}\) etc.). Similarly, we vectorise on the human response and define \(X = vec(C^{t}\chi )\). Step 4 of our algorithm can now be reformulated as:

$$\begin{aligned} \min \limits _{\underline{f}} \parallel V\underline{f} - X \parallel \end{aligned}$$
(13)

The same vectorisation is operated on the matrix X. Apparently, the filter can now be easily solved by least-squares regression as \(f = V^{+}X\).

Using this newly calculated filter \(\underline{f}\), next we can solve the mapping matrix M as

$$\begin{aligned} M = (C^{t}diag(\underline{f})Q)^{+}(C^{t}\chi ) \end{aligned}$$
(14)

Because solving for the filter or the colour correction matrix is to solve a least-squares problem, then the error reduces at each stage in the optimisation process. Further it is well known that Alternating Least-Squares problems (of which Algorithm 1 is a particular case) also converge [13].

It is important to note that from a physical perspective, the transmittance of the filter must be within the range [0, 100%]. Therefore Eq. 13 is solved subject to \(0 \le f \le 1\). This linear constraint condition can be achieved using Quadratic programming (least-squares problem as in Eq. 13 can be easily converted into quadratic problem) where we apply the upper and lower constraints upon the parameters [14].

4 Experimental Results

In this work, we find the best colour pre-filter for a set of 28 digital cameras [15]. The colour signal tested here is a combination of the CIE standard illuminant D65 [16] with SFU-1995 reflectance data set [17]. The filter optimization is based on the best mapping between RGBs (after filtering and linear correction) and reference XYZs as formulated in Eq. 10.

4.1 Spectral Transmission of Filter

The filter and the corresponding transform matrix for each camera device with given testing colour signal inputs are calculated through Algorithm 1. Note that in order to simulate a physically reliable filter, we constrain its parameters in the range of [0, 100%] and for the current method, the experimental results presented here (in Table 1) are based on this constraint. For Canon D50 we show the filter found by bilinear least-squares at the top of Fig. 2a representing the transmittance within [0, 100%]. Actually, by using the Quadratic programming technique, the boundaries for filter parameters can be easily adjusted. In Fig. 2b we also show a variant where the filter transmittance is higher constrained to be between 50% and 100% (it can be regarded as a high-transparent filter which can result in less noise issues).

Fig. 2.
figure 2

Filters results for Canon D50 under different constraints

Table 1. Comparison of colour correction results between different methods
Fig. 3.
figure 3

Overall colour correction performance in terms of mean, median, 95 percentile colour differences with error bars.

4.2 Colour Evaluation

We evaluate our method compared to simple least-squares and the previous “spectral-based” colorimetric filter design method as in [11]. Our results are summarized for all 28 cameras in Table 1.

In the first three columns of the table, we record the mean, median and 95 percentile of colour difference errors in terms of CIELAB \(\varDelta E_{ab}^{*}\) for the SFU-1995 reflectance data set viewed under a CIE D65 illuminant. In the second set of three columns we record the performance of the prior filter design method (that tries to find a filter so a camera best matches the Luther Condition). Finally, in the last three columns we record the colour correction performance by our new method. The overall colour correction performance is drawn in Fig. 3 listing the results by these three methods (from left to right).

The current filtering method can achieve as small error as \(0.98 \pm 0.28 {\varDelta E_{ab}^{*}}\) by averaging the whole camera set. The overall medium error is even smaller, reaching \(0.59 \pm 0.17 {\varDelta E_{ab}^{*}}\). Clearly, our new method finds filters which support a step change in our ability to correct camera colour responses. Compared to the linear colour correction and according to the mean, median and 95% percentile error measures, the recorded error by current method is much less. Previously, we proposed the filter design for colorimetric purpose based on Luther condition [11]. Comparing to the former spectral-based colorimetric filter design, the current method outperforms overall, especially for the Nikon cameras which illustrate a significant improvement. Among the camera set, Sony Nex5N provides the best results which are all under Just Noticeable Difference [18].

5 Conclusion

In this article, we develop a method to find the optimal filter (to be placed in front of a camera) to make a device most colorimetric via optimisation. Experiments show that this method provides dramatic improvement over direct linear correction operating on raw unfiltered RGBs. Compared to normal linear correction the errors (calculated as mean, median or 95% \(\varDelta E_{ab}^{*}\)) are reduced by 20% to 70% on average.