Color palette modification for pictures/frames is often required in professional photograph editing as well as the video postproduction. Artists usually choose a desired target picture as a reference and manipulate the other pictures to make their color palette similar to the reference. This process is known as photo-realistic color transfer. Figure 1 shows an example of photo-realistic color transfer between a target image and a source image. This color transfer process is a complex task that requires artists to carefully adjust for multiple properties such as exposure, brightness, white point, and color mapping. These adjustments are also interdependent, i.e., the alignment for a single property can cause the other previously aligned properties to become misaligned. Some artifacts due to nonlinear image processing (e.g., JPEG block edges) may also appear after color adjustment.

Fig. 1
figure 1

Pipeline of our color transfer decomposition. The “target” image was used by the “original color transfer” algorithms to produce the “original color transfer” output image, but it is not used by the proposed color transfer re-coding algorithm. The orange dashed line divides the pipeline into two steps: (1) color space mapping: The RGBs of the source image (drawn as red edge circles) and the original color transfer image (by [16] with the target image as the reference, black edge circles) are matched according to their locations (e.g., the blue matching lines), from which we estimate a 3D homography matrix H and use H to transfer the source image colors and (2) mean intensity mapping: The image mean intensity values (mean values of R, G, and B) are aligned by estimating the per-pixel shading between the color space-mapped result and the original color transfer result by least squares. The final result is a visually close color transfer approximation

One of the first photo-realistic color transfer methods was introduced by Reinhard et al. [23]. Their method proposed that the mean and variance of the source image, in a specially chosen color space, should be manipulated to match those of a target. More recent methods [1, 16, 19,20,21] might adopt more aggressive color transfers—e.g., color distribution force matches [19, 20]—and yet, these aggressive changes often do not preserve the original intensity gradients and new spatial type artifacts may be introduced into an image (e.g., JPEG blocks become visible or there is false contouring). In addition, the complexity of a color transfer method usually leads to longer processing time. To address these issues, previous methods [11, 13, 18] were proposed to approximate the color change produced by a color transfer, such that an original complicated color transfer can be re-formulated as a simpler and faster algorithm with an acceptable level of accuracy and some introduced artifacts.

In this paper, we propose a simple and general model for re-coding (approximating) an unknown photo-realistic color transfer which provides leading accuracy and the color transfer algorithm can be decomposed into meaningful parts. Our model is extended from a recent planar color homography color transfer model [11] to the 3D domain, as opposed to the original 2D planar domain. In our improved model, we decompose an unknown color transfer into 3D color perspective transform and mean intensity mapping components. Based on [11], we make two new contributions: (1) a 3D color mapping model that better re-codes color change by relating two homogeneous color spaces and (2) a monotonic mean intensity mapping method that prevents artifacts without adding unwanted blur. Our experiments show significant improvements in color transfer re-coding accuracy. We demonstrate three applications of the proposed method for color transfer artifact fixing, color transfer acceleration, and color-robust image stitching.

Throughout the paper, we denote the source image by \(I_\mathrm{s}\) and the original color transfer result by \(I_\mathrm{t}\). Given \(I_\mathrm{s}\) and \(I_\mathrm{t}\), we re-code the color transfer with our color homography model which approximates the original color transfer from \(I_\mathrm{s}\) to \(I_\mathrm{t}\). Figure 1 shows the pipeline of our color transfer decomposition.

Our paper is organized as follows. We review the leading prior color transfer methods and the previous color transfer approximation methods in Sect. 2. Our color transfer decomposition model is described in Sect. 3. We present a color transfer re-coding method for two corresponding images in Sect. 4. In addition, we demonstrate its applications in Sect. 5. Finally, we conclude in Sect. 6.


In this section, we briefly review the existing work on photo-realistic color transfer, the methods for re-coding such a color transfer, and the concept of Color Homography.

Photo-realistic color transfer

Example-based color transfer was first introduced by Reinhard et al. [23]. Their method aligns the color distributions of two images in a specially chosen color space via 3D scaling and shift. Pitie et al. [19, 20] proposed an iterative color transfer method that distorts the color distribution by random 3D rotation and per-channel histogram matching until the distributions of the two images are fully aligned. This method makes the output color distribution exactly the same as the target image’s color distribution. However, the method introduces spatial artifacts. By adding a gradient preservation constraint, these artifacts can be mitigated or removed at the cost of more blurry artifacts [20]. Pouli and Reinhard [21] adopted a progressive histogram matching in L*a*b* color space. Their method generates image histograms at different scales. From coarse to fine, the histogram is progressively reshaped to align the maxima and minima of the histogram, at each scale. Their algorithm also handles the difference in dynamic ranges between two images. Nguyen et al. [16] proposed an illuminant-aware and gamut-based color transfer. They first eliminate the color cast difference by a white-balancing operation for both images. A luminance alignment is later performed by histogram matching along the “gray” axis of RGB. They finally adopt a 3D convex hull mapping to limit the color-transferred RGBs to the gamut of the target RGBs. Other approaches (e.g., [1, 25, 28]) solve for several local color transfers rather than a single global color transfer. As most non-global color transfer methods are essentially a blend of several single color transfer steps, a global color transfer method is extendable for multi-transfer algorithms.

Photo-realistic color transfer re-coding

Various methods have been proposed for approximating an unknown photo-realistic color transfer for better speed and naturalness. Pitie et al. [18] proposed a color transfer approximation by a 3D similarity transform (\(\hbox {translation}+\hbox {rotation}+\hbox {scaling}\)) which implements a simplification of the earth mover’s distance. By restricting the form of a color transfer to a similarity transform model, some of the generality of the transfer can be lost such that the range of color changes it can account for is more limited. In addition, a color transfer looks satisfying only if the tonality looks natural and this is often not the case with the similarity transformation. Ilie and Welch [13] proposed a polynomial color transfer which introduces higher-degree terms of the RGB values. This encodes the non-linearity of color transfer better than a simple \(3\times 3\) linear transform. However, the nonlinear polynomial terms may over-manipulate a color change and introduce spatial gradient artifacts. Similarly, this method also does not address the tonal difference between the input and output images. Gong et al. [11] proposed a planar color homography model which re-codes a color transfer effect as a combination of 2D perspective transform of chromaticity and shading adjustments. Compared with [18], it requires less parameters to represent a non-tonal color change. The model’s tonal adjustment also further improves color transfer re-coding accuracy. However, the assumption of a 2D chromaticity distortion limits the range of color transfer it can represent. Their [11] tonal adjustment (mean intensity-to-shading mapping) also does not preserve image gradients and the original color rank. Another important work is probabilistic moving least squares [12] which calculates a largely parameterized transform of color space. Its accuracy is slightly better than [13]. However, due to its high complexity, it is unsuitable for real-time use. In this paper, we only benchmark against the color transfer re-coding methods with a real-time performance.

2D color homography

The color homography theorem [7, 8] shows that chromaticities across a change in capture conditions (light color, shading, and imaging device) are a homography apart. Suppose that we have an RGB vector \({\rho = [\mathrm{R},\mathrm{G},\mathrm{B}]^\intercal }\), its r and g chromaticity coordinates are written as \({r=\mathrm{R}/\mathrm{B}},~{g=\mathrm{G}/\mathrm{B}}\) which can be interpreted as a homogeneous coordinate vector \(\underline{c}\) and we have:

$$\begin{aligned} \underline{c}\propto \left[ \begin{array}{lll} r&g&1 \end{array} \right] ^\intercal . \end{aligned}$$

when the shading is uniform and the scene is diffuse, it is well known that across a change in illumination or a change in device, the corresponding RGBs are, to a reasonable approximation, related by a \(3 \times 3\) linear transform \(H_{3 \times 3}\):

$$\begin{aligned} \underline{\rho }'^\intercal = \underline{\rho }^\intercal H_{3 \times 3} \end{aligned}$$

where \(\underline{\rho }'\) is the corresponding RGBs under a second light or captured by a different camera [14, 15]. Due to different shading, the RGB triple under a second light is written as

$$\begin{aligned} \underline{c}'^\intercal =\alpha \underline{c}^\intercal H_{3 \times 3} \end{aligned}$$

where \(H_{3 \times 3}\) here is a color homography color correction matrix, \(\alpha \) denotes an unknown scaling. Without loss of generality, let us interpret c as a homogeneous coordinate, i.e., assume its third component is 1. Then, rg chromaticities \(\underline{c}^\intercal \) and \(\underline{c}'^\intercal \) are a homography apart.

The 2D color homography model decomposes a color change into a 2D chromaticity distortion and a 1D tonal mapping, which successfully approximates a range of physical color changes. However, the degrees of freedom of a 2D chromaticity distortion may not accurately capture more complicated color changes applied in photograph editing.

3D color homography model for photo-realistic color transfer re-coding

The color homography theorem reviewed in Sect. 2.3 states that the same diffuse scene under an illuminant (or camera) change will result in two images whose chromaticities are a 2D (planar) homography apart. The 2D model works best for RAW-to-RAW linear color mappings and can also approximates the nonlinear mapping from RAW to sRGB [8]. In this paper, we extend the planar color homography color transfer to the 3D spatial domain. A 3D perspective transformation in color may be interpreted as a perspective distortion of color space (e.g., Fig. 2). Compared with the 2D homography model, the introduction of a 3D perspective transform can model a higher degree of non-linearity in the color mapping. We propose that a global color transfer can be decomposed into a 3D perspective transform and a mean intensity alignment.

Fig. 2
figure 2

Homogeneous color space mapping. The left 3 homogeneous RGB cubes are equivalent (up to a scale). The left RGB space cubes can be mapped to the right by a \(4 \times 4\) 3D homography transform H

We start with the outputs of the previous color transfer algorithms. We represent a 3D RGB intensity vector by its 4D homogeneous coordinates (i.e., appending an additional element “1” to each RGB vector). Assuming we relate \(I_\mathrm{s}\) to \(I_\mathrm{t}\) with a pixel-wise correspondence, we represent the RGBs of \(I_\mathrm{s}\) and \(I_\mathrm{t}\) as two \(n\times 3\) matrices \(R_\mathrm{s}\) and \(R_\mathrm{t}\), respectively, where n is the number of pixels. We also denote their corresponding \(n\times 4\) homogeneous RGB matrices as \(\dot{R_\mathrm{s}}\) and \(\dot{R_\mathrm{t}}\). For instance, \(\dot{R_\mathrm{s}}\) can be converted to \(R_\mathrm{s}\) by dividing its first 3 columns by its 4th column. Our 3D color homography color transfer model is proposed as:

$$\begin{aligned} \dot{R_\mathrm{t}}&\approx D\dot{R_\mathrm{s}}H_{4 \times 4} \end{aligned}$$
$$\begin{aligned} {R_\mathrm{t}}&\approx D'{} \mathbf h (\dot{R_\mathrm{s}}H_{4 \times 4}) \end{aligned}$$

where D is a \(n\times n\) diagonal matrix of scalars (i.e., exposures, but only used for estimating \(H_{4 \times 4}\)) and \(H_{4 \times 4}\) is a \(4\times 4\) perspective color space mapping matrix, \(\mathbf h ()\) is a function that converts a \(n\times 4\) homogeneous RGB matrix to a \(n\times 3\) RGB matrix (i.e., it divides the first 3 columns of \(\dot{R_\mathrm{s}}\) by its 4th column and removes the 4th column such that \(R_\mathrm{s} = \mathbf h (\dot{R_\mathrm{s}})\)), \(D'\) is another \(n\times n\) diagonal matrix of scalars (i.e., shadings) for mean intensity mapping. A color transfer is decomposed into a 3D homography matrix \(H_{4 \times 4}\) and a mean intensity mapping diagonal matrix \(D'\). The effect of applying the matrix H is essentially a perspective color space distortion (e.g., Fig. 2). That is, we have a homography for RGB colors (rather than just chromaticities – \((\mathrm{R}/\mathrm{B},\mathrm{G}/\mathrm{B})\)). \(D'\) adjusts mean intensity values (by modifying the magnitudes of the RGB vectors) to cancel the tonal difference between an input image and its color-transferred output image (e.g., the right half of Fig. 1).

Image color transfer re-coding

In this section, we describe the steps for decomposing a color transfer between two registered images into the 3D color homography model components.

Perspective color space mapping

We solve for the 3D homography matrix \(H_{4 \times 4}\) in Eq. 4 by using alternating least squares (ALS) [9] as illustrated in Algorithm 1 where i indicates the iteration number.

figure e

In each iteration, Step 4 (Algorithm 1) keeps D fixed (which was updated in Step 5 of a previous iteration) and finds a better \(H_{4 \times 4}\). Step 5 finds a better D using the updated \(H_{4 \times 4}\) fixed. The minimization in Steps 1, 4, and 5 are achieved by linear least squares. After these alternating estimation steps, we get a decreasing evaluation error for \(\left||\dot{R}_\mathrm{s}^{i}-\dot{R}_\mathrm{s}^{i-1} \right||_F\). The assured convergence of ALS has been discussed in [26]. Practically, we may limit the number of iterations to \(n=20\). (Empirically, the error is not significant after 20 iterations.)

Mean intensity mapping

The tonal difference between an original input image and the color-transferred image is caused by the nonlinear operations of a color transfer process. We cancel this tonal difference by adjusting mean intensity (i.e., scaling RGBs by multiplying \(D'\) in Eq. 5). To determine the diagonal scalars in \(D'\), we first propose a universal mean intensity-to-mean intensity mapping function g() which is a smooth and monotonic curve fitted to the per-pixel mean intensity values (i.e., mean values of RGB intensities) of the two images. As opposed to the unconstrained mean intensity-to-shading mapping adopted in [11], we enforce monotonicity and smoothness in our optimization which avoids halo-like artifacts (due to sharp or non-monotonic tone change [6]). The mapping function g is fitted by minimizing the following function:

$$\begin{aligned} \begin{aligned}&\underset{g}{\text {argmin}}&\left||\underline{y}-g(\underline{x}) \right||^2 + p\int {\left||g''(t) \right||^2}\mathrm{d}t\\&\text {subject to}&g'(t) \ge 0 \quad \text {and} \quad 0 \le g(t) \le 1. \end{aligned} \end{aligned}$$

where the first term minimizes the least-squares fitting error and the second term enforces a smoothness constraint for the fitted curve. \(\underline{x}\) and \(\underline{y}\) are assumed in [0, 1]. The curve is smoother when the smoothness factor p is larger (\(p = 10^{-5}\) by default). \(\underline{x}\) and \(\underline{y}\) are the input and reference mean intensity vectors. The mapping function g() is implemented as a lookup table which is solved by quadratic programming [4]. (See “Appendix” 1.) Figure 1 shows an example of the computed function g().

Let the operator \(\mathbf diag (\underline{x})\) return a diagonal matrix with components of \(\underline{x}\) along its diagonal. Given the estimated mean intensity mapping function g(), the diagonal scalar matrix \(D'\) (in Eq. 5) is updated as follows:

$$\begin{aligned} \begin{aligned} \underline{b}&= \frac{1}{3} \mathbf h (\dot{R_\mathrm{s}}H_{4 \times 4}) \begin{bmatrix}1\\1\\1 \end{bmatrix} \\ D'&= \mathbf diag (g(\underline{b}))\mathbf diag ^{-1}(\underline{b}) \end{aligned} \end{aligned}$$

where \(\underline{b}\) is the input mean intensity vector of the 3D color space-mapped RGB values (i.e., \(\mathbf h (\dot{R_\mathrm{s}}H_{4 \times 4})\)). Because this step only changes the magnitude of an RGB vector, the physical chromaticity and hue are preserved.

Mean intensity mapping noise reduction

Perhaps because of the greater compression ratios in images and videos, we found that, even though the mean intensity is reproduced as a smooth and monotonic tone curve, some minor spatial artifacts could—albeit rarely—be introduced. We found that the noise can be amplified in the mean intensity mapping step and the component \(D'\) (in Eq. 5) absorbs most of the spatial artifacts (e.g., JPEG block edges). To reduce the potential gradient artifacts, we propose a simple noise reduction. Because \(D'\) scales each pixel individually, we can visualize \(D'\) as an image denoted as \(I_{D'}\). We remove the spatial artifacts by applying a joint bilateral filter [17] which spatially filters the scale image \(I_{D'}\) guided by the source image \(I_\mathrm{s}\) such that the edges in \(I_{D'}\) are similar to the edges in the mean intensity channel of \(I_\mathrm{s}\). Figure 3 shows the effect of the monotonic smooth mean intensity mapping and its mapping noise reduction. Although it is often not necessary to apply this noise reduction step, we have always included it as a “safety guarantee.”

Fig. 3
figure 3

Minor noise reduction: Some JPEG block artifacts of our initial approximation are reduced via mean intensity mapping noise reduction step. Compared with our final approximation result (middle), the output of [11] contains significant blur artifacts at the boundary of the hill. Row 2 shows the magnified area in Row 1. Row 3 shows the shading image \(I_{D'}\) of the magnified area


In our experiments, we assume that we have an input image and an output produced by a color transfer algorithm. Because the input and output are matched (i.e., they are in perfect registration), we can apply Algorithm 1 directly.

In the experiments which follow, we call our method—3D homography color transform\(~+~\)mean intensity mapping—“3D-H.” Similarly, the 2D homography approach for color transfer re-coding [11] is denoted as “2D-H.” We first show some visual results of color transfer approximations of [16, 20, 21, 23] in Fig. 4. Our 3D color homography model offers a visually closer color transfer approximation.

Fig. 4
figure 4

Visual result of 4 color transfer approximations (rightmost 4 columns). The original color transfer results are produced by the methods cited at the top right of the images in the first column. The original input images are shown at the bottom right of the first column. Please also check our supplementary material for more examples (

Although a public dataset for color transfer re-coding was published in [11], it contains a limited sample size. In this work, we use a new datasetFootnote 1 In Table 1, we quantitatively evaluate the approximation accuracy of the 3 state-of-the-art algorithms [11, 13, 18] by the error between the approximation result and the original color transfer result. The results are the averages over the 200 test images. The method [13] is tested by using a common second-order polynomial (which avoids over-fitting). We adopt PSNR (peak signal-to-noise ratio) and SSIM (structural similarity) [27] as the error measurements. Acceptable values for PSNR and SSIM are usually considered to be, respectively, over 20  dB and 0.9. Table 1 shows that 3D-H is generally better than the other compared methods for both PSNR and SSIM metrics. To further investigate the statistical significance of the evaluation result, we run a one-way ANOVA to verify that the choice of our model has a significant and positive effect on the evaluation metrics (i.e., PSNR and SSIM). In our test, we categorize all evaluation numbers into four groups according to associated color transfer re-coding method. Table 2 shows the post hoc tests for one-way ANOVA where the choice of color transfer approximation method is the only variable. We obtained the overall p-values \(<0.001\) for both PSNR and SSIM which indicate the choice of color transfer re-coding method has a significant impact on the color transfer approximation result. In addition, we run a post hoc analysis and found near 0 p-values when comparing 3D-H with all 3 other methods. This further confirms that the difference in performance of 3D-H is significant. Our test also shows that the difference between 2D-H [11] and Poly [13] is not significant. Our 3D color homography model produces the best result overall.

Table 1 Mean errors between the original color transfer result and its approximations by 4 popular color transfer methods
Table 2 Post hoc tests for one-way ANOVA on errors between the original color transfer result and its approximations


In this section, we demonstrate three applications of our color transfer re-coding method.

Fig. 5
figure 5

Color transfer acceleration. The color transfers [16, 20] of an original image (a) are accelerated by our 3D-H color transfer re-coding method and a state-of-the-art method transform recipe (TR) [10]. The top right labels in (b) and (e) show the original color transfer time. The top right labels in (c, d) and (f, g) show the improved processing time and the measurements of similarity (PSNR) to the original color transfer output. “App. Time” indicates time for color transfer approximation only

Fig. 6
figure 6

Artifact reduction comparison based on an original image (b) and its defective color transfer output (a). c is an artifact reduction result by a state-of-the-art method— transportation map regularization (TMR) [22]. d is our re-coding result where the artifacts are indirectly smoothed at the layer of shading scales. e is our alternative enhancement result which makes its higher-frequency detail similar to the original input image. f, g CIE 1976 Color Difference \(\Delta E\) [24] visualization of ce where the average color difference is labeled at the top right corner. A lower \(\Delta E\) indicates a closer approximation to (a)

Color transfer acceleration

More recent color transfer methods usually produce higher-quality outputs, however, at the cost of more processing time. Methods that produce high-quality images and are fast include the work of Gharbi et al. [10] who proposed a general image manipulation acceleration method—named transform recipe (TR)—designed for cloud applications. Based on a downsampled pair of input and output images, their method approximates the image manipulation effect according to changes in luminance, chrominance, and stack levels. Another fast method by Chen et al. [3] approximates the effect of many general image manipulation procedures by convolutional neural networks (CNNs). While their approach significantly reduces the computational time for some complex operations, it requires substantial amounts of samples for training a single image manipulation. In this subsection, we demonstrate that our re-coding method can be applied as an alternative to accelerate a complex color transfer by approximating its color transfer effect at a lower scale. We approximate the color transfer in the following steps: (1) We supply a thumbnail image (\(40 \times 60\) in our experiment) to the original color transfer method and obtain a thumbnail output; (2) Given the pair of lower-resolution input and output images, we estimate a color transfer model that approximates the color transfer effect; (3) We then process the higher-resolution input image by using the estimated color transfer model and obtain a higher-resolution output which looks very close to the original higher-resolution color transfer result without acceleration.

In our experiment, we choose two computationally expensive methods [16, 20] as the inputs and we compare our performance (MATLAB implementation) with a state-of-the-art method TR [10] (Python implementation). Figure 5 shows the output comparisons between the original color transfer results and the acceleration results. The results indicate that our re-coding method can significantly reduce the computational time (\(25\times \) to \(30\times \) faster depending on the speed of original color transfer algorithm and the input image resolution) for these complicated color transfer methods while preserving color transfer fidelity. Compared with TR [10], our method produces similar quality of output for global color transfer approximation, however, at a much reduced cost of computation (about \(10\times \) faster). Although TR [10] is less efficient, it is also worth noting that TR supports a wider range of image manipulation accelerations which include non-global color change.

Color transfer artifact reduction

Color transfer methods often produce artifacts during the color matching process. Here we show that our color transfer re-coding method can be applied as an alternative to reduce some undesirable artifacts (e.g., JPEG compression block edges). As discussed in Sect. 4.3, the color transfer artifacts are empirically “absorbed” in the shading scale component. Therefore, we can simply filter the shading scale layer by using the de-noising step described in Sect. 4.3. Figure 6 shows our artifact reduction result where we also compare with a state-of-the-art color transfer artifact reduction method—transportation map regularization (TMR) [22]. Compared with the original color transfer output, our processed result better preserves its contrast and color similarity (e.g., the grass). Meanwhile, it also removes the undesirable artifacts. And, depending on an individual user’s personal preference, the result of TMR could also be preferred since TMR makes the contrast of its result more similar to the original input. While one of our goals is to make the processed colors close to the original output image’s colors, it is also possible to transfer the details of the original input image to our result using the detail transfer method suggested in [5]. The result of this input detail recovery is shown in Figure 6e.

Fig. 7
figure 7

Color-robust image stitching. a, b are the original images taken by 2 different cameras. c is the color-transferred results from Image 2 to Image 1. d shows the original image stitching result without color transfer. e, f are the image stitching results matched to the colors of Image 2 by using different color alignment algorithms (The color transition in f) is better, which is especially visible in the sky.)

Color-robust image stitching

The input images for image stitching are not always taken by the same camera or under the same illumination conditions. The camera’s automatic image processing pipeline also modifies the colors. Direct image stitching without color correction may therefore leave imperfections in the final blending result. Since the color change between images of different views is unknown but photo-realistic, our color transfer approximation model can be applied to address this color inconsistency problem. Figure 7 shows an example of a color-robust image stitching using our color transfer re-coding method where 2 input images taken by 2 different cameras and in different illuminations are supplied for image stitching. In our color-robust procedure, we first register these two images and find the overlapping pixels. With the per-pixel correspondences, we estimate a 3D color homography color transfer model that transfers the colors of the first image to the second image’s. We then apply the estimated color transfer model to correct the first image. Finally, the corrected first input image and the original second image are supplied to the image stitching software AutoStitch [2]. Although the multi-band blending proposed in [2] provides a smooth transition at the stitching boundary, the color difference between the two halves is still noticeable (especially for the sky and tree colors) in the original stitching result. After our color alignment process, the colors of the two halves look more homogeneous. We also compare our method with a local color correction method—gain compensation [2].


In this paper, we have shown that a global color transfer can be approximated by a combination of 3D color space mapping and mean intensity mapping. Our experiments show that the proposed color transfer model approximates well for many photo-realistic color transfer methods as well as unknown global color change in images. We have demonstrated three applications for color transfer acceleration, color transfer artifact reduction, and color-robust image stitching.