Keywords

1 Introduction

Total appearance of an object is described by its color, its gloss, its translucency and its texture, according to Hunter [1] and a subsequent CIE technical report on total appearance [2]. Within this context, the gloss of the material gives numerous cues about the object: It helps to understand in particular the structure of the illumination (e.g. spectral distribution as cue for color constancy, direction of light for scene understanding and also geometric properties that may help to assess invisible parts of the scene) and provide information on object properties (e.g. 3D shape, size and interaction within the scene). Although it is most probable that gloss must be combined with other attributes for material identification and scene understanding, it is, without a doubt, a determining and important factor [3].

The physical correlate of gloss perception is expressed within the bidirectional reflectance distribution function (BRDF). Huge progress has been achieved in the measurement of this correlate during the last decade. However, it is yet not very well understood how to correlate this measure to appearance perception. According to Hunter and others, six visual criteria may be necessary to evaluate the perception of gloss: Specular gloss, contrast gloss, sheen, absence of bloom gloss (haze) and distinctness of image gloss [4,5,6].

Contrast gloss relates to the contrast perceived by the observer between the specular highlights and the diffuse area related to the same object and illumination condition. We propose that this edge between diffuse and specular may be partially representative of the perception of gloss to some extent and would be an indicator and a descriptor of near to specular area of the material. This indicator would necessarily be dependant on the size, magnitude and on other components of the scene. Consequently, we suggest that it could be characterized by a perceptual local image contrast measure of the specific related area of an image where specular reflection happens. This measure could be related to the BRDF measurement of the object and the gloss perceived by the observers.

We first introduce image contrast measurement and select specific a priori relevant metrics from the state of the art. We then generate simple objects that exhibit different surface properties. Focus is on roughness of object, which should significantly influence contrast gloss. We used an isotropic Ward model for that. In Sect. 4, we describe the visual experiment and categorization we performed, and in Sect. 5, we relate the physical model, the contrast measure and the perceptual categories together. Results suggest that some contrast measures may represent perceived gloss well, when gloss is perceived. However, it is not yet clear whether it is only contrast gloss that is evaluated, neither how robust is this indication.

2 Measures of Contrast for Digital Images

Studies have identified contrast as one of the fundamental perceptual attributes to describe the quality of an image [7]. So far there are still several definitions of contrast based on the field of research and target application. For example, in vision, contrast can be defined as the physical differences in luminance and color, as well as the perception of these differences [8], while in photography, contrast is typically related to the degree of information visible in the shadow areas [9].

In a century of contrast studies, different definitions led to development of different contrast measures. The very first contrast measures referred as global formulae are based on the highest and the lowest luminance in the scene [10,11,12]. Evolution of these measures embed more advanced global image statistics. In particular interest for this work is the measure proposed by King-Smith and Kulikowski [13] defined as \(C^{KK}=\frac{L_{max}-L_{avg}}{L_{avg}}\), where \(L_{max}\) is the maximum luminance of the image and \(L_{avg}\) is the average luminance of the image.

Later studies have shown that perceived contrast can vary across the image due to different spatial frequencies [14, 15] and to the presence of gloss and glare [16]. As a consequence, in the three last decades contrast measures based on local description of the image emerged, approaching the problem in various ways. We recall here a brief selection of them.

In 1983, Frankle and McCann [17] followed by Adelson et al. [18] proposed to use multilevel representation as feature to mimic the Human Visual System. The image is represented by a set of low-pass or band-pass copies, each representing information at a different scale. From this feature came the pioneering contrast measure of Peli [15] in 1990. This measure is commonly used in the following as benchmark of local contrast measurement.

Peli’s local band-limited contrast c of each pixel location (x,y) is defined by \(c_{i}(x,y)=\frac{a_{i}(x,y)}{l_{i}(x,y)}\), where \(a_{i}(x,y)\) is the corresponding local luminance image and \(l_{i}(x,y)\) is a low-pass-filtered version of the image containing all energy below the band i. The contrast image would then be the addition of \(c_{i}(x,y)\) of each level. Then, \(C^{P}\) is computed as the average of the contrast image.

An optimization is later proposed by Lubin [19]. Following the multi-level representation, Iordache et al. [20] and Rizzi et al. [21], respectively, proposed a local contrast measure based on a weighted 8-neighborhood mask. Tadmor and Tolhurst [22] proposed a local contrast measure by modifying and adapting the Difference Of Gaussians (DOG) model from neurophysiological studies. Later Boccignone and Ferraro [23, 24] defined local contrast as a set of thermodynamical variables.

The local contrast measure proposed by Simone et al. [25], named Weighted Level Framework (WLF), combines and extends Rizzi et al. [21] and Tadmor and Tolhurst [22] measures. WLF has been shown to have high correlation with observers perceived contrast and is flexible to work in different color spaces and with color images. We consider this metric as being a potential good candidate to represent locally the visual effect of luster.

The WLF measure of contrast for greyscale images is defined by \(C^{WLF}=\beta \cdot C\), where \(\beta \) is a scaling factor that is defined according to the image content and C the final multilevel average contrast of the image. This is defined as \(C=\frac{1}{N_{l}}{\sum ^{N_{l}}_{l=1}}\lambda _{l}\cdot \overline{c}_{l}\), where \(N_{l}\) is the number of levels, \(\overline{c}_{l}\) is the average contrast at the level l and \(\lambda _{l}\) is the weight assigned to each level l that is defined according to the image content. Likewise in Rizzi et al. [21] the number of levels of \(N_{l}\) is image size independent and each level l is created reducing at each operation the previous level of a factor of 2 starting from the original image size.

At each level l, the contrast c of each pixel location (x,y) is calculated, using the DOG model proposed by Tadmor and Tolhurst [22] such as \(c(x,y)=\frac{R_{c}(x,y)-R_{s}(x,y)}{R_{c}(x,y)+R_{s}(x,y)}\), where \(R_{c}\) and \(R_{s}\) are the center and surround components, respectively.

In the DOG model, the center component is described by a bi-dimensional Gaussian such as \(Center(x,y)=\exp \left[ -\left( x/r_{c}\right) ^{2}-\left( y/r_{c}\right) ^{2}\right] \), where \(r_{c}\) is the radius of the center component. The surround component is represented by another Gaussian curve, with a larger radius, \(r_{s}\), such as

$$Surround(x,y)=0.85\left( r_{c}/r_{s}\right) ^{2}\exp \left[ -\left( x/r_{s}\right) ^{2}-\left( y/r_{s}\right) ^{2}\right] .$$

When the central component is placed at location (x,y), the output is calculated as \(R_{c}(x,y)=\sum _{i=x-3r_{c}}^{i=x+3r_{c}}\sum _{j=y-3r_{c}}^{j=y+3r_{c}}Center(i-x,j-y)I(i,j)\). When the surround component is placed at location (x,y), the output is calculated as \(R_{s}(x,y)=\sum _{i=x-3r_{s}}^{i=x+3r_{s}}\sum _{j=y-3r_{s}}^{j=y+3r_{s}}Surround(i-x,j-y)I(i,j)\). In both cases I(i,j) is the image pixel value at position (i,j).

The rules of thumb for automatically choosing the various parameters are presented in Simone et al. [25]. In our experiment, we used two set of parameters for WLF, referred to as uniform (WLF-U, when \(r_{c}=1\), \(r_{s}=2\), \(\lambda _{l}=1\) and \(\beta =1\)) and optimal (WLF-O, when \(r_{c}=2\), \(r_{s}=4\), \(\lambda _{l}=\) variance of the level l and \(\beta =\) variance of the image).

3 Experimental Data

We propose to use one basic object, consistent setup and a simple model to investigate our proposal. We selected achromatic spheres, which surface is characterized by its BRDF defined by an isotropic Wards model [26]. In this sense, the only difference between these images lies in material properties defined by a single parameter. Images were generated with the software BRDF Explorer Footnote 1.

The Ward model for isotropic materials is defined as:

$$\rho _{isotropic}(\theta _i,\phi _i; \theta _r, \phi _r)= \frac{\rho _d}{\pi } + \rho _s\frac{1}{\sqrt{\cos \theta _i\cos \theta _r}}\frac{\exp (-tan^2\delta /\alpha ^2)}{4\pi \alpha ^2},$$

where \(\rho _d\) is the diffuse reflectance, \(\rho _s\) the specular reflectance. \(\delta \) is the angle between the surface normal and the half angle between illumination and viewing angle. Viewing geometry remains the same. We set \(\theta _i\), the incident angle to 45\(^\circ \). The planar incident angle \(\phi _i\) remains also the same at 45\(^\circ \). That gives the highlight in the upper right part of the sphere. The illumination remains the same and the gamma value for the imaging simulation was 2.2.

The only parameter that varies, \(\alpha \), the standard deviation of the surface slope, corresponds to material coarseness in our case. We vary it from 0 to 1. \(\rho _d\) and \(\rho _s\) are kept constant and simply set to 1. Also, Ward mentions that \(\alpha \) is meaningful as long as it is not much greater than 0.2, we provide a span of values from 0 to 1 to investigate the contrast metrics, but perform visual experiments only on the area where Ward’s model is more likely to represent some physical reality, arbitrarily between 0.1 and 0.4.

Examples of generated BRDFs are shown in Fig. 3. When \(\alpha =0.001\), the sphere is highly specular – we do not use this value after since only one pixel in the image would be white and it is quite unrealistic, after \(\alpha =0.3\), the specular direction does not saturate anymore the virtual sensor, and after 0.5, a more diffuse behavior is clearly observed. This is exemplified in Fig. 1. Images of the spheres used in the visual experiment are shown on Fig. 2.

Fig. 1.
figure 1

Images of spheres showing BRDF properties defined by the Wards model. The parameter \(\alpha \) defines how diffuse is the material. Here, \(\alpha \in [0.05, 1.00]\).

Fig. 2.
figure 2

Images of spheres showing BRDF properties defined by the Wards model. The parameter \(\alpha \) defines how diffuse is the material. Here, \(\alpha \in [0.10, 0.40]\). Note that for the experiment, we have added \(\alpha =0.05\) and \(\alpha =1.00\) as explained in Sect. 4.

Fig. 3.
figure 3

Isotropic BRDFs considered in this work as instantiations of the Wards model, based on one parameter \(\alpha \).

4 Visual Experiment

We gathered a committee of experts for ranking and rating the images, 4 persons, including 2 of the authors, together in a dim room, and the images of spheres were displayed all at the same time on a screen by a video–projection system. Notice that the projected background was white, as this may be important in the following. One of the authors was chairing the discussion and giving a large degree of freedom to the participants to redesign the experiment and reach an unanimous decision on each tasks. The participants were able to interact freely with the images, to move around and to discuss, such that a collective decision could be made.

The task was first to rank the images of Fig. 2 from glossy to most glossy. Images were first presented in random order. This was done quite fast by the committee, the task was relatively easy and the ranking coincided perfectly with the increasing of the \(\alpha \) parameter. There was no disagreement between the committee members. Since there is only one physical parameter that varies, it is not surprising that it was easy to rank the samples along a single dimension.

The next task was to rate the images. Through discussion, references of highly matte and highly specular images were included into the set-up (\(\alpha =1.00\) and \(\alpha =0.05\)) to provide potential anchor for the judgment. Even with this addition, it was not possible to decide of rate valuesFootnote 2. However, it was possible to categorize the images. A scale of seven steps: very matte (VM), matte (M), somewhat matte (SM), neither glossy or matte (N), somewhat glossy (SG), glossy (G), and very glossy (VG), was used to classify the images. The only image assumed to be very matte was the sample that is very close to a perfect diffuser (\(\alpha =1\)). The three following samples (\(\alpha =0.40, 0.38, 0.36\)) were considered as somewhat matte. The three next (\(\alpha =0.34, 0.32, 0.30\)) were considered somewhat glossy. The six next (\(\alpha =0.28, 0.26, 0.24, 0.22, 0.20, 0.18\)) were considered as glossy, and the remaining, including the other reference (\(\alpha =0.16, 0.14, 0.12, 0.10, 0.05\)), were considered as very glossy. This is exemplified in the results later in color code (Fig. 5). No sphere was considered to be “neither glossy or matte”, no sphere was considered “matte” within the range of our investigation. The scale was accepted within the context of this experiment, but doubt was emitted on the existence of an axis going from matte to glossy samples as it appears to be not perfectly correlated in some studies (e.g. [27]).

No specific questions were asked about contrast gloss/luster, as we consider \(\alpha \) to be only representative of luster thanks to the stability of the scene and no difference in lighting conditions. This may limit our analysis as \(\alpha \) would not only be the correlate of contrast gloss in more complex situations. However in this study, especially when \(\alpha \le 0.3\), only contrast gloss seems to change, according to the definition of contrast gloss.

5 Results

One aspect to take into account in our analysis, is that the images with an \(\alpha \le 0.3\) are most likely to contain a saturated pixel, so they would have all the same dynamic range. In order to identify the presence of a saturated pixel, i.e. that would represent dynamic range of the scene, we introduce, in addition to the contrast measures, a saturation index. This index is defined as \(SI=\frac{L_{max}}{255}\), since our images are encoded into 8 bits.

Results for this indicator are shown in Fig. 4, i and j. These results are somewhat surprising in the sense that after that \(\alpha \) value goes over 0.5, new brighter pixels appear at the edge of the sphere, which show then some sheen effects, i.e. gloss at grazing angles. This is out of the accepted validity of Ward’s model and shows an aspect of gloss that we do not focus on in this paper, so it is hard to incorporate or discard these data into the discussion on solid ground.

We separate our analysis in two parts: first, we look at the behavior of the image contrast measures according to variation of BRDF parameter, and next we evaluate closely how we can relate that with the visual categorization.

5.1 Physical Aspects

We analyze the correlation between the Ward parameter \(\alpha \) and a set of contrast measures, in addition to the saturation index just described. We chose the global image contrast measure proposed by King-Smith and Kulikowski [13] and two local image contrast measures, Peli [15] and Simone et al. [25] WLF.

The results of the different contrast measures investigated are shown in Fig. 4. On the right, the full range of samples is investigated, while on the left, only the oversampled area of visual interest is drawn.

The saturation index shows that no more saturated pixels are present in the image when \(\alpha \ge 0.3\). The dynamic range reduces until \(\alpha \ge 0.5\), and then increases again until \(\alpha =0.7\). This shows sheen effect.

A similar behavior is shown by the global measure of contrast King-Smith and Kulikowski, with more sensitivity to the amount of bright pixels due to the anchored average normalization. However, the score value drops down regularly according to evolution of the parameter, and then a change of slope is observed around \(\alpha =0.3\).

Peli local contrast measure does not seem to show strong tendency, and appears to be rather unstable. The curves that appear noisy must be related to the short range of scores, only approximately between [0.235, 0.255]. Thus, Peli local contrast measure does not seem to be highly sensitive to the changes of material properties. Similar conclusion could be pointed out for WLF-U local contrast measure (when WLF parameters are fixed).

On the contrary, WLF-O local contrast measure (WLF with optimal parameters, see Sect. 2) seems to perform a rather good ranking of the images for \(0 \le \alpha \le 0.3\). Then for values greater than 0.3, a change in the function occurs, and it follows a descending behavior.

Fig. 4.
figure 4

Contrast metrics results versus Wards parameter \(\alpha \).

5.2 Visual Aspects

We incorporate here the observers categorization into the image contrast measures analysis. We remind to the reader that five images were categorized very glossy \(\alpha \le 0.16\) (Magenta), six glossy \(0.18 \le \alpha \le 0.28\) (Green), three somewhat glossy \(0.30 \le \alpha \le 0.34\) (Yellow), three somewhat matte \(0.36 \le \alpha \le 0.40\) (light Blue) and one very matte \(\alpha = 1\) (dark Blue), which color code correspond to Fig. 5.

As can be seen from Fig. 5(d), about when the saturation index starts to become lower than 1, meaning that there are no more saturated pixels in the image, the observers rate samples somewhat glossy. This behavior may be influenced by the white background and some adaptation process, which generates already a reference for the highest radiance in the visual field. However, this observation is consistent with other results, and whatever the reason, the metric indicates a similar trend as the observers: Gloss sensation reduces.

The global measure King-Smith and Kulikowski is shown on Fig. 5(c). In this case, observers perceived gloss until a similar point where the metric shows an inflexion. The interesting behavior here is that the values keep on going down until \(\alpha =0.5\), which provide a potential ranking from very glossy to somewhat matte. However, after some sheen appears, matte material would show increasing values.

Fig. 5.
figure 5

Perceptual categorization ordered in function of contrast measures score and Wards parameter. Legend stands for very matte (VM, dark blue), somewhat matte (SM, light blue), somewhat glossy (SG, yellow), glossy (G, green), and very glossy (VG, magenta). (Color figure online)

Peli local image contrast measure seems not following observers categorization as the categories exhibit a similar score as shown in Fig. 5(b). It seems that change in perceived contrast gloss/luster due to the changes in material properties cannot be adequately predicted in terms of perceived contrast by Peli measure in this case.

Of particular interest is the behavior of the WLF-O local image contrast measure. This measure shows a smooth curve, which increases within the range of gloss perception. When samples are rated somewhat glossy, the curves has changed its slope and started to decrease, which correlate with the observer categorizations. This agreement seems to be justified by the optimal parameters, which are not fixed but retrieved from statistics of the image, where variance has shown high correlation in terms of perceived contrast [25, 28].

In this study, variance seems to link perception of gloss (potentially specifically luster) with perception of contrast and agree with observers categories. However, even though we can agree that ranking is possible, we do not have any clue on rating. On the other hand, the parameter \(\alpha \) of the Ward model is not perceptually uniform. Thus, the behavior of the curve (linear, smooth, etc.) does not provide any information on perceptual uniformity, which would be one of the targets of a further work if our general observations are confirmed by further investigations. Precautions should be taken with these results since the measure may be only good in fitting the parameter \(\alpha \). Further investigations on whether the measure continues to fit \(\alpha \) rather than the perceived contrast gloss in the general case is yet to be performed.

6 Conclusion

In this work, we have evaluated image contrast metrics in relation to BRDF and gloss perception. Two data sets were created with the intention to focus on contrast gloss/luster driven by a single parameter in this specific case.

Results suggest that WLF-O contrast measure increases when increasing contrast gloss, when gloss is perceived. When gloss is not perceived, then the measure inverses its tendency. This second statement has to be confirmed with an extension of the experiment.

It is not yet clear if the contrast measure follows the physical parameter or the perception of contrast gloss in the general case. More investigations are required in this direction. Further work would be performed with a more sophisticated model for BRDF and light simulation, which would permit to vary more parameters. The methodology for the visual experiment provides interesting qualitative results, but does not permit to rate quantitatively gloss perception. To this aim, more traditional psychometric methodology is required.