Encyclopedia of Color Science and Technology

2016 Edition
| Editors: Ming Ronnier Luo

High Dynamic Range Imaging

  • Tania Pouli
Reference work entry
DOI: https://doi.org/10.1007/978-1-4419-8071-7_177

Synonyms

Definition

High-dynamic-range imaging (HDRI or HDR) is a collection of hardware and software technologies that allow the capture, processing, and display of image and video content containing a wide range of intensities between the darkest and brightest areas of an image. Lighting conditions in both natural and man-made environments can range from starlight to artificial illumination to bright sunlight. Traditional imaging techniques typically store information using one byte per pixel for each channel, allowing for 256 distinct steps per channel. Consequently, they can only represent a narrow range of this illumination, often resulting in over- or undersaturated regions in the image. In contrast, HDR technologies represent image content using floating-point numbers and thus both allow for a much larger number of intensity steps to be encoded and reduce the distance between consecutive steps in image intensity.

Overview

Natural environments contain a vast range of illumination that extends from bright sunlight to starlight. In fact, lighting conditions in natural scenes can extend over more than 14 orders of magnitude. Human eyes are not able to perceive this enormous range simultaneously, but mechanisms in the visual system allow adaptation to the prevailing lighting conditions and surroundings. A recent study has evaluated the simultaneous range of the human visual system under extended luminance levels and has found that under specific circumstances, it can reach (and theoretically exceed) 3.7 orders of magnitude [1].

Traditionally, digital images are stored using one byte (8 bits) per pixel to represent the values for each of the three channels, allowing for 256 distinct levels between the darkest and brightest value within each channel. Consequently, standard imaging techniques can capture and store roughly 2 orders of magnitude between black and white, separated by distinct steps. Any information exceeding the available range appears over- or underexposed, while values between intensity levels are mapped to one of the 256 available levels, introducing steps into smooth gradients.

This is a familiar problem for photographers, as many cases exist where this range cannot adequately capture the full scene. For instance, when capturing the interior of a room, if there is a window with bright sunlight outdoors, the region covered by that window would be significantly brighter than the interior. The photographer would then have to choose whether to expose for the room interior or the window to be captured correctly, losing the remaining information of the scene. Such a solution could be sufficient in an artistic context; however, when images are intended as a realistic and complete representation of a scene, the photographer’s choices and limitations of the equipment can affect the results. Useful information may not be captured or images may include bias, as the photographer needs to select an exposure level. Figure 1 shows two photographs of the same scene exemplifying this problem.
High Dynamic Range Imaging, Fig. 1

Both photographs capture the same scene but using different exposure levels. The foliage and colors of the tree are well exposed in the left image but the sky has been overexposed and has no details visible. The right image on the other hand shows the details of the cloudy sky much better but at the expense of most information in the rest of the image. Problems such as these can easily be overcome with the use of high-dynamic-range technologies

High-dynamic-range imaging (HDRI) is a collection of tools and algorithms that allow the capture, storage, processing, and display of images with a wider dynamic range than that afforded by traditional imaging techniques. High-dynamic-range (HDR) images use floating-point numbers to represent values in pixels, leading to two main differences compared to traditional, 8-bit images. First, a much wider range of values can be encoded, far exceeding the dynamic range occurring in nature. Second, the use of floating-point representation means HDR images effectively have no quantization since consecutive steps in intensity can be much smaller.

Imaging Pipeline

These inherent differences between HDR and traditional imagery, in addition to allowing more information to be captured and opening up many possibilities in terms of applications, also mean that all stages of an imaging pipeline need to be modified. From the capture of HDR images and video, to storage solutions, to processing, and finally to display, many existing algorithms make assumptions that do not apply to images and video content with a wider dynamic range.

Capture

To capture the full dynamic range of a scene such as the one shown in Fig. 1, a camera would need to be able to capture both bright and darker areas simultaneously. Although this is not currently possible using a single camera sensor, an HDR image can be constructed by capturing a scene at multiple exposure levels, which are combined into a single image [2].

Typically, multiple exposures are captured sequentially, with all camera settings apart from the exposure time fixed to minimize changes (many recent cameras include “Auto-Exposure Bracketing” settings to simplify this process). One example of a series of exposures is shown in Fig. 2. Although this is often an effective solution, as consecutive exposures are taken at slightly different times, fast-changing scenes (e.g., moving people or tree branches) can lead to misaligned exposures or ghosting artifacts [3]. To counter these issues, differently exposed versions of the scene can be captured concurrently using specialized cameras, where multiple sensors record the same scene at different exposure levels [4].
High Dynamic Range Imaging, Fig. 2

An HDR image (here shown tonemapped) constructed by combining the seven exposures shown on the left. Each individual exposure cannot fully capture all the detail in the scene

When the HDR image itself is not required, multiple exposures can be directly combined into an 8-bit image, bypassing the floating-point representation. This process is known as exposure fusion [5].

Storage

A direct consequence the higher fidelity afforded by HDR is the increased storage requirement. Existing image file encodings are not readily amenable to the storage of HDR images as they often rely on the well-defined range of values encountered in traditional imagery. As such, several new file formats and extensions to existing formats have been developed for the storage of HDR images [3]. Video formats, although a more recent area of study, have also successfully been developed [6].

Display

The final stage within an HDR image pipeline is the display of still and video content, which can be achieved in two different ways. A number of display devices capable of an extended dynamic range have been developed, albeit mostly at a prototype level. The most common solution combines a spatially varying backlight with an LCD front panel, extending the dynamic range compared to conventional displays both toward the dark and the bright ends of the spectrum. Solutions exist that combine a DLP projector, placed behind an LCD panel [7], or alternatively, a matrix of LED light sources can replace the projector [8].

Although HDR display technologies are becoming more feasible, they are still only capable of displaying a fraction of the dynamic range available in nature. As such, to display images of arbitrary dynamic ranges on hardware with more restricted capabilities (including commercially available monitors), algorithms that can compress the range of the image to that of the display are necessary. Many such techniques exist and are known as tonemapping operators, which will be discussed in the following section.

Tone Reproduction

Different devices are capable of displaying varying dynamic ranges. To ensure that HDR images and content are compatible both with existing, more limited displays and with future ones, solutions allowing the HDR content to be mapped to the capabilities of a particular display are necessary. Since the output of such a mapping should still be useful for visual consumption, the form of this mapping is an important consideration.

In the simplest scenario, one could linearly scale an HDR image to the given display range. Such a solution though would not be adequate for many scenes. Luminance values in HDR images are not linearly distributed across the available range. Instead, the distribution of luminance values in HDR imagery is highly kurtotic: sources and highlights only cover a small number of pixels in the image but are significantly brighter than the majority of the scene, while most pixels represent a relatively small range of intensities. In practice, the consequence of this is that a nonlinear mapping is necessary to ensure that sufficient detail in the scene remains visible.

The aim of tonemapping techniques can then be seen as the compression of the range of an HDR image such that it can be displayed on devices of a lower dynamic range, while preserving the visible detail in the scene [9]. Often, this is also combined with a transition from floating point to integer steps in intensity. Compressing the dynamic range of an image leads inevitably to the loss of some information. Consequently, tonemapping operators (TMOs) need to decide which information to keep and which to discard. The human visual system faces a similar challenge, as it is only capable of seeing a considerably smaller range than that available in nature. It should not come as a surprise then that dynamic range reduction is often inspired by aspects of human vision. Before discussing the different tonemapping techniques and their respective merit, a better understanding of the human eye and the relevant processes to tone reproduction is necessary.

Perceptual Background

Human eyes are one of the means for sensing the world. Figure 3 shows an annotated cross section of the human eye. Light enters the eye through the pupil, travels through the ocular media, and finally hits the retina. The retina is a layer of neural sensors lining the back of the eye, which transduce light into a signal that is transmitted to the brain. Although several layers of neurons are present at the retina, it is the photoreceptors that are sensitive to light. Specifically, rod photoreceptors are sensitive to low light conditions, while cone photoreceptors are sensitive to brighter conditions.
High Dynamic Range Imaging, Fig. 3

A simplified cross section of the human eye

One of the most important properties of the human visual system with respect to dynamic range reduction is its ability to adapt to a wide range of illumination. This effect is experienced on a daily basis: it is sufficient to walk into a dark room from bright sunlight for the eyes to adjust to see under the new conditions. Effectively, the human visual system adapts to the prevailing lighting conditions so that contrasts in the scene can still be distinguished [10].

It is not a single process within the human visual system that is responsible for this effect, but rather a combination of factors. One familiar to most is the adjustment of the diameter of the pupil depending on the illumination. Another set of mechanisms occurring in the photoreceptors, which prevents them from saturating while adapting to the enormous range of illumination in nature, is their nonlinear response to illumination. This nonlinearity was first studied by Naka and Rushton [11] and provides a very effective mechanism for the compression of dynamic range. Intensities in the middle of the range lead to approximately logarithmic responses, while as the intensity is increased, the response tails off to a maximum. After that maximum is reached, further increases in intensity will not lead to a corresponding increase in response of the photoreceptors, ensuring that saturation will not occur [12].

Figure 4 shows the nonlinear response of photoreceptors to increasing luminance values, which can be modeled with what is known as the Naka-Rushton equation:
High Dynamic Range Imaging, Fig. 4

Photoreceptors respond nonlinearly to increasing luminance levels. The Naka-Rushton equation models this nonlinearity, producing S-shaped curves when plotted in a log-linear scale, known as sigmoid

$$ \frac{V}{V_{\max }} = \frac{I^n}{I^n+{\sigma}^n\ } $$
(1)
where V is the response at an intensity I, Vmax is the peak response at saturation, and σ is the intensity necessary for the half-maximum response. The exponent n controls the slope of the function and is generally reported to be in the range of 0.7–1.0 [12]. This functional form is known as a sigmoid because it forms an S-shaped curve when plotted on log-linear axes. This aspect of light adaptation of the visual system has played an instrumental role in dynamic range compression, which will be discussed in the following section.

Perceptually Motivated Tone Reproduction

Many tonemapping operators use models of photoreceptor adaptation to effectively reduce the dynamic range of input HDR images. Typically this involves a forward step where the dynamic range of the image is compressed, often taking into account the lighting conditions and appearance of the scene, followed by an inverse step that considers the parameters of the display where the image will be viewed, as shown in Fig. 5. If the forward step is derived from a model of the human visual system, it transforms the image from luminance values to perceived values, essentially modeling what the original scene would be perceived as if the viewer were present. The reverse step then takes the compressed (now perceived) values back to luminance values, allowing them to be correctly displayed [3]. Note however that many tonemapping operators do not employ the reverse step in order to maximize compression.
High Dynamic Range Imaging, Fig. 5

The typical steps in the tonemapping process

To approximate the compression afforded by photoreceptors, several functions have been proposed. At their simplest, linear scaling functions have been shown to be effective if the scaling factor is chosen appropriately, depending on the properties of the image and the intended display. Although such a linear compression scheme can create realistic results in less extreme cases, some information will be clamped for the image to fit within the displayable range. A better approximation of light adaptation behavior of photoreceptors can be achieved using a logarithmic mapping. Further, to better adapt this mapping to the image content, the base of the logarithms can be modified on a per-pixel basis in order to appropriately adjust the compression [3]. A perceptual evaluation of many of these methods can be found here [13]. More akin to photoreceptor behavior are sigmoid models, which have been used extensively in tone reproduction [9, 14, 15].

More complete models of photoreceptor physiology and light adaptation have been repeatedly used in the tone reproduction literature, offering perceptually accurate reproductions but at the cost of increased computational complexity. An early and fairly complete model that incorporates luminance, color, and pattern representations to create a perceptually accurate compressed image employs a multi-scale processing scheme [16]. Another notable example that lends itself to HDR compression is the retinex model, which considers aspects of both the retina and the brain in order to model complex perceptual phenomena such as color constancy [17].

Multi-scale Tone Reproduction

Features in images occur at different scales, and many tone reproduction techniques employ mechanisms that treat images at multiple scales, improving the appearance of the final result as a different amount of compression can be applied to different features in the image. This ensures that high-contrast features are sufficiently compressed, while local edges (e.g., texture) are not proportionally scaled as that would reduce their visibility. To decompose the image into different scales, a form of edge-preserving filtering is typically used, such as the bilateral filter [3].

Display and Viewing-Adaptive Tonemapping

Most tone reproduction algorithms discussed so far aim to compress the range of a given image to a displayable range, typically between 0 and 255, but with no consideration of any further properties of the device or the environment where the image will be viewed. Although many detailed models exist that take into account aspects of the environment where the image was captured as well as where and how it will be displayed, such models tend to require additional input that is usually not available in a typical tonemapping setting.

In addition to the parameters of the viewing environment, different display devices offer varying capabilities that need to be considered. Mantiuk et al. [18] recently formulated tonemapping as an optimization problem that strives to reduce the dynamic range of an image while minimizing the visible distortions that will inevitably be introduced in the process. To detect visible distortions, a model of the human visual system is used, and to ensure that the resulting image is suitable for the target device, a display model is incorporated. Such a scheme allows for images to be tonemapped even for unconventional devices and viewing conditions.

Dynamic Range Expansion

Tonemapping techniques discussed so far compress the dynamic range of images to prepare them for displays with more limited capabilities. With the advent of HDR displays, the inverse problem arises. To fully take advantage of the extended dynamic range available on such a display, it may be desirable to expand the intensity range of traditional image and video content, a process known as dynamic range expansion or inverse tonemapping. This can be achieved by applying an expansion function to the intensity values of the image to map them to the range of the display. Both nonlinear (such the inverse of Eqs. 2 and 4) and linear schemes have been proposed [19], but a recent psychophysical study suggested that linear expansion leads to the most plausible results [20]. Alternatively, an observation that has been exploited in inverse tonemapping is that in many scenes, luminance distribution is highly kurtotic, with most pixels covering a relatively small part of the range, while most dynamic range is allocated to light sources and highlights. To simulate this when expanding the dynamic range of images, bright areas of the image can be enhanced more [21, 22].

A final issue to be considered in dynamic range expansion is that of under- or overexposed areas in the image. Even in a well-exposed image, light sources and highlights are likely clip to white, leading to loss of detail in those areas. When expanding such images, it is desirable to reintroduce some of the lost detail in these areas, which can be achieved through hallucination techniques, where overexposed regions are filled in using information from surrounding areas of the image [23].

Color Management in Tone Reproduction

Most of the tone reproduction operators discussed so far operate on a luminance representation of the image. Typically, a tonemapping operator will compress the world luminance L w , which is computed from the RGB values in the image, to a display luminance L d . To form the resulting tonemapped image, the RGB values will have to be scaled in such a way that the ratio between the three channels is kept the same before and after the compression [14]. This can be achieved as follows:
$$ \left[\begin{array}{c}\hfill {R}_d\hfill \\ {}\hfill {G}_d\hfill \\ {}\hfill {B}_d\hfill \end{array}\right] = \left[\begin{array}{c}\hfill {L}_d\frac{R_w}{L_w}\ \hfill \\ {}\hfill {L}_d\frac{G_w}{L_w}\hfill \\ {}\hfill {L}_d\frac{B_w}{L_w}\hfill \end{array}\right] $$
(2)
This process preserves the ratios between channels but it is not sufficient to maintain the appearance of colors in the image. Saturation depends on both chromatic information and intensity values, and thus, luminance compression through tonemapping can lead to an oversaturated appearance (or undersaturated in the case of dynamic range expansion). To control the saturation of the tonemapped image, an exponent s can be used when reconstructing the three color channels from the adjusted luminance [24] (here shown only for the red channel):
$$ {R}_d={L}_d{\left(\frac{R_w}{L_w}\right)}^s $$
(3)
Although this approach is frequently employed as part of tonemapping operators, it may cause undesirable luminance shifts. Alternatively, the three channels can be recomputed from the adjusted luminance as follows:
$$ {R}_d={L}_d\left(\left(\frac{R_w}{L_w}-1.0\right)s+1.0\right){L}_d $$
(4)
Both solutions offer simple saturation control but rely on manual selection for the parameter s. To automate the parameter selection for global tonemapping operators, a recent series of psychophysical experiments has derived a link between the tonemapping curve and the saturation parameter for both correction formulas [25]. Despite this improvement though, such post hoc corrections cannot guarantee that the color appearance of the image will be preserved.

If additional information is available about both the image itself and the target viewing environment and device, accurate treatment of luminance in conjunction with color is possible using color appearance models (CAMs). Most such models are designed with accurate color reproduction as their primary aim and operate on a limited range of intensities. The increasing availability of HDR devices and content, however, has led to the development of several models capable of handing content of extended dynamic ranges, effectively combining tone reproduction and color appearance modeling. In contrast to the operators discussed so far, CAMs process each channel of the image separately rather than only compress luminance [26]. Typically, such models employ both a forward and an inverse step, taking the scene and viewing parameters into account, although it was recently shown that the inverse step may be bypassed, while still accurately reproducing color appearance [27].

Cross-References

References

  1. 1.
    Kunkel, T., Reinhard, E.: A reassessment of the simultaneous dynamic range of the human visual system. In: Proceedings of the 7th Symposium on Applied Perception in Graphics and Visualization. ACM, New York (2010)Google Scholar
  2. 2.
    Debevec, P., Malik, J.: Recovering high dynamic range radiance maps from photographs. In: Proceedings of the SIGGRAPH ’97: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques. ACM, New York (1997)Google Scholar
  3. 3.
    Reinhard, E., Ward, G., Pattanaik, S., Debevec, P., Heidrich, W., Myszkowski, K.: High Dynamic Range Imaging: Acquisition, Display and Image-Based Lighting. Morgan Kaufmann Publishers, San Francisco (2010)Google Scholar
  4. 4.
    Tocci, M.D., Kiser, C., Tocci, N., Sen, P.: A versatile HDR video production system. ACM Trans. Graph. 30(4): 41 (2011), 41–41: 10Google Scholar
  5. 5.
    Mertens, T., Kautz, J., Van Reeth, F.: Exposure fusion: a simple and practical alternative to high dynamic range photography. Comput. Graph. Forum 28(1), 161–171 (2009)CrossRefGoogle Scholar
  6. 6.
    Myszkowski, K., Mantiuk, R., Krawczyk, G.: High dynamic range video. Synthesis Lect. Comput. Graph. Animation 2(1), 1–158 (2008)CrossRefGoogle Scholar
  7. 7.
    Seetzen, H., Heidrich, W., Stuerzlinger, W., Ward, G., Whitehead, L., Trentacoste, M., Ghosh, A., Vorozcovs, A.: High dynamic range display systems. ACM Trans. Graph. 23(3), 760–768 (2004)CrossRefGoogle Scholar
  8. 8.
    Seetzen, H., Whitehead, L.A., Ward, G.: 54.2: A High Dynamic Range Display Using Low and High Resolution Modulators. SID Symp. Dig. Tech. Pap. 34, 1450–1453 (2003)CrossRefGoogle Scholar
  9. 9.
    Tumblin, J., Rushmeier, H.: Tone reproduction for computer generated images. IEEE Comput. Graph. Appl. 13(6), 42–48 (1993)CrossRefGoogle Scholar
  10. 10.
    Wandell, B.A.: Foundations of Vision. Sinauer Associates, Sunderland (1995)Google Scholar
  11. 11.
    Naka, K.I., Rushton, W.A.H.: S-potentials from luminosity units in the retina of fish (Cyprinidae). J. Physiol. 185, 587–599 (1966)CrossRefGoogle Scholar
  12. 12.
    Dowling, J.E.: The Retina: An Approachable Part of the Brain. Belknap, Cambridge (1987)Google Scholar
  13. 13.
    Čadík, M., Wimmer, M., Neumann, L., Artusi, A.: Evaluation of HDR tone mapping methods using essential perceptual attributes. Comput. Graph. 32(3), 330–349 (2008)CrossRefGoogle Scholar
  14. 14.
    Schlick, C.: Quantization techniques for the visualization of high dynamic range pictures. In: Proceedings of the Photorealistic Rendering Techniques. Springer, Berlin/Heidelberg/New York (1994)Google Scholar
  15. 15.
    Reinhard, E., Stark, M., Shirley, P., Ferwerda, J.: Photographic tone reproduction for digital images. ACM Trans. Graph. 21(3), 267–276 (2002)CrossRefGoogle Scholar
  16. 16.
    Pattanaik, S.N., Ferwerda, J.A., Fairchild, M.D., Greenberg, D.P.: A multiscale model of adaptation and spatial vision for realistic image display. In: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '98), pp. 287–298. ACM, New York (1998). doi: 10.1145/280814.280922Google Scholar
  17. 17.
    McCann, J.J., Rizzi, A.: The Art and Science of HDR Imaging. Wiley, Hoboken (2011)CrossRefGoogle Scholar
  18. 18.
    Mantiuk, R., Daly, S., Kerofsky, L.: Display adaptive tone mapping. ACM Trans. Graph. 27(3), 68 (2008)CrossRefGoogle Scholar
  19. 19.
    Banterle, F., Artusi, A., Debattista, K., Chalmers, A.: Advanced High Dynamic Range Imaging: Theory and Practice. AK Peters/CRC Press, Natick (2011)CrossRefGoogle Scholar
  20. 20.
    Oǧuz Akyüz, A., Fleming, R., Riecke, B.E., Reinhard, E., Bülthoff, H.H.: Do HDR displays support LDR content?: a psychophysical evaluation. ACM Trans. Graph. 26(3), Article 38 (2007). doi:10.1145/1276377.1276425Google Scholar
  21. 21.
    Didyk, P., Mantiuk, R., Hein, M., Seidel, H-P.: Enhancement of bright video features for HDR displays. In: Computer Graphics Forum, vol. 27, no. 4, pp. 1265–1274. Blackwell (2008)Google Scholar
  22. 22.
    Rempel, A.G., Trentacoste, M., Seetzen, H., Young, H.D., Heidrich, W., Whitehead, L., Ward, G.: Ldr2hdr: on-the-fly reverse tone mapping of legacy video and photographs. ACM Trans. Graph. 26(3), 39 (2007)CrossRefGoogle Scholar
  23. 23.
    Wang, L., Wei, L-Y., Zhou, K., Guo, B., Shum, H-Y.: High dynamic range image hallucination. In: Proceedings of the 18th Eurographics Conference on Rendering Techniques, pp. 321–326. Eurographics Association (2007)Google Scholar
  24. 24.
    Tumblin, J., Hodgins, J.K., Guenter, B.K.: Two methods for display of high contrast images. ACM Trans. Graph. 18(1), 56–94 (1999)CrossRefGoogle Scholar
  25. 25.
    Mantiuk, R., Mantiuk, R., Tomaszewska, A., Heidrich, W.: Color correction for tone mapping. Comput. Graph. Forum (Proc. Eurographics ’09) 28(2), 193–202 (2009)CrossRefGoogle Scholar
  26. 26.
    Fairchild, M.D.: Color Appearance Models. Addison-Wesley, Reading (2005)Google Scholar
  27. 27.
    Reinhard, E., Pouli, T., Kunkel, T., Long, B., Ballestad, A., Damberg, G.: Calibrated image appearance reproduction. ACM Trans. Graph. 31(6), 201 (2012), 201–201: 211CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.TechnicolorCesson-SévignéFrance