High Dynamic Range Imaging
High-dynamic-range imaging (HDRI or HDR) is a collection of hardware and software technologies that allow the capture, processing, and display of image and video content containing a wide range of intensities between the darkest and brightest areas of an image. Lighting conditions in both natural and man-made environments can range from starlight to artificial illumination to bright sunlight. Traditional imaging techniques typically store information using one byte per pixel for each channel, allowing for 256 distinct steps per channel. Consequently, they can only represent a narrow range of this illumination, often resulting in over- or undersaturated regions in the image. In contrast, HDR technologies represent image content using floating-point numbers and thus both allow for a much larger number of intensity steps to be encoded and reduce the distance between consecutive steps in image intensity.
Natural environments contain a vast range of illumination that extends from bright sunlight to starlight. In fact, lighting conditions in natural scenes can extend over more than 14 orders of magnitude. Human eyes are not able to perceive this enormous range simultaneously, but mechanisms in the visual system allow adaptation to the prevailing lighting conditions and surroundings. A recent study has evaluated the simultaneous range of the human visual system under extended luminance levels and has found that under specific circumstances, it can reach (and theoretically exceed) 3.7 orders of magnitude .
Traditionally, digital images are stored using one byte (8 bits) per pixel to represent the values for each of the three channels, allowing for 256 distinct levels between the darkest and brightest value within each channel. Consequently, standard imaging techniques can capture and store roughly 2 orders of magnitude between black and white, separated by distinct steps. Any information exceeding the available range appears over- or underexposed, while values between intensity levels are mapped to one of the 256 available levels, introducing steps into smooth gradients.
High-dynamic-range imaging (HDRI) is a collection of tools and algorithms that allow the capture, storage, processing, and display of images with a wider dynamic range than that afforded by traditional imaging techniques. High-dynamic-range (HDR) images use floating-point numbers to represent values in pixels, leading to two main differences compared to traditional, 8-bit images. First, a much wider range of values can be encoded, far exceeding the dynamic range occurring in nature. Second, the use of floating-point representation means HDR images effectively have no quantization since consecutive steps in intensity can be much smaller.
These inherent differences between HDR and traditional imagery, in addition to allowing more information to be captured and opening up many possibilities in terms of applications, also mean that all stages of an imaging pipeline need to be modified. From the capture of HDR images and video, to storage solutions, to processing, and finally to display, many existing algorithms make assumptions that do not apply to images and video content with a wider dynamic range.
To capture the full dynamic range of a scene such as the one shown in Fig. 1, a camera would need to be able to capture both bright and darker areas simultaneously. Although this is not currently possible using a single camera sensor, an HDR image can be constructed by capturing a scene at multiple exposure levels, which are combined into a single image .
When the HDR image itself is not required, multiple exposures can be directly combined into an 8-bit image, bypassing the floating-point representation. This process is known as exposure fusion .
A direct consequence the higher fidelity afforded by HDR is the increased storage requirement. Existing image file encodings are not readily amenable to the storage of HDR images as they often rely on the well-defined range of values encountered in traditional imagery. As such, several new file formats and extensions to existing formats have been developed for the storage of HDR images . Video formats, although a more recent area of study, have also successfully been developed .
The final stage within an HDR image pipeline is the display of still and video content, which can be achieved in two different ways. A number of display devices capable of an extended dynamic range have been developed, albeit mostly at a prototype level. The most common solution combines a spatially varying backlight with an LCD front panel, extending the dynamic range compared to conventional displays both toward the dark and the bright ends of the spectrum. Solutions exist that combine a DLP projector, placed behind an LCD panel , or alternatively, a matrix of LED light sources can replace the projector .
Although HDR display technologies are becoming more feasible, they are still only capable of displaying a fraction of the dynamic range available in nature. As such, to display images of arbitrary dynamic ranges on hardware with more restricted capabilities (including commercially available monitors), algorithms that can compress the range of the image to that of the display are necessary. Many such techniques exist and are known as tonemapping operators, which will be discussed in the following section.
Different devices are capable of displaying varying dynamic ranges. To ensure that HDR images and content are compatible both with existing, more limited displays and with future ones, solutions allowing the HDR content to be mapped to the capabilities of a particular display are necessary. Since the output of such a mapping should still be useful for visual consumption, the form of this mapping is an important consideration.
In the simplest scenario, one could linearly scale an HDR image to the given display range. Such a solution though would not be adequate for many scenes. Luminance values in HDR images are not linearly distributed across the available range. Instead, the distribution of luminance values in HDR imagery is highly kurtotic: sources and highlights only cover a small number of pixels in the image but are significantly brighter than the majority of the scene, while most pixels represent a relatively small range of intensities. In practice, the consequence of this is that a nonlinear mapping is necessary to ensure that sufficient detail in the scene remains visible.
The aim of tonemapping techniques can then be seen as the compression of the range of an HDR image such that it can be displayed on devices of a lower dynamic range, while preserving the visible detail in the scene . Often, this is also combined with a transition from floating point to integer steps in intensity. Compressing the dynamic range of an image leads inevitably to the loss of some information. Consequently, tonemapping operators (TMOs) need to decide which information to keep and which to discard. The human visual system faces a similar challenge, as it is only capable of seeing a considerably smaller range than that available in nature. It should not come as a surprise then that dynamic range reduction is often inspired by aspects of human vision. Before discussing the different tonemapping techniques and their respective merit, a better understanding of the human eye and the relevant processes to tone reproduction is necessary.
One of the most important properties of the human visual system with respect to dynamic range reduction is its ability to adapt to a wide range of illumination. This effect is experienced on a daily basis: it is sufficient to walk into a dark room from bright sunlight for the eyes to adjust to see under the new conditions. Effectively, the human visual system adapts to the prevailing lighting conditions so that contrasts in the scene can still be distinguished .
It is not a single process within the human visual system that is responsible for this effect, but rather a combination of factors. One familiar to most is the adjustment of the diameter of the pupil depending on the illumination. Another set of mechanisms occurring in the photoreceptors, which prevents them from saturating while adapting to the enormous range of illumination in nature, is their nonlinear response to illumination. This nonlinearity was first studied by Naka and Rushton  and provides a very effective mechanism for the compression of dynamic range. Intensities in the middle of the range lead to approximately logarithmic responses, while as the intensity is increased, the response tails off to a maximum. After that maximum is reached, further increases in intensity will not lead to a corresponding increase in response of the photoreceptors, ensuring that saturation will not occur .
Perceptually Motivated Tone Reproduction
To approximate the compression afforded by photoreceptors, several functions have been proposed. At their simplest, linear scaling functions have been shown to be effective if the scaling factor is chosen appropriately, depending on the properties of the image and the intended display. Although such a linear compression scheme can create realistic results in less extreme cases, some information will be clamped for the image to fit within the displayable range. A better approximation of light adaptation behavior of photoreceptors can be achieved using a logarithmic mapping. Further, to better adapt this mapping to the image content, the base of the logarithms can be modified on a per-pixel basis in order to appropriately adjust the compression . A perceptual evaluation of many of these methods can be found here . More akin to photoreceptor behavior are sigmoid models, which have been used extensively in tone reproduction [9, 14, 15].
More complete models of photoreceptor physiology and light adaptation have been repeatedly used in the tone reproduction literature, offering perceptually accurate reproductions but at the cost of increased computational complexity. An early and fairly complete model that incorporates luminance, color, and pattern representations to create a perceptually accurate compressed image employs a multi-scale processing scheme . Another notable example that lends itself to HDR compression is the retinex model, which considers aspects of both the retina and the brain in order to model complex perceptual phenomena such as color constancy .
Multi-scale Tone Reproduction
Features in images occur at different scales, and many tone reproduction techniques employ mechanisms that treat images at multiple scales, improving the appearance of the final result as a different amount of compression can be applied to different features in the image. This ensures that high-contrast features are sufficiently compressed, while local edges (e.g., texture) are not proportionally scaled as that would reduce their visibility. To decompose the image into different scales, a form of edge-preserving filtering is typically used, such as the bilateral filter .
Display and Viewing-Adaptive Tonemapping
Most tone reproduction algorithms discussed so far aim to compress the range of a given image to a displayable range, typically between 0 and 255, but with no consideration of any further properties of the device or the environment where the image will be viewed. Although many detailed models exist that take into account aspects of the environment where the image was captured as well as where and how it will be displayed, such models tend to require additional input that is usually not available in a typical tonemapping setting.
In addition to the parameters of the viewing environment, different display devices offer varying capabilities that need to be considered. Mantiuk et al.  recently formulated tonemapping as an optimization problem that strives to reduce the dynamic range of an image while minimizing the visible distortions that will inevitably be introduced in the process. To detect visible distortions, a model of the human visual system is used, and to ensure that the resulting image is suitable for the target device, a display model is incorporated. Such a scheme allows for images to be tonemapped even for unconventional devices and viewing conditions.
Dynamic Range Expansion
Tonemapping techniques discussed so far compress the dynamic range of images to prepare them for displays with more limited capabilities. With the advent of HDR displays, the inverse problem arises. To fully take advantage of the extended dynamic range available on such a display, it may be desirable to expand the intensity range of traditional image and video content, a process known as dynamic range expansion or inverse tonemapping. This can be achieved by applying an expansion function to the intensity values of the image to map them to the range of the display. Both nonlinear (such the inverse of Eqs. 2 and 4) and linear schemes have been proposed , but a recent psychophysical study suggested that linear expansion leads to the most plausible results . Alternatively, an observation that has been exploited in inverse tonemapping is that in many scenes, luminance distribution is highly kurtotic, with most pixels covering a relatively small part of the range, while most dynamic range is allocated to light sources and highlights. To simulate this when expanding the dynamic range of images, bright areas of the image can be enhanced more [21, 22].
A final issue to be considered in dynamic range expansion is that of under- or overexposed areas in the image. Even in a well-exposed image, light sources and highlights are likely clip to white, leading to loss of detail in those areas. When expanding such images, it is desirable to reintroduce some of the lost detail in these areas, which can be achieved through hallucination techniques, where overexposed regions are filled in using information from surrounding areas of the image .
Color Management in Tone Reproduction
If additional information is available about both the image itself and the target viewing environment and device, accurate treatment of luminance in conjunction with color is possible using color appearance models (CAMs). Most such models are designed with accurate color reproduction as their primary aim and operate on a limited range of intensities. The increasing availability of HDR devices and content, however, has led to the development of several models capable of handing content of extended dynamic ranges, effectively combining tone reproduction and color appearance modeling. In contrast to the operators discussed so far, CAMs process each channel of the image separately rather than only compress luminance . Typically, such models employ both a forward and an inverse step, taking the scene and viewing parameters into account, although it was recently shown that the inverse step may be bypassed, while still accurately reproducing color appearance .
- 1.Kunkel, T., Reinhard, E.: A reassessment of the simultaneous dynamic range of the human visual system. In: Proceedings of the 7th Symposium on Applied Perception in Graphics and Visualization. ACM, New York (2010)Google Scholar
- 2.Debevec, P., Malik, J.: Recovering high dynamic range radiance maps from photographs. In: Proceedings of the SIGGRAPH ’97: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques. ACM, New York (1997)Google Scholar
- 3.Reinhard, E., Ward, G., Pattanaik, S., Debevec, P., Heidrich, W., Myszkowski, K.: High Dynamic Range Imaging: Acquisition, Display and Image-Based Lighting. Morgan Kaufmann Publishers, San Francisco (2010)Google Scholar
- 4.Tocci, M.D., Kiser, C., Tocci, N., Sen, P.: A versatile HDR video production system. ACM Trans. Graph. 30(4): 41 (2011), 41–41: 10Google Scholar
- 10.Wandell, B.A.: Foundations of Vision. Sinauer Associates, Sunderland (1995)Google Scholar
- 12.Dowling, J.E.: The Retina: An Approachable Part of the Brain. Belknap, Cambridge (1987)Google Scholar
- 14.Schlick, C.: Quantization techniques for the visualization of high dynamic range pictures. In: Proceedings of the Photorealistic Rendering Techniques. Springer, Berlin/Heidelberg/New York (1994)Google Scholar
- 16.Pattanaik, S.N., Ferwerda, J.A., Fairchild, M.D., Greenberg, D.P.: A multiscale model of adaptation and spatial vision for realistic image display. In: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '98), pp. 287–298. ACM, New York (1998). doi: 10.1145/280814.280922Google Scholar
- 20.Oǧuz Akyüz, A., Fleming, R., Riecke, B.E., Reinhard, E., Bülthoff, H.H.: Do HDR displays support LDR content?: a psychophysical evaluation. ACM Trans. Graph. 26(3), Article 38 (2007). doi:10.1145/1276377.1276425Google Scholar
- 21.Didyk, P., Mantiuk, R., Hein, M., Seidel, H-P.: Enhancement of bright video features for HDR displays. In: Computer Graphics Forum, vol. 27, no. 4, pp. 1265–1274. Blackwell (2008)Google Scholar
- 23.Wang, L., Wei, L-Y., Zhou, K., Guo, B., Shum, H-Y.: High dynamic range image hallucination. In: Proceedings of the 18th Eurographics Conference on Rendering Techniques, pp. 321–326. Eurographics Association (2007)Google Scholar
- 26.Fairchild, M.D.: Color Appearance Models. Addison-Wesley, Reading (2005)Google Scholar