SUShe: simple unsupervised shadow removal

Koutsiou, Dimitra-Christina C.; Savelonas, Michalis A.; Iakovidis, Dimitris K.

doi:10.1007/s11042-023-16282-0

SUShe: simple unsupervised shadow removal

Open access
Published: 28 July 2023

Volume 83, pages 19517–19539, (2024)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

SUShe: simple unsupervised shadow removal

Download PDF

1154 Accesses
1 Citation
Explore all metrics

Abstract

Shadow removal is an important problem in computer vision, since the presence of shadows complicates core computer vision tasks, including image segmentation and object recognition. Most state-of-the-art shadow removal methods are based on complex deep learning architectures, which require training on a large amount of data. In this paper a novel and efficient methodology is proposed aiming to provide a simple solution to shadow removal, both in terms of implementation and computational cost. The proposed methodology is fully unsupervised, based solely on color image features. Initially, the shadow region is automatically extracted by a segmentation algorithm based on Electromagnetic-Like Optimization. Superpixel-based segmentation is performed and pairs of shadowed and non-shadowed regions, which are nearest neighbors in terms of their color content, are identified as parts of the same object. The shadowed part of each pair is relighted by means of histogram matching, using the content of its non-shadowed counterpart. Quantitative and qualitative experiments on well-recognized publicly available benchmark datasets are conducted to evaluate the performance of proposed methodology in comparison to state-of-the-art methods. The results validate both its efficiency and effectiveness, making evident that solving the shadow removal problem does not necessarily require complex deep learning-based solutions.

Deep Learning Based Shadow Detection in Images

Towards enhancing shadow removal from images

Article 10 July 2024

Deep Learning Based Shadow Removal: Target to Current Methodology Flaws

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Core computer vision tasks, such as navigation and obstacle detection are complicated in the presence of shadows, which can be misinterpreted as objects or obstacles. Several methods have been proposed to address shadow detection and removal in various application domains, including satellite imaging, traffic monitoring, and obstacle detection in navigation systems [4, 28, 52, 57, 59]. Figure 1 shows example images illustrating objects in a shading environment.

The shadow problem is commonly treated using the Shadow Model Theory (SMT) proposed by Barrow et al. [6], which enables the calculation of the intensity of light, reflected at a given point on a surface. It is a physics-based model, which decomposes illumination in two components: direct and ambient illumination. According to Guo et al. [23], the illumination model for an RGB image is provided by:

$${I}_i=\left({t}_i\mathit{\cos}{\theta}_i\ {L}_d+{L}_e\right){R}_i$$

(1)

where I_i is the color intensity of pixel i in R, G and B color channels, L_d and L_e are the light intensities associated with the light source and ambient sources respectively, θ_i is the angle defined by direct lighting direction and surface, and t_i ∈[0, 1] is a variable which denotes the amount of light inside an area.

Early shadow removal methods, such as [7, 17,18,19, 36, 41], were also based on models relying on a number of assumptions, and usually evaluated on datasets with a small number of images, focusing mainly on qualitative aspects. Later, there was a decisive shift towards supervised methods, such as deep neural networks, which have brought new trends, but also the requirement for large training sets. New benchmark datasets were created, to cover this requirement, and evaluations begun to encompass also a more quantitative aspect [33, 46, 50, 53]. A drawback was that in these datasets shadows needed to be manually annotated, and such annotations are time-consuming and usually costly. Furthermore, deep neural networks are usually computationally expensive and demanding in terms of computational resources.

The state-of-the-art shadow detection and removal approaches can be divided into two categories: a) unsupervised methods, and b) supervised methods. Most methods of both categories encompass the SMT and use a coefficient to indicate the intensity in the shadow areas based on Eq. (1). This coefficient describes the brightness reduction in relation to the area of the image without any shadows [15, 35, 47] . Finding the right parameters that, when multiplied by the intensity of each pixel in the shadow zone, the initial illumination will be recovered in the shadowed areas. Unsupervised methods are usually based on intrinsic image features, such as color and texture, and strategies that enable the recovery of detail and luminosity of shadow regions [5, 12, 16,17,18,19, 22, 25, 29, 31, 36, 41, 60, 63]. Supervised methods, are mainly based on complex deep learning architectures, and they usually provide results of higher quality than the unsupervised ones [1, 9, 14, 26, 27, 33, 34, 37, 42, 53, 56, 64].

This work aims to address the need for both efficient and effective shadow removal in computer vision workflows, by the following contributions:

A novel, fully unsupervised shadow removal method, named Simple Unsupervised Shadow Removal (SUShe), which is based solely on color image features. Unlike previous methods it is very simple, both in terms of implementation and computational complexity, whereas it obtains results that are comparable to state-of-the-art deep learning-based methods.
A unique shadow segmentation approach efficiently combining a physics-inspired optimization algorithm, superpixel segmentation, and histogram matching, in the context of a lightweight pre-processing function.
An extensive experimental study on various publicly available benchmark datasets, highlighting the tradeoff of efficiency and effectiveness it offers.

The remainder of this paper is organized into six sections. Section 2 provides an overview of related work, and Section 3 details the proposed methodology. Section 4 provides information on the experimental setup and the evaluation framework. Section 5 presents results of the proposed method in comparison with the most relevant state-of-the-art methods. Section 6 provides a perspective in terms of computational complexity. Section 7 provides a discussion of the experimental results, and the main conclusions of this work as well.

2 Related work

Several shadow detection and removal methods have been based on SMT. This theory is the foundation for most relighting methods that have been published in the last decade. Still, it is incomplete, in the sense that it fails to accurately model umbra regions, to enable correct relighting in the proximity of shadow borders. Apart from the SMT, several works have been proposed for shadow removal. These include model-based unsupervised methods, such as the method proposed in [19], where each RGB image is projected to an 1D invariant direction, in order to recover hues by means of a 2D chromaticity feature space. In [47], a pyramid-based process is employed for shadow removal with user assistance. In [36], the main aim was shadow removal in a way that is robust against texture variations. Later, methods, such as [60], were based on classical machine learning techniques, which use engineered features, such as texture elements, and aim to match regions in different lighting conditions [22, 23]. In [12] clustering-based shadow detection relied on color and texture features, whereas in [41] a shadow removal method was proposed for images with uniform background. Shadow and lit regions were separated by ignoring the low-frequency image details. Also, an unsupervised shadow removal method using differential operations for a recent osmosis model was proposed in [7]. However, most of the afore-mentioned methods have been tested only in subsets of benchmark datasets. This is because the implementation of these studies is usually limited to images with specific types of textures and features.

Recently, the focus of research on shadow detection and removal turned to supervised deep learning-based architectures, such as Convolutional Neural Networks (CNNs), and Generative Adversarial Networks (GANs). In [33], a shadow-free image is generated using two deep networks, SP-Net and M-Net. In [34], a GAN-based framework was trained with patches extracted from shadow images, following the physics-inspired model of [33]. In [1], Channel Attention GAN detects and removes shadows by using two networks, which consider physical shadow properties and equipment parameters. Another network, called G2R-ShadowNet consists of three subnetworks, and it requires a small number of images for training [38]. Stacked Conditional Generative Adversarial Network (ST-CGAN) [53] combines two stacked conditional GANs, which provide generators for the shadow detection mask and for the shadow-free image. In [20], shadow removal was treated as an image fusion problem via FusionNet, a network that generates weight maps facilitating the fusion process. Feature fusion was also employed in [9], integrated with multiple dictionary learning. During the last years, several studies have also been proposed to increase the effectiveness of shadow removal in the benchmark shadow datasets. In [26] shadow removal architecture was proposed by Hu et al., aiming to learn direction-aware and spatial characteristics of the images at various levels, using a CNN. Additionally, Hu et al. proposed a weighted cross entropy loss, to train a neural network for shadow detection. That method addressed color and luminosity inconsistencies in the training pairs for shadow removal, by applying a color transfer function. In [62] in order to investigate residual images and illumination estimation with GANs for shadow removal discrepancies in the training pairs, a framework named RIS-GAN was proposed. To refine the coarse shadow-removal result in the shadow-free image of that approach, indirect shadow-removal images were created by estimating negative residual images and inverse illumination maps, in conjunction with the coarse shadow-removal image. In [11], the shadow removal problem was approached in two ways. Firstly, a dual hierarchically aggregation network was proposed to carefully learn the border artifacts in a shadowed image. Without any down-sampling, a foundation of dilated convolutions was considered for attention estimation, using multi-context information. Secondly, taking into account that training on a small dataset limits the network ability to recognize textural differences, resulting in color inconsistencies in the shadowed region, the authors developed a dataset synthesis method based on shadow matting. In [10] a two-stage context-aware network, called Context-Aware Network (CANet) was proposed for shadow removal. In CANet the shadow regions receive contextual information from the corresponding non-shadowed regions. As a next step, encoder-decoder was used to enhance the results. Mask-ShadowNet was proposed in [24], where a masked adaptive instance normalization method along with embedded aligners were applied to remove shadows, considering illumination uniformity and the different feature statistics in the shadow and non-shadow areas. In [65] a Bidirectional Mapping Network was presented, combining the learning process of shadow removal and shadow generation into a unified parameter sharing framework. In [51], a style-guided shadow removal network was proposed to address the issue of visually disharmonic images after shadow removal, and to ensure better image style coherence. The training of all these deep learning-based methods is associated with a) a high computational cost, b) non-trivial hardware specifications, and c) a requirement for a large number of annotated images.

Shadow removal is useful in a variety of computer vision applications, such as the detection of moving objects and pedestrians, either in indoor or in outdoor environments [29, 31, 63], in navigation-aid systems [44], and the recognition of regions of interest in remote sensing images [26, 27]. In [49], an automatic shadow mask estimation approach was introduced, aiming to replace manual labeling in a supervised context, using known solar angles and 3D point clouds. Shadow removal can be an essential component of remote sensing object detection algorithms, aiming to cope with several challenges, such as the complex background, and the variations of scale and density [39, 55]. Wang et al. [54] proposed an automatic cloud shadow screening mechanism, which was utilized for PlanetScope, a constellation of over 130 satellites of European Space Agency (ESA) that can regularly image the entire surface of the Earth. In an unsupervised context, a statistical method [3] was proposed for aerial imaging, based on decision trees.

Unlike current, either unsupervised or supervised, shadow removal methods, this work provides a very simple methodology for automatic shadow removal, based on a novel combination of superpixel segmentation with a strategy for matching shadow and lit regions.

3 Proposed methodology

The proposed methodology is based on a simple, yet very effective strategy. Initially, the shadow mask is extracted using an evolutionary physics-inspired algorithm. Next, both shadowed or non-shadowed image regions, which are coherent in terms of texture and color, are identified and shadow/non-shadow pairs of neighboring superpixels, adjacent to shadow borders, are determined. The shadowed part of each pair is relighted by means of histogram matching.

3.1 Shadow detection

Shadow detection refers to the segmentation of a natural image, either in indoor or outdoor settings, in order to extract the shadowed region. Algorithm 1 summarizes the shadow detection stage with a pseudocode, and Fig. 2 presents a visual summary of this algorithm. The Electromagnetism-like Optimization (EMO) [32, 45, 48] algorithm is employed for multilevel segmentation, aiming to cope with the issue of computational complexity. Initially, the color space used to represent the input image is converted from RGB to HSV (line 2). The Hue (H) component of the input image is segmented using the EMO-based method described in [8], considering that the component Η is invariant to changes in lighting. A set of k images h_i, i = 1, 2, …, k, is the output of this operation, representing roughly hue-homogeneous regions (line 3). As a next step, the Value component (V) of the HSV image is multiplied by each h_i, i = 1, 2, …, k, image, resulting in a series of new images v_i = h_i · V, i = 1, 2, …, k, which represent regions with weighted intensities (line 5). This weighting is performed because the generated image regions have lower intensities in shadowed areas. Thus, a subsequent bilevel thresholding step is facilitated. Bi-level thresholding is performed by using EMO, on each of v_i, i = 1, 2, …, k, with only one threshold to be optimized. The result of this operation is a set of k binary images, b_i, i = 1,2, …, k (line 6). In these images, the pixels corresponding to lower intensities (potential shadowed regions) are set to white, and the remaining pixels are set to black. As a final step, the binary masks b_i, i = 1,2, …, k, obtained for each input image, are aggregated to create a mask B representing the shadowed regions of the input image (line 7).

3.2 Superpixel matching strategy

Following the application of the shadow detection algorithm, SUShe performs unsupervised shadow removal on image regions with approximately uniform color features. These regions, which are characterized as superpixels, are obtained using the SLIC Superpixel segmentation algorithm. Α superpixel matching strategy is then applied to identify superpixels in the shadow areas that are similar to superpixels in the non-shadow areas. Relighting is performed by transforming the histogram of the shadow superpixels, so that it matches the histogram of the respective non-shadow superpixels.

SLIC Superpixel segmentation

The Simple Linear Iterative Clustering (SLIC) superpixel segmentation algorithm performs local clustering of pixels, considering both color and spatial information, by means of a metric proposed in [2]. The algorithm takes as input an image and the number of superpixels K, in which the input image should be divided. The initial RGB image is separated into K smaller grid intervals S defined in the xy plane. Each pixel inside S has spatial coordinates (x_i, y_i) and color coordinates (L_i, a_i, b_i) in CIE-Lab color space. Every grid interval S would have a superpixel center C_k = [L_k, a_k, b_k, x_k, y_k], for superpixels that are similar in size. Thus, the following distances are calculated for each pixel (x_i, y_i, L_i, a_i, b_i) in S, to the superpixel center C_k using Eqs. (2) and (3) in order to define the metric provided in Eq. (4):

$${d}_{Lab}=\sqrt{{\left({L}_k-{L}_i\right)}^2+{\left({a}_k-{a}_i\right)}^2+{\left({b}_k-{b}_i\right)}^2}$$

(2)

$${d}_{xy}=\sqrt{{\left({x}_k-{x}_i\right)}^2+{\left({y}_k-{y}_i\right)}^2}$$

(3)

$${D}_s=\sqrt{d_{Lab}^2+\frac{m^2}{S}{d}_{xy}^2}$$

(4)

where D_s is the final metric, which combines Euclidean distances of the color (in CIE-Lab) d_Lab and the Euclidean distances of the spatial coordinates d_xy, normalized by the grid interval S, and m is a variable of the SLIC algorithm, which controls the compactness of the superpixel. The default value for m was set equal to 10 according to [2]. Each cluster center C_k is assigned to the best matching pixels from the 2S × 2S area, according to the distance metric D_s. This process is iterative, until convergence.

SUShe: Simple unsupervised shadow removal

The proposed methodology combines two very simple techniques for region segmentation and relighting of the shadowed areas. The SLIC Superpixel algorithm is used to segment the input image into many small, approximately uniform (with respect to color), regions. Algorithm 2 summarizes the shadow removal stage with pseudocode, and Fig. 3 presents a visual summary.

Initially, the binary shadow mask B (obtained in Subsection 3.1) is used to split the input image I (line 1) into a shadowed regions I_S and lit regions I_L (line 2). These regions are obtained by multiplying I_S = B · I and I_L = (1 − B) · I.
Next, SLIC superpixel segmentation is applied to I_S and I_L separately (line 3): I_S and I_L are broken down into the shadowed (I_SLIC(S)) and the lit image regions I_SLIC(L)),with respect to the color and spatial features of each region.
The spatial gravity centers of each superpixel ${I_{SLIC}}_{{\left(\textrm{S}\right)}_i}$ inside the shadow region $G{C}_{shado{w}_i},i=1,2\dots K$ and the corresponding ones of the lit region $G{C}_{li{t}_j},j=1,2,\dots K$ are calculated (line 4), where K is the number of superpixels (line 1).
In each channel of the RGB color space (line 6), for each gravity center $G{C}_{shado{w}_i}$ of ${I_{SLIC}}_{{\left(\textrm{S}\right)}_i}$, its Euclidean spatial distance $d\equiv d\left({GC}_{shado{w}_i\kern0.5em },\kern0.5em {GC}_{li{t}_j\kern0.5em }\right)$ from each gravity center of $G{C}_{li{t}_j}$of superpixel I_SLIC(L)j is calculated (line 8) in order to find the minimum one (line 9). The minimum spatial distance corresponds to the distance of the superpixel that will illuminate the shadowed superpixel located in the area ${GC}_{shado{w}_i\kern0.5em }$.
The lit superpixel pair_i, that has the minimum distance d from the shadowed one, ${I_{SLIC}}_{{\left(\textrm{S}\right)}_i}$, is considered as the optimal counterpart for the respective shadow superpixel.
Next, the histograms, and the corresponding cumulative distribution functions (cdfs) of the shadow superpixel ${I_{SLIC}}_{{\left(\textrm{S}\right)}_i}$and of its pair_i, are calculated (line 10).
Histogram matching [40] is performed on ${I_{SLIC}}_{{\left(\textrm{S}\right)}_i}$to transform the shadow histogram, so that it matches the corresponding lit cdf of pair_i. The shadow superpixels are relighted, using the color values of the lit counterpart pair_i (line 11).
These steps are iteratively performed for all shadow superpixels ${I_{SLIC}}_{{\left(\textrm{S}\right)}_{i=1,..K}}$.
Finally, all the relighted shadowed superpixels are merged to extract the relighted region I_R (line 12). After completing these iterations, the relighted shadowed region I_R is merged with the initial lit region I_L (line 13).

The entire process is repeated three times, for R, G, and B channels, and the results are concatenated (line 14) to produce the final shadow-free image I_{non shadow} (line 15).

4 Evaluation

4.1 Experimental setup and datasets

Τhe proposed methodology and all the experiments have been implemented in MATLAB R2019a, on an AMD Ryzen Core-75800H 3.2 GHz, with 16 GB RAM. The experimental evaluation has been based on three benchmark datasets, namely the Image Shadow Triplets Dataset (ISTD), the Adjusted Image Shadow Triplets Dataset (AISTD), and the Shadow Removal Dataset (SRD). ISTD is the most challenging shadow dataset employed in state-of-the-art works. It consists of 2410 triplets, each comprising the initial RGB images with shadows, the shadow mask, and the ground truth RGB shadow-free image. ISTD is divided in two subsets; the first subset is composed of 1870 images for training and the second one contains 540 images for testing. Each image has a size of 480 × 640 pixels. Another well-recognized dataset is AISTD, which is an improved version of the ISTD, as described in [33], with 1870 training and 540 testing images. Experiments were also performed using the SRD dataset. This dataset has been proposed in [46] and consists of 3088 images in total, from which 2680 are used for training and 408 are used for testing. This dataset includes images of various scenes, illumination conditions and object types, in order to enable the investigation of various shadow and reflectance phenomena.

4.2 Evaluation metrics

The results of the proposed method have been evaluated both quantitatively and qualitatively. The quantitative evaluation was based on the Root Mean-Squared Error (RMSE) and the Peak Signal to Noise Ratio (PSNR). The RMSE between two given images has been calculated for the shadowed area, the non-shadow area, and for all areas, using the evaluation code proposed in [21], which has also been used in major state-of-the-art works, such as [33, 34, 53]. In that code, the RMSE was implemented as follows:

$$RMSE=\frac{1}{n}\sqrt{\sum_{i=1}^n{\left(G{T}_i- Outpu{t}_i\right)}^2}$$

(5)

where GT is the ground truth image, Output is the predicted shadow-free image and i = 1, . . , n represents the index of each pixel in the area of interest (i.e., shadow, non-shadow, all areas) in the image, and n is the total number of pixels in that area.

The Mean-Squared Error (MSE) between the output of the shadow removal and the ground truth without shadows is calculated by:

$$MSE=\frac{1}{n}\sum_{i=1}^n{\left(G{T}_i- Outpu{t}_i\right)}^2$$

(6)

PSNR has been calculated using Eq. (6):

$$PSNR=10\log \left(\frac{M^2}{MSE}\right)$$

(7)

where M is the maximum pixel value in the area of interest. PSNR is measured in decibels (db). A higher PSNR value is linked to higher output image quality (↑). The RMSE decreases as the output image becomes more similar to the ground truth; therefore, the output image quality is improved.

Furthermore, we have also assessed our experiments using the novel evaluation metric Learned Perceptual Image Patch Similarity (LPIPS), which closely matches human reception [61]. For some predefined network, LPIPS merely computes the similarity between the activations of two image patches. An image patch perceived similarity is indicated by a low LPIPS score (↓).

5 Results

In this section, quantitative and qualitative results of the proposed methodology are presented. Different values of K with respect to superpixel segmentation were tested to find the most appropriate segmentation level; specifically, K = 70, 80, 90, 100, 400 and 700 superpixels. Tables 1, 2 and 3 summarize the results obtained by SUShe in ISTD, AISTD and SRD datasets, respectively (the best results are indicated in bold). In Table 1 it can be noted that by setting K = 90 (PSNR = 24.82, RMSE = 8.14, LPIPS = 0.079), SUShe achieves the best shadow removal results in ISTD. Second best results were obtained for K = 70 (PSNR = 24.84, RMSE = 8.15, LPIPS = 0.078). For K = 100, K = 400, K = 700 (mean values approximately equal to PSNR = 24.73, RMSE = 8.17 and LPIPS = 0.084) the results are comparable. For K = 80 the lowest performance is obtained. In the case of AISTD (Table 2), the best results of SUShe are also obtained for the lowest values K, in the range tested, i.e. K = 70 (PSNR = 30.09, RMSE = 4.12, LPIPS = 0.076) and K = 90 (PSNR = 30.06, RMSE = 4.11, LPIPS = 0.076). Again, for larger values of K, i.e. K = 100, Κ = 400, Κ = 700, the results are comparable to each other (mean values approximately equal to PSNR = 29.60, RMSE = 4.24, LPIPS = 0.083). The results obtained for K = 70 are the best all over the datasets we have used for evaluation. In the case of SRD, the optimal results obtained for K = 70 lead to the lowest RMSE score (PSNR = 22.05, RMSE = 8.68, LPIPS = 0.167) The values K = 80, 90 lead to slightly inferior accuracy. It can be observed that higher values for K are not linked with higher quality results.

Table 1 Quantitative Results of the proposed methodology for different superpixel values in ISTD

Full size table

Table 2 Quantitative Results of the proposed methodology for different superpixel values in AISTD

Full size table

Table 3 Quantitative results of the proposed methodology for different superpixel values in SRD

Full size table

Figure 4 indicates that the proposed methodology is relatively insensitive to K. Especially in terms of LPIPS, the results are comparable using different values of K. Yet, its performance is slightly better for the lowest values of K, in the range tested.

Tables 4, 5 and 6 present experimental comparisons between SUShe (indicated in bold) and state-of-the-art shadow removal algorithms. The results presented for the latter are derived from the literature. RMSE is computed in three ways: a) inside the shadowed region (Shadow), outside the shadowed region (Non-shadow) and in the entire image (All regions). Figures 5, 6 and 7 illustrate qualitative results of SUShe and other state-of-the-art algorithms, including all unsupervised methods and some of the supervised ones, which are publicly available by the authors.

Table 4 Quantitative results in comparison with other state-of-the-art methodologies for ISTD

Full size table

Table 5 Quantitative results in comparison with other state-of-the-art methodologies for AISTD

Full size table

Table 6 Quantitative results in comparison with other state-of-the-art methodologies for SRD

Full size table

Table 4 presents comparisons on ISTD. SUShe outperforms all non-neural network-based methods ([21, 23, 58]), as it achieves the best results in terms of all metrics (PSNR = 24.82, RMSE = 8.14, LPIPS = 0.079). The methods of Guo et al. [23] and Gong et al. [21] have been created using supervised techniques in the context of shadow detection and removal. On one hand, Guo et al. use pairwise classification for shadow removal. On the other hand, the method of Gong et al. requires indication of the shadowed and lit areas using a GUI tool to apply shadow detection. In addition, SUShe outperforms three state-of-the-art neural network-based methods: ARGAN [14], Cycle-GAN [64], the method proposed by Nagae et al. [43], and the well-known SP + M Net, DHAN [33] (Table 4). The rest of the neural network-based methods achieve RMSE values that are lower than the RMSE of SUShe. Still, SUShe obtains LPIPS = 0.079 in the ISTD, which is the lowest value with the exception of ST-CGAN. Approaches such as [26, 27, 33, 37, 46], [20, 30] lead to results comparable to SUShe (with a difference in RMSE that is not exceeding 2.0); however, these approaches require training. Figure 5 illustrates indicative shadow removal results of SUShe and the state-of-the-art methods compared. As can be observed in Fig. 8b or e, some methods completely alter the image, while others fail in completely removing the shadow (Fig. 8c, d, f, j). In addition, in Fig. 8h, the pair of shadowed/non-shadowed regions is erroneous, leading to a relighting from an erroneous non-shadow area and eventually to an erroneous brightness reset. SUShe is the only completely unsupervised methodology with a satisfactory performance along the entire ISTD. The comparative results in the ISTD are graphically represented in Fig. 5.

Table 5 presents comparisons between SUShe and state-of-the-art shadow removal methods, for AISTD. In this case, SUShe is ranked second with respect to PSNR and LPIPS, after SP + M Net and SG-ShadowNet, and is ranked third with respect to RMSE. The difference between SUShe results and these methods is notably small, given the higher computational complexity of the latter. Figure 9 illustrates comparative results on images from AISTD. Once again, the methods proposed by Guo et al., Gong et al., and the methods DC-ShadowNet and LG-ShadowNet (Fig. 9c-e, h) fail to remove the shadow in the second and third image (center and right column), whereas the method of Yang et al. (Fig. 9b) alters the image inside and outside the shadow regions. The best results are obtained by SUShe, whereas comparable results are obtained by SG-ShadowNet, DHAN, and by the method of Fu et al. (Fig. 9f-g, i). The comparative results in the AISTD are also graphically represented in Fig. 6.

Table 6 presents comparisons on SRD. In this case, SUShe outperforms all unsupervised methods, and notably outperforms Cycle-GAN and the method of Nagae et al. as well. Furthermore, the performance of SUShe is comparable to the performances of DHAN (RMSE = 8.68) and DeShadowNet [46], in terms of LPIPS. Figure 10 illustrates indicative qualitative results of SUShe and other state-of-the-art algorithms on images from SRD, including the supervised methods. In the case of the first image of Fig. 10 (left), the output of SUShe is obviously closer to the ground truth than the rest of the methods. In the case of the second image of Fig. 10 (right), the result of SUShe is comparable to that of DeshadowNet and ARGAN, DSC, DC-ShadowNet, Fu et al., and SG-ShadowNet. It is also worth noting that SUShe preserves the details of the original image, unlike ST-CGAN, which introduces blur artifacts on the original image. Overall, SUShe achieves comparable shadow removal results with some supervised methods. The comparative results in the SRD are also graphically represented in Fig. 7.

6 Computational complexity

The computational complexity of SUShe is estimated in order to quantitatively assess its efficiency. More specifically:

a) SLIC Superpixels are used for segmentation of both lit and shadowed regions by employing hue and spatial coordinates. SLIC bypasses tens of thousands of point-to-point redundant distance calculations by localizing the search in the clustering process. SLIC is O(N), where N is the number of the pixels of an image [2].

b) For each RGB channel (c = 3 for the loops in the following process):

Histogram calculation of an image is O(N), since N = width × height.
Calculation of the gravity centers of the lit superpixels is O(K), where K is the number of lit superpixels.
Scanning the shadow superpixels to calculate the following features, is O(K):
Calculation of the Euclidean distance with all lit superpixels and finding the minimum amounts to O(K).
The calculation of cdfs is estimated almost equal to O(N).

Histogram matching amounts to O(greylevels²).

A quantitative comparison of the Floating-point Operations per Second (FLOPS) between SUShe and available deep learning-based methods is presented in Table 7 and Fig. 11. It can be noticed that SUShe has a significantly lower FLOPS value (indicated in bold).

Table 7 The number of FLOPS per image for different shadow removal methods

Full size table

7 Discussion and conclusions

This work investigated a simple, efficient and effective solution for the complex problem of shadow removal, which can affect object detection and recognition algorithms, deteriorating their performance. The experimental results showed that by combining simple segmentation and color enhancement algorithms, the original brightness of shadowed regions can be restored. This was validated by quantitative and qualitative comparisons performed with both unsupervised and supervised state-of-the-art methods. All experiments were performed in three widely adopted publicly available benchmark datasets. From Tables 4, 5 and 6 and Figs. 5, 6 and 7, it is evident that SUShe outperforms all state-of-the-art unsupervised methods compared, as well as some supervised ones (Cycle-GAN, ARGAN, Nagae et al., DHAN, LG-ShadowNet, DC-ShadowNet). As for those that SUShe does not outperform, such as SG-ShadowNet, the results of SUShe are comparable both quantitatively and qualitatively (Figs. 8, 9 and 10). To the best of our knowledge SUShe is the only unsupervised algorithm which provides a similar shadow removal performance with most of the state-of-the-art supervised methods compared, on the full, widely used benchmark datasets considered in this study. Furthermore, the comparisons with state-of-the-art deep-learning-based methods in terms computational complexity showed that SUShe is much more efficient (Table 7 and Fig. 11).

Overall, the following conclusions can be derived:

SUShe is very simple to implement and of low computational complexity.
Its computational complexity is generally lower than that of the state-of-the-art algorithms for shadow removal.
The results obtained indicate that SUShe can remove shadows better than any of the compared state-of-the-art unsupervised shadow removal methods.
In comparison with the supervised state-of-the-art shadow removal methods, its performance is comparable or better.
Solving the shadow removal problem does not necessarily require complex deep learning-based solutions.

Shadow removal is instrumental for object recognition in various domains such as remote sensing image processing, traffic monitoring and object recognition. Future work will involve evaluation of SUShe in various applications where rapid system response is required, such as assistive navigation systems [13]. Furthermore, SUShe can be also applied in the medical domain to investigate how shadow removal can optimize the results of medical imaging in the shadowed regions of an internal body organ, e.g., the shadowed regions in images obtained from gastrointestinal capsules, diagnostic ultrasounds etc.

Data availability

All data analyzed during this study are available in https://github.com/DeepInsight-PCALab/ST-CGAN (ISTD) and https://github.com/ cvlab-stonybrook/SID (AISTD), and included in this published article [46] (SRD).

References

Abiko R, Ikehara M (2022) Channel Attention GAN Trained with Enhanced Dataset for Single-Image Shadow Removal. IEEE Access 10:12322–12333
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2010) Slic superpixels. ΙEEE Trans Pattern Anal Mach Intell 34(11):2274–82. https://doi.org/10.1109/TPAMI.2012.120
Alvarado-Robles G, Osornio-Rios RA, Solis-Munoz FJ, Morales-Hernandez LA (2021) An approach for shadow detection in aerial images based on multi-channel statistics. IEEE Access 9:34240–34250
Article Google Scholar
Avina-Cervantes JG, Mart𝚤nez-Jiménez L, Devy M, Hernández-Gutierrez A, Almanza DL, Ibarra MA (2007) Shadows attenuation for robust object recognition. In: Mexican International Conference on Artificial Intelligence, pp 650–659
Baba M, Mukunoki M, Asada N (2004) Shadow removal from a real image based on shadow density. In: Proceedings of the ACM SIGGRAPH, p 60
Barrow H, Tenenbaum J, Hanson A, Riseman E (1978) Recovering intrinsic scene characteristics. Comput Vis Syst 2(3–26):2
Google Scholar
Benalia S, Hachama M (2022) A nonlocal method for image shadow removal. Comput Math Appl 107:95–103
Article MathSciNet Google Scholar
Birbil SL, Fang S-C (2003) An electromagnetism-like mechanism for global optimization. J Glob Optim 25(3):263–282
Article MathSciNet Google Scholar
Chen Q, Zhang G, Yang X, Li S, Li Y, Wang HH (2018) Single image shadow detection and removal based on feature fusion and multiple dictionary learning. Multimed Tools Appl 77(14):18601–18624. https://doi.org/10.1007/s11042-017-5299-0
Chen Z, Long C, Zhang L, Xiao C (2021) Canet: A context-aware network for shadow removal. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 4743–4752
Cun X, Pun C-M, Shi C (2020) Towards ghost-free shadow removal via dual hierarchical aggregation network and shadow matting GAN. In: Proceedings of the AAAI Conf Artif Intell 34(07), pp 10680–10687 https://doi.org/10.1609/aaai.v34i07.6695
Google Scholar
Dhingra G, Kumar V, Joshi HD (2021) Clustering-based shadow detection from images with texture and color analysis. Multimed Tools Appl 80(25):33763–33778
Article Google Scholar
Dimas G, Diamantis DE, Kalozoumis P, Iakovidis DK (2020) Uncertainty-aware visual perception system for outdoor navigation of the visually challenged. Sensors 20(8):2385
Article ADS PubMed PubMed Central Google Scholar
Ding B, Long C, Zhang L, Xiao C (2019) Argan: attentive recurrent generative adversarial network for shadow detection and removal. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 10213–10222
Einy T, Immer E, Vered G, & Avidan S (2022) Physics based image deshadowing using local linear model. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 3012–3020
Fan X, Wu W, Zhang L, Yan Q, Fu G, Chen Z, Long C, Xiao C (2020) Shading-aware shadow detection and removal from a single image. Vis Comput 36(10):2175–2188. https://doi.org/10.1007/s00371-020-01916-3
Finlayson GD, Hordley SD, Drew MS (2002) Removing shadows from images. In: Proceedings of the Eur Conf Comput Vis (ECCV), pp 823–836
Finlayson GD, Hordley SD, Lu C, Drew MS (2005) On the removal of shadows from images. IEEE Trans Pattern Anal Mach Intell 28(1):59–68
Article Google Scholar
Finlayson GD, Drew MS, Lu C (2009) Entropy minimization for shadow removal. Int J Comput Vis 85(1):35–57
Article Google Scholar
Fu L, Zhou C, Guo Q, Juefei-Xu F, Yu H, Feng W, Liu Y, Wang S (2021) Auto-exposure fusion for single-image shadow removal. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 10571–10580
Gong H, Cosker D (2016) Interactive removal and ground truth for difficult shadow scenes. JOSA A 33(9):1798–1811. https://doi.org/10.1364/JOSAA.33.001798
Guo R, Dai Q, Hoiem D (2011) Single-image shadow detection and removal using paired regions. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR) 2011, pp 2033–2040
Google Scholar
Guo R, Dai Q, Hoiem D (2012) Paired regions for shadow detection and removal. IEEE Trans Pattern Anal Mach Intell 35(12):2956–2967
Article Google Scholar
He S, Peng B, Dong J, Du Y (2021) Mask-ShadowNet: toward shadow removal via masked adaptive instance normalization. IEEE Signal Process Lett 28:957–961
Article ADS Google Scholar
Hiary H, Zaghloul R, Al-Zoubi MB (2018) Single-image shadow detection using quaternion cues. Comput J 61(3):459–468
Article Google Scholar
Hu X, Fu C-W, Zhu L, Qin J, Heng P-A (2019) Direction-aware spatial context features for shadow detection and removal. IEEE Trans Pattern Anal Mach Intell 42(11):2795–2808
Article PubMed Google Scholar
Hu X, Jiang Y, Fu C-W, Heng P-A (2019) Mask-shadowgan: learning to remove shadows from unpaired data. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 2472–2481
Hu X, Wang T, Fu C-W, Jiang Y, Wang Q, Heng P-A (2021) Revisiting shadow detection: a new benchmark dataset for complex world. IEEE Trans Image Process 30:1925–1934
Article ADS PubMed Google Scholar
Jarraya SK, Hammami M, Ben-Abdallah H (2016) Adaptive moving shadow detection and removal by new semi-supervised learning technique. Multimed Tools Appl 75(18):10949–10977
Article Google Scholar
Jin Y, Sharma A, Tan RT (2021) DC-Shadownet: single-image hard and soft shadow removal using unsupervised domain-classifier guided network. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 5027–5036
Khare M, Srivastava RK, Jeon M (2018) Shadow detection and removal for moving objects using Daubechies complex wavelet transform. Multimed Tools Appl 77(2):2391–2421. https://link.springer.com/article/10.1007/s11042-017-4371-0
Koutsiou D-CC, Savelonas M, Iakovidis DK (2021) HV shadow detection based on electromagnetism-like optimization. In: Proceedings of the Eur Sig Process Conf (EUSIPCO), pp 635–639
Le H, Samaras D (2019) Shadow removal via shadow image decomposition. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 8578–8587
Le H, Samaras D (2020) From shadow segmentation to shadow removal. In: Proceedings of the Eur Conf Comput Vis (ECCV), pp 264–281
Le H, Samaras D (2021) Physics-based shadow image decomposition for shadow removal. IEEE Trans Pattern Anal Mach Intell 44(12):9088–9101. https://doi.org/10.1109/tpami.2021.3124934
Liu F, Gleicher M (2008) Texture-consistent shadow removal. In: Proceedings of the Eur Conf Comput Vis (ECCV), pp 437–450
Liu Z, Yin H, Mi Y, Pu M, Wang S (2021) Shadow removal by a lightness-guided network with training on unpaired data. IEEE Trans Image Process 30:1853–1865
Article ADS PubMed Google Scholar
Liu Z, Yin H, Wu X, Wu Z, Mi Y, Wang S (2021) From shadow generation to shadow removal. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 4927–4936
Liu Y, Li Q, Yuan Y, Du Q, Wang Q (2021) ABNet: adaptive balanced network for multiscale object detection in remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–14
CAS Google Scholar
Maini R, Aggarwal H (2010) A comprehensive review of image enhancement techniques. arXiv:1003.4053. https://doi.org/10.48550/arXiv.1003.4053
Murali S, Govindan V, Kalady S (2019) Shadow removal from uniform-textured images using iterative thresholding of shearlet coefficients. Multimed Tools Appl 78(15):21167–21186
Article Google Scholar
Murali S, Govindan V, Kalady S (2022) Quaternion-based image shadow removal. Vis Comput 38(5):1527–1538
Article Google Scholar
Nagae T, Abiko R, Yamaguchi T, Ikehara M (2021) Shadow detection and removal using GAN. In: Proceedings of the Eur Sig Process Conf (EUSIPCO), pp 630–634
Ntakolia C, Dimas G, Iakovidis DK (2022) User-centered system design for assisted navigation of visually impaired individuals in outdoor cultural environments. Univ Access Inf Soc 21(1):249–274. https://doi.org/10.1007/s10209-020-00764-1
Oliva D, Cuevas E, Pajares G, Zaldivar D, Osuna V (2014) A multilevel thresholding algorithm using electromagnetism optimization. Neurocomputing 139:357–381
Article Google Scholar
Qu L, Tian J, He S, Tang Y, Lau RW (2017) Deshadownet: a multi-context embedding deep network for shadow removal. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 4067–4075
Shor Y, Lischinski D (2008) The shadow meets the mask: pyramid-based shadow removal. Comput Graph Forum 27(2):577–586
Article Google Scholar
Sovatzidi G, Savelonas M, Koutsiou D-CC, Iakovidis DK (2020) Image segmentation based on determinative brain storm optimization. In: Proceedings of the Int Work Sem Soc Med Adapt Pers (SMAP), 2020:1–6
Ufuktepe DK, Collins J, Ufuktepe E, Fraser J, Krock T, Palaniappan K (2021) Learning-based shadow detection in aerial imagery using automatic training supervision from 3D point clouds. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 3926–3935
Vicente TFY, Hou L, Yu C-P, Hoai M, Samaras D (2016) Large-scale training of shadow detectors with noisily-annotated shadow examples. In: Proceedings of the Eur Conf Comput Vis (ECCV), part VI, 14, pp 816–832
Wan J., H. Yin, Z. Wu, X. Wu, Y. Liu, and S. Wang (2022) Style-guided shadow removal. In: Proceedings of the Eur Conf Comput Vis (ECCV), pp 361–378
Wang J-M, Chung Y-C, Chang C, Chen S-W (2004) Shadow detection and removal for traffic images. IEEE Int Conf Network Sens Control 1:649–654
Google Scholar
Wang J, Li X, Yang J (2018) Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. In: Proceedings of the IEEE Conf Comput Vis Patt Rec (CVPR), pp 1788–1797
Wang J, Yang D, Chen S, Zhu X, Wu S, Bogonovich M, Guo Z, Zhu Z, Wu J (2021) Automatic cloud and cloud shadow detection in tropical areas for PlanetScope satellite images. Remote Sens Environ 264:112604. https://doi.org/10.1016/j.rse.2021.112604
Wang Q, Liu Y, Xiong Z, Yuan Y (2022) Hybrid feature aligned network for salient object detection in optical remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–15
Google Scholar
Wu W, Wu X, Wan Y (2022) Single-image shadow removal using detail extraction and illumination estimation. Vis Comput 38(5):1677–1687. https://doi.org/10.1007/s00371-021-02096-4
Xiao M, Han C-Z, Zhang L (2007) Moving shadow detection and removal for traffic sequences. Int J Autom Comput 4(1):38–46
Article Google Scholar
Yang Q, Tan K-H, Ahuja N (2012) Shadow removal using bilateral filtering. IEEE Trans Image Process21(10):4361–4368. https://doi.org/10.1109/TIP.2012.2208976
Zhang H, Sun K, Li W (2014) Object-oriented shadow detection and removal from urban high-resolution remote sensing images. IEEE Trans Geosci Remote Sens 52(11):6972–6982. https://doi.org/10.1109/TGRS.2014.2306233
Zhang L, Zhang Q, Xiao C (2015) Shadow remover: image shadow removal based on illumination recovering optimization. IEEE Trans Image Process 24(11):4623–4636
Article ADS MathSciNet PubMed Google Scholar
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conf Compu Vis Patt Rec (CVPR), pp 586–595
Zhang L, Long C, Zhang X, Xiao C (2020) Ris-Gan: explore residual and illumination with generative adversarial networks for shadow removal. In: Proceedings of the AAAI Conf Artif Intell 34(07):12829–12836
Zheng L, Ruan X, Chen Y, Huang M (2017) Shadow removal for pedestrian detection and tracking in indoor environments. Multimed Tools Appl 76(18):18321–18337. https://doi.org/10.1007/s11042-016-3880-6
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2223–2232
Zhu Y, Huang J, Fu X, Zhao F, Sun Q, Zha Z-J (2022) Bijective mapping network for shadow removal. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 5627–5636

Download references

Acknowledgements

The authors would like to thank Prof. Yandong Tang for the SRD dataset provision.

This work was co-financed by Greece and the European Union (European Social Fund-ESF) through the Operational Programme «Human Resources Development, Education and Lifelong Learning» in the context of the Act “Enhancing Human Resources Research Potential by undertaking a Doctoral Research” Sub-action 2: IKY Scholarship Programme for PhD candidates in the Greek Universities.

Funding

Open access funding provided by HEAL-Link Greece.

Author information

Authors and Affiliations

Department of Computer Science and Biomedical Informatics, University of Thessaly, Papasiopoulou 2-4, 35131, Lamia, Greece
Dimitra-Christina C. Koutsiou, Michalis A. Savelonas & Dimitris K. Iakovidis

Authors

Dimitra-Christina C. Koutsiou
View author publications
You can also search for this author in PubMed Google Scholar
Michalis A. Savelonas
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris K. Iakovidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitris K. Iakovidis.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Koutsiou, DC.C., Savelonas, M.A. & Iakovidis, D.K. SUShe: simple unsupervised shadow removal. Multimed Tools Appl 83, 19517–19539 (2024). https://doi.org/10.1007/s11042-023-16282-0

Download citation

Received: 16 November 2022
Revised: 31 May 2023
Accepted: 04 July 2023
Published: 28 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-16282-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

SUShe: simple unsupervised shadow removal

Abstract

Similar content being viewed by others

Deep Learning Based Shadow Detection in Images

Towards enhancing shadow removal from images

Deep Learning Based Shadow Removal: Target to Current Methodology Flaws

1 Introduction

2 Related work