The last decade has witnessed tremendous progress in many sub-fields of computer vision, due to the increasing availability of large annotated datasets such as PASCAL VOC (Everingham et al. 2010), ImageNet (Russakovsky et al. 2015), KITTI (Geiger et al. 2012), MS-COCO (Lin et al. 2014) and Cityscapes (Cordts et al. 2016). While steady progress is still being made, the performance is mainly benchmarked under good weather and favorable lighting conditions. Even the best performing algorithms on those benchmarks can become untrustworthy in a new domain or under adverse conditions. As widely known, adverse weather and illumination conditions (e.g. fog, rain, snow, low-light, nighttime, glare and shadows) create visibility problems both for people and the sensors that power automated systems (Narasimhan and Nayar 2002; Garg and Nayar 2007; Tan 2008; Zhang and Patel 2018a; Sakaridis et al. 2018, 2020). Many real-world applications such as autonomous cars, rescue robots, and security systems can hardly escape from ‘bad’ weather and challenging illumination conditions. For example, an automated car still needs to detect and localize other traffic agents in the presence of fog, in sun glare or at nighttime. Therefore, vision algorithms that are robust to adverse weather and lighting conditions are strongly needed for real-world applications.
In view of this, promising methods have been proposed in the last few years and they can be categorized into three groups. The first stream of work aims to increase the robustness of deep learning approaches to common data corruptions (Hendrycks and Dietterich 2019; Hendrycks et al. 2020; Wong and Kolter 2021; Kamann and Rother 2020; Rusak et al. 2020). The second line of work aims to improve the performance of computer vision methods under adverse weather and lighting conditions. Notable examples include scene understanding methods for foggy (Sakaridis et al. 2018; Dai et al. 2020), rainy (Li et al. 2018; Halder et al. 2019), and nighttime conditions (Sakaridis et al. 2019; Larsson et al. 2019), continuous domain adaptation methods (Wulfmeier et al. 2018; Dai et al. 2020; Sakaridis et al. 2019), and large-scale benchmark construction (Maddern et al. 2017; Wenzel et al. 2020). The third group of work focus on visibility enhancement. This line of work has a long history (Narasimhan and Nayar 2002; Garg and Nayar 2007; Tan 2008) and has gained more popularity in recent years due to the development of deep neural networks (Pei et al. 2018; Zhang and Patel 2018b; Chen et al. 2018; Li et al. 2020; Zheng et al. 2020).
The aim of this special issue is to provide an overview of this burgeoning topic—“Computer Vision for All Seasons: Adverse Weather and Lighting Conditions”, to let the progress take a center stage and to testify to the quality of the work. Following an open call for papers, the special issue has attracted forty-six submissions in total, out of which sixteen were desk rejected. The remaining thirty submissions have gone through a rigorous reviewing procedure with ten of them being accepted. The ten articles that comprise this special issue contain a common thread of increasing the visibility of images or increase the robustness of computer vision methods under adverse weather/lighting conditions. The articles cover various areas within this theme, ranging from methods for image defogging, deraining, low-light enhancement and image translation, to methods for domain adaptation, object detection, semantic segmentation and depth estimation under adverse weather/lighting conditions, and to benchmark construction for high-level tasks under the targeted adverse conditions.
We now provide a brief summary of each paper:
In “Rain rendering for evaluating and improving robustness to bad weather”, Tremblay, Sukanta Halder, Charette, and Lalonde present a rain rendering pipeline to enable the systematic evaluation of common computer vision algorithms under controlled amounts of rain. The rendered rain effects by their method is more realistic than previous methods. The authors have conducted a thorough evaluation of object detection, semantic segmentation, and depth estimation algorithms on their generated rain-augmented KITTI, Cityscapes, and nuScenes datasets.
In “Benchmarking Low-Light Image Enhancement and Beyond”, Liu, Xu, Yang, Fan, and Huang present a systematic review and evaluation of existing single-image low-light enhancement algorithms, point out their limitations, and suggest promising future directions. The authors further measure and analyze the performance of face detection in low-light condition to line up low-level enhancement methods and high-level recognition methods. To this end, a large-scale low-light image dataset serving both low/high-level vision tasks have been proposed.
In “CDTD: A Large-Scale Cross-Domain Benchmark for Instance-Level Image-to-Image Translation and Domain Adaptive Object Detection”, Shen, Huang, Shi, Liu, Maheshwari, Zheng, Xue, Savvides, and Huang introduce a large-scale cross-domain benchmark for the new instance-level translation and object detection tasks. In addition, the authors provide comprehensive baseline results of the benchmark on both of these two tasks, and develop a novel instance-level image-to-image translation approach and a gradient detach method for the domain adaptive object detection.
In “Selective Wavelet Attention Learning for Single Image Deraining”, Huang, Yu, Chai, He, and Tan propose a selective wavelet attention learning method by learning a series of wavelet attention maps to guide the separation of rain and background information in both spatial and frequency domains. The key idea is to use wavelet transform to learn the content and structure of rainy features because high-frequency features are more sensitive to rain degradation, whereas low-frequency features preserve more of the background content.
In “LAMP-HQ: A Large-Scale Multi-pose High-Quality Database and Benchmark for NIR-VIS Face Recognition”, Yu Wu, Huang, Lei, and He propose a new Large-Scale Multi-Pose High-Quality NIR-VIS database ‘LAMP-HQ’ containing 56,788 NIR and 16,828 VIS images of 573 subjects with large diversities in pose, illumination, attribute, scene and accessory. The authors further propose a novel exemplar-based variational spectral attention network to produce high-fidelity VIS images from NIR data, in order to improve the performance of NIR-VIS heterogeneous face recognition.
In “Scale-Aware Domain Adaptive Faster R-CNN”, Chen, Wang, Li, Sakaridis, Dai, and Van Gool develop a novel method that is able to improve the cross-domain robustness of object detection. In particular, the authors improve the widely-used Faster R-CNN model by tackling the domain shift problem on two levels: (1) the image-level shift, such as image style and illumination, and (2) the instance-level shift, such as object appearance and size. Adversarial learning is employed to align the feature distributions. The method is further improved by incorporating the scales of the objects into adversarial training. The method yields state-of-the-art performance on multiple domain adaptation scenarios.
In “Context-enhanced Representation Learning for Single Image Deraining”, Wang, Sun, and Sowmya propose a context-enhanced representation learning and deraining network with a novel two-branch encoder design. The method aims to reduce the over- or under-deraining artefacts caused by the highly imbalanced distribution between rainy effects and varied background scenes. Experimental results show that the proposed method produces significantly better results than other state-of-the-art models for removing rainstreaks and raindrops from both synthetic images and real images.
In “You Only Look Yourself: Unsupervised and Untrained Single Image Dehazing Neural Network”, Li, Gou, Gu, Liu, Zhou, and Peng propose a novel unsupervised and untrained neural networks for image dehazing. The method bypasses the requirement of ground-truth clean images and an image collection by the conventional deep learning methods. It thus avoids the labor-intensive data collection step and the domain shift issue.
In “Successive Graph Convolutional Network for Image De-raining”, Fu, Qi, Zha, Ding, Wu, and Paisley propose a graph convolutional networks (GCNs)-based model for the single image deraining task. The method is able to explore comprehensive feature representations from three aspects, i.e., local spatial patterns, global spatial coherence and channel correlation, and it achieves state-of-the-art results on both synthetic and real-world data sets.
In “Attention Guided Low-light Image Enhancement with a Large Scale Low-light Simulation Dataset”, Lv, Li and Lu proposes a novel end-to-end attention-guided method based on multi-branch convolutional neural network for low-light image enhancement. The authors construct a synthetic dataset with carefully designed low-light simulation strategies to train the method. The method outperforms the current state-of-the-art methods both quantitatively and visually.
Collectively, these ten papers provide a detailed compilation of the diverse range of issues currently being investigated in the field of robust computer vision methods for all conditions.
Chen, C., Chen, Q., Xu, J., & Koltun, V. (2018). Learning to see in the dark. In CVPR.
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The Cityscapes dataset for semantic urban scene understanding. In CVPR.
Dai, D., Sakaridis, C., Hecker, S., & Van Gool, L. (2020). Curriculum model adaptation with synthetic and real data for semantic foggy scene understanding. International Journal of Computer Vision, 128, 1182–1204. https://doi.org/10.1007/s11263-019-01182-4.
Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The PASCAL visual object classes (VOC) challenge. IJCV, 88(2), 303–338.
Garg, K., & Nayar, S. K. (2007). Vision and rain. IJCV, 75(1), 3–27.
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In IEEE conference on computer vision and pattern recognition (CVPR).
Halder, S. S., Lalonde, J.-F., & Charette, R. D. (2019). Physics-based rendering for improving robustness to rain. In Proceedings of the IEEE/CVF international conference on computer vision (ICCV).
Hendrycks, D., & Dietterich, T. (2019). Benchmarking neural network robustness to common corruptions and perturbations. In International conference on learning representations.
Hendrycks, D., Mu, N., Cubuk, E. D., Zoph, B., Gilmer, J., & Lakshminarayanan, B. (2020). Augmix: A simple method to improve robustness and uncertainty under data shift. In International conference on learning representations.
Kamann, C., & Rother, C. (2020). Benchmarking the robustness of semantic segmentation models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR).
Larsson, M., Stenborg, E., Hammarstrand, L., Pollefeys, M., Sattler, T., & Kahl, F. (2019). A cross-season correspondence dataset for robust semantic segmentation. In The IEEE conference on computer vision and pattern recognition (CVPR).
Li, R., Tan, R. T., & Cheong, L.-F. (2018). Robust optical flow in rainy scenes. In V. Ferrari, M. Hebert, C. Sminchisescu, & Y. Weiss (Eds.), ECCV.
Li, R., Tan, R. T., & Cheong, L.-F. (2020). All in one bad weather removal using architectural search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR).
Lin, T., Maire, M., Belongie, S. J., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In ECCV.
Maddern, W., Pascoe, G., Linegar, C., & Newman, P. (2017). 1 year, 1000 km: The oxford robotcar dataset. The International Journal of Robotics Research, 36(1), 3–15.
Narasimhan, S. G., & Nayar, S. K. (2002). Vision and the atmosphere. IJCV, 48, 233–254.
Pei, Y., Huang, Y., Zou, Q., Lu, Y., & Wang, S. (2018). Does haze removal help CNN-based image classification? In European conference on computer vision (ECCV).
Rusak, E., Schott, L., Zimmermann, R. S., Bitterwolf, J., Bringmann, O., Bethge, M., & Brendel, W. (2020). A simple way to make neural networks robust against diverse image corruptions. In European conference on computer vision (ECCV).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Sakaridis, C., Dai, D., & Van Gool, L. (2018). Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 126, 973–992.
Sakaridis, C., Dai, D., & Van Gool, L. (2019). Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. In International conference on computer vision (ICCV).
Sakaridis, C., Dai, D., & Van Gool, L. (2020). Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. In IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2020.3045882.
Tan, R. T. L. (2008). Visibility in bad weather from a single image. In CVPR.
Wenzel, P., Wang, R., Yang, N., Cheng, Q., Khan, Q., von Stumberg, L., Zeller, N., & Cremers, D. (2020). 4Seasons: A cross-season dataset for multi-weather SLAM in autonomous driving. In German conference on pattern recognition (GCPR).
Wong, E., & Kolter, J. Z. (2021). Learning perturbation sets for robust machine learning. In International conference on learning representations.
Wulfmeier, M., Bewley, A., & Posner, I. (2018). Incremental adversarial domain adaptation for continually changing environments. In IEEE international conference on robotics and automation (ICRA).
Zhang, H., & Patel, V. M. (2018a). Densely connected pyramid dehazing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
Zhang, H., & Patel, V. M. (2018b). Density-aware single image de-raining using a multi-stream dense network. In CVPR.
Zheng, Z., Wu, Y., Han, X., & Shi, J. (2020). Forkgan: Seeing into the rainy night. In IEEE European conference on computer vision (ECCV).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Dai, D., Tan, R.T., Patel, V. et al. Guest Editorial: Special Issue on “Computer Vision for All Seasons: Adverse Weather and Lighting Conditions”. Int J Comput Vis 129, 2031–2033 (2021). https://doi.org/10.1007/s11263-021-01464-w