Abstract
Between the imaging sensor and the image applications, the hardware image signal processing (ISP) pipelines reconstruct an RGB image from the sensor signal and feed it into downstream tasks. The processing blocks in ISPs depend on a set of tunable hyperparameters that have a complex interaction with the output. Manual setting by image experts is the traditional way of hyperparameter tuning, which is time-consuming and biased towards human perception. Recently, ISP has been optimized by the feedback of the downstream tasks based on different optimization algorithms. Unfortunately, these methods should keep parameters fixed during the inference stage for arbitrary input without considering that each image should have specific parameters based on its feature. To this end, we propose an attention-aware learning method that integrates the parameter prediction network into ISP tuning and utilizes the multi-attention mechanism to generate the attentive mapping between the input RAW image and the parameter space. The proposed method integrates downstream tasks end-to-end, predicting specific parameters for each image. We validate the proposed method on object detection, image segmentation, and human viewing tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
IEEE standard for camera phone image quality. IEEE Std 1858–2016 (Incorporating IEEE Std 1858–2016/Cor 1–2017), pp. 1–146 (2017). https://doi.org/10.1109/IEEESTD.2017.7921676
Bardenet, R., Brendel, M., Kégl, B., Sebag, M.: Collaborative hyperparameter tuning. In: International conference on machine learning, pp. 199–207 (2013)
van Beek, P., Wu, C.T.R., Chaudhury, B., Gardos, T.R.: Boosting computer vision performance by enhancing camera ISP. Electronic Imaging 2021(17), 1–174 (2021)
Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems 24 (2011)
Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International Conference on Machine Learning, pp. 115–123 (2013)
Brown, M.S., Kim, S.: Understanding the in-camera image processing pipeline for computer vision. In: IEEE International Conference on Computer Vision, vol. 3 (2019)
Buckler, M., Jayasuriya, S., Sampson, A.: Reconfiguring the imaging pipeline for computer vision. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 975–984 (2017)
Cao, Y., Wu, X., Qi, S., Liu, X., Wu, Z., Zuo, W.: Pseudo-ISP: learning pseudo in-camera signal processing pipeline from a color image denoiser. arXiv preprint arXiv:2103.10234 (2021)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018)
Cheung, E.C., Wong, J., Chan, J., Pan, J.: Optimization-based automatic parameter tuning for stereo vision. In: 2015 IEEE International Conference on Automation Science and Engineering (CASE), pp. 855–861 (2015)
Hansen, N., Müller, S.D., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evol. Comput. 11(1), 1–18 (2003)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1389–1397 (2017)
Ignatov, A., Van Gool, L., Timofte, R.: Replacing mobile camera ISP with a single deep learning model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537 (2020)
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018)
Karaimer, H.C., Brown, M.S.: A software platform for manipulating the camera imaging pipeline. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 429–444. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_26
Kim, S.J., Lin, H.T., Lu, Z., Süsstrunk, S., Lin, S., Brown, M.S.: A new in-camera imaging model for color computer vision and its application. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2289–2302 (2012)
Kim, Y., Lee, J., Kim, S.S., Yang, C., Kim, T., Yim, J.: DNN-based ISP parameter inference algorithm for automatic image quality optimization. Electronic Imaging 2020(9), 1–315 (2020)
Liang, Z., Cai, J., Cao, Z., Zhang, L.: CameraNet: a two-stage framework for effective camera ISP learning. IEEE Trans. Image Process. 30, 2248–2262 (2021)
Lin, J., Rao, Y., Lu, J., Zhou, J.: Runtime neural pruning. In: Advances in neural information processing systems 30 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, D., Wen, B., Jiao, J., Liu, X., Wang, Z., Huang, T.S.: Connecting image denoising and high-level vision tasks via deep learning. IEEE Trans. Image Process. 29, 3695–3706 (2020)
Majumdar, P., Singh, R., Vatsa, M.: Attention aware debiasing for unbiased model prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4133–4141 (2021)
Mantiuk, R.K., Tomaszewska, A., Mantiuk, R.: Comparison of four subjective methods for image quality assessment. In: Computer graphics forum, vol. 31, pp. 2478–2491. Wiley Online Library (2012)
Mosleh, A., Sharma, A., Onzon, E., Mannan, F., Robidoux, N., Heide, F.: Hardware-in-the-loop end-to-end optimization of camera image processing pipelines. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7529–7538 (2020)
Nishimura, J., Gerasimow, T., Sushma, R., Sutic, A., Wu, C.T., Michael, G.: Automatic ISP image quality tuning using nonlinear optimization. In: 2018 25th IEEE International Conference on Image Processing, pp. 2471–2475. IEEE (2018)
Onzon, E., Mannan, F., Heide, F.: Neural auto-exposure for high-dynamic range object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7710–7720 (2021)
Pfister, L., Bresler, Y.: Learning filter bank sparsifying transforms. IEEE Trans. Signal Process. 67(2), 504–519 (2018)
Phan, B., Mannan, F., Heide, F.: Adversarial imaging pipelines. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16051–16061 (2021)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Robidoux, N., Capel, L.E.G., Seo, D., Sharma, A., Ariza, F., Heide, F.: End-to-end high dynamic range camera pipeline optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6297–6307 (2021)
Schwartz, E., Giryes, R., Bronstein, A.M.: DeepISP: toward learning an end-to-end image processing pipeline. IEEE Trans. Image Process. 28(2), 912–923 (2018)
Thung, K.H., Raveendran, P.: A survey of image quality measures. In: IEEE international conference for technical postgraduates, pp. 1–4 (2009)
Tseng, E., et al.: Differentiable compound optics and processing pipeline optimization for end-to-end camera design. ACM Trans. Graph. 40(2), 1–19 (2021)
Tseng, E., et al.: Hyperparameter optimization in black-box image processing using differentiable proxies. ACM Trans. Graph. 38(4), 1–27 (2019)
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems 30 (2017)
Wu, C.T., et al.: VisionISP: repurposing the image signal processor for computer vision applications. In: IEEE International Conference on Image Processing, pp. 4624–4628. IEEE (2019)
Yahiaoui, L., Hughes, C., Horgan, J., Deegan, B., Denny, P., Yogamani, S.: Optimization of ISP parameters for object detection algorithms. Electronic Imaging 2019(15), 1–44 (2019)
Yang, C., et al.: Effective ISP tuning framework based on user preference feedback. Electronic Imaging 2020(9), 1–316 (2020)
Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. In: Artificial intelligence and statistics, pp. 1077–1085. PMLR (2014)
Yu, K., Li, Z., Peng, Y., Loy, C.C., Gu, J.: ReconfigISP: reconfigurable camera image processing pipeline. arXiv preprint arXiv:2109.04760 (2021)
Zamir, S.W., et al.: CycleISP: real image restoration via improved data synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2696–2705 (2020)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8697–8710 (2018)
Acknowledgements
This work was supported by the National Key Research and Development Program of China (Grant No. 2020AAA0105802), the Natural Science Foundation of China (Grant No. 62036011,62192782, 61721004,62122086, 61906192, U1936204 ), the Key Research Program of Frontier Sciences, CAS, Grant No. QYZDJ-SSW-JSC040, Beijing Natural Science Foundation (No. 4222003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Qin, H. et al. (2022). Attention-Aware Learning for Hyperparameter Prediction in Image Processing Pipelines. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13679. Springer, Cham. https://doi.org/10.1007/978-3-031-19800-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-19800-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19799-4
Online ISBN: 978-3-031-19800-7
eBook Packages: Computer ScienceComputer Science (R0)