Skip to main content

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Abstract

While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries and requiring less than 1 s to perform the inference, while for FullHD images it achieves real-time performance. The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power. To evaluate the performance of the model, we collected a novel Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The experiments demonstrated that, despite its compact size, the MicroISP model is able to provide comparable or better visual results than the traditional mobile ISP systems, while outperforming the previously proposed efficient deep learning based solutions. Finally, this model is also compatible with the latest mobile AI accelerators, achieving good runtime and low power consumption o n smartphone NPUs and APUs. The code, dataset and pre-trained models are available on the project website: https://people.ee.ethz.ch/~ihnatova/microisp.html.

A. Ignatov, R. Timofte and L. Van Gool—The main contact authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdelhamed, A., Afifi, M., Timofte, R., Brown, M.S.: NTIRE 2020 challenge on real image denoising: dataset, methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 496–497 (2020)

    Google Scholar 

  2. Abdelhamed, A., Timofte, R., Brown, M.S.: NTIRE 2019 challenge on real image denoising: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)

    Google Scholar 

  3. API, A.N.N.: https://source.android.com/devices/neural-networks

  4. Cai, J., Gu, S., Timofte, R., Zhang, L.: NTIRE 2019 challenge on real image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)

    Google Scholar 

  5. Cai, J., Gu, S., Zhang, L.: Learning a deep single image contrast enhancer from multi-exposure images. IEEE Trans. Image Process. 27(4), 2049–2062 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  6. Dai, L., Liu, X., Li, C., Chen, J.: AWNet: attentive wavelet network for image ISP. arXiv preprint arXiv:2008.09228 (2020)

  7. Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)

    Article  Google Scholar 

  8. Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25

    Chapter  Google Scholar 

  9. Fu, X., Zeng, D., Huang, Y., Liao, Y., Ding, X., Paisley, J.: A fusion-based enhancing method for weakly illuminated images. Signal Process. 129, 82–96 (2016)

    Article  Google Scholar 

  10. Gu, S., Timofte, R.: A brief review of image denoising algorithms and beyond. In: Inpainting and Denoising Challenges, pp. 1–21 (2019)

    Google Scholar 

  11. Hsyu, M.C., Liu, C.W., Chen, C.H., Chen, C.W., Tsai, W.C.: CSANet: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)

    Google Scholar 

  12. Huang, J., et al.: Range scaling global U-Net for perceptual image enhancement on mobile devices. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 230–242. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_15

    Chapter  Google Scholar 

  13. Hui, Z., Wang, X., Deng, L., Gao, X.: Perception-preserving convolutional networks for image enhancement on smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 197–213. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_13

    Chapter  Google Scholar 

  14. Ignatov, A., Byeoung-su, K., Timofte, R., Pouget, A.: Fast camera image denoising on mobile gpus with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2515–2524 (2021)

    Google Scholar 

  15. Ignatov, A., Chiang, J., Kuo, H.K., Sycheva, A., Timofte, R.: Learned smartphone isp on mobile npus with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)

    Google Scholar 

  16. Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3277–3285 (2017)

    Google Scholar 

  17. Ignatov, A., et al.: PyNet-V2 Mobile: efficient on-device photo processing with neural networks. In: 2021 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)

    Google Scholar 

  18. Ignatov, A., Timofte, R.: NTIRE 2019 challenge on image enhancement: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)

    Google Scholar 

  19. Ignatov, A., et al.: AI benchmark: running deep neural networks on android smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 288–314. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_19

    Chapter  Google Scholar 

  20. Ignatov, A., Timofte, R., Denna, M., Younes, A.: Real-time quantized image super-resolution on mobile NPUs, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)

    Google Scholar 

  21. Ignatov, A., et al.: Aim 2019 challenge on raw to RGB mapping: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3584–3590. IEEE (2019)

    Google Scholar 

  22. Ignatov, A., et al.: AI benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3617–3635. IEEE (2019)

    Google Scholar 

  23. Ignatov, A., et al.: PIRM challenge on perceptual image enhancement on smartphones: report. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 315–333. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_20

    Chapter  Google Scholar 

  24. Ignatov, A., et al.: AIM 2020 challenge on learned image signal processing pipeline. arXiv preprint arXiv:2011.04994 (2020)

  25. Ignatov, A., Van Gool, L., Timofte, R.: Replacing mobile camera ISP with a single deep learning model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537 (2020)

    Google Scholar 

  26. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  27. Kim, B.-H., Song, J., Ye, J.C., Baek, J.H.: PyNET-CA: enhanced PyNET with channel attention for end-to-end mobile image signal processing. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 202–212. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_12

    Chapter  Google Scholar 

  28. Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)

    Google Scholar 

  29. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  30. Lee, J., et al.: On-device neural net inference with mobile GPUs. arXiv preprint arXiv:1907.01989 (2019)

  31. Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 136–144 (2017)

    Google Scholar 

  32. Liu, H., Navarrete Michelini, P., Zhu, D.: Deep networks for image-to-image translation with Mux and Demux layers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 150–165. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_10

    Chapter  Google Scholar 

  33. Lugmayr, A., Danelljan, M., Timofte, R.: Unsupervised learning for real-world super-resolution. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3408–3416. IEEE (2019)

    Google Scholar 

  34. Lugmayr, A., Danelljan, M., Timofte, R.: NTIRE 2020 challenge on real-world image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 494–495 (2020)

    Google Scholar 

  35. Ma, K., Yeganeh, H., Zeng, K., Wang, Z.: High dynamic range image tone mapping by optimizing tone mapped image quality index. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2014)

    Google Scholar 

  36. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  37. Salih, Y., Malik, A.S., Saad, N., et al.: Tone mapping of HDR images: a review. In: 2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012), vol. 1, pp. 368–373. IEEE (2012)

    Google Scholar 

  38. Silva, J.I.S., et al.: A deep learning approach to mobile camera image signal processing. In: Anais Estendidos do XXXIII Conference on Graphics, Patterns and Images, pp. 225–231. SBC (2020)

    Google Scholar 

  39. Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.0/types.hal

  40. Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.2/types.hal

  41. Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.3/types.hal

  42. de Stoutz, E., Ignatov, A., Kobyshev, N., Timofte, R., Van Gool, L.: Fast perceptual image enhancement. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 260–275. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_17

    Chapter  Google Scholar 

  43. Tai, Y., Yang, J., Liu, X., Xu, C.: MemNet: a persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017)

    Google Scholar 

  44. Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017)

    Google Scholar 

  45. Timofte, R., Gu, S., Wu, J., Van Gool, L.: NTIRE 2018 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 852–863 (2018)

    Google Scholar 

  46. Truong, P., Danelljan, M., Van Gool, L., Timofte, R.: Learning accurate dense correspondences and when to trust them. arXiv preprint arXiv:2101.01710 (2021)

  47. Vu, T., Nguyen, C.V., Pham, T.X., Luu, T.M., Yoo, C.D.: Fast and efficient image quality enhancement via Desubpixel convolutional neural networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 243–259. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_16

    Chapter  Google Scholar 

  48. Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5

    Chapter  Google Scholar 

  49. Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. (TOG) 35(2), 11 (2016)

    Article  Google Scholar 

  50. Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks, vol. 35, p. 11. In: ACM (2016)

    Google Scholar 

  51. Yuan, L., Sun, J.: Automatic exposure correction of consumer photographs. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 771–785. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_55

    Chapter  Google Scholar 

  52. Zhang, K., Gu, S., Timofte, R.: NTIRE 2020 challenge on perceptual extreme super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 492–493 (2020)

    Google Scholar 

  53. Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  54. Zhang, K., Zuo, W., Zhang, L.: FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27(9), 4608–4622 (2018)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrey Ignatov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ignatov, A. et al. (2023). MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25063-7_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25062-0

  • Online ISBN: 978-3-031-25063-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics