Advertisement

Locally controllable neural style transfer on mobile devices

  • Max ReimannEmail author
  • Mandy Klingbeil
  • Sebastian Pasewaldt
  • Amir Semmo
  • Matthias Trapp
  • Jürgen Döllner
Original Article
  • 19 Downloads

Abstract

Mobile expressive rendering gained increasing popularity among users seeking casual creativity by image stylization and supports the development of mobile artists as a new user group. In particular, neural style transfer has advanced as a core technology to emulate characteristics of manifold artistic styles. However, when it comes to creative expression, the technology still faces inherent limitations in providing low-level controls for localized image stylization. In this work, we first propose a problem characterization of interactive style transfer representing a trade-off between visual quality, run-time performance, and user control. We then present MaeSTrO, a mobile app for orchestration of neural style transfer techniques using iterative, multi-style generative and adaptive neural networks that can be locally controlled by on-screen painting metaphors. At this, we enhance state-of-the-art neural style transfer techniques by mask-based loss terms that can be interactively parameterized by a generalized user interface to facilitate a creative and localized editing process. We report on a usability study and an online survey that demonstrate the ability of our app to transfer styles at improved semantic plausibility.

Keywords

Non-photorealistic rendering Style transfer Neural networks Mobile devices Interactive control Expressive rendering 

Notes

Acknowledgements

We would like to thank the anonymous reviewers for their valuable feedback. This work was funded by the Federal Ministry of Education and Research (BMBF), Germany, for the AVA project 01IS15041.

Supplementary material

371_2019_1654_MOESM1_ESM.pdf (756 kb)
Supplementary material 1 (pdf 755 KB)

Supplementary material 2 (mp4 129663 KB)

References

  1. 1.
    Aydın, T.O., Smolic, A., Gross, M.: Automated aesthetic analysis of photographic images. IEEE Trans. Vis. Comput. Graph. 21(1), 31–42 (2015).  https://doi.org/10.1109/TVCG.2014.2325047 CrossRefGoogle Scholar
  2. 2.
    Bakhshi, S., Shamma, D.A., Kennedy, L., Gilbert, E.: Why we filter our photos and how it impacts engagement. In: Proceedings of the ICWSM, pp. 12–21 (2015)Google Scholar
  3. 3.
    Berry, M.: Re-imagining place with filters: more than meets the eye. J. Creat. Technol. 4, 81–96 (2014)Google Scholar
  4. 4.
    Caesar, H., Uijlings, J., Ferrari, V.: COCO-Stuff: thing and stuff classes in context. In: Proceedings of the CVPR, pp. 1209–1218 (2018).  https://doi.org/10.1109/CVPR.2018.00132
  5. 5.
    Champandard, A.J.: Semantic style transfer and turning two-bit doodles into fine artworks. Tech. Rep., arXiv arXiv:1612.04337 (2016)
  6. 6.
    Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: StyleBank: an explicit representation for neural image style transfer. In: Proceedings of the CVPR, pp. 2770–2779. IEEE Computer Society, Los Alamitos (2017).  https://doi.org/10.1109/CVPR.2017.296
  7. 7.
    DeCarlo, D., Santella, A.: Stylization and abstraction of photographs. ACM Trans. Graph. 21(3), 769–776 (2002).  https://doi.org/10.1145/566654.566650 CrossRefGoogle Scholar
  8. 8.
    Dev, K.: Mobile expressive renderings: the state of the art. IEEE Comput. Graph. Appl. 33(3), 22–31 (2013).  https://doi.org/10.1109/MCG.2013.20 CrossRefGoogle Scholar
  9. 9.
    Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: Proceedings of the ICLR, p. 9 (2017)Google Scholar
  10. 10.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the CVPR, pp. 2414–2423. IEEE Computer Society, Los Alamitos (2016).  https://doi.org/10.1109/CVPR.2016.265
  11. 11.
    Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings of the CVPR, pp. 3730–3738. IEEE Computer Society, Los Alamitos (2017).  https://doi.org/10.1109/CVPR.2017.397
  12. 12.
    Ghiasi, G., Lee, H., Kudlur, M., Dumoulin, V., Shlens, J.: Exploring the structure of a real-time, arbitrary neural artistic stylization network. Tech. Rep., arXiv arXiv:1705.06830 (2017)
  13. 13.
    Gooch, A.A., Long, J., Ji, L., Estey, A., Gooch, B.S.: Viewing progress in non-photorealistic rendering through Heinlein’s lens. In: Proceedings of the NPAR, pp. 165–171. ACM, New York (2010).  https://doi.org/10.1145/1809939.1809959
  14. 14.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. Tech. Rep., arXiv arXiv:1703.06868 (2017)
  15. 15.
    Isenberg, T.: Interactive NPAR: what type of tools should we create? In: Proceedings of the NPAR, pp. 89–96. Eurographics Association, Goslar, Germany (2016).  https://doi.org/10.2312/exp.20161067
  16. 16.
    Jing, Y., Yang, Y., Feng, Z., Ye, J., Song, M.: Neural style transfer: a review. Tech. Rep., arXiv arXiv:1705.04058 (2018)
  17. 17.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Proceedings of the ECCV, pp. 694–711. Springer International, Cham, Switzerland (2016).  https://doi.org/10.1007/978-3-319-46475-6_43
  18. 18.
    Keep, D.: Artist with a Camera-Phone: A Decade of Mobile Photography, pp. 14–24. Palgrave Macmillan US, New York (2014).  https://doi.org/10.1057/9781137469816_2 Google Scholar
  19. 19.
    Klingbeil, M., Pasewaldt, S., Semmo, A., Döllner, J.: Challenges in user experience design of image filtering apps. In: Proceedings of the MGIA, pp. 22:1–22:6. ACM, New York, NY, USA (2017).  https://doi.org/10.1145/3132787.3132803
  20. 20.
    Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the CVPR, pp. 2479–2486. IEEE Computer Society, Los Alamitos (2016).  https://doi.org/10.1109/CVPR.2016.272
  21. 21.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: Proceedings of the CVPR, pp. 266–274 (2017).  https://doi.org/10.1109/CVPR.2017.36
  22. 22.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 386–396. Curran Associates, Red Hook (2017)Google Scholar
  23. 23.
    Liao, J., Yao, Y., Yuan, L., Hua, G., Kang, S.B.: Visual attribute transfer through deep image analogy. Tech. Rep., arXiv arXiv:1705.01088 (2017)
  24. 24.
    Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, L.C.: Microsoft COCO: common objects in context. Tech. Rep., arXiv arXiv:1405.0312 (2014)
  25. 25.
    Liu, X.C., Cheng, M.M., Lai, Y.K., Rosin, P.L.: Depth-aware neural style transfer. In: Proceedings of the NPAR, pp. 4:1–4:10. ACM, New York, NY, USA (2017).  https://doi.org/10.1145/3092919.3092924
  26. 26.
    Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. Tech. Rep., arXiv arXiv:1703.07511 (2017)
  27. 27.
    Narvekar, N.D., Karam, L.J.: A no-reference image blur metric based on the cumulative probability of blur detection (CPBD). IEEE Trans. Image Process. 20(9), 2678–2683 (2011).  https://doi.org/10.1109/TIP.2011.2131660 MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Reimann, M., Klingbeil, M., Pasewaldt, S., Semmo, A., Döllner, J., Trapp, M.: MaeSTrO: a mobile app for style transfer orchestration using neural networks. In: Proceedings International Conference on Cyberworlds, pp. 9–16. IEEE (2018).  https://doi.org/10.1109/CW.2018.00016
  29. 29.
    Rudner, R.: On semiotic aesthetics. J. Aesthet. Art Crit. 10(1), 67–77 (1951)CrossRefGoogle Scholar
  30. 30.
    Salesin, D.H.: Non-Photorealistic Animation & Rendering: 7 Grand Challenges. Keynote talk at NPAR (2002)Google Scholar
  31. 31.
    Santella, A., DeCarlo, D.: Visual interest and NPR: an evaluation and manifesto. In: Proceedings of the NPAR, pp. 71–150. ACM, New York, NY, USA (2004).  https://doi.org/10.1145/987657.987669
  32. 32.
    Seims, J.: Putting the artist in the loop. ACM SIGGRAPH Comput. Graph. 33(1), 52–53 (1999)CrossRefGoogle Scholar
  33. 33.
    Selim, A., Elgharib, M., Doyle, L.: Painting style transfer for head portraits using convolutional neural networks. ACM Trans. Graph. 35(4), 129:1–129:18 (2016).  https://doi.org/10.1145/2897824.2925968 CrossRefGoogle Scholar
  34. 34.
    Semmo, A., Dürschmid, T., Trapp, M., Klingbeil, M., Döllner, J., Pasewaldt, S.: Interactive image filtering with multiple levels-of-control on mobile devices. In: Proceedings of the MGIA, pp. 2:1–2:8. ACM, New York (2016).  https://doi.org/10.1145/2999508.2999521
  35. 35.
    Semmo, A., Isenberg, T., Döllner, J.: Neural style transfer: a paradigm shift for image-based artistic rendering? In: Proceedings of the NPAR, pp. 5:1–5:13. ACM, New York (2017).  https://doi.org/10.1145/3092919.3092920
  36. 36.
    Semmo, A., Trapp, M., Döllner, J., Klingbeil, M.: Pictory: combining neural style transfer and image filtering. In: Proceedings of the SIGGRAPH Appy Hour, pp. 5:1–5:2. ACM, New York, NY, USA (2017).  https://doi.org/10.1145/3098900.3098906
  37. 37.
    Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017).  https://doi.org/10.1109/TPAMI.2016.2572683 CrossRefGoogle Scholar
  38. 38.
    Shneiderman, B.: Creativity support tools: accelerating discovery and innovation. Commun. ACM 50(12), 20–32 (2007).  https://doi.org/10.1145/1323688.1323689 CrossRefGoogle Scholar
  39. 39.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Tech. Rep., arXiv arXiv:1409.1556 (2015)
  40. 40.
    Tanno, R., Matsuo, S., Shimoda, W., Yanai, K.: DeepStyleCam: a real-time style transfer app on iOS. In: Proceedings of the MultiMedia Modeling, pp. 446–449. Springer International, Cham, Switzerland (2017).  https://doi.org/10.1007/978-3-319-51814-5_39
  41. 41.
    Toyoura, M., Abe, N., Mao, X.: Painterly image generation using scene-aware style transferring. In: Proceedings of the International Conference on Cyberworlds, pp. 73–80. IEEE Computer Society, Los Alamitos (2016).  https://doi.org/10.1109/CW.2016.18
  42. 42.
    Toyoura, M., Abe, N., Mao, X.: Scene-aware style transferring using GIST. In: Transactions on Computational Science XXX: Special Issue on Cyberworlds and Cybersecurity, pp. 29–49. Springer, Berlin, Heidelberg (2017).  https://doi.org/10.1007/978-3-662-56006-8_3
  43. 43.
    Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.S.: Texture networks: feed-forward synthesis of textures and stylized images. In: Proceedings of the ICML, pp. 1349–1357. JMLR.org, New York (2016)Google Scholar
  44. 44.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Instance normalization: the missing ingredient for fast stylization. Tech. Rep., arXiv arXiv:1607.08022 (2016)
  45. 45.
    Winnemöller, H.: NPR in the wild. In: Rosin, P., Collomosse, J. (eds.) Image and Video based Artistic Stylisation. Computational Imaging and Vision, Chap. 17, vol. 42, pp. 353–374. Springer, New York (2013).  https://doi.org/10.1007/978-1-4471-4519-6_17 CrossRefGoogle Scholar
  46. 46.
    Zhang, H., Dana, K.: Multi-style generative network for real-time transfer. Tech. Rep., arXiv arXiv:1703.06953 (2017)
  47. 47.
    Zhao, H., Rosin, P.L., Lai, Y.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. Tech. Rep., arXiv arXiv:1708.09641 (2017)

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Digital Masterpieces GmbH/Hasso Plattner InstitutePotsdamGermany
  2. 2.Hasso Plattner InstituteUniversity of PotsdamPotsdamGermany

Personalised recommendations