Design of Image Generation System for DCGAN Based Picture Book Text

Cho, JaeHyeon; Moon, Nammee

doi:10.1007/978-981-13-9341-9_46

JaeHyeon Cho³⁸ &
Nammee Moon³⁸

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 536))

Included in the following conference series:

870 Accesses

Abstract

When a picture book is photographed with a smart device, the text is analyzed for meaning and associated images are created. Image creation is the first step in learning DCGAN using class lists and images. In this study, DCGAN was trained with 11 classes and images of 1688 bears, which were collected by ImageNet for design. The second step is to shoot the image and text of the picture book on a smart device, and convert the text part of the shot image into a system readable character. We use the morpheme analyzer to classify nouns and verbs in text, and Discriminator learn to recognize the classified parts of speech as latent vectors of images. The third step is to create an associated image in the text. In the picture book, take the text of the part without the image and extract nouns and verbs. The extracted parts of speech and the learned latent vector are used as Generator parameters to generate images associated with the text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kim, I.-T., Yoo, K.-J.: Effects of augmented reality picture book on the language expression and flow of young children’s in picture book reading activities. J. Korea Open Assoc. Early Child. Educ. 23(1), 83–109 (2018)
Article Google Scholar
Ryu, K.M., Kim, H.J., Kim, H.J., Lee, E.J.I., Heo, J.Y.: A development of interactive storybook with digital board and smart device. 한국HCI학회 학술대회, pp. 1179–1182 (2017)
Google Scholar
Kim, Y., Park, H.: Study on the relation between young children’s smart device immersion tendency and their playfulness. Early Child. Educ. Res. Rev. 20(4), 337–353 (2016)
Google Scholar
Lee, G.-C., Yoo, J.: Development an Android based OCR Application for Hangul Food Menu. J. Korea Inst. Inf. Commun. Eng. 21(5), 951–959 (2017)
Google Scholar
Smith, R.: An overview of the tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Parana, pp. 629–633 (2007)
Google Scholar
Park, E.L., Cho, S.: KoNLPy: Korean natural language processing in Python. In: Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, pp. 133–136 (2014)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015)
Han, Y., Kim, H.J.: Face morphing using generative adversarial networks. J. Digital Contents Soc. 19(3), 435–443 (2018)
Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: Proceedings of the 33rd International Conference on Machine. Learning, New York, NY, USA, 2016, JMLR: W & CP, vol. 48 (2016)
Google Scholar
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: Proceedings of the International Conference on Learning Representations, pp. 1–14, arXiv preprint arXiv:1412.6806 (2015)
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Proceedings of The 33rd International Conference on Machine Learning, PMLR, vol. 48, pp. 1060–1069 (2016)
Google Scholar
Triantafyllidou, D., Tefas, A.: Face detection based on deep convolutional neural networks exploiting incremental facial part learning. In: Proceeding of the International Conference on Pattern Recognition, pp. 3560–3565 (2016)
Google Scholar
Miller, E.L., Huang, G., RoyChowdhury, A., Li, H., Hua, G.: Labeled faces in the wild: a survey. In: Kawulok, M., Celebi, M., Smolka, B. (eds.) Advances in Face Detection and Facial Image Analysis, pp. 189–248 (2016)
Google Scholar

Download references

Acknowledgments

This work has supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2017R1A2B4008886).

Author information

Authors and Affiliations

Division of Computer and Information Engineering, Hoseo University, Asan, South Korea
JaeHyeon Cho & Nammee Moon

Authors

JaeHyeon Cho
View author publications
You can also search for this author in PubMed Google Scholar
Nammee Moon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nammee Moon .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
James J. Park
Department of Computer Software Engineering, Soon Chun Hyang University, Asan, Korea (Republic of)
Doo-Soon Park
Department of Multimedia Engineering, Dongguk University, Seoul, Korea (Republic of)
Young-Sik Jeong
Department of Computer Science, Georgia State University, Atlanta, GA, USA
Yi Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cho, J., Moon, N. (2020). Design of Image Generation System for DCGAN Based Picture Book Text. In: Park, J., Park, DS., Jeong, YS., Pan, Y. (eds) Advances in Computer Science and Ubiquitous Computing. CUTE CSA 2018 2018. Lecture Notes in Electrical Engineering, vol 536. Springer, Singapore. https://doi.org/10.1007/978-981-13-9341-9_46

Download citation

DOI: https://doi.org/10.1007/978-981-13-9341-9_46
Published: 04 December 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9340-2
Online ISBN: 978-981-13-9341-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics