Text to Complicated Image Synthesis with Segmentation Information Guidance

Zhang, Zhiqiang; Zhang, Yunye; Liu, Wenfa; Yu, Wenxin; He, Gang; Jiang, Ning; Yang, Zhuo

doi:10.1007/978-981-15-3867-4_32

Zhiqiang Zhang⁷,
Yunye Zhang⁷,
Wenfa Liu⁷,
Wenxin Yu⁷,
Gang He⁷,
Ning Jiang⁷ &
…
Zhuo Yang⁸

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 180))

599 Accesses

Abstract

In this paper, we propose a novel method called Segmentation Information Guidance (SIG). In this method, additional segmentation information is added to guide the process of text to complicated image synthesis. We demonstrate the effectiveness of SIG model on Microsoft Common Objects in Common (MSCOCO) dataset. It proves that the image results generated by directly using the segmentation image are more authentic and coherent than that without background.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: CoRR abs/1412.3555 (2014)
Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: International Conference on Learning Representations (2016)
Google Scholar
Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan D., Dollár P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision Proceedings Prat V, pp. 740–755 (2014)
Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: International Conference on Machine Learning, pp. 1060–1069 (2016)
Google Scholar
Zhang, H., Xu, T., Li, H., Zhang, S., Huang, X., Wang, X., Metaxas, D.: StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: International Conference on Computer Vision, pp. 5908–5916 (2017)
Google Scholar
Xu, T., Zhang, P., Huang, Q., Zhang, H., Gan, Z., Huang, X., He, X.: AttnGAN: fine-grained text to image generation with attentional generative adversarial networks. In: Conference on Computer Vision and Pattern Recognition, pp. 1316–1324 (2018)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Kiros, R., Zhu,, Y., Salakhutdinov, R., Zemel, R., Torralba, A., Urtasun, R., Fidler, S.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2014)
Google Scholar

Download references

Acknowledgements

This research was supported by 2018GZ0517, 2019YFS0146, 2019YFS0155, which supported by Sichuan Provincial Science and Technology, Department, 2018KF003 Supported by State Key Laboratory of ASIC & System. No. 61907009 Supported by National Natural Science Foundation of China, No. 2018A030313802 Supported by Natural Science Foundation of Guangdong Province, No. 2017B010110007 and 2017B010110015 Supported by Science and Technology Planning Project of Guangdong Province.

Author information

Authors and Affiliations

Southwest University of Science and Technology, Mianyang, China
Zhiqiang Zhang, Yunye Zhang, Wenfa Liu, Wenxin Yu, Gang He & Ning Jiang
Guangdong University of Technology, Guangzhou, China
Zhuo Yang

Authors

Zhiqiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yunye Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wenfa Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenxin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Gang He
View author publications
You can also search for this author in PubMed Google Scholar
Ning Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhuo Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenxin Yu .

Editor information

Editors and Affiliations

Department of Radio Communications and Video Technologies, Technical University of Sofia, Sofia, Bulgaria
Roumen Kountchev
Department of Computer Science and Engineering, SOA University, Bhubaneswar, Odisha, India
Srikanta Patnaik
Department of Electro-Optical Engineering, Yunnan Normal University, Yunnan, China
Junsheng Shi
Informatics and Computer Techniques, Reshetnev Siberian State University of Science and Technology, Russian Federation, Krasnoyarsk Krai, Russia
Margarita N. Favorskaya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z. et al. (2020). Text to Complicated Image Synthesis with Segmentation Information Guidance. In: Kountchev, R., Patnaik, S., Shi, J., Favorskaya, M. (eds) Advances in 3D Image and Graphics Representation, Analysis, Computing and Information Technology. Smart Innovation, Systems and Technologies, vol 180. Springer, Singapore. https://doi.org/10.1007/978-981-15-3867-4_32

Download citation

DOI: https://doi.org/10.1007/978-981-15-3867-4_32
Published: 21 April 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3866-7
Online ISBN: 978-981-15-3867-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics