In the field of computer graphics and multimedia, automatic synthesis of a new set of image sequences from another different set of image sequences for creating realistic video or animation of some human activity performed is a research challenge. Traditionally, creating such animation or similar visual media contents is done manually, which is a tedious task. Recent advancements in deep learning have made some promising progress for automating this type of media creation process. This work is motivated by the idea to synthesize a temporally coherent sequence of images (e.g., a video) of a person performing some activity by using a video or set of images of a different person performing a similar activity. To achieve that, our approach utilized the cycle-consistent adversarial network (CycleGAN). We present a new approach for learning to transfer a human activity from a source domain to a target domain without using any complicated pose detection or extraction method. Our objective in this work is to learn a mapping between two consecutive sequences of images from two domains representing two different activities and use that mapping to transfer the activity from one domain to another for synthesizing an entirely new consecutive sequence of images, which can be combined to make a video of new human activity. We also present and analyze some qualitative results generated by our method.
- Image synthesis
- Generative adversarial networks
All the authors shared an equal amount of contribution to this work.
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Stock images, photos, vectors, video, and music. https://www.shutterstock.com/
Chan, C., Ginosar, S., Zhou, T., Efros, A.A.: Everybody dance now. arXiv preprint arXiv:1808.07371 (2018)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR 2011, pp. 3337–3344. IEEE (2011)
Villegas, R., Yang, J., Ceylan, D., Lee, H.: Neural kinematic networks for unsupervised motion retargetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8639–8648 (2018)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Editors and Affiliations
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Khan, F.H., de Silva, A., Yetukuri, J., Norouzi, N. (2019). Sequential Image Synthesis for Human Activity Video Generation. In: Karray, F., Campilho, A., Yu, A. (eds) Image Analysis and Recognition. ICIAR 2019. Lecture Notes in Computer Science(), vol 11663. Springer, Cham. https://doi.org/10.1007/978-3-030-27272-2_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27271-5
Online ISBN: 978-3-030-27272-2