Deep Photo Rally: Let’s Gather Conversational Pictures
Conference paper
First Online:
Abstract
In this paper, we propose an anthropomorphic approach to generate speech sentences of a specific object according to surrounding circumstances using the recent Deep Neural Networks technology. In the proposal approach, the user can have pseudo communication with the object by photographing the object with a mobile terminal. We introduce some examples of application of the proposal approach to entertainment products, and show that this is an anthropomorphic approach capable of interacting with the environment.
Keywords
Augmented reality Anthropomorphic Deep Neural NetworksReferences
- 1.Waytz, A.: Social connection and seeing human. In: The Oxford Handbook of Social Exclusion, pp. 251–256 (2013)Google Scholar
- 2.Epley, N., Waytz, A., Cacioppo, J.T.: On seeing human: a three-factor theory of anthropomorphism. Psychol. Rev. 114(4), 864–886 (2007)CrossRefGoogle Scholar
- 3.Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger. arXiv preprint arXiv:1612.08242 (2016)
- 4.Vinyals, O., et al.: Show and tell: A neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)Google Scholar
- 5.Chen, X., et al.: Microsoft COCO captions: Data collection and evaluation server (2015). arXiv preprint arXiv:1504.00325
Copyright information
© IFIP International Federation for Information Processing 2017