Abstract
Visual learning has been helping people to store information for a longer period of time. It helps us to process information primarily through visuals and improves learning. Natural language text descriptions are given as an input, an ML model is built for image classification using Convolutional Neural Network (CNN), and the text is tokenized and given to the POS tagger. Further, the tagged objects are displayed in the form of a visual 2D scene in the Blender application. Thus, we propose a text-to-2D scene generation system which incorporates user interaction for preprocessing the output of the generated scene for the storyDB dataset. The dataset consists of abstract images required for the identification of children with autism. Our approach is an attempt to improve the memorization skill to remember and visualize the object next time when the word is heard by the child.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chang A, Monroe W, Savva M, Potts C, Manning CD (2015) Text to 3D scene generation with rich lexical grounding. arXiv
Coyne B, Schudel C, Bitz M, Hirschberg J (2011) Evaluating a text-to-scene generation system as an aid literacy. SLaTE 2011
Rugma R, Sreeram S (2016) Text-to-scene conversion system for assisting the education of children with intellectual challenges. IJRSET 5(8)
Dessai S, Dhanaraj R (2016) Text to 3d scene generation. IJLTET 6(3):255–258
Lawrence Zincky C (2013) Bringing semantics into focus using visual abstraction. In: IEEE conference on computer vision and pattern recognition (CVPR), 2013 (Oral)
Parikh D, Vanderwende L (2013) Learning the visual interpretation of sentences. In: IEEE international conference on computer vision (ICCV)
Vedantam R, Parikh D, Lawrence Zincky C (2015) Adopting abstract images for semantic scene understanding. IEEE Trans Pattern Anal Mach Intell (PAMI)
Lawrence Zincky C, Parikh D (2013) Bringing semantics into focus using visual abstraction. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3009–3016
Lawrence Zincky C, Parikh D, Vanderwende L (2013) Learning the visual interpretation of sentences. In: IEEE international conference on computer vision (ICCV), pp 1681–1688
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yashaswini, S., Shylaja, S.S. (2023). Story Telling: Learning to Visualize Sentences Through Generated Scenes. In: Shukla, P.K., Singh, K.P., Tripathi, A.K., Engelbrecht, A. (eds) Computer Vision and Robotics. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-7892-0_1
Download citation
DOI: https://doi.org/10.1007/978-981-19-7892-0_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-7891-3
Online ISBN: 978-981-19-7892-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)