Multimodal Semantic Simulations of Linguistically Underspecified Motion Events

  • Nikhil KrishnaswamyEmail author
  • James Pustejovsky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10523)


This paper details the technical functionality of VoxSim, a system for generating three-dimensional visual simulations of natural language motion expressions. We use a rich formal model of events and their participants to generate simulations that satisfy the minimal constraints entailed by an utterance and its minimal model, relying on real-world semantic knowledge of physical objects and motion events. This paper outlines technical considerations of such a system, and discusses the implementation of the aforementioned semantic models as well as VoxSim’s suitability as a platform for examining linguistic and spatial reasoning questions.


Spatial cognition Spatial reasoning Spatial language Event semantics Simulation semantics Spatial information representation Spatial information processing Underspecification 



We would like to thank the reviewers for their perceptive and helpful comments. This work is supported by a contract with the US Defense Advanced Research Projects Agency (DARPA), Contract W911NF-15-C-0238. Approved for Public Release, Distribution Unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government. We would like to thank Scott Friedman, David McDonald, Marc Verhagen, and Mark Burstein for their discussion and input on this topic. All errors and mistakes are, of course, the responsibilities of the authors.


  1. 1.
    Albath, J., Leopold, J.L., Sabharwal, C.L., Maglia, A.M.: RCC-3D: qualitative spatial reasoning in 3D. In: CAINE, pp. 74–79 (2010)Google Scholar
  2. 2.
    Allen, J.: Towards a general theory of action and time. Artif. Intell. 23, 123–154 (1984)CrossRefzbMATHGoogle Scholar
  3. 3.
    Andor, D., Alberti, C., Weiss, D., Severyn, A., Presta, A., Ganchev, K., Petrov, S., Collins, M.: Globally normalized transition-based neural networks. arXiv preprint arXiv:1603.06042 (2016)
  4. 4.
    Bergen, B.K.: Louder Than Words: The New Science of How the Mind Makes Meaning. Basic Books, New York (2012)Google Scholar
  5. 5.
    Bhatt, M., Loke, S.: Modelling dynamic spatial systems in the situation calculus. Spat. Cogn. Comput. 8, 86–130 (2008)Google Scholar
  6. 6.
    Blackburn, P., Bos, J.: Computational semantics. THEORIA. Int. J. Theory Hist. Found. Sci. 18(1) (2008)Google Scholar
  7. 7.
    Chang, A., Monroe, W., Savva, M., Potts, C., Manning, C.D.: Text to 3D scene generation with rich lexical grounding. arXiv preprint arXiv:1505.06289 (2015)
  8. 8.
    Choi, J.D., McCallum, A.: Transition-based dependency parsing with selectional branching. In: ACL (1), pp. 1052–1062 (2013)Google Scholar
  9. 9.
    Coyne, B., Sproat, R.: Wordseye: an automatic text-to-scene conversion system. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 487–496. ACM (2001)Google Scholar
  10. 10.
    Dill, K.: A game AI approach to autonomous control of virtual characters. In: Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC) (2011)Google Scholar
  11. 11.
    Do, T., Krishnaswamy, N., Pustejovsky, J.: ECAT: event capture annotation tool. In: Proceedings of ISA-12: International Workshop on Semantic Annotation (2016)Google Scholar
  12. 12.
    Do, T., Pustejovsky, J.: Fine-grained event learning of human-object interaction with LSTM-CRF. In: Proceedings of European Symposium on Artificial Neural (ESANN 2017) (2017)Google Scholar
  13. 13.
    Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 4163–4168. IEEE (2009)Google Scholar
  14. 14.
    Feldman, J.: From Molecule to Metaphor: A Neural Theory of Language. MIT Press, Cambridge (2006)Google Scholar
  15. 15.
    Ferguson, G., Allen, J.F., et al.: Trips: an integrated intelligent problem-solving assistant. In: AAAI/IAAI, pp. 567–572 (1998)Google Scholar
  16. 16.
    Forbus, K.D., Mahoney, J.V., Dill, K.: How qualitative spatial reasoning can improve strategy game AIs. IEEE Intell. Syst. 17(4), 25–30 (2002)CrossRefGoogle Scholar
  17. 17.
    Galton, A.: Towards an integrated logic of space, time, and motion. In: Bajcsy, R. (ed.) Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI 1993), pp. 1550–1555. Morgan Kaufmann, San Mateo (1993)Google Scholar
  18. 18.
    Galton, A.: Qualitative Spatial Change. Oxford University Press, Oxford (2000)zbMATHGoogle Scholar
  19. 19.
    Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In: ICLP/SLP, vol. 88, pp. 1070–1080 (1988)Google Scholar
  20. 20.
    Gerber, R., Nagel, H.H.: Representation of occurrences for road vehicle traffic. Artif. Intell. 172(4), 351–391 (2008)CrossRefGoogle Scholar
  21. 21.
    Gibson, J.J., Reed, E.S., Jones, R.: Reasons for Realism: Selected Essays of James J. Gibson. Lawrence Erlbaum Associates, Hillsdale (1982)Google Scholar
  22. 22.
    Goldman, A.I.: Simulating Minds: The Philosophy, Psychology, and Neuroscience of Mindreading. Oxford University Press, Oxford (2006)CrossRefGoogle Scholar
  23. 23.
    Goldstone, W.: Unity Game Development Essentials. Packt Publishing Ltd., Birmingham (2009)Google Scholar
  24. 24.
    Krishnaswamy, N.: Monte-Carlo Simulation Generation Through Operationalization of Spatial Primitives. Ph.D. thesis, Brandeis University (2017)Google Scholar
  25. 25.
    Kurata, Y., Egenhofer, M.: The 9+ intersection for topological relations between a directed line segment and a region. In: Gottfried, B. (ed.) Workshop on Behaviour and Monitoring Interpretation, Germany, pp. 62–76, September 2007Google Scholar
  26. 26.
    Levin, B.: English Verb Class and Alternations: A Preliminary Investigation. University of Chicago Press, Chicago (1993)Google Scholar
  27. 27.
    Mani, I., Pustejovsky, J.: Interpreting Motion: Grounded Representations for Spatial Language. Oxford University Press, Oxford (2012)CrossRefGoogle Scholar
  28. 28.
    Mark, D., Egenhofer, M.: Topology of prototypical spatial relations between lines and regions in English and Spanish. In: Proceedings of the Twelfth International Symposium on Computer-Assisted Cartography, vol. 4, pp. 245–254 (1995)Google Scholar
  29. 29.
    Markman, K.D., Klein, W.M., Suhr, J.A.: Handbook of Imagination and Mental Simulation. Psychology, New York (2012)Google Scholar
  30. 30.
    McDonald, D., Pustejovsky, J.: On the representation of inferences and their lexicalization. In: Advances in Cognitive Systems, vol. 3 (2014)Google Scholar
  31. 31.
    Moratz, R., Fischer, K., Tenbrink, T.: Cognitive modeling of spatial reference for human-robot interaction. Int. J. Artif. Intell. Tools 10(04), 589–611 (2001)CrossRefGoogle Scholar
  32. 32.
    Muller, P.: A qualitative theory of motion based on spatio-temporal primitives. In: Cohn, A.G., Schubert, L., Shapiro, S.C. (eds.) KR 1998: Principles of Knowledge Representation and Reasoning, pp. 131–141. Morgan Kaufmann, San Francisco (1998)Google Scholar
  33. 33.
    Narayanan, S.S.: KARMA: Knowledge-Based Active Representations for Metaphor and Aspect. University of California, Berkeley (1997)Google Scholar
  34. 34.
    Naumann, R.: A dynamic approach to aspect: verbs as programs. Submitted to J. Semant. (1999). University of DüsseldorfGoogle Scholar
  35. 35.
    Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge (1995)Google Scholar
  36. 36.
    Pustejovsky, J.: Dynamic event structure and habitat theory. In: Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL 2013), pp. 1–10. ACL (2013)Google Scholar
  37. 37.
    Pustejovsky, J., Krishnaswamy, N.: Generating simulations of motion events from verbal descriptions. In: Lexical and Computational Semantics (*SEM 2014), p. 99 (2014)Google Scholar
  38. 38.
    Pustejovsky, J., Krishnaswamy, N.: VoxML: a visualization modeling language. In: Chair, N.C.C., Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Paris, May 2016Google Scholar
  39. 39.
    Pustejovsky, J., Krishnaswamy, N.: Envisioning language: The semantics of multimodal simulations (forthcoming)Google Scholar
  40. 40.
    Pustejovsky, J., Moszkowicz, J.: The qualitative spatial dynamics of motion. J. Spat. Cogn. Comput. 11, 15–44 (2011)Google Scholar
  41. 41.
    Raman, V., Lignos, C., Finucane, C., Lee, K.C., Marcus, M.P., Kress-Gazit, H.: Sorry dave, i’m afraid i can’t do that: Explaining unachievable robot tasks using natural language. In: Robotics: Science and Systems, vol. 2, pp. 2–1. IEEE (2013)Google Scholar
  42. 42.
    Randell, D., Cui, Z., Cohn, A.: A spatial logic based on regions and connections. In: Kaufmann, M. (ed.) Proceedings of the 3rd International Conference on Knowledge Representation and Reasoning, San Mateo, pp. 165–176 (1992)Google Scholar
  43. 43.
    Siskind, J.M.: Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. J. Artif. Intell. Res. (JAIR) 15, 31–90 (2001)zbMATHGoogle Scholar
  44. 44.
    Skubic, M., Perzanowski, D., Blisard, S., Schultz, A., Adams, W., Bugajska, M., Brock, D.: Spatial language for human-robot dialogs. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 34(2), 154–167 (2004)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceBrandeis UniversityWalthamUSA

Personalised recommendations