Crowd Knowledge Enhanced Multimodal Conversational Assistant in Travel Domain

  • Lizi LiaoEmail author
  • Lyndon Kennedy
  • Lynn Wilcox
  • Tat-Seng Chua
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11961)


We present a new solution towards building a crowd knowledge enhanced multimodal conversational system for travel. It aims to assist users in completing various travel-related tasks, such as searching for restaurants or things to do, in a multimodal conversation manner involving both text and images. In order to achieve this goal, we ground this research on the combination of multimodal understanding and recommendation techniques which explores the possibility of a more convenient information seeking paradigm. Specifically, we build the system in a modular manner where each modular construction is enriched with crowd knowledge from social sites. To the best of our knowledge, this is the first work that attempts to build intelligent multimodal conversational systems for travel, and moves an important step towards developing human-like assistants for completion of daily life tasks. Several current challenges are also pointed out as our future directions.


Multimodal assistant Conversational systems Travel 


  1. 1.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  2. 2.
    Bordes, A., Weston, J.: Learning end-to-end goal-oriented dialog. In: The 3rd International Conference on Learning Representations, pp. 1–14 (2016)Google Scholar
  3. 3.
    Budzianowski, P., et al.: MultiWOZ - a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. In: EMNLP, pp. 5016–5026 (2018)Google Scholar
  4. 4.
    Chaudhuri, S., Ganjam, K., Ganti, V., Motwani, R.: Robust and efficient fuzzy match for online data cleaning. In: SIGMOD, pp. 313–324. ACM (2003)Google Scholar
  5. 5.
    Chen, Y.N., Wang, W.Y., Rudnicky, A.I.: Leveraging frame semantics and distributional semantics for unsupervised semantic slot induction in spoken dialogue systems. In: 2014 IEEE Spoken Language Technology Workshop, pp. 584–589 (2014)Google Scholar
  6. 6.
    Ester, M., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)Google Scholar
  7. 7.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)Google Scholar
  8. 8.
    Li, R., Kahou, S.E., Schulz, H., Michalski, V., Charlin, L., Pal, C.: Towards deep conversational recommendations. In: NIPS, pp. 9748–9758 (2018)Google Scholar
  9. 9.
    Liao, L., He, X., Ren, Z., Nie, L., Xu, H., Chua, T.S.: Representativeness-aware aspect analysis for brand monitoring in social media. In: IJCAI, pp. 310–316 (2017)Google Scholar
  10. 10.
    Liao, L., Takanobu, R., Ma, Y., Yang, X., Huang, M., Chua, T.: Deep conversational recommender in travel. arxiv:1907.00710 (2019)
  11. 11.
    Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016)
  12. 12.
    Madotto, A., Wu, C.S., Fung, P.: Mem2seq: effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In: ACL, pp. 1468–1478 (2018)Google Scholar
  13. 13.
    Rieser, V., Lemon, O.: Natural language generation as planning under uncertainty for spoken dialogue systems. In: Krahmer, E., Theune, M. (eds.) EACL/ENLG -2009. LNCS (LNAI), vol. 5790, pp. 105–120. Springer, Heidelberg (2010). Scholar
  14. 14.
    Sukhbaatar, S., et al.: End-to-end memory networks. In: NIPS, pp. 2440–2448 (2015)Google Scholar
  15. 15.
    Sun, Y., Zhang, Y.: Conversational recommender system. In: SIGIR, pp. 235–244 (2018)Google Scholar
  16. 16.
    Tur, G., Jeong, M., Wang, Y.Y., Hakkani-Tür, D., Heck, L.: Exploiting the semantic web for unsupervised natural language semantic parsing. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)Google Scholar
  17. 17.
    Wen, T.H., et al.: A network-based end-to-end trainable task-oriented dialogue system. In: EACL, pp. 438–449 (2017)Google Scholar
  18. 18.
    Yan, Z., Duan, N., Chen, P., Zhou, M., Zhou, J., Li, Z.: Building task-oriented dialogue systems for online shopping. In: AAAI, pp. 4618–4625 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Lizi Liao
    • 1
    Email author
  • Lyndon Kennedy
    • 2
  • Lynn Wilcox
    • 2
  • Tat-Seng Chua
    • 1
  1. 1.NGSNational University of SingaporeSingaporeSingapore
  2. 2.FXPALPalo AltoUSA

Personalised recommendations