Skip to main content

Construction of Multimodal Dialog System via Knowledge Graph in Travel Domain

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2023)

Abstract

When traveling to a foreign city, we often find ourselves in dire need of an intelligent agent that can provide instant and informative responses to our various queries. Such an agent should have the ability to understand our queries and possess the knowledge to generate helpful responses. Furthermore, if the agent can comprehend image information, it can provide solutions from multiple perspectives. Knowledge graph-based multimodal dialog systems offer a promising approach to fulfill these requirements. In this paper, we present a solution for efficiently constructing a multimodal dialog system in the travel domain without large-scale datasets. The system’s main objective is to assist users in completing various travel-related tasks, specifically attraction recommendation and route planning, which are frequently requested by users while traveling. We introduce the Multimodal Chinese Tourism Knowledge Graph (MCTKG) and integrate image processing and recommendation technology into a dialog system. Specifically, our approach utilizes modular design to construct the dialog system, and leverages the rich information available in the knowledge graph to enhance the performance of each module. To the best of our knowledge, this is the first multimodal travel dialog system that provides users with personalized travel route recommendations. Multiple experiments have proven that our dialog system can effectively enhance the user’s travel experience.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, H., Liu, X., Yin, D., Tang, J.: A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explor. Newsl. 19(2), 25–35 (2017)

    Article  Google Scholar 

  2. Chen, Q., Zhuo, Z., Wang, W.: Bert for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909 (2019)

  3. Dai, Z., Callan, J.: Deeper text understanding for IR with contextual neural language modeling. In: Proceedings of SIGIR, pp. 985–988. Association for Computing Machinery, New York (2019)

    Google Scholar 

  4. Dhingra, B., et al.: Towards end-to-end reinforcement learning of dialogue agents for information access. In: Proceedings of ACL, Vancouver, Canada, pp. 484–495. Association for Computational Linguistics (2017)

    Google Scholar 

  5. Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of NAACL-HLT, New Orleans, Louisiana, pp. 753–757. Association for Computational Linguistics (2018)

    Google Scholar 

  6. Han, S., Bang, J., Ryu, S., Lee, G.G.: Exploiting knowledge base to generate responses for natural language dialog listening agents. In: Proceedings of SIGDIAL, Prague, Czech Republic, pp. 129–133. Association for Computational Linguistics (2015)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR, New York, USA, pp. 770–778. IEEE (2016)

    Google Scholar 

  8. Huang, J., Zhao, W.X., Dou, H., Wen, J.R., Chang, E.Y.: Improving sequential recommendation with knowledge-enhanced memory networks. In: Proceedings of SIGIR, pp. 505–514. Association for Computing Machinery, New York (2018)

    Google Scholar 

  9. Jung, J., Son, B., Lyu, S.: AttnIO: knowledge graph exploration with in-and-out attention flow for knowledge-grounded dialogue. In: Proceedings of EMNLP, Stroudsburg, PA, pp. 3484–3497. Association for Computational Linguistics (2020)

    Google Scholar 

  10. Kurata, G., Xiang, B., Zhou, B., Yu, M.: Leveraging sentence-level information with encoder LSTM for semantic slot filling. In: Proceedings of EMNLP, Austin, Texas, pp. 2077–2083. Association for Computational Linguistics (2016)

    Google Scholar 

  11. Liao, L., Ma, Y., He, X., Hong, R., Chua, T.S.: Knowledge-aware multimodal dialogue systems. In: Proceedings of ACM MM, pp. 801–809. Association for Computing Machinery, New York (2018)

    Google Scholar 

  12. Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. In: Proceedings of Interspeech, Baixas, France, pp. 685–689. ISCA-INT Speech Communication Association (2016)

    Google Scholar 

  13. Liu, H., Zhang, F., Zhang, X., Zhao, S., Zhang, X.: An explicit-joint and supervised-contrastive learning framework for few-shot intent classification and slot filling. In: Proceedings of EMNLP, Punta Cana, Dominican Republic, pp. 1945–1955. Association for Computational Linguistics (2021)

    Google Scholar 

  14. Mrkšić, N., Séaghdha, D.O., Wen, T.H., Thomson, B., Young, S.: Neural belief tracker: data-driven dialogue state tracking. In: Proceedings of ACL, Stroudsburg, PA, pp. 1777–1788. Association for Computational Linguistics (2017)

    Google Scholar 

  15. Peng, B., Yao, K., Jing, L., Wong, K.F.: Recurrent neural networks with external memory for spoken language understanding. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 25–35. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25207-0_3

    Chapter  Google Scholar 

  16. Qin, L., Xu, X., Che, W., Liu, T.: AGIF: an adaptive graph-interactive framework for joint multiple intent detection and slot filling. In: Proceedings of EMNLP, Stroudsburg, PA, pp. 1807–1816. Association for Computational Linguistics (2020)

    Google Scholar 

  17. Saha, A., Khapra, M.M., Sankaranarayanan, K.: Towards building large scale multimodal domain-aware conversation systems. In: Proceedings of AAAI, Palo Alto, CA, pp. 696–704. AAAI Press (2018)

    Google Scholar 

  18. Serban, I., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of AAAI, Palo Alto, CA, vol. 30, pp. 3776–3783. AAAI Press (2016)

    Google Scholar 

  19. Tur, G., Hakkani-Tür, D., Heck, L., Parthasarathy, S.: Sentence simplification for spoken language understanding. In: Proceedings of ICASSP, New York, USA, pp. 5628–5631. IEEE (2011)

    Google Scholar 

  20. Tur, G., Hakkani-Tür, D., Heck, L.: What is left to be understood in atis? In: IEEE Spoken Language Technology Workshop, pp. 19–24. IEEE (2010)

    Google Scholar 

  21. Wang, X., Wang, D., Xu, C., He, X., Cao, Y., Chua, T.S.: Explainable reasoning over knowledge graphs for recommendation. In: Proceedings of AAAI, Palo Alto, CA, vol. 33, pp. 5329–5336. AAAI Press (2019)

    Google Scholar 

  22. Wen, Q., Tian, Y., Zhang, X., Hu, R., Wang, J., Hou, L., Li, J.: Type-aware open information extraction via graph augmentation model. In: Chen, H., Liu, K., Sun, Y., Wang, S., Hou, L. (eds.) CCKS 2020. CCIS, vol. 1356, pp. 119–131. Springer, Singapore (2020). https://doi.org/10.1007/978-981-16-1964-9_10

    Chapter  Google Scholar 

  23. Wen, T.H., et al.: A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of EACL, Stroudsburg, PA, pp. 438–449. Association for Computational Linguistics (2017)

    Google Scholar 

  24. Xie, J., et al.: Construction of multimodal Chinese tourism knowledge graph. In: Zeng, J., Qin, P., Jing, W., Song, X., Lu, Z. (eds.) ICPCSEE 2021. CCIS, vol. 1452, pp. 16–29. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-5943-0_2

    Chapter  Google Scholar 

  25. Yan, Z., Duan, N., Chen, P., Zhou, M., Zhou, J., Li, Z.: Building task-oriented dialogue systems for online shopping. In: Proceedings of AAAI, Palo Alto, CA, vol. 31, pp. 4618–4625. AAAI Press (2017)

    Google Scholar 

  26. Yu, Z., Yu, J., Fan, J., Tao, D.: Multi-modal factorized bilinear pooling with co-attention learning for visual question answering. In: Proceedings of ICCV, New York, USA, pp. 1839–1848. IEEE (2017)

    Google Scholar 

  27. Zhang, C., Wang, H., Jiang, F., Yin, H.: Adapting to context-aware knowledge in natural conversation for multi-turn response selection. In: Proceedings of the Web Conference, pp. 1990—2001. Association for Computing Machinery, New York (2021)

    Google Scholar 

  28. Zhou, K., Zhao, W.X., Bian, S., Zhou, Y., Wen, J.R., Yu, J.: Improving conversational recommender systems via knowledge graph based semantic fusion. In: Proceedings of KDD, pp. 1006–1014. Association for Computing Machinery, New York (2020)

    Google Scholar 

  29. Zhu, Q., Huang, K., Zhang, Z., Zhu, X., Huang, M.: Crosswoz: a large-scale Chinese cross-domain task-oriented dialogue dataset. Trans. Assoc. Comput. Linguist. 8, 281–295 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Hou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wan, J. et al. (2024). Construction of Multimodal Dialog System via Knowledge Graph in Travel Domain. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14334. Springer, Singapore. https://doi.org/10.1007/978-981-97-2421-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2421-5_28

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2420-8

  • Online ISBN: 978-981-97-2421-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics