Skip to main content
Log in

Clustering of conversational bandits with posterior sampling for user preference learning and elicitation

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Cite this article

Abstract

Conversational recommender systems elicit user preference via conversational interactions. By introducing conversational key-terms, existing conversational recommenders can effectively reduce the need for extensive exploration required by a traditional interactive recommender. However, there are still limitations of existing conversational recommender approaches eliciting user preference via key-terms. First, the key-term data of the items needs to be carefully labeled, which requires a lot of human efforts. Second, the number of the human labeled key-terms is limited and the granularity of the key-terms is fixed, while the elicited user preference is usually from coarse-grained to fine-grained during the conversations. In this paper, we propose a clustering of conversational bandits algorithm. To avoid the human labeling efforts and automatically learn the key-terms with the proper granularity, we online cluster the items and generate meaningful key-terms for the items during the conversational interactions. Our algorithm is general and can also be used in the user clustering when the feedback from multiple users is available, which further leads to more accurate learning and generations of conversational key-terms. Moreover, to learn the user clustering structure more efficiently in more complex user clustering structure, we further propose a simple yet effective soft user clustering module to perform exploration on user clustering via sampling the posterior user representations. We analyze the regret bound of our learning algorithm. In the empirical evaluations, without using any human labeled key-terms, our algorithm effectively generates meaningful coarse-to-fine grained key-terms and performs as well as or better than the state-of-the-art baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data that support the findings are available in open datasets with URLs given in Sect. 7.3.1.

Notes

  1. \(T_0\) is merged into the \({\mathcal {O}}\) item.

  2. https://foursquare.com/

  3. http://delicious.com/.

  4. http://www.grouplens.org/.

  5. http://www.lastfm.com/.

  6. https://www.bibsonomy.org/.

  7. http://vi.sualize.us/.

  8. Not to be confused with the ACR defined in Sect. 7.1.2. The user’s ACR here indicates the averaged cumulative reward (ACR) of the recommendations conducted by the agent to this user.

References

  • Abbasi-Yadkori, Y., Pál, D., Szepesvári, C.: Improved algorithms for linear stochastic bandits. Adv. Neural Inf. Process. Syst. 24, 2312–2320 (2011)

    Google Scholar 

  • Abe, N., Long, P.M.: Associative reinforcement learning using linear probabilistic concepts. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML ’99, pp. 3–11. Morgan Kaufmann Publishers Inc., San Francisco (1999)

  • Aggarwal, C.C., et al.: Recommender Systems, vol. 1. Springer (2016)

    Book  Google Scholar 

  • Agrawal, S., Goyal, N.: Analysis of Thompson sampling for the multi-armed bandit problem. In: COLT 2012—The 25th Annual Conference on Learning Theory, June 25–27, 2012, Edinburgh, Scotland, JMLR Proceedings, vol 23. JMLR.org, pp. 39.1–39.26 (2012)

  • Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, JMLR Workshop and Conference Proceedings, vol 28. JMLR.org, pp. 127–135 (2013)

  • Ahmad, W.U., Bai, X., Lee, S., et al.: Select, extract and generate: neural keyphrase generation with layer-wise coverage attention. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, vol. 1: Long Papers, Virtual Event, August 1–6, pp. 1389–1404. Association for Computational Linguistics (2021)

  • Ba, L.J., Kiros, J.R., Hinton, G.E.: Layer normalization. CoRR abs/1607.06450 (2016)

  • Bahuleyan, H., Asri, L.E.: Diverse keyphrase generation with neural unlikelihood training. In: Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020. International Committee on Computational Linguistics, pp. 5271–5287 (2020)

  • Ban, Y., He, J.: Local clustering in contextual multi-armed bandits. Proc. Web Conf. 2021, 2335–2346 (2021)

    Google Scholar 

  • Bi, Y., Song, L., Yao, M., et al.: Dcdir: a deep cross-domain recommendation system for cold start users in insurance domain. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1661–1664 (2020)

  • Bogdanov, D., Won, M., Tovstogan, P., et al.: The mtg-jamendo dataset for automatic music tagging. In: Machine Learning for Music Discovery Workshop, International Conference on Machine Learning (ICML 2019). PMLR, Proceedings of Machine Learning Research (2019)

  • Chan, H.P., Chen, W., Wang, L., et al.: Neural keyphrase generation via reinforcement learning with adaptive rewards. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, vol. 1: Long Papers, pp 2163–2174. Association for Computational Linguistics (2019)

  • Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a Meeting Held 12–14 December 2011, Granada, Spain, pp. 2249–2257 (2011)

  • Chen, H., Dai, X., Cai, H., et al.: Large-scale interactive recommendation with tree-structured policy gradient. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, pp. 3312–3320. AAAI Press (2019a)

  • Chen, W., Gao, Y., Zhang, J., et al.: Title-guided encoding for keyphrase generation. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, pp. 6268–6275. AAAI Press (2019c)

  • Chen, Q., Lin, J., Zhang, Y., et al.: Towards knowledge-based recommender dialog system. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1803–1813 (2019b)

  • Christakopoulou, K., Beutel, A., Li, R., et al.: Q &r: a two-stage approach toward interactive recommendation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 139–148 (2018)

  • Christakopoulou, K., Radlinski, F., Hofmann, K.: Towards conversational recommender systems. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 815–824 (2016)

  • Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990)

    Google Scholar 

  • Dani, V., Hayes, T.P., Kakade, S.M.: Stochastic linear optimization under bandit feedback. In: 21st Annual Conference on Learning Theory— COLT 2008, Helsinki, Finland, July 9–12, 2008. Omnipress, pp. 355–366 (2008)

  • Deerwester, S.C., Dumais, S.T., Landauer, T.K., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  • Deng, Y., Li, Y., Sun, F., et al.: Unified conversational recommendation policy learning via graph-based reinforcement learning. In: SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11–15, 2021, pp. 1431–1441. ACM (2021)

  • Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186 (2019)

  • Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769 (1965)

    Google Scholar 

  • Fu, Z., Xian, Y., Zhang, Y., et al.: Tutorial on conversational recommendation systems. In: Fourteenth ACM Conference on Recommender Systems, pp. 751–753 (2020a)

  • Fu, Z., Xian, Y., Zhu, Y., et al.: COOKIE: a dataset for conversational recommendation over knowledge graphs in e-commerce. CoRR abs/2008.09237 (2020b)

  • Gao, C., Lei, W., He, X., et al.: Advances and challenges in conversational recommender systems: a survey. AI Open 2, 100–126 (2021)

    Article  Google Scholar 

  • Gentile, C., Li, S., Kar, P., et al.: On context-dependent clustering of bandits. In: International Conference on Machine Learning, PMLR, pp. 1253–1262 (2017)

  • Gentile, C., Li, S., Zappella, G.: Online clustering of bandits. In: International Conference on Machine Learning, pp. 757–765 (2014)

  • Godin, F., Slavkovikj, V., Neve, W.D., et al.: Using topic models for twitter hashtag recommendation. In: 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13–17, 2013, Companion Volume, pp. 593–596. International World Wide Web Conferences Steering Committee/ACM (2013)

  • Grineva, M.P., Grinev, M.N., Lizorkin, D.: Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, Madrid, Spain, April 20–24, 2009, pp. 661–670. ACM (2009)

  • Guo, D., Xu, J., Zhang, J., et al.: User relationship strength modeling for friend recommendation on instagram. Neurocomputing 239, 9–18 (2017)

    Article  Google Scholar 

  • Hai, Z., Cong, G., Chang, K., et al.: Coarse-to-fine review selection via supervised joint aspect and sentiment model. In: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’14, Gold Coast, QLD, Australia–July 06–11, 2014, pp. 617–626. ACM (2014)

  • Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22–27, 2014, Baltimore, MD, USA, vol. 1: Long Papers, pp 1262–1273. The Association for Computer Linguistics (2014)

  • Heitmann, B., Hayes, C.: Semstim: exploiting knowledge graphs for cross-domain recommendation. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 999–1006 (2016)

  • Hu, Y., Da, Q., Zeng, A., et al.: Reinforcement learning to rank in e-commerce search engine: formalization, analysis, and application. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19–23, 2018, pp. 368–377. ACM (2018)

  • Jannach, D., Jugovac, M.: Measuring the business value of recommender systems. ACM Trans. Manag. Inf. Syst. 10(4), 1–23 (2019)

    Article  Google Scholar 

  • Kim, S.N., Baldwin, T., Kan, M.: Extracting domain-specific words—a statistical approach. In: Proceedings of the Australasian Language Technology Association Workshop, ALTA 2009, Sydney, Australia, December 3–4, 2009, pp. 94–98. ACL (2009)

  • Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1), 4–22 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  • Lattimore, T., Szepesvári, C.: Bandit Algorithms. Cambridge University Press (2020)

    Book  MATH  Google Scholar 

  • Lei, W., He, X., de Rijke, M., et al.: Conversational recommendation: formulation, methods, and evaluation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25–30, 2020, pp. 2425–2428. ACM (2020a)

  • Lei, W., Zhang, G., He, X., et al.: Interactive path reasoning on graph for conversational recommendation. In: KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23–27, 2020, pp. 2073–2083. ACM (2020b)

  • Lei, W., Zhang, G., He, X., et al.: Interactive path reasoning on graph for conversational recommendation. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2073–2083 (2020c)

  • Li, S., Chen, W., Li, S., et al.: Improved algorithm on online clustering of bandits. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, pp. 2923–2929 (2019)

  • Li, L., Chu, W., Langford, J., et al.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 661–670 (2010)

  • Li, R., Kahou, S.E., Schulz, H., et al.: Towards deep conversational recommendations. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, pp. 9748–9758 (2018)

  • Li, S., Karatzoglou, A., Gentile, C.: Collaborative filtering bandits. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 539–548 (2016)

  • Li, S., Zhang, S.: Online clustering of contextual cascading bandits. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

  • Li, S., Lei, W., Wu, Q., et al.: Seamlessly unifying attributes and items: conversational recommendation for cold-start users. ACM Trans. Inf. Syst. 39, 4 (2021)

    Article  Google Scholar 

  • Liu, Z., Huang, W., Zheng, Y., et al.: Automatic keyphrase extraction via topic decomposition. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, 9–11 October 2010, MIT Stata Center, Massachusetts, USA, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 366–376. ACL (2010)

  • Liu, Z., Li, P., Zheng, Y., et al.: Clustering to find exemplar terms for keyphrase extraction. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, 6–7 August 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 257–266. ACL (2009)

  • Liu, Z., Winata, G.I., Xu, P., et al.: Coach: A coarse-to-fine approach for cross-domain slot filling. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, pp. 19–25. Association for Computational Linguistics (2020)

  • Louvan, S., Magnini, B.: Recent neural methods on slot filling and intent classification for task-oriented dialogue systems: A survey. In: Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020, pp. 480–496. International Committee on Computational Linguistics (2020)

  • Luan, Y., Ostendorf, M., Hajishirzi, H.: Scientific information extraction with semi-supervised neural tagging. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017, pp. 2641–2651. Association for Computational Linguistics (2017)

  • Ma, J., Feng, C., Shi, G., et al.: Temporal enhanced sentence-level attention model for hashtag recommendation. CAAI Trans. Intell. Technol. 3(2), 95–100 (2018)

    Article  Google Scholar 

  • Maas, A.L., Daly, R.E., Pham, P.T., et al.: Learning word vectors for sentiment analysis. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19–24 June, 2011, Portland, Oregon, USA, pp. 142–150. The Association for Computer Linguistics (2011)

  • MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, pp. 281–297 (1967)

  • Mahadik, K., Wu, Q., Li, S., et al.: Fast distributed bandits for online recommendation systems. In: Proceedings of the 34th ACM International Conference on Supercomputing, pp. 1–13 (2020)

  • Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press (2001)

    MATH  Google Scholar 

  • Meng, R., Zhao, S., Han, S., et al.: Deep keyphrase generation. In: Barzilay, R., Kan, M. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers, pp. 582–592. Association for Computational Linguistics (2017)

  • Merrouni, Z.A., Frikh, B., Ouhbi, B.: Automatic keyphrase extraction: an overview of the state of the art. In: 4th IEEE International Colloquium on Information Science and Technology, CiSt 2016, Tangier, Morocco, October 24–26, 2016, pp. 306–313. IEEE (2016)

  • Merrouni, Z.A., Frikh, B., Ouhbi, B.: Automatic keyphrase extraction: a survey and trends. J. Intell. Inf. Syst. 54(2), 391–424 (2020)

    Article  Google Scholar 

  • Mihalcea, R., Tarau, P.: Textrank: Bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing , EMNLP 2004, A meeting of SIGDAT, a Special Interest Group of the ACL, Held in Conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain, pp. 404–411. ACL (2004)

  • Mou, L., Song, Y., Yan, R., et al.: Sequence to backward and forward sequences: a content-introducing approach to generative short-text conversation. In: COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan, pp. 3349–3358. ACL (2016)

  • Nguyen, T.T., Lauw, H.W.: Dynamic clustering of contextual multi-armed bandits. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 1959–1962 (2014)

  • Niu, Y., Xie, R., Liu, Z., et al.: Improved word representation learning with sememes. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers, pp. 2049–2058. Association for Computational Linguistics (2017)

  • Norouzi, M., Mikolov, T., Bengio, S., et al.: Zero-shot learning by convex combination of semantic embeddings. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Conference Track Proceedings (2014)

  • Paulus, R., Xiong, C., Socher, R.: A deep reinforced model for abstractive summarization. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net (2018)

  • Priyogi, B.: Preference elicitation strategy for conversational recommender system. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 824–825 (2019)

  • Ricci, F., Rokach, L., Shapira, B., et al.: Recommender Systems Handbook (2015)

  • Romano, S., Vinh, N.X., Bailey, J., et al.: Adjusting for chance clustering comparison measures. J. Mach. Learn. Res. 17(1), 4635–4666 (2016)

    MathSciNet  MATH  Google Scholar 

  • Salton, G.: The SMART Retrieval System-Experiments in Automatic Document Processing. Prentice-Hall Inc, New York (1971)

    Google Scholar 

  • See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Barzilay, R., Kan, M. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers, pp. 1073–1083. Association for Computational Linguistics (2017)

  • Shi, W., Zheng, W., Yu, J.X., et al.: Keyphrase extraction using knowledge graphs. Data Sci. Eng. 2(4), 275–288 (2017)

    Article  Google Scholar 

  • Song, K., Huang, Q., Zhang, F., et al.: Coarse-to-fine: a dual-view attention network for click-through rate prediction. Knowl. Based Syst. 216, 106–767 (2021)

    Article  Google Scholar 

  • Subramanian, S., Wang, T., Yuan, X., et al.: Neural models for key phrase extraction and question generation. In: Choi, E., Seo, M., Chen, D., et al. (eds.) Proceedings of the Workshop on Machine Reading for Question Answering@ACL 2018, Melbourne, Australia, July 19, 2018, pp. 78–88. Association for Computational Linguistics (2018)

  • Sun, Y., Zhang, Y.: Conversational recommender system. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018, pp. 235–244. ACM (2018)

  • Swaminathan, A., Zhang, H., Mahata, D., et al.: A preliminary exploration of gans for keyphrase generation. In: Webber, B., Cohn, T., He, Y., et al. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16–20, 2020, pp. 8021–8030. Association for Computational Linguistics (2020)

  • Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4), 285–294 (1933)

    Article  MATH  Google Scholar 

  • Tosi, M.D.L., dos Reis, J.C.: Keyphrase extraction from single textual documents based on semantically defined background knowledge and co-occurrence graphs. Int. J. Metadata Semant. Ontol. 15(2), 121–132 (2021)

    Article  Google Scholar 

  • Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)

    MathSciNet  MATH  Google Scholar 

  • Wang, X., He, X., Cao, Y., et al.: KGAT: Knowledge graph attention network for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019, pp. 950–958. ACM (2019b)

  • Wang, H., Wu, Q., Wang, H.: Factorization bandits for interactive recommendation. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA, pp. 2695–2702. AAAI Press (2017)

  • Wang, C., Zhou, T., Chen, C., et al.: CAMO: A collaborative ranking method for content based recommendation. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019, pp. 5224–5231. AAAI Press (2019a)

  • Weld, H., Huang, X., Long, S., et al.: A survey of joint intent detection and slot filling models in natural language understanding. ACM Comput. Surv. (2021)

  • Wu, Q., Wang, H., Gu, Q., et al.: Contextual bandits in a collaborative environment. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 529–538 (2016)

  • Wu, J., Zhao, C., Yu, T., et al.: Clustering of conversational bandits for user preference learning and elicitation. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 2129–2139 (2021)

  • Xian, Y., Fu, Z., Zhao, H., et al.: Cafe: coarse-to-fine neural symbolic reasoning for explainable recommendation. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 1645–1654 (2020)

  • Xie, Z., Yu, T., Zhao, C., et al.: Comparison-based conversational recommender system with relative bandit feedback. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’21, pp. 1400–1409. Association for Computing Machinery, New York (2021)

  • Xu, K., Yang, J., Xu, J., et al.: Adapting user preference to online feedback in multi-round conversational recommendation. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 364–372 (2021)

  • Yang, D., Xiao, Y., Song, Y., et al.: Tag propagation based recommendation across diverse social media. In: 23rd International World Wide Web Conference, WWW ’14, Seoul, Republic of Korea, April 7–11, 2014, Companion Volume, pp. 407–408. ACM (2014)

  • Ye, H., Wang, L.: Semi-supervised learning for neural keyphrase generation. In: Riloff, E., Chiang, D., Hockenmaier, J., et al. (eds.) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018, pp. 4142–4153. Association for Computational Linguistics (2018)

  • Yisong, M.: Advanced method towards conversational recommendation. PhD thesis, National University OF Singapore (2020)

  • Yue, Y., Hong, S.A., Guestrin, C.: Hierarchical exploration for accelerating contextual bandits. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp. 979–986 (2012)

  • Zhang, Y., Fang, Y., Xiao, W.: Deep keyphrase generation with a convolutional sequence to sequence model. In: 4th International Conference on Systems and Informatics, ICSAI 2017, Hangzhou, China, November 11–13, 2017, pp. 1477–1485. IEEE (2017)

  • Zhang, Q., Wang, Y., Gong, Y., et al.: Keyphrase extraction using deep recurrent neural networks on twitter. In: Su, J., Carreras, X., Duh, K. (eds.) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1–4, 2016, pp. 836–845. The Association for Computational Linguistics (2016)

  • Zhang, X., Xie, H., Li, H., et al.: Conversational contextual bandit: algorithm and application. In: Proceedings of The Web Conference 2020, pp. 662–672 (2020)

  • Zhao, X., Xia, L., Zhang, L., et al.: Deep reinforcement learning for page-wise recommendations. In: Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2–7, 2018, pp. 95–103. ACM (2018)

  • Zhao, C., Yu, T., Xie, Z., et al.: Knowledge-aware conversational preference elicitation with bandit feedback. In: WWW ’22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25–29, 2022, pp. 483–492. ACM (2022)

  • Zhou, S., Dai, X., Chen, H., et al.: Interactive recommender system via knowledge graph-enhanced reinforcement learning. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25–30, 2020, pp. 179–188. ACM (2020b)

  • Zhou, C., Jin, Y., Wang, X., et al.: Conversational music recommendation based on bandits. In: 2020 IEEE International Conference on Knowledge Graph (ICKG), pp. 41–48, IEEE (2020a)

  • Zhu, H., Chang, D., Xu, Z., et al.: Joint optimization of tree-based index and deep model for recommender systems. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada, pp. 3973–3982 (2019)

  • Zhu, H., Li, X., Zhang, P., et al.: Learning tree-based deep model for recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1079–1088 (2018)

  • Zimmert, J., Luo, H., Wei, C.: Beating stochastic and adversarial semi-bandits optimally and simultaneously. In: Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA, Proceedings of Machine Learning Research, vol. 97. PMLR, pp. 7683–7692 (2019)

Download references

Acknowledgements

The corresponding author Shuai Li is supported by National Natural Science Foundation of China (62006151, 62076161) and Shanghai Sailing Program.

Author information

Authors and Affiliations

Authors

Contributions

SL, TY conceived and designed the algorithms. QL, CZ and JW performed the experiments under the supervision of SL, TY. All authors jointly wrote and reviewed the manuscript.

Corresponding author

Correspondence to Shuai Li.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This paper is an extended version of our earlier work (Wu et al. 2021) which appeared in the proceedings of CIKM 2021.

Appendices

Appendix

Values of hyper-parameters on synthetic datasets

Table 2 Values of hyper-parameters on the Synthetic dataset generated with \(\sigma _k=1.00\)
Table 3 Values of hyper-parameters on the Synthetic dataset generated with \(\sigma _k=10.00\)
Table 4 Values of hyper-parameters on the Synthetic dataset generated with \(\sigma _u=0.01\)
Table 5 Values of hyper-parameters on the synthetic dataset generated with \(\sigma _u=0.1\)

Values of hyper-parameters on the real-world datasets

Table 6 Values of hyper-parameters on the FourSquare dataset
Table 7 Values of hyper-parameters on the Delicious dataset
Table 8 Values of hyper-parameters on the MovieLens 25 M dataset
Table 9 Values of hyper-parameters on the LastFM dataset
Table 10 Values of hyper-parameters on the BibSonomy dataset
Table 11 Values of hyper-parameters on the visualizeUs dataset

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Q., Zhao, C., Yu, T. et al. Clustering of conversational bandits with posterior sampling for user preference learning and elicitation. User Model User-Adap Inter 33, 1065–1112 (2023). https://doi.org/10.1007/s11257-023-09358-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-023-09358-x

Keywords

Navigation