Skip to main content

Diversifying recommendations on sequences of sets


Diversifying recommendations on a sequence of sets (or sessions) of items captures a variety of applications. Notable examples include recommending online music playlists, where a session is a channel and multiple channels are listened to in sequence, or recommending tasks in crowdsourcing, where a session is a set of tasks and multiple task sessions are completed in sequence. Item diversity can be defined in more than one way, e.g., as a genre diversity for music, or as a function of reward in crowdsourcing. A user who engages in multiple sessions may intend to experience diversity within and/or across sessions. Intra session diversity is set-based, whereas Inter session diversity is naturally sequence-based. This novel formulation gives rise to four bi-objective problems with the goal of minimizing or maximizing Inter and Intra diversities. We prove hardness and develop efficient algorithms with theoretical guarantees. Our experiments with human subjects on two real datasets show that our diversity formulations do serve different user needs and yield high user satisfaction. Our large-scale experiments on real and synthetic data empirically demonstrate that our solutions satisfy our theoretical bounds and are highly scalable, compared to baselines.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18


  1. A preliminary version of this work has got accepted in The Web Conference, 2021 [20].



  1. (2019) Figure eight—data for everyone.

  2. Abbar, S., Amer-Yahia, S., Indyk, P., Mahabadi, S.: Real-time recommendation of diverse related articles. In: 22nd International World Wide Web Conference, WWW’13, Rio de Janeiro, Brazil, May 13–17, 2013, pp. 1–12 (2013)

  3. Aipe, A., Gadiraju, U.: Similarhits: revealing the role of task similarity in microtask crowdsourcing. In: HT, pp. 115–122 (2018)

  4. Alsayasneh, M., Amer-Yahia, S., Gaussier, E., Leroy, V., Pilourdault, J., Borromeo, R.M., Toyama, M., Renders, J.M.: Personalized and diverse task composition in crowdsourcing. IEEE Trans. Knowl. Data Eng. 30(1), 128–141 (2017)

    Article  Google Scholar 

  5. Amer-Yahia, S., Gaussier, E., Leroy, V., Pilourdault, J., Borromeo, R.M., Toyama, M.: Task composition in crowdsourcing. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 194–203. IEEE (2016)

  6. Anagnostopoulos, A., Broder, A.Z., Carmel, D.: Sampling search-engine results. World Wide Web 9(4), 397–429 (2006)

    Article  Google Scholar 

  7. Andreev, K., Racke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  8. Angel, A., Koudas, N.: Efficient diversity-aware search. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 781–792 (2011)

  9. Bertin-Mahieux, T., Ellis, D.P., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)

  10. Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained k-means clustering. Microsoft Res. 20, 66 (2000)

    Google Scholar 

  11. Carbonell, J.G., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. SIGIR 98, 335–336 (1998)

    Article  Google Scholar 

  12. Chandler, D., Kapelner, A.: Breaking monotony with meaning: motivation in crowdsourcing markets (2012). CoRR arXiv:1210.0962

  13. Chen, Z., Li, T.: Addressing diverse user preferences in sql-query-result navigation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12–14, 2007, pp. 641–652 (2007)

  14. Cieliebak, M., Eidenbenz, S., Pagourtzis, A., Schlude, K.: On the complexity of variations of equal sum subsets. Nord J. Comput. 14(3), 151–172 (2008)

    MathSciNet  MATH  Google Scholar 

  15. Cressie, N., Whitford, H.: How to use the two sample t-test. Biometr. J. 28(2), 131–148 (1986)

    Article  MathSciNet  Google Scholar 

  16. Dai, P., Rzeszotarski, J.M., Paritosh, P., Chi, E.H.: And now for something completely different: improving crowdsourcing workflows with micro-diversions. In: ACM CSCW, pp. 628–638 (2015)

  17. Difallah, D., Filatova, E., Ipeirotis, P.: Demographics and dynamics of mechanical turk workers. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 135–143. ACM (2018)

  18. Difallah, D.E., Catasta, M., Demartini, G., Cudré-Mauroux, P.: Scaling-up the crowd: micro-task pricing schemes for worker retention and latency improvement. In: Second AAAI Conference on Human Computation and Crowdsourcing (2014)

  19. El-Arini, K., Veda, G., Shahaf, D., Guestrin, C.: Turning down the noise in the blogosphere. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28–July 1, 2009, pp. 289–298 (2009)

  20. Esfandiari, M., Borromeo, R.M., Nikookar, S., Sakharkar, P., Amer-Yahia, S., Basu Roy, S.: Multi-session diversity to improve user satisfaction in web applications. Proc. Web Conf. 2021, 1928–1936 (2021)

    Google Scholar 

  21. Fan, J., Lu, M., Ooi, B.C., Tan, W.C., Zhang, M.: A hybrid machine-crowdsourcing system for matching web tables. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 976–987. IEEE (2014)

  22. Fan, J., Li, G., Ooi, B.C., Tan, K.l., Feng, J.: icrowd: an adaptive crowdsourcing framework. In: SIGMOD, pp. 1015–1030 (2015)

  23. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman (1979)

  24. Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  25. Han, L., Roitero, K., Gadiraju, U., Sarasua, C., Checco, A., Maddalena, E., Demartini, G.: All those wasted hours: on task abandonment in crowdsourcing. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February 11–15, 2019, pp. 321–329 (2019)

  26. Hariri, N., Mobasher, B., Burke, R.: Context-aware music recommendation based on latenttopic sequential patterns. In: Proceedings of the Sixth ACM Conference on Recommender Systems, pp. 131–138 (2012)

  27. Hata, K., Krishna, R., Li, F., Bernstein, M.S.: A glimpse far into the future: understanding long-term crowd worker quality. In: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2017), Portland, OR, USA, February 25–March 1, 2017, pp. 889–901 (2017)

  28. Ho, C., Vaughan, J.W.: Online task assignment in crowdsourcing markets. In: AAAI (2012)

  29. Ho, C., Jabbari, S., Vaughan, J.W.: Adaptive task assignment for crowdsourced classification. In: ICML, pp. 534–542 (2013)

  30. Jain, A., Sarda, P., Haritsa, J.R.: Providing diversity in k-nearest neighbor query results. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 404–413. Springer (2004)

  31. Kyriakidi, M., Stefanidis, K., Ioannidis, Y.: On achieving diversity in recommender systems. In: Proceedings of the ExploreDB’17, pp. 1–6 (2017)

  32. Leiserson, C.E., Rivest, R.L., Cormen, T.H., Stein, C.: Introduction to Algorithms, vol. 6. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  33. Michiels, W., Korst, J., Aarts, E., Van Leeuwen, J.: Performance ratios for the differencing method applied to the balanced number partitioning problem. In: Annual Symposium on Theoretical Aspects of Computer Science, pp. 583–595. Springer (2003)

  34. Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions-i. Math. Program. 14(1), 265–294 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  35. Pilourdault, J., Amer-Yahia, S., Lee, D., Roy, S.: Motivation-aware task assignment in crowdsourcing. In: EDBT (2017)

  36. Punnen, A., Margot, F., Kabadi, S.: Tsp heuristics: domination analysis and complexity. Algorithmica 35(2), 111–127 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  37. Puthiya Parambath, S.A., Usunier, N., Grandvalet, Y.: A coverage-based approach to recommendation diversity on similarity graph. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 15–22 (2016)

  38. Qin, L., Zhu, X.: Promoting diversity in recommendation by entropy regularizer. In: Twenty-Third International Joint Conference on Artificial Intelligence (2013)

  39. Rahman, H., Roy, S.B., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Optimized group formation for solving collaborative tasks. VLDB J. 28(1), 1–23 (2019)

    Article  Google Scholar 

  40. Rosenkrantz, D.J., Tayi, G.K., Ravi, S.: Facility dispersion problems under capacity and cost constraints. J. Combin. Optim. 4(1), 7–33 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  41. Rzeszotarski, J.M., Chi, E., Paritosh, P., Dai, P.: Inserting micro-breaks into crowdsourcing workflows. In: First AAAI Conference on Human Computation and Crowdsourcing (2013)

  42. Stoline, M.R.: The status of multiple comparisons: simultaneous estimation of all pairwise comparisons in one-way Anova designs. Am. Stat. 35(3), 134–141 (1981)

    MATH  Google Scholar 

  43. Stratigi, M., Nummenmaa, J., Pitoura, E., Stefanidis, K.: Fair sequential group recommendations. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, pp. 1443–1452 (2020)

  44. SurveyMonkey: Calculating the number of respondents you need (1999).

  45. Vargas, S., Baltrunas, L., Karatzoglou, A., Castells, P.: Coverage, redundancy and size-awareness in genre diversity for recommender systems. In: Proceedings of the 8th ACM Conference on Recommender Systems, pp. 209–216 (2014)

  46. Volkovs, M., Rai, H., Cheng, Z., Wu, G., Lu, Y., Sanner, S.: Two-stage model for automatic playlist continuation at scale. Proc. ACM Recomm. Syst. Chall. 2018, 1–6 (2018)

    Google Scholar 

  47. Wang, D., Deng, S., Xu, G.: Sequence-based context-aware music recommendation. Inf. Retr. J. 21(2–3), 230–252 (2018)

    Article  Google Scholar 

  48. Yu, C., Lakshmanan, L., Amer-Yahia, S.: It takes variety to make a world: diversification in recommender systems. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 368–378 (2009)

  49. Zhang, M., Hurley, N.: Avoiding monotony: improving the diversity of recommendation lists. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 123–130 (2008)

  50. Zheng, Y., Wang, J., Li, G., Cheng, R., Feng, J.: QASCA: a quality-aware task assignment system for crowdsourcing applications. In: SIGMOD, pp. 1031–1046 (2015)

  51. Ziegler, C., McNee, S.M., Konstan, J.A., Lausen, G.: Improving recommendation lists through topic diversification. In: Proceedings of the 14th International Conference on World Wide Web, WWW 2005, Chiba, Japan, May 10–14, 2005, pp. 22–32 (2005)

Download references


The work of Sepideh Nikookar, Paras Sakharkar, and Senjuti Basu Roy is supported by the NSF CAREER Award #1942913, IIS #2007935, IIS #1814595, PPoSS: Planning #2118458, and by the Office of Naval Research Grants Nos. N000141812838 and N000142112966.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Senjuti Basu Roy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nikookar, S., Esfandiari, M., Borromeo, R.M. et al. Diversifying recommendations on sequences of sets. The VLDB Journal 32, 283–304 (2023).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: