Skip to main content

Dynamic Assortment Optimization: Beyond MNL Model

  • Chapter
  • First Online:
The Elements of Joint Learning and Optimization in Operations Management

Part of the book series: Springer Series in Supply Chain Management ((SSSCM,volume 18))

  • 736 Accesses

Abstract

Dynamic assortment optimization with demand learning is a fundamental task in data-driven revenue management research that requires a combination of techniques from operations research, optimization, and machine learning. In this chapter, we give an overview of research on data-driven dynamic assortment optimization when the underlying demand is governed by probabilistic choice models beyond the classical multinomial logit (MNL) choice model, thereby overcoming several limitations and drawbacks of the MNL model. We also mention interesting unsolved questions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In terms of either the worst-case regret or the Bayes regret. We shall adopt the worst-case regret formulation here as it is more popular.

References

  • Abbasi-Yadkori, Y., Pál, D., & Szepesvári, C. (2011). Improved algorithms for linear stochastic bandits. In Proceedings of the 25th Conference on Advances in Neural Information Processing Systems (NeurIPS) (pp. 2312–2320).

    Google Scholar 

  • Agrawal, S., Avadhanula, V., Goyal, V., & Zeevi, A. (2017). Thompson sampling for the MNL-bandit. In Proceedings of the 30th Conference on Learning Theory (COLT) (pp. 76–78). PMLR

    Google Scholar 

  • Agrawal, S., Avadhanula, V., Goyal, V., & Zeevi, A. (2019). MNL-bandit: A dynamic learning approach to assortment selection. Operations Research, 67(5), 1453–1485.

    Article  Google Scholar 

  • Andrieu, C., De Freitas, N., Doucet, A., & Jordan, M. I. (2003). An introduction to MCMC for machine learning. Machine Learning, 50(1), 5–43.

    Article  Google Scholar 

  • Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov), 397–422.

    Google Scholar 

  • Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of IEEE 36th Annual Foundations of Computer Science (FOCS) (pp. 322–331). New York: IEEE.

    Chapter  Google Scholar 

  • Bubeck, S., & Cesa-Bianchi, N. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, 5(1), 1–122.

    Article  Google Scholar 

  • Chen, W., Wang, Y., & Yuan, Y. (2013). Combinatorial multi-armed bandit: General framework and application. In Proceedings of the 30th International Conference on Machine Learning (ICML) (pp. 151–159).

    Google Scholar 

  • Chen, W., Hu, W., Li, F., Li, J., Liu, Y., & Lu, P. (2016). Combinatorial multi-armed bandit with general reward functions. In Proceedings of the 30th Conference on Advances in Neural Information Processing Systems (NeurIPS)

    Google Scholar 

  • Chen, X., & Wang, Y. (2018). A note on a tight lower bound for capacitated MNL-bandit assortment selection models. Operations Research Letters, 46(5), 534–537.

    Article  Google Scholar 

  • Chen, X., Wang, Y., & Zhou, Y. (2018). An optimal policy for dynamic assortment planning under uncapacitated multinomial logit models. Mathematics of Operations Research (in press). arXiv preprint arXiv:1805.04785.

    Google Scholar 

  • Chen, X., Wang, Y., & Zhou, Y. (2020). Dynamic assortment optimization with changing contextual information. Journal of Machine Learning Research, 21(216), 1–44.

    Google Scholar 

  • Chen, X., Shi, C., Wang, Y., & Zhou, Y. (2021). Dynamic assortment planning under nested logit models. Production and Operations Management, 30(1), 85–102.

    Article  Google Scholar 

  • Cheung, W. C., & Simchi-Levi, D. (2017). Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Available at SSRN 3075658.

    Google Scholar 

  • Chu, W., Li, L., Reyzin, L., & Schapire, R. (2011). Contextual bandits with linear payoff functions. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS) (pp. 208–214). JMLR Workshop and Conference Proceedings.

    Google Scholar 

  • Daganzo, C. (2014). Multinomial probit: The theory and its application to demand forecasting. Amsterdam: Elsevier.

    Google Scholar 

  • Davis, J., Gallego, G., & Topaloglu, H. (2013). Assortment planning under the multinomial logit model with totally unimodular constraint structures. Work in Progress.

    Google Scholar 

  • Davis, J. M., Gallego, G., Topaloglu, H. (2014). Assortment optimization under variants of the nested logit model. Operations Research, 62(2), 250–273.

    Article  Google Scholar 

  • Feldman, J. B., & Topaloglu, H. (2017). Revenue management under the Markov chain choice model. Operations Research, 65(5), 1322–1342.

    Article  Google Scholar 

  • Filippi, S., Cappe, O., Garivier, A., & Szepesvári, C. (2010). Parametric bandits: The generalized linear case. In Proceedings of the 24th conference on advances in neural information processing systems (NeruIPS) (pp. 586–594).

    Google Scholar 

  • Jagabathula, S., Mitrofanov, D., & Vulcano, G. (2020a). Personalized retail promotions through a DAG-based representation of customer preferences. Available at SSRN 3258700.

    Google Scholar 

  • Jagabathula, S., Subramanian, L., & Venkataraman, A. (2020b). A conditional gradient approach for nonparametric estimation of mixing distributions. Management Science, 66(8), 3635–3656.

    Article  Google Scholar 

  • Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.

    Article  Google Scholar 

  • Li, G., Rusmevichientong, P., & Topaloglu, H. (2015). The d-level nested logit model: Assortment and price optimization problems. Operations Research, 63(2), 325–342.

    Article  Google Scholar 

  • Li, L., Lu, Y., & Zhou, D. (2017). Provably optimal algorithms for generalized linear contextual bandits. In Proceedings of the 34th International Conference on Machine Learning (ICML) (pp. 2071–2080). PMLR.

    Google Scholar 

  • McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In Frontiers in Econometrics (pp. 105–142)

    Google Scholar 

  • McFadden, D., Train, K. (2000). Mixed MNL models for discrete response. Journal of Applied Econometrics, 15(5), 447–470.

    Article  Google Scholar 

  • Megiddo, N. (1978). Combinatorial optimization with rational objective functions. In Proceedings of the annual ACM symposium on Theory of computing (STOC)

    Google Scholar 

  • Miao, S. & Chao, X. (2019). Fast algorithms for online personalized assortment optimization in a big data regime. Available at SSRN 3432574.

    Google Scholar 

  • Oh, M. h., & Iyengar, G. (2019). Thompson sampling for multinomial logit contextual bandits. In Proceedings of the 33rd conference on advances of neural information processing systems (NeurIPS) (pp. 3145–3155).

    Google Scholar 

  • Rusmevichientong, P., & Tsitsiklis, J. N. (2010). Linearly parameterized bandits. Mathematics of Operations Research, 35(2), 395–411.

    Article  Google Scholar 

  • Rusmevichientong, P., Shen, Z. J. M., & Shmoys, D. B. (2010). Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Operations Research, 58(6), 1666–1680.

    Article  Google Scholar 

  • Russo, D., & Van Roy, B. (2014). Learning to optimize via posterior sampling. Mathematics of Operations Research, 39(4), 1221–1243.

    Article  Google Scholar 

  • Sauré, D., & Zeevi, A. (2013). Optimal dynamic assortment planning with demand learning. Manufacturing and Service Operations Management, 15(3), 387–404.

    Article  Google Scholar 

  • Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285–294.

    Article  Google Scholar 

  • Train, K. E. (2008). EM algorithms for nonparametric estimation of mixing distributions. Journal of Choice Modelling, 1(1), 40–69.

    Article  Google Scholar 

  • Train, K. E. (2009). Discrete choice methods with simulation. Cambridge: Cambridge University Press.

    Google Scholar 

Download references

Acknowledgements

We would like to thank the editors for their invitation and helpful guidelines on the writing of this chapter. We would also like to thank Sentao Miao for his suggestions that greatly helped the writing of Sect. 10.4.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yining Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, Y., Zhou, Y. (2022). Dynamic Assortment Optimization: Beyond MNL Model. In: Chen, X., Jasin, S., Shi, C. (eds) The Elements of Joint Learning and Optimization in Operations Management. Springer Series in Supply Chain Management, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-031-01926-5_10

Download citation

Publish with us

Policies and ethics