Dynamic Assortment Optimization: Beyond MNL Model

Wang, Yining; Zhou, Yuan

doi:10.1007/978-3-031-01926-5_10

Yining Wang⁵ &
Yuan Zhou⁶

Part of the book series: Springer Series in Supply Chain Management ((SSSCM,volume 18))

736 Accesses

Abstract

Dynamic assortment optimization with demand learning is a fundamental task in data-driven revenue management research that requires a combination of techniques from operations research, optimization, and machine learning. In this chapter, we give an overview of research on data-driven dynamic assortment optimization when the underlying demand is governed by probabilistic choice models beyond the classical multinomial logit (MNL) choice model, thereby overcoming several limitations and drawbacks of the MNL model. We also mention interesting unsolved questions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In terms of either the worst-case regret or the Bayes regret. We shall adopt the worst-case regret formulation here as it is more popular.

References

Abbasi-Yadkori, Y., Pál, D., & Szepesvári, C. (2011). Improved algorithms for linear stochastic bandits. In Proceedings of the 25th Conference on Advances in Neural Information Processing Systems (NeurIPS) (pp. 2312–2320).
Google Scholar
Agrawal, S., Avadhanula, V., Goyal, V., & Zeevi, A. (2017). Thompson sampling for the MNL-bandit. In Proceedings of the 30th Conference on Learning Theory (COLT) (pp. 76–78). PMLR
Google Scholar
Agrawal, S., Avadhanula, V., Goyal, V., & Zeevi, A. (2019). MNL-bandit: A dynamic learning approach to assortment selection. Operations Research, 67(5), 1453–1485.
Article Google Scholar
Andrieu, C., De Freitas, N., Doucet, A., & Jordan, M. I. (2003). An introduction to MCMC for machine learning. Machine Learning, 50(1), 5–43.
Article Google Scholar
Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov), 397–422.
Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of IEEE 36th Annual Foundations of Computer Science (FOCS) (pp. 322–331). New York: IEEE.
Chapter Google Scholar
Bubeck, S., & Cesa-Bianchi, N. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, 5(1), 1–122.
Article Google Scholar
Chen, W., Wang, Y., & Yuan, Y. (2013). Combinatorial multi-armed bandit: General framework and application. In Proceedings of the 30th International Conference on Machine Learning (ICML) (pp. 151–159).
Google Scholar
Chen, W., Hu, W., Li, F., Li, J., Liu, Y., & Lu, P. (2016). Combinatorial multi-armed bandit with general reward functions. In Proceedings of the 30th Conference on Advances in Neural Information Processing Systems (NeurIPS)
Google Scholar
Chen, X., & Wang, Y. (2018). A note on a tight lower bound for capacitated MNL-bandit assortment selection models. Operations Research Letters, 46(5), 534–537.
Article Google Scholar
Chen, X., Wang, Y., & Zhou, Y. (2018). An optimal policy for dynamic assortment planning under uncapacitated multinomial logit models. Mathematics of Operations Research (in press). arXiv preprint arXiv:1805.04785.
Google Scholar
Chen, X., Wang, Y., & Zhou, Y. (2020). Dynamic assortment optimization with changing contextual information. Journal of Machine Learning Research, 21(216), 1–44.
Google Scholar
Chen, X., Shi, C., Wang, Y., & Zhou, Y. (2021). Dynamic assortment planning under nested logit models. Production and Operations Management, 30(1), 85–102.
Article Google Scholar
Cheung, W. C., & Simchi-Levi, D. (2017). Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Available at SSRN 3075658.
Google Scholar
Chu, W., Li, L., Reyzin, L., & Schapire, R. (2011). Contextual bandits with linear payoff functions. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS) (pp. 208–214). JMLR Workshop and Conference Proceedings.
Google Scholar
Daganzo, C. (2014). Multinomial probit: The theory and its application to demand forecasting. Amsterdam: Elsevier.
Google Scholar
Davis, J., Gallego, G., & Topaloglu, H. (2013). Assortment planning under the multinomial logit model with totally unimodular constraint structures. Work in Progress.
Google Scholar
Davis, J. M., Gallego, G., Topaloglu, H. (2014). Assortment optimization under variants of the nested logit model. Operations Research, 62(2), 250–273.
Article Google Scholar
Feldman, J. B., & Topaloglu, H. (2017). Revenue management under the Markov chain choice model. Operations Research, 65(5), 1322–1342.
Article Google Scholar
Filippi, S., Cappe, O., Garivier, A., & Szepesvári, C. (2010). Parametric bandits: The generalized linear case. In Proceedings of the 24th conference on advances in neural information processing systems (NeruIPS) (pp. 586–594).
Google Scholar
Jagabathula, S., Mitrofanov, D., & Vulcano, G. (2020a). Personalized retail promotions through a DAG-based representation of customer preferences. Available at SSRN 3258700.
Google Scholar
Jagabathula, S., Subramanian, L., & Venkataraman, A. (2020b). A conditional gradient approach for nonparametric estimation of mixing distributions. Management Science, 66(8), 3635–3656.
Article Google Scholar
Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.
Article Google Scholar
Li, G., Rusmevichientong, P., & Topaloglu, H. (2015). The d-level nested logit model: Assortment and price optimization problems. Operations Research, 63(2), 325–342.
Article Google Scholar
Li, L., Lu, Y., & Zhou, D. (2017). Provably optimal algorithms for generalized linear contextual bandits. In Proceedings of the 34th International Conference on Machine Learning (ICML) (pp. 2071–2080). PMLR.
Google Scholar
McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In Frontiers in Econometrics (pp. 105–142)
Google Scholar
McFadden, D., Train, K. (2000). Mixed MNL models for discrete response. Journal of Applied Econometrics, 15(5), 447–470.
Article Google Scholar
Megiddo, N. (1978). Combinatorial optimization with rational objective functions. In Proceedings of the annual ACM symposium on Theory of computing (STOC)
Google Scholar
Miao, S. & Chao, X. (2019). Fast algorithms for online personalized assortment optimization in a big data regime. Available at SSRN 3432574.
Google Scholar
Oh, M. h., & Iyengar, G. (2019). Thompson sampling for multinomial logit contextual bandits. In Proceedings of the 33rd conference on advances of neural information processing systems (NeurIPS) (pp. 3145–3155).
Google Scholar
Rusmevichientong, P., & Tsitsiklis, J. N. (2010). Linearly parameterized bandits. Mathematics of Operations Research, 35(2), 395–411.
Article Google Scholar
Rusmevichientong, P., Shen, Z. J. M., & Shmoys, D. B. (2010). Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Operations Research, 58(6), 1666–1680.
Article Google Scholar
Russo, D., & Van Roy, B. (2014). Learning to optimize via posterior sampling. Mathematics of Operations Research, 39(4), 1221–1243.
Article Google Scholar
Sauré, D., & Zeevi, A. (2013). Optimal dynamic assortment planning with demand learning. Manufacturing and Service Operations Management, 15(3), 387–404.
Article Google Scholar
Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285–294.
Article Google Scholar
Train, K. E. (2008). EM algorithms for nonparametric estimation of mixing distributions. Journal of Choice Modelling, 1(1), 40–69.
Article Google Scholar
Train, K. E. (2009). Discrete choice methods with simulation. Cambridge: Cambridge University Press.
Google Scholar

Download references

Acknowledgements

We would like to thank the editors for their invitation and helpful guidelines on the writing of this chapter. We would also like to thank Sentao Miao for his suggestions that greatly helped the writing of Sect. 10.4.

Author information

Authors and Affiliations

Naveen Jindal School of Management, University of Texas at Dallas, Richardson, TX, USA
Yining Wang
Yau Mathematical Sciences Center, Tsinghua University, Beijing, China
Yuan Zhou

Authors

Yining Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yining Wang .

Editor information

Editors and Affiliations

New York University, New York, NY, USA
Xi Chen
University of Michigan–Ann Arbor, Ann Arbor, MI, USA
Stefanus Jasin
University of Michigan–Ann Arbor, Ann Arbor, MI, USA
Cong Shi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wang, Y., Zhou, Y. (2022). Dynamic Assortment Optimization: Beyond MNL Model. In: Chen, X., Jasin, S., Shi, C. (eds) The Elements of Joint Learning and Optimization in Operations Management. Springer Series in Supply Chain Management, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-031-01926-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-01926-5_10
Published: 12 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-01925-8
Online ISBN: 978-3-031-01926-5
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics