Abstract
Choice-based demand models serve as building blocks for making demand predictions, which are key inputs to critical operational decisions, such as what prices to charge for different products or which subset of products to offer to the customers. Traditionally, parametric choice models (such as the multinomial logit model) have been employed for tractability reasons, but with the increasing ability of firms to collect large volumes of sales transaction and product availability data, nonparametric choice models have been gaining in popularity. We review recent advances in nonparametric estimation of two versatile choice model families.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
However, they did not introduce this nomenclature.
- 2.
- 3.
This is true as long as \(\left <\nabla h (\boldsymbol {x}^{(k-1)}), \boldsymbol {v}^{(k)} - \boldsymbol {x}^{(k-1)}\right > < 0\). If \(\left <\nabla h (\boldsymbol {x}^{(k-1)}), \boldsymbol {v}^{(k)} - \boldsymbol {x}^{(k-1)}\right > \geq 0\), then the convexity of h(⋅) implies that h(x) ≥ h(x(k−1)) for all \(\boldsymbol {x} \in \mathcal {D}\) and consequently, x(k−1) is an optimal solution.
- 4.
We abuse notation and denote αf(σ) as ασ for any \(\sigma \in \mathcal {P}\) in the remainder of this section.
- 5.
Mišić (2016) also proposed a similar formulation for estimating the rank-based choice model with an L1-norm loss function using a column generation approach.
- 6.
The remaining products in each ranking can be chosen arbitrarily.
- 7.
In this case, the feature vector for other products would typically include a constant feature 1 to allow for general no-purchase market shares.
- 8.
Our development here is closely related to that in JSV but with slight differences.
- 9.
Technically, the distribution is modeled over the parameter vector β as opposed to its “type” representation f(β).
- 10.
This is equivalent to minimizing the KL-divergence loss function and is the standard choice when estimating the mixed logit model.
- 11.
The rank-based model can allow for the number of products in a ranking to be strictly smaller than the size of the product universe, in which case the customer selects the no-purchase option if none of the products in the ranking is part of the offer set.
References
Abdallah, T., & Vulcano, G. (2020). Demand estimation under the multinomial logit model from sales transaction data. Manufacturing & Service Operations Management, 23, 1005–1331.
Aouad, A., Elmachtoub, A. N., Ferreira, K. J., & McNellis, R. (2020a). Market segmentation trees. arXiv:1906.01174.
Aouad, A., Farias, V., & Levi, R. (2020b). Assortment optimization under consider-then-choose choice models. Management Science, 67, 3321–3984.
Barberá, S., & Pattanaik, P. K. (1986). Falmagne and the rationalizability of stochastic choices in terms of random orderings. Econometrica: Journal of the Econometric Society, 54, 707–715.
Ben-Akiva, M. E., Lerman, S. R., & Lerman, S. R. (1985). Discrete choice analysis: Theory and application to travel demand (vol. 9). Cambridge: MIT Press.
Berbeglia, G. (2018). The generalized stochastic preference choice model. Available at SSRN 3136227.
Bertsimas, D., & Mišić, V. V. (2019). Exact first-choice product line optimization. Operations Research, 67(3), 651–670.
Bhat, C. R. (1997). An endogenous segmentation mode choice model with an application to intercity travel. Transportation Science, 31(1), 34–48.
Block, H. D., & Marschak, J. (1960). Random orderings and stochastic theories of responses. Contributions to Probability and Statistics, 2, 97–132.
Boxall, P. C., & Adamowicz, W. L. (2002). Understanding heterogeneous preferences in random utility models: A latent class approach. Environmental and Resource Economics, 23(4), 421–446.
Chen, N., Gallego, G., & Tang, Z. (2019). The use of binary choice forests to model and estimate discrete choices. Available at SSRN 3430886.
Chen, Y. C., & Mišić, V. (2019). Decision forest: A nonparametric approach to modeling irrational choice. Available at SSRN 3376273.
Clarkson, K. L. (2010). Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. ACM Transactions on Algorithms, 6(4), 63.
Désir, A., Goyal, V., Jagabathula, S., & Segev, D. (2021). Mallows-smoothed distribution over rankings approach for modeling choice. Operations Research, 69, 1015–1348.
Dwork, C., Kumar, R., Naor, M., & Sivakumar, D. (2001). Rank aggregation methods for the web. In Proceedings of the 10th International Conference on World Wide Web (pp. 613–622). New York: ACM.
Falmagne, J. C. (1978). A representation theorem for finite random scale systems. Journal of Mathematical Psychology, 18(1), 52–72.
Farias, V. F., Jagabathula, S., & Shah, D. (2013). A nonparametric approach to modeling choice with limited data. Management Science, 59(2), 305–322.
Fox, J. T., il Kim, K., Ryan, S. P., & Bajari, P. (2011). A simple estimator for the distribution of random coefficients. Quantitative Economics, 2(3), 381–418.
Frank, M., & Wolfe, P. (1956). An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3(1–2), 95–110.
Gallego, G., Ratliff, R., & Shebalov, S. (2015). A general attraction model and sales-based linear program for network revenue management under customer choice. Operations Research, 63(1), 212–232.
Gallego, G., & Topaloglu, H. (2019). Introduction to choice modeling. In Revenue management and pricing analytics (pp. 109–128). Berlin: Springer.
Greene, W. H., & Hensher, D. A. (2003). A latent class model for discrete choice analysis: Contrasts with mixed logit. Transportation Research Part B: Methodological, 37(8), 681–698.
Guélat, J., & Marcotte, P. (1986). Some comments on wolfe’s ‘away step’. Mathematical Programming, 35(1), 110–119.
Haensel, A., & Koole, G. (2011). Estimating unconstrained demand rate functions using customer choice sets. Journal of Revenue and Pricing Management, 10(5), 438–454.
Han, Y., Zegras, C., Pereira, F. C., & Ben-Akiva, M. (2020). A neural-embedded choice model: Tastenet-mnl modeling taste heterogeneity with flexibility and interpretability. arXiv:200200922.
Hauser, J. R. (2014). Consideration-set heuristics. Journal of Business Research, 67(8), 1688–1699.
Hensher, D. A., & Greene, W. H. (2003). The mixed logit model: The state of practice. Transportation, 30(2), 133–176.
Honhon, D., Jonnalagedda, S., & Pan, X. A. (2012). Optimal algorithms for assortment selection under ranking-based consumer choice models. Manufacturing & Service Operations Management, 14(2), 279–289.
Hoyer, W. D., & Ridgway, N. M. (1984). Variety seeking as an explanation for exploratory purchase behavior: A theoretical model. In T. C. Kinnear (Ed.), NA - Advances in consumer research (vol. 11, pp. 114–119). Provo: ACR North American Advances.
Hunter, D. R. (2004). MM algorithms for generalized bradley-terry models. Annals of Statistics, 32, 384–406.
Jagabathula, S., & Rusmevichientong, P. (2017). A nonparametric joint assortment and price choice model. Management Science, 63(9), 3128–3145.
Jagabathula, S., Mitrofanov, D., & Vulcano, G. (2020a). Personalized retail promotions through a dag-based representation of customer preferences. Operations Research, 70, 641–1291.
Jagabathula, S., & Rusmevichientong, P. (2019). The limit of rationality in choice modeling: Formulation, computation, and implications. Management Science, 65(5), 2196–2215.
Jagabathula, S., Subramanian, L., & Venkataraman, A. (2020b). A conditional gradient approach for nonparametric estimation of mixing distributions. Management Science, 66(8), 3635–3656.
Jagabathula, S., & Venkataraman, A. (2020). An MM algorithm for estimating the MNL model with product features. Available at SSRN: https://ssrncom/abstract=3733971
Jagabathula, S., & Vulcano, G. (2018). A partial-order-based model to estimate individual preferences using panel data. Management Science, 64(4), 1609–1628.
Jaggi, M. (2011). Sparse convex optimization methods for machine learning. Ph.D. Thesis, ETH Zürich.
Jaggi, M. (2013). Revisiting frank-wolfe: Projection-free sparse convex optimization. In Proceedings of the 30th International Conference on Machine Learning (ICML-13) (pp. 427–435).
Kahn, B. E., & Lehmann, D. R. (1991). Modeling choice among assortments. Journal of Retailing, 67(3), 274–300.
Krishnan, R. G., Lacoste-Julien, S., & Sontag, D. (2015). Barrier frank-wolfe for marginal inference. In Advances in Neural Information Processing Systems (vol. 28, pp. 532–540)
Li, G., Rusmevichientong, P., & Topaloglu, H. (2015). The d-level nested logit model: Assortment and price optimization problems. Operations Research, 63(2), 325–342.
Lindsay, B. G. (1983). The geometry of mixture likelihoods: A general theory. The Annals of Statistics, 11, 86–94.
Liu, L., Dzyabura, D., & Mizik, N. (2020). Visual listening in: Extracting brand image portrayed on social media. Marketing Science, 39(4), 669–686.
Liu, X., Lee, D., & Srinivasan, K. (2019). Large-scale cross-category analysis of consumer review content on sales conversion leveraging deep learning. Journal of Marketing Research, 56(6), 918–943.
Luce, R. D. (1959). Individual Choice Behavior: A Theoretical analysis. New York: Wiley.
Mahajan, S., & Van Ryzin, G. (2001). Stocking retail assortments under dynamic consumer substitution. Operations Research, 49(3), 334–351.
Mallows, C. L. (1957). Non-null ranking models. I. Biometrika, 44(1–2), 114–130.
Manski, C. F. (1977). The structure of random utility models. Theory and Decision, 8(3), 229–254.
Mas-Colell, A., Whinston, M. D., Green. J. R. (1995). Microeconomic theory (vol 1). New York: Oxford University Press.
McFadden, D. (1981). Econometric models of probabilistic choice. In: Structural analysis of discrete data with econometric applications (pp. 198–272). Cambridge: MIT Press.
McFadden, D., & Train, K. (2000). Mixed MNL models for discrete response. Journal of Applied Econometrics, 15, 447–470.
McFadden, D. L. (2005). Revealed stochastic preference: A synthesis. Economic Theory, 26(2), 245–264.
McLachlan, G., & Peel, D. (2004). Finite mixture models. Hoboken: Wiley.
Mišić, V. V. (2016). Data, models and decisions for large-scale stochastic optimization problems. Ph. D. Thesis, Massachusetts Institute of Technology, chapter 4: Data-driven Assortment Optimization.
Newman, J. P., Ferguson, M. E., Garrow, L. A., & Jacobs, T. L. (2014). Estimation of choice-based models using sales data from a single firm. Manufacturing & Service Operations Management, 16(2), 184–197.
Nocedal, J., & Wright, S. J. (2006). Numerical optimization (2nd edn.). Berlin: Springer.
Paul, A., Feldman, J., & Davis, J. M. (2018). Assortment optimization and pricing under a nonparametric tree choice model. Manufacturing & Service Operations Management, 20(3), 550–565.
Prechelt, L. (2012). Early stopping—but when? In Neural networks: Tricks of the trade (pp. 53–67), Berlin: Springer.
Rusmevichientong, P., Shmoys, D., Tong, C., & Topaloglu, H. (2014). Assortment optimization under the multinomial logit model with random choice parameters. Production and Operations Management, 23(11), 2023–2039.
Shalev-Shwartz, S., Srebro, N., & Zhang, T. (2010). Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM Journal on Optimization, 20(6), 2807–2832.
Sher, I., Fox, J. T., il Kim, K., & Bajari, P. (2011). Partial identification of heterogeneity in preference orderings over discrete choices. Tech. Rep., National Bureau of Economic Research.
Sifringer, B., Lurkin, V., & Alahi, A. (2020). Enhancing discrete choice models with representation learning. Transportation Research Part B: Methodological, 140, 236–261.
Strauss, A. K., Klein, R., & Steinhardt, C. (2018). A review of choice-based revenue management: Theory and methods. European Journal of Operational Research, 271(2), 375–387.
Train, K. E. (2008). EM algorithms for nonparametric estimation of mixing distributions. Journal of Choice Modelling, 1(1), 40–69.
Train, K. E. (2009). Discrete choice methods with simulation. Cambridge: Cambridge University Press.
van Ryzin, G., & Vulcano, G. (2015). A market discovery algorithm to estimate a general class of nonparametric choice models. Management Science, 61(2), 281–300.
van Ryzin, G., & Vulcano, G. (2017). An expectation-maximization method to estimate a rank-based choice model of demand. Operations Research, 65(2), 396–407.
Yao, Y., Rosasco, L., & Caponnetto, A. (2007). On early stopping in gradient descent learning. Constructive Approximation, 26(2), 289–315.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Jagabathula, S., Venkataraman, A. (2022). Nonparametric Estimation of Choice Models. In: Chen, X., Jasin, S., Shi, C. (eds) The Elements of Joint Learning and Optimization in Operations Management. Springer Series in Supply Chain Management, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-031-01926-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-01926-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-01925-8
Online ISBN: 978-3-031-01926-5
eBook Packages: Business and ManagementBusiness and Management (R0)