A Bayesian framework for large-scale geo-demand estimation in on-line retailing
Time-specific geo-demand distribution estimation of the products in the catalog is the fundamental guiding analytics for inventory allocation in any major online retailer’s supply chain operations. Although geography-specific historical sales data is available for learning the geo-demand distributions, it is extremely sparse from a view of a product \(\times \) demand zone \(\times \) time data cube (tensor). As a result, we have to estimate the entries in a large-scale tensor with limited amount of training data. The sheer scale of the problem makes the task challenging to solve within a limited time frame. We formulate this problem in the spirit of text theme classification and view the geo-demand distributions as underlying probability distributions that govern the historical sales observations. We develop a Bayesian framework based on mixture of Multinomials for estimating the time-dependent geo-demand distributions in a collaborative manner. As a by-product, the solution provides guidance on grouping the products by their geo-demand patterns. We also provide practical solutions to counter various scalability issues. Benchmark results are provided in comparison to basic same-class methods and a state-of-the-art R package.
KeywordsBayesian estimation Geo-demand Mixture of multinomials Tensor completion
- Bishop, C. M., et al. (2006). Pattern recognition and machine learning (Vol. 1). New York: Springer.Google Scholar
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.Google Scholar
- Kaufman, L., & Rousseeuw, P. (1987). Clustering by means of medoids. Amsterdam: North-Holland.Google Scholar
- Qin, Z. T., Bowman, J., & Bewli, J. (2014). A Bayesian framework for large-scale geo-demand estimation in on-line retailing. In Proceedings of the 2014 INFORMS Workshop on Data Mining and Analytics. San Francisco, CA.Google Scholar