We study the problem of a buyer who gains stochastic rewards by procuring through an auction, multiple units of a service or item from a pool of heterogeneous agents who are strategic on two dimensions, namely cost and capacity. The reward obtained for a single unit from an allocated agent depends on the inherent quality of the agent; the agent’s quality is fixed but unknown. Each agent can only supply a limited number of units (capacity of the agent). The cost incurred per unit and capacity (maximum number of units that can be supplied) are private information of each agent. The auctioneer is required to elicit from the agents their costs as well as capacities (making the mechanism design bidimensional) and further, learn the qualities of the agents as well, with a view to maximize her utility. Motivated by this, we design a bidimensional multi-armed bandit procurement auction that seeks to maximize the expected utility of the auctioneer subject to incentive compatibility and individual rationality, while simultaneously learning the unknown qualities of the agents. We first work with the assumption that the qualities are known, and propose an optimal, truthful mechanism 2D-OPT for the auctioneer to elicit costs and capacities. Next, in order to learn the qualities of the agents as well, we provide sufficient conditions for a learning algorithm to be Bayesian incentive compatible and individually rational. We finally design a novel learning mechanism, 2D-UCB that is stochastic Bayesian incentive compatible and individually rational.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Abraham, I., Alonso, O., Kandylas, V., Slivkins, A.: Adaptive crowdsourcing algorithms for the bandit survey problem. In: Conference on learning theory (COLT’13), vol. 30, pp. 882–910 (2013)
Agrawal, S., Devanur, N.R.: Bandits with concave rewards and convex knapsacks. In: Proceedings of the Fifteenth ACM Conference on Economics and Computation (EC’14), pp. 989–1006 (2014)
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. J. Mach. Learn. 47(2-3), 235–256 (2002)
Babaioff, M., Kleinberg, R., Slivkins, A.: Multi-parameter mechanisms with implicit payment computation. In: Proceedings of the 14th ACM conference on electronic commerce (EC’13), pp. 35–52. ACM (2013)
Babaioff, M., Kleinberg, R.D., A. Slivkins.: Truthful mechanisms with implicit payment computation. In: Proceedings of the 11th ACM conference on electronic commerce (EC’10), pp. 43–52. ACM (2010)
Babaioff, M., Sharma, Y., Slivkins, A.: Characterizing truthful multi-armed bandit mechanisms: extended abstract. In: Proceedings of the 10th ACM conference on electronic commerce (EC’09), pp. 79–88. ACM (2009)
Badanidiyuru, A., Kleinberg, R., Singer, Y.: Learning on a budget: posted price mechanisms for online procurement. In: Proceedings of 13th ACM conference on electronic commerce, EC-2012, pp. 128–145. ACM, Valencia (2012)
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Mach. Learn. 5(1), 1–122 (2012)
Buhrmester, M., Kwang, T., Gosling, S.D.: Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data? Perspect. Psychol. Sci. 6(1), 3–5 (2011)
Devanur, N.R., Kakade, S.M.: The price of truthfulness for pay-per-click auctions. In: Proceedings of the 10th ACM conference on electronic commerce (EC’09), pp. 99–106. ACM (2009)
Gatti, N., Lazaric, A., Trovò, F.: A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities. In: Proceedings of the 13th ACM conference on electronic commerce (EC’12), pp. 605–622. ACM (2012)
Gujar, S., Narahari, Y.: Optimal multi-unit combinatorial auctions. Oper. Res. 13(1), 27–46 (2013)
Hartline, J.D.: Bayesian mechanism design. Foundations and Trends in Theoretical Computer Science 8(3), 143–263 (2013)
Ho, C., Slivkins, A., Vaughan, J.W.: Adaptive contract design for crowdsourcing markets: bandit algorithms for repeated principal-agent problems. In: Proceedings of the 13th ACM conference on economics and computation (EC ’14), pp. 359–376. ACM (2014)
Ho, C., Vaughan, J.: Online task assignment in crowdsourcing markets. In: Proceedings of 26th AAAI conference on artificial intelligence (AAAI’12), pp. 45–51. AAAI Press (2012)
Iyengar, G., Kumar, A.: Optimal procurement mechanisms for divisible goods with capacitated suppliers. Rev. Econ. Des. 12(2), 129–154 (2008)
Jain, S., Gujar, S., Zoeter, O., Narahari, Y.: A quality assuring multi-armed bandit crowdsourcing mechanism with incentive compatible learning. In: Proceedings of the 13th international conference on autonomous agents and multiagent systems (AAMAS’14), pp. 1609–1610. International Foundation for Autonomous Agents and Multiagent Systems (2014)
ju Ho, C., Jabbari, S., Vaughan, J.W.: Adaptive task assignment for crowdsourced classification. In: Proceedings of the 13th international conference on machine learning (ICML’13), vol. 28, pp. 534–542 (2013)
Krishna, V.: Auction theory. Academic Press, Cambridge (2009)
Lai, T., Robbins, H.: Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1), 4–22 (1985)
Mandal, D., Narahari, Y.: A novel ex-post truthful mechanism for multi-slot sponsored search auctions. In: Proceedings of the 13th international conference on autonomous agents and multi-agent systems (AAMAS’14), pp. 1555–1556. International Foundation for Autonomous Agents and Multiagent Systems (2014)
Mishra, D.: Multidimensional mechanism design: key results and research issues. Curr. Sci. 103(9), 1043–1050 (2012)
Myerson, R.B.: Optimal auction design. Math. Oper. Res. 6(1), 58–73 (1981)
Rothkopf, M.: Thirteen reasons why the Vickrey-Clarke-Groves process is not practical. Oper. Res. 55(2), 191–197 (2007)
Royden, H.L., Fitzpatrick, P., Hall, P.: Real analysis, vol. 32. Macmillan, New York (1988)
Sharma, A.D., Gujar, S., Narahari, Y.: Truthful multi-armed bandit mechanisms for multi-slot sponsored search auctions. Curr. Sci. 103(9), 1064–1077 (2012)
Singer, Y., Mittal, M.: Pricing mechanisms for crowdsourcing markets. In: Proceedings of the 20 second international world wide web conference (WWW’13), pp. 1157–1166 (2013)
Singla, A., Krause, A.: Truthful incentives in crowdsourcing tasks using regret minimization mechanisms. In: Proceedings of the 20 second international world wide web conference (WWW’13), pp. 1167–1178 (2013)
Tran-Thanh, L., Chapman, A.C., Rogers, A., Jennings, N.R.: Knapsack based optimal policies for budget-limited multi-armed bandits. In: Proceedings of the 26th AAAI conference on artificial intelligence (AAAI’12), pp. 1134–1140. AAAI Press (2012)
Tran-Thanh, L., Stein, S., Rogers, A., Jennings, N.R.: Efficient crowdsourcing of unknown experts using bounded multi-armed bandits. Artif. Intell. 214(0), 89–111 (2014)
Tran-Thanh, L., Venanzi, M., Rogers, A., Jennings, N.R.: Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks (2013)
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Bhat, S., Jain, S., Gujar, S. et al. An optimal bidimensional multi-armed bandit auction for multi-unit procurement. Ann Math Artif Intell 85, 1–19 (2019). https://doi.org/10.1007/s10472-018-9611-0
- Multi-armed bandit
- Mechanism design
Mathematics Subject Classification (2010)