Mathematical Methods of Operations Research

, Volume 75, Issue 3, pp 287–304 | Cite as

Continuous learning methods in two-buyer pricing problem

  • Kimmo Berg
  • Harri Ehtamo
Original Article


This paper presents continuous learning methods in a monopoly pricing problem where the firm has uncertainty about the buyers’ preferences. The firm designs a menu of quality-price bundles and adjusts them using only local information about the buyers’ preferences. The learning methods define different paths, and we compare how much profit the firm makes on these paths, how long it takes to learn the optimal tariff, and how the buyers’ utilities change during the learning period. We also present a way to compute the optimal path in terms of discounted profit with dynamic programming and complete information. Numerical examples show that the optimal path may involve jumps where the buyer types switch from one bundle to another, and this is a property which is difficult to include in the learning methods. The learning methods have, however, the benefit that they can be generalized to pricing problems with many buyers types and qualities.


Pricing Learning Limited information Buyer-seller game Mechanism design 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Araujo A, Moreira H (1999) Adverse selection problems without the single crossing property. Econometric Society World Congress 2000 Contributed Papers 1874Google Scholar
  2. Armstrong M (1996) Multiproduct nonlinear pricing. Econometrica 64(1): 51–75zbMATHCrossRefGoogle Scholar
  3. Armstrong M (2006) Recent developments in the economics of price discrimination. In: Blundell R, Newey W, Persson T (eds) Advances in economics and econometrics: theory and applications. Ninth world congress, vol 2. Cambridge University Press, CambridgeGoogle Scholar
  4. Basov S (2005) Multidimensional screening. Springer, HeidelbergzbMATHGoogle Scholar
  5. Berg K, Ehtamo H (2008) Multidimensional screening: online computation and limited information. In: ICEC ’08: Proceedings of 10th international conference on electronic commerce. ACM, New York, pp 1–10Google Scholar
  6. Berg K, Ehtamo H (2009) Learning in nonlinear pricing with unknown utility functions. Ann Oper Res 172(1): 375–392MathSciNetzbMATHCrossRefGoogle Scholar
  7. Berg K, Ehtamo H (2010) Interpretation of Lagrange multipliers in nonlinear pricing problem. Optim Lett 4: 275–285MathSciNetzbMATHCrossRefGoogle Scholar
  8. Bertsekas DP (2005) Dynamic programming and optimal control. Athena Scientific, BelmontzbMATHGoogle Scholar
  9. Bowling M, Veloso M (2002) Multiagent learning using a variable learning rate. Artif Intell 136: 215–250MathSciNetzbMATHCrossRefGoogle Scholar
  10. Braden D, Oren S (1994) Nonlinear pricing to produce information. Mark Sci 13: 310–326CrossRefGoogle Scholar
  11. Brooks CH, Gazzale RS, Das RD, Kephart JO, Mackie-Mason JK, Durfee EH (2002) Model selection in an information economy: Choosing what to learn. Comput Intell 18(4): 566–582MathSciNetCrossRefGoogle Scholar
  12. Conitzer V, Sandholm T (2002) Complexity of mechanism design. In: Proceedings of 18th annual conference on uncertainty in artificial intelligence (UAI-02)Google Scholar
  13. Ehtamo H, Berg K, Kitti M (2010) An adjustment scheme for nonlinear pricing problem with two buyers. Eur J Oper Res 201(1): 259–266MathSciNetzbMATHCrossRefGoogle Scholar
  14. Ellison G (1997) One rational guy and the justification of myopia. Games Econ Behav 19: 180–210MathSciNetzbMATHCrossRefGoogle Scholar
  15. Elmaghraby W, Keskinocak P (2003) Dynamic pricing in the presence of inventory considerations: research overview, current practices, and future directions. Manage Sci 49(10): 1287–1309zbMATHCrossRefGoogle Scholar
  16. Fudenberg D, Levine DK (1999) The theory of learning in games. MIT Press, CambridgeGoogle Scholar
  17. Garcia A, Campos-Nanez E, Reitzes J (2005) Dynamic pricing and learning in electricity markets. Oper Res 53(2): 231–241zbMATHCrossRefGoogle Scholar
  18. Hofbauer J, Sigmund K (2003) Evolutionary game dynamics. Bull Am Math Soc 40(4): 479–519MathSciNetzbMATHCrossRefGoogle Scholar
  19. Lin KY (2006) Dynamic pricing with real-time demand learning. Eur J Oper Res 174: 522–538zbMATHCrossRefGoogle Scholar
  20. Maskin E, Riley J (1984) Monopoly with incomplete information. Rand J Econ 15: 171–196MathSciNetCrossRefGoogle Scholar
  21. Mussa M, Rosen S (1978) Monopoly and product quality. J Econ Theory 18: 301–317MathSciNetzbMATHCrossRefGoogle Scholar
  22. Nahata B, Kokovin S, Zhelobodko E (2002) Package sizes, tariffs, quantity discounts and premium. Working paper. Department of Economics, University of LouisvilleGoogle Scholar
  23. Nahata B, Kokovin S, Zhelobodko E (2004) Solution structures in non-ordered discrete screening problems: trees, stars and cycles. Working paper, Department of Economics, University of LouisvilleGoogle Scholar
  24. Nisan N, Ronen A (2001) Algorithmic mechanism design. Games Econ Behav 35: 166–196MathSciNetzbMATHCrossRefGoogle Scholar
  25. Raju Chinthalapati VL, Yadati N, Karumanchi R (2006) Learning dynamic prices in multiseller electronic retail markets with price sensitive customers, stochastic demands, and inventory replenishments. IEEE Trans Syst Man Cybern C 36(1): 92–106CrossRefGoogle Scholar
  26. Rochet JC, Chone P (1998) Ironing, sweeping, and multidimensional screening. Econ 66: 783–826zbMATHGoogle Scholar
  27. Rochet JC, Stole LA (2003) The economics of multidimensional screening. In: Dewatripont M, Hansen LP, Turnovsky SJ (eds) Advances in economics and econometrics 1. Cambridge University Press, Cambridge, pp 150–197CrossRefGoogle Scholar
  28. Sandholm T (2007) Perspectives on multiagent learning. Artif Intell 171: 382–391MathSciNetzbMATHCrossRefGoogle Scholar
  29. Spence M (1980) Multi-product quantity-dependent prices and profitability constraints. Rev Econ Stud 47: 821–841zbMATHCrossRefGoogle Scholar
  30. Stole LA (2007) Price discrimination and competition. In: Armstrong M, Porter R (eds) Handbook of industrial organization, vol 3. North-Holland, AmsterdamGoogle Scholar
  31. Vengerov D (2008) A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments. Future Gener Comp Syst 24(7): 687–693CrossRefGoogle Scholar
  32. Wilson R (1993) Nonlinear pricing. Oxford University Press, OxfordGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  1. 1.Systems Analysis LaboratoryAalto University School of ScienceAaltoFinland

Personalised recommendations