Skip to main content

Combining intelligent heuristics with simulators in hotel revenue management


Revenue Management uses data-driven modelling and optimization methods to decide what to sell, when to sell, to whom to sell, and for which price, in order to increase revenue and profit. Hotel Revenue Management is a very complex context characterized by nonlinearities, many parameters and constraints, and stochasticity, in particular in the demand by customers. It suffers from the curse of dimensionality (Bellman 2015): when the number of variables increases (number of rooms, number possible prices and capacities, number of reservation rules and constraints) exact solutions by dynamic programming or by alternative global optimization techniques cannot be used and one has to resort to intelligent heuristics, i.e., methods which can improve current solutions but without formal guarantees of optimality. Effective heuristics can incorporate “learning” (“reactive” schemes) that update strategies based on the past history of the process, the past reservations received up to a certain time and the previous steps in the iterative optimization process. Different approaches can be classified according to the specific model considered (stochastic demand and hotel rules), the control mechanism (the pricing policy) and the optimization technique used to determine improving or optimal solutions. In some cases, model definitions, control mechanism and solution techniques are strongly interrelated: this is the case of dynamic programming, which demands suitably simplified problem formulations. We design a flexible discrete-event simulator for the hotel reservation process and experiment different approaches though measurements of the expected effect on profit (obtained by carefully separating a “training” phase from the final “validation” phase obtained from different simulations). The experimental results show the effectiveness of intelligent heuristics with respect to exact optimization methods like dynamic programming, in particular for more constrained situations (cases when demand tends to saturate hotel room availability), when the simplifying assumptions needed to make the problem analytically treatable do not hold.

This is a preview of subscription content, access via your institution.


  1. Battiti, R., Brunato, M.: Reactive search optimization: learning while optimizing. Handbook of Metaheuristics 146, 543–571 (2010)

    Article  Google Scholar 

  2. Battiti, R., Brunato, M.: The LION Way. Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy. (2018)

  3. Battiti, R., Tecchiolli, G.: Learning with first, second, and no derivatives: a case study in high energy physics. Neurocomputing 6, 181–206 (1994)

    Article  Google Scholar 

  4. Bayoumi, A.E.M., Saleh, M., Atiya, A.F., Aziz, H.A.: Dynamic pricing for hotel revenue management using price multipliers. Journal of Revenue and Pricing Management 12(3), 271–285 (2013)

    Article  Google Scholar 

  5. Bellman, R.E.: Adaptive Control Processes: a Guided Tour, vol. 2045. Princeton University Press, Princeton (2015)

    Google Scholar 

  6. Belobaba, P.P.: Or practice—application of a probabilistic decision model to airline seat inventory control. Oper. Res. 37(2), 183–197 (1989)

    Article  Google Scholar 

  7. Bertsimas, D., De Boer, S.: Simulation-based booking limits for airline revenue management. Oper. Res. 53(1), 90–106 (2005)

    MATH  Article  Google Scholar 

  8. den Boer, A.V.: Dynamic pricing and learning: historical origins, current research, and new directions. Surveys in Operations Research and Management Science 20(1), 1–18 (2015)

    MathSciNet  Article  Google Scholar 

  9. den Boer, A.V., Zwart, B.: Simultaneously learning and optimizing using controlled variance pricing. Manag. Sci. 60(3), 770–783 (2013)

    Article  Google Scholar 

  10. Brunato, M., Battiti, R.: RASH: a self-adaptive random search method. In: Cotta, C., Sevaux, M., Sörensen, K. (eds.) Adaptive and Multilevel Metaheuristics, Studies in Computational Intelligence, vol. 136. Springer, Berlin (2008)

  11. Brunelli, R., Tecchiolli, G.: On random minimization of functions. Biol. Cybern. 65, 501–506 (1991)

    MathSciNet  Article  Google Scholar 

  12. Brunelli, R., Tecchiolli, G.: Stochastic minimization with adaptive memory. J. Comput. Appl. Math. 57, 329–343 (1995)

    MathSciNet  MATH  Article  Google Scholar 

  13. Chaneton, J.M., Vulcano, G.: Computing bid prices for revenue management under customer choice behavior. Manuf. Serv. Oper. Manag. 13(4), 452–470 (2011)

    Article  Google Scholar 

  14. Gallego, G., Van Ryzin, G.: Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Manag. Sci. 40(8), 999–1020 (1994)

    MATH  Article  Google Scholar 

  15. Glover, F., Glover, R., Lorenzo, J., McMillan, C.: The passenger-mix problem in the scheduled airlines. Interfaces 12(3), 73–80 (1982)

    Article  Google Scholar 

  16. Hansen, N., Ostermeier, A.: Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation. In: Proceedings of IEEE International Conference on Evolutionary Computation, 1996, pp. 312–317. IEEE (1996)

  17. Klein, R.: Network capacity control using self-adjusting bid-prices. OR Spectr. 29(1), 39–60 (2007)

    MATH  Article  Google Scholar 

  18. Papadimitriou, C.H.: Computational Complexity. Wiley, Hoboken (2003)

    MATH  Google Scholar 

  19. Powell, W.B.: Approximate Dynamic Programming: Solving the curses of dimensionality, Wiley Series in Probability and Statistics, vol. 703. Wiley, Hoboken (2007)

    Book  Google Scholar 

  20. Powell, W.B., Ryzhov, I.O.: Optimal Learning, Wiley Series in Probability and Statistics, vol. 841. Wiley, Hoboken (2012)

    Google Scholar 

  21. Robinson, L.W.: Optimal and approximate control policies for airline booking with sequential nonmonotonic fare classes. Oper. Res. 43(2), 252–263 (1995)

    MATH  Article  Google Scholar 

  22. van Ryzin, G., Vulcano, G.: Computing virtual nesting controls for network revenue management under customer choice behavior. Manuf. Serv. Oper. Manag. 10 (3), 448–467 (2008)

    MATH  Article  Google Scholar 

  23. Talluri, K.T., Van Ryzin, G.J.: The Theory and Practice of Revenue Management, vol. 68. Springer Science & Business Media, Berlin (2006)

    MATH  Google Scholar 

  24. Van Ryzin, G., Vulcano, G.: Simulation-based optimization of virtual nesting controls for network revenue management. Oper. Res. 56(4), 865–880 (2008)

    MATH  Article  Google Scholar 

  25. Williamson, E.L.: Airline network seat inventory control: Methodologies and revenue impacts. Ph.D. thesis Massachusetts Institute of Technology (1992)

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Mauro Brunato.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Optimal prices for sigmoidal acceptance

Appendix A: Optimal prices for sigmoidal acceptance

This appendix describes the steps to obtain the optimal prices described in the paper.

A.1 Non-saturated equilibrium price

Here is the derivation of the optimal price value (16) described in Section 4.1.2 in the hypothesis that infinite rooms are available.

We want to find the stationary point of upa(u), where u is the price and pa(u) is the corresponding acceptance probability (15):

$$ \frac d{du}\bigl(u\cdot p_{a}(u)\bigr) = 1 - \sigma\left( \frac{u-\mu}\sigma\right) - \frac1\eta u\sigma^{\prime}\left( \frac{u-\mu}\eta\right); $$

by applying the substitutions

$$ s = \frac{u-\mu}\eta,\quad \beta=\frac\mu\eta, $$

and reminding the identity \(\sigma ^{\prime }(s)=\sigma (s)\bigl (1-\sigma (s)\bigr )\), we obtain

$$ (\beta+s)\sigma(s)^{2} - (\beta+s+1)\sigma(s) + 1 = 0. $$

After replacing the sigmoid function definition, multiplying by (1 + es)2 and simplifying, we are left with

$$ (s+\beta-1)e^{s} = 1 $$

and, multiplying by eβ− 1,

$$ (s+\beta-1)e^{s+\beta-1} = e^{\beta-1}. $$

Let us observe that this equation is in the form xex = a for a > −π/2, whose solution can be analytically expressed as x = W0(a), where W0(⋅) is the main branch of Lambert’s function. We get

$$ s+\beta-1 = W_{0}(e^{\beta-1}). $$

By replacing the original variables, we finally obtain (16).

A.2 Dynamic programming optimal price

The derivation of the optimal price policy (23) described in Section 4.1.3 for the dynamic programming technique follows the same steps outlined above.

After replacing (15) into (22), let us perform the following variable substitutions and quantity replacements:

$$ s = \frac{u-\mu}\eta,\quad \beta=\frac{V_{1}-V_{2}+\mu}\eta, $$

after which we obtain (25), whose solution is, again, (26).

By replacing the original variables from (27), we obtain (23).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Brunato, M., Battiti, R. Combining intelligent heuristics with simulators in hotel revenue management. Ann Math Artif Intell 88, 71–90 (2020).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Machine learning and intelligent optimization
  • Hotel revenue management
  • Simulation-based optimization
  • Optimization heuristics

Mathematics Subject Classification (2010)

  • 90-04
  • 90B50