Telecommunication Systems

, Volume 52, Issue 2, pp 611–622 | Cite as

Inter-carrier SLA negotiation using Q-learning

  • Hélia PouyllauEmail author
  • Giovanna Carofiglio


Inter-domain high performance services (e.g. telepresence) are not sustainable over the current Internet architecture. The Quality of Service (QoS) guarantees they demand require to settle on end-to-end Service Level Agreements (SLAs) among providers (aka. carriers) and across different networks. This process is critical since it must provide the most benefits while dealing with heterogeneous operators’ business interests and confidentiality constraints. In this paper, we propose, in the frame of a cooperative organizational model called federation, a composition technique for inter-carrier SLAs that respects end-user’s QoS requirements while maximizing network operators’ long-term benefits. We formulate the dynamic optimization problem as a Markov Decision process (MDP). This latter allows to provide an iterative near-optimal solution through reinforcement learning (more precisely, Q-learning). The SLA composition is thus performed taking into account customers and network providers’ utilities. We also propose a version including several negotiation rounds and observe how it affects the results.


Inter-carrier SLA Negotiation QoS Reinforcement learning Q-learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bakiras, S., & Li, V. (2004). A scalable architecture for end-to-end QoS provisioning. Computer Communications, 27, 1330–1340. CrossRefGoogle Scholar
  2. 2.
    Barth, D., Echabbi, L., & Hamlaoui, C. (2008). Optimal transit price negotiation: The distributed learning perspective. Journal of Universal Computer Science, 14, 745–765. Google Scholar
  3. 3.
    Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Nashua: Athena Scientific. Google Scholar
  4. 4.
    Djarallah, N., & Pouyllau, H. (2009). Algorithms for SLA composition to provide inter-domain services. In IEEE IM mini-conference. Google Scholar
  5. 5.
    Casier, K. et al. (2006). A fair cost allocation scheme for CapEx and OpEx for a network service provider. In Conference on telecommunication techno-economics. Google Scholar
  6. 6.
    Howarth, M. P. et al. (2006). End-to-end quality of service provisioning through inter-provider traffic engineering. Computer Communications. Google Scholar
  7. 7.
    Even-dar, E., & Mansour, Y. (2003). Learning rates for q-learning. Journal of Machine Learning Research, pp. 1–25. Google Scholar
  8. 8.
    Fei, Y., Wong, V., & Leung, V. (2006). Efficient QoS provisioning for adaptive multimedia in mobile communication networks by reinforcement learning. Mobile Networks and Applications, 11, 101–110. CrossRefGoogle Scholar
  9. 9.
    Telemanagement forum.
  10. 10.
    Kumar, N., & Saraph, G. (2006). End-to-end QoS in interdomain routing. In ICNS. Los Alamitos: IEEE Computer Society. Google Scholar
  11. 11.
    Ma, R., Chiu, D., Lui, J., Misra, V., & Rubenstein, D. (2007). Internet economics: The use of Shapley value for ISP settlement. In ACM conference on emerging network experiment and technology. Google Scholar
  12. 12.
    Mellouk, A., Hoceini, S., & Larynouna, S. (2006). Flow based routing for irregular traffic using reinforcement learning approach in dynamic networks. In ISCC. Google Scholar
  13. 13.
    Pouyllau, H., & Douville, R. (2010). End-to-end qos negotiation in network federations. In IEEE NOMS bandwidth on demand (BoD) workshop. Google Scholar
  14. 14.
    Le Sauze, N., Chiosi, A., Douville, R., Pouyllau, H., Lonsethagen, H., Fantini, P., Palas-ciano, C., Cimmino, A., Callejo Rodriguez, M. A., Dugeon, O., Kofman, D., Gadefait, X., Cuer, P., Ciulli, N., Carrozzo, G., Soppera, A., Briscoe, B., Bornstaedt, F., Andreou, M., Stamoulis, G., Courcoubetis, C., Reichl, P., Gojmerac, I., Rougier, J. L., Vaton, S., Barth, D., & Orda, A. (2010). Etics: Qos-enabled interconnection for future Internet services. In Future network and mobile summit. Google Scholar
  15. 15.
    Shakkottai, S., & Srikant, R. (2005). Economics of network pricing with multiple ISPs. In INFOCOM. Google Scholar
  16. 16.
    Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: an introduction (adaptive computation and machine learning). Cambridge: MIT Press. Google Scholar
  17. 17.
    Tesauro, G., Jong, N. K., Das, R., & Bennani, M. N. (2006). A hybrid reinforcement learning approach to autonomic resource allocation. In ICAC ’06: Proceedings of the 2006 IEEE international conference on autonomic computing. Los Alamitos: IEEE Computer Society. Google Scholar
  18. 18.
    Tong, H., & Brown, T. X. (2002). Reinforcement learning for call admission control and routing under quality of service constraints in multimedia networks. Journal of Machine Learning Research, pp. 111–139. Google Scholar
  19. 19.
    Watkins, C. J. C. H., & Dayan, P. (1992). Technical note: Q-learning. Journal of Machine Learning Research, pp. 279–292. Google Scholar
  20. 20.
    Williamson, O. E. (1991). Strategizing, economizing and economic organization. Strategic Management Journal, 12, 75–94. (Special Issue). CrossRefGoogle Scholar
  21. 21.
    Xiao, J., & Boutaba, R. (2005). QoS-aware service composition and adaptation in autonomic communication. IEEE Journal on Selected Areas in Communications 23. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Alcatel-Lucent Bell Labs FranceCentre de VillarceauxNozayFrance

Personalised recommendations