Abstract
In the framework of the Future Internet, the aim of the Quality of Experience (QoE) Control functionalities is to track the personalized desired QoE level of the applications. The paper proposes to perform such a task by dynamically selecting the most appropriate Classes of Service (among the ones supported by the network), this selection being driven by a novel heuristic Multi-Agent Reinforcement Learning (MARL) algorithm. The paper shows that such an approach offers the opportunity to cope with some practical implementation problems: in particular, it allows to face the so-called “curse of dimensionality” of MARL algorithms, thus achieving satisfactory performance results even in the presence of several hundreds of Agents.
Similar content being viewed by others
References
ITU-T, “Amendment 1: Recommendation P.10/G.100. New Appendix I Definition of Quality of Experience (QoE),” Telecommun. Stand. Sect. Itu-T, vol. 100, no. 2006, 2007.
S. Jelassi, G. Rubino, H. Melvin, H. Youssef, and G. Pujolle, “Quality of experience of VoIP service: a survey of assessment approaches and open issues,” IEEE Commun. Surv. Tutorials, vol. 14, no. 2, pp. 491513, 2012.
S. Singh, J. G. Andrews, and G. de Veciana, “Interference shaping for improved quality of experience for real-time video streaming,” IEEE J. Sel. Areas Commun., vol. 30, no. 7, pp. 12591269, 2012.
S. Canale, F. Facchinei, R. Gambuti, L. Palagi, and V. Suraci, “User profile based quality of experience,” Proceedings of the 18th International Conference on Circuits, Systems, Communications and Computers (CSCC 2014), Santorini Island, Greece, Advances in Information Science and Applications Volume II, 2014.
M. Fiedler, T. Hossfeld, and P. Tran-Gia, “A generic quantitative relationship between quality of experience and quality of service,” IEEE Network, vol. 24, no. 2, pp. 36–41, 2010.
“Platform for Innovative Services in Future Internet,” Italian Ministry of University and Research (MIUR) PLATINO project, grant agreement no. PON01_01007, http://www.progettoplatino.it/.
FI-WARE (Future Internet Ware), EU FP7-ICT Largescale Integrating Project (IP), 2011-2014, grant agreement no. 312826, http://www.fi-ware.eu/.
FI-Core (Future Internet-Core Platform), EU FP7-ICT Large-scale Integrating Project (IP), 2014-2016, grant agreement no. 632893, http://cordis.europa.eu/project/rcn/192274_en.html.
F. Delli Priscoli, A. Isidori, L. Marconi, “A dissipativitybased approach to output regulation of non-minimumphase systems,” Systems and Control Letters, Elsevier Science Pub., vol. 58, pp. 584–591, 2009.
L. Ricciardi Celsi, R. Bonghi, S. Monaco, and D. Normand-Cyrot, “On the exact steering of finite sampled nonlinear dynamics with input delays,” Proceedings of the 1st Conference on Modelling, Identification and Control of Nonlinear Systems (MICNON 2015), IFAC-PapersOnLine, vol. 48, no. 11, pp. 674–679, Saint-Petersburg, June 2015. DOI: 10.1016/j.ifacol.2015.09.265
L. Ricciardi Celsi, S. Battilotti, F. Cimorelli, C. Gori Giorgi, S. Monaco, M. Panfili, V. Suraci, and F. Delli Priscoli, “A Q-learning based approach to quality of experience control in cognitive Future Internet networks,” Proc. of the 23rd Mediterranean Conference on Control and Automation (MED15), pp. 1045–1052, June 16-19, 2015, Torremolinos, Spain, DOI: 10.1109/MED.2015.7158895.
M. M. L. Littman, “Friend-or-foe Q-learning in generalsum Games,” ICML, vol. 1, pp. 322328, 2001.
L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” Syst. Man, Cybern. Part C Appl. Rev. IEEE Trans., vol. 38, pp. 156172, 2008.
S. Manfredi, “An algorithm for fast rendezvous seeking of wireless networked robotic systems,” Ad Hoc Networks, vol. 11, no. 7, pp. 1942–1950, 2013.
F. Delli Priscoli, A. Isidori, L. Marconi, A. Pietrabissa, “Leader-Following Coordination of Nonlinear Agents under Time-varying Communication Topologies,” IEEE Transactions on Control of Network Systems, vol. 2, no. 4, pp. 393–405, 2015.
M. Castrucci, F. Delli Priscoli, A. Pietrabissa, and V. Suraci, “A Cognitive Future Internet Architecture,” Futur. Internet, Lect. Notes Comput. Sci. vol. 7858 2013, vol. 6656, pp. 91102, 2011.
M. Castrucci, M. Cecchi, F. Delli Priscoli, L. Fogliati, P. Garino, and V. Suraci, “Key Concepts for the Future Internet Architecture,” Future Network and Mobile Summit 2011, Warsaw, June 2011.
C. Bruni, F. Delli Priscoli, G. Koch, A. Palo, and A. Pietrabissa, “Quality of experience provision in the Future Internet,” IEEE Syst. J., vo9l. 10, no. 1, pp. 302–312, March 2016.
N. Nisan and A. Ronen, “Algorithmic mechanism design,” Games and Economic Behavior, vol. 35, no. 1–2, pp. 166–196, 2001.
F. Delli Priscoli, V. Suraci, A. Pietrabissa, and M. Iannone, “Modelling quality of experience in Future Internet networks,” Proc. of the Future Network & Mobile Summit (FutureNetw), 2012.
C. Estan, S. Savage, and G. Varghese, “Automatically inferring patterns of resource consumption in network traffic,” Proc. 2003 Conf. Appl. Technol. Archit. Protoc. Comput. Commun.-SIGCOMM’ 03, pp. 137148, 2003.
S. Battilotti, C. Gori Giorgi, S. Monaco, M. Panfili, A. Pietrabissa, L. Ricciardi Celsi, and V. Suraci, “A multiagent reinforcement learning based approach to quality of experience control in Future Internet networks,” Proc. of the 34th Chinese Control Conference (CCC2015), Hangzhou, China„ pp. 6495–6500, July 28-30, 2015.
Q. Jiang, H. Xi, and B. Yin, “Dynamic file grouping for load balancing in streaming media clustered server systems,” International Journal of Control, Automation and Systems, vol. 7, no. 4, pp. 630–637, 2009. [click]
C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn., vol. 8, pp. 279292, 1992.
G. Santhi, A. Nachiappan, M. Z. Ibrahime, R. Raghunadhane, and M. K. Favas, “Q-learning based adaptive QoS routing protocol for MANETs,” Proc. of IEEE Int. Conf. on Recent Trends in Information Technology (ICRTIT), pp. 1233–1238, 2011.
A. Pietrabissa, “A reinforcement learning approach to call admission and call dropping control in links with variable capacity,” European Journal of Control, vol. 17, no. 1, pp. 89–103, 2011.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, Massachusetts, 1998.
T. Jaakkola, M. I. Jordan, and S. P. Singh, “On the convergence of stochastic iterative dynamic programming algorithms,” Neural Computation, vol. 6, pp. 1185–1201, 1994. [click]
A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queueing algorithm,” ACM SIGCOMM Comput. Commun. Rev., vol. 19, no. 4, pp. 112, 1989.
T. Mitchell, Machine Learning, McGraw Hill, 1997.
I. Zliobaité, “Learning under concept drift: an overview,” arXiv preprint arXiv:1010.4784, 2010.
Y. Gai, B. Krishnamachari, and Q. Zhao, “Combinatorial network optimization with unknown variables: Multiarmed bandits with linear rewards and individual observations,” IEEE/ACM Transactions on Networking (TON), vol. 20, no. 5, pp. 1466–1478, 2012.
J. Xu, C. Tekin, S. Zhang, and M. van der Schaar, “Distributed multi-agent online learning based on global feedback,” IEEE Transactions on Signal Processing, vol. 63, no. 9, pp. 2225–2238, 2015.
S. Canale, A. Di Giorgio, F. Lisi, M. Panfili, L. Ricciardi Celsi, V. Suraci, and F. Delli Priscoli, “A Future Internet oriented user centric extended intelligent transportation system,” Proc. of the 24th Mediterranean Conference on Control and Automation (MED), Athens, pp. 1133–1139, 2016.
Author information
Authors and Affiliations
Corresponding author
Additional information
Recommended by Associate Editor Hongbo Li under the direction of Editor Fuchun Sun. This work was supported by the Italian Ministry of Education, Research and University, namely by the PLATINO PON project (www.progettoplatino.it), under Grant Agreement no. PON01_01007. The authors wish to thank Prof. A. Isidori, C. Gori Giorgi, S. Battilotti, F. Facchinei, and L. Palagi for their continuous support and valuable contributions to the work within the PLATINO project. The authors also wish to thank Ing. J. Capolicchio for the fruitful discussions.
Francesco Delli Priscoli was born in Rome, Italy, in 1962. He received the degree in Electronics Engineering (summa cum laude) and the Ph.D. degree in Systems Engineering from the University of Rome La Sapienza in 1986 and 1991, respectively. Since 1991, he has been working at the University of Rome La Sapienza, where, at present, he is Full Professor of Automatic Control, Control of Autonomous Multi-Agent Systems, and Control of Communication and Energy Networks. In the framework of his academic activity, he has mainly researched on resource/service/content management procedures and on cognitive techniques for telecommunication and energy networks, by largely adopting control-based methodologies. He is the author of about 180 papers appeared in major international journals (about 60), on books (about 10) and in conference proceedings (about 110). He also holds four patents. He is an associate editor of Control Engineering Practice and a member of the IFAC Technical Committee on Networked Systems. He was/is the scientific responsible, at the University of Rome La Sapienza, for 31 projects financed by the European Union (Fourth, Fifth, Sixth, Seventh and Eighth Framework Programmes) and by the European Space Agency (ESA), as well as for many national projects and cooperations with major industries. His present research interests concern closed-loop multi-agent learning techniques for Quality of Experience evaluation and control in advanced communication and energy networks, as well as all the related networking algorithms.
Alessandro Di Giorgio was born in Rome, Italy, in 1980. He received the degree (cum laude) in Physics in 2005, and the Ph.D. degree in Systems Engineering from the University of Rome La Sapienza, in 2010. He is currently a Research Fellow in Automatic Control, working on original applications of control systems theory to the resource manegement problem in the field of power systems and telecommunications networks; he is author of about 40 papers and book chapters on these topics, mainly produced in the context of national and European research projects.
Federico Lisi was born in Rome, Italy, in 1986. He received the M.Sc. degree in Artificial Intelligence and Robotics with 110/110 in 2015 from the University of Rome “La Sapienza.” He has been working in the MIUR project PLATINO and in the FP7 project SWIPE. His main research interests concern reinforcement learning for multi-agent systems, path planning for autonomous robots, neural networks and data mining.
Salvatore Monaco was born in Udine, Italy in 1951 and he has been a Full Professor of Systems Theory at the University of Rome La Sapienza since 1986. He was a member of the ASI (Italian Space Agency) Scientific Committee from 1989 to 1995, of the Executive Council of the EUCA (European Union Control Association) from 1990 (foundation year) to 1997, and of the ASIWorking Group on Evaluation from 1999 to 2001. He has also been a member of the ASI Technological Committee since 1997. He has promoted technological transfer in the area of Automation. In 1995, he served as scientific advisor for the Director of the Joint Research Center of the European Union. Since 2001, he has been president of the council for the degree of Systems and Control Engineering at the University of Rome La Sapienza and also president of the Scientific Committee of the Université Franco-Italienne, an inter-governmental institution for coordinating research and didactics. His research activity is in the field of Systems and Control Theory and applied research in spacecraft control, mobile robot control and control of communication networks.
Antonio Pietrabissa is an Assistant Professor at the Department of Computer, Control, and Management Engineering Antonio Ruberti (DIAG) of the University of Rome La Sapienza, where he received his degree in Electronics Engineering and his Ph.D. degree in Systems Engineering in 2000 and 2004. Since 2000, he has worked with the Network Control Laboratory at DIAG, in the framework of National and European projects related to ICT. Since 2007, he has been member of the Scientific and Technical Committee of the Consortium for the Research in Automation and Telecommunication (CRAT). Since 2000, he has participated in 15 research projects funded by the European Union (EU), 2 projects funded by the European Space Agency (ESA), 2 projects funded by the Italian Space Agency (ASI), and 3 projects funded by the Italian Ministry of Education, Universities and Research (MIUR). His main research focus is on the application of systems and control theory methodologies to the analysis and control of networks. He is author of more than 30 journal papers and over 60 conference papers and book chapters on these topics.
Lorenzo Ricciardi Celsi was born in Rome, Italy, in 1990. He received the B.Sc. degree in Electronics Engineering in 2011 and the M.Sc. degree in Control Engineering in 2014, both summa cum laude from the University of Rome La Sapienza. He is currently a PhD Candidate in Automatica, Bioengineering and Operations Research at the same university. He has been working on reinforcement learning algorithms within the framework of the FP7 project T-NOVA and the MIUR project PLATINO. He is also working on the development of the intelligent multi-modal transport system foreseen by the H2020 project BONVOYAGE. His main research interests are: nonlinear systems and control theory with application to communication networks as well as to aircraft and spacecraft control, advanced control methodologies for multi-agent systems and machine learning algorithms and methods.
Vincenzo Suraci was born in Rome, Italy, in 1978. He graduated in Computer Engineering summa cum laude in 2004 at the University of Rome La Sapienza. In 2008 he received his Ph.D. degree in Systems Engineering at the Department Computer, Control, and Management Engineering Antonio Ruberti (DIAG) of the same university. Currently, he is a Researcher at eCampus University and Project Manager at CRAT. His main research interest is to develop and adapt advanced control and operations research methodologies (such as reinforcement learning, column generation, hybrid automata, and discrete event systems) for the solution of challenging and emerging engineering problems: e.g., connection admission control, access technologies selection, QoE/QoS cognitive control, resource management over heterogeneous technologies, convergence of heterogeneous networks. He has achieved a wide experience in the field of applied research and project management. Since 2011, he has been managing the EU-funded Future Internet Core Platform research project FI-WARE. In 2012, he also applied for a EU Patent request on DVB as a result of his profitable research in the framework of EU research projects.
Rights and permissions
About this article
Cite this article
Delli Priscoli, F., Di Giorgio, A., Lisi, F. et al. Multi-agent quality of experience control. Int. J. Control Autom. Syst. 15, 892–904 (2017). https://doi.org/10.1007/s12555-015-0465-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12555-015-0465-5