Machine-Learning Deﬁned Networking: Towards Intelligent Networks

With the advent of 5G technology and an ever-increasing traffic demand, today Communication Service Providers (CSPs) experience a progressive congestion of their networks. The operational complexity, the use of manual configuration, the static nature of current technologies together with fast-changing traffic profiles lead to: inefficient network utilization, over-provisioning of resources and very high Capital Expenditures (CapEx) and Operational Expenses (OpEx). This situation is forcing the CSPs to change their underlying network technologies, and have started to look at new technological solutions that increase the level of programmability, control, and flexibility of configuration, while reducing the overall costs related to network operations. Software Define Networking (SDN), Network Function Virtualization (NFV) and Machine Learning (ML) are accepted as effective solutions to reduce CapEx and OpEx and to boost network innovation. This chapter summarizes the content of my Ph.D. thesis, by presenting new ML-based approaches in order to efficiently optimize resources in 5G metro-core SDN/NFV networks. The main goal is to provide the modern CSP with intelligent and dynamic network optimization tools in order to address the requirements of increasing traffic demand and 5G technology.

field. ML is a sub-field of computer science that originated from the study of pattern recognition and computational learning theory in artificial intelligence. By leveraging complex mathematical and statistical tools, computers can learn by input data how to solve specific problems that have been traditionally solved by human beings [8]. ML explores algorithms that can "learn" from and make predictions out of input data. The learning phase is also referred to as "training" an algorithm, by iterative and feedback-based mechanisms, to solve specific problems. Such algorithms operate by building a model in order to make data-driven predictions or decisions, rather than following static program instructions. ML algorithms are typically classified into three different categories: • Supervised Learning (SL) is used in a variety of applications, such as speech recognition, spam detection and object recognition. The goal is to predict the value of one or more output variables given the value of a vector of input variables. The output variable can be a continuous variable (regression problem) or a discrete variable (classification problem). A training data set comprises a set of samples of the input variables and the corresponding output values. • Unsupervised Learning (UL) consists of training an algorithm using only input vector data with the aim to find hidden patterns and similarities. Social network analysis, genes clustering and market research are among the most successful applications of unsupervised learning methods. It can address different tasks, such as clustering or cluster analysis. Clustering is the process of grouping data so that the intra-cluster similarity among the input data samples is high, while the inter-cluster similarity is low. • Reinforcement Learning (RL) is an approach to ML that trains algorithms using a system of positive and negative rewards. Contrary to what happens with the supervised and unsupervised learning, in RL we do not have training, validation and test set of data defined "a priori". An RL algorithm, or agent, learns how to perform a task by interacting with its environment. The interaction is defined in terms of specific actions, observations and rewards.
This chapter presents the main results of my PhD thesis [3] by introducing novel ML-based algorithms to solve various network optimization problems. The thesis provides contributions to all the ML branches by implementing supervised, unsupervised and reinforcement learning algorithms together with their implementations in real and emulated SDN/NFV testbed scenarios.

Network Traffic Prediction
Mobile, metro and core networks often suffer from resource inefficiency due to overprovisioning. It's a common practice to perform static resource allocation based on the peak-hour demand, because the current operational processes used by the network operators are too slow to dynamically allocate the resources following the daily demand variations. Over-provisioning leads to poor energy efficiency and high OpEx, as the resources are sub-utilized outside of the peak hour. Moreover, as the peakhour to average demand ratio continue to increase [2], the static resource allocation lead to higher and unnecessary OpEx and CapEx. As such, understanding how to correctly predict network behaviour plays a vital role in the management and provisioning of mobile and fixed network services. Network traffic prediction has become very important for network providers. Having an accurate traffic prediction tool is essential for most network management tasks, such as resource allocation, shorttime traffic scheduling or re-routing, long-term capacity planning, network design and network anomaly detection. A proactive prediction-based approach allows network providers to optimize network resource allocation, potentially improving the QoS. With this aim, we introduce a matheuristic (i.e. interoperation of heuristic and mathematical programming) for dynamic network optimization. In Fig. 1.1 we show the proposed model. Traffic is predicted on different spatial locations developing different machine-learning algorithms, such as Artificial Neural Network (ANN) [9], Recurrent Neural Network (RNN) [10] and Diffusion Convolutional RNN (DCRNN) [11]. The predicted traffic-demand is then used to optimize the network at various hours during the day, so to adapt resource occupation to the actual traffic volume. The optimization problem solvers are based on Integer Linear Programming (ILP) and heuristic algorithms; their goal is to minimize the power and energy consumption of the network elements, providing weights to assign to network links every hour. After that, a weighted graph is generated to route the actual traffic demands by a minimum-cost path algorithm. The time required by optimization is not an issue, since prediction allows to start performing optimization computation in advance (Offline). Such feature makes the solution reported suitable to be implemented as an SDN application for 5G scenarios. We have chosen the metro infrastructure of a mobile operator as use case; in particular, the objective of the resource optimization is the optical metro network used as the backbone for the mobile service. The tests demonstrate the effectiveness of our methodology, with results that match almost perfectly th!htbe behaviour of a network that performs optical routing reconfiguration with a perfect, oracle-like traffic prediction. The methodology and the detailed results can be found in Chaps. 2 and 3 of my Ph.D. thesis and in the following papers: [9-12].

Network Traffic Pattern Identification
Due to the highly predictable daily movements of citizens in urban areas, mobile traffic shows repetitive patterns with spatio-temporal variations. This phenomenon is known as tidal effect analogy to the rise and fall of the sea levels. Recognizing and defining traffic load patterns at the base station thus plays a vital role in traffic engineering, network design and load balancing since it represents an important solution for the Internet Service Providers (ISPs) that face network congestion problems or over-provisioning of the link capacity. State of the art works have dealt with the classification and identification of patterns through the use of techniques, which inspect the flow of data of a particular application. In the previous section we have shown dynamic network optimization techniques based on supervised learning algorithms, or prediction algorithms, while in this section, we use unsupervised learning to "learn" periodic trends (or patterns) from traffic data over time. We wanted to answer to the following questions: Do different areas of the city with the same social function display the same typical traffic pattern? Can they be used to optimize the metro-core optical network? In order to answer to these questions we analysed datasets concerning the city of Milan [13,14]. They show information regarding the internet traffic of Telecom Italia (TIM) service provider and the point of interests located throughout the city, such as schools, restaurants, social services, etc., that reveal the social function of a particular area. We developed a novel model for basic pattern identification based on matrix factorization methods by integrating the aforementioned datasets in order to assess the similarity of traffic patterns related to areas with the same type of point of interests. In the field of pattern recognition, the Non-negative Matrix Factorization (NMF) is one of the most used method thanks to the ability to detect basic flows inside large matrices. First, we apply the classical NMF method to a real-world dataset that collects the data traffic of mobile users at the base station level in the city of Milan. Afterwords, we propose an integration of multi-domain data on the basis of NMF and denote it as Collective-NMF based model. The final results show the presence of very similar patterns located in distant areas of the city that share the same point of interests, such as schools, public parks, commuting areas, clubs, commercial areas (see Fig. 1.2). The experiments confirm the presence of regular and periodic traffic due to the tidal effect experienced by the network. In fact, it depends not only on user habits but also by the land usage and POIs scattered in the city area. Taking advantage of the mobile traffic patterns, we propose two mathematical models with the goal of predicting how and when to optimize the resource allocation in the underlying physical network. The methodology and the detailed results can be found in Chaps. 4 and 5 of the Ph.D thesis and in the following papers: [15][16][17].

Reinforcement Learning for Adaptive Network Resource Allocation
While in the previous sections we investigated machine learning algorithms for network optimization based on supervised and unsupervised learning, in this section, we investigate the application of RL for performing dynamic Service Function Chain (SFC) resources allocation in SDN/NFV enabled metro-core optical networks. RL allows to build a self-learning system able to solve highly complex problems by employing RL agents to learn policies from an evolving network environment. Specifically, we build an RL system able to optimize the resources allocation of SFCs in a multi-layer network (packet over flexi-grid optical layer). An SFC is composed of an ordered sequence of Virtual Network Functions (VNFs). Therefore, given a set of SFC provisioning requests and their QoS requirements, the proposed method finds the optimal positioning of the VNFs in the service layer and allocates the network resources of the IP/MPLS and optical layer, respectively. Furthermore, in order to meet the SFC traffic variation over time, it dynamically predicts if and when to reconfigure the SFCs that no longer meet the required QoS criteria. Each reconfiguration implies an availability penalty on the service as it requires a temporary service interruption to allow for VNF migration and/or network resource reallocation. Thus, the mechanism presented here can be seen as a way of spending reconfiguration tokens over the span of a day. The proposed RL-based algorithm aims to spend these tokens in order to minimize the blocking probability of traffic requests finding the best action in the trade-off between availability and performance. Reinforcement learning is an approach to machine learning that trains algorithms using a system of positive and negative rewards. An RL algorithm, or agent, learns how to perform a task by interacting with its environment. The interaction is defined in terms of specific actions, observations and rewards. As we can see from Fig. 1.3, an agent performs an action at time t based on the reward and state (or observations) obtained by the environment. The action performed by the Fig. 1.3 Schema of the proposed RL system [18] agent will produce another state of the environment and a reward at a time t + 1 and so on. The environment, namely network environment, consists of a multi-layer model formulation based on Mixed ILP (MILP) that, given a set of SFC requests, finds the optimal VNF placement and Routing and Wavelength Assignment (RWA). The objective function of the MILP aims to maximize the number of successfully routed SFCs, minimizing: reconfiguration penalty, blocking probability and power consumption of network elements. At each time step t, i.e. hour, the network environment exposes its state S(t) made by 3 observations: O 1 (t): total number of reconfigurations of each SFC, at time t; O 2 (t): total number of blocked requests of each SFC, at time t; O 3 (t): traffic volume request in Gbps of each SFC, at time t. As shown in Fig. 1.3, the agent, namely Network Agent, is a Dense Neural Network (DNN) that takes as input the state of the environment S(t) and the result of the reward function R(t). As output, the agent decides if one or more SFCs need to be reconfigured. This decision is performed by means of an action A(t) executed on the environment. The reward function represents the effect of the action performed by the agent. It depends on the number of blocked service-chain requests; R(t) is positive if the agent managed to reconfigure the network in order not to have blocked requests, otherwise it is negative. Results show that, initially the system assigns negative rewards that decreases hour by hour when the algorithm does not block the requests. After few days of training, the RL agent figured out how to maximize the reward function and keep the objective function of the MILP stable to a minimum value respect to the case in which we do not use RL. In other words, it learnt how and when to reconfigure the SFCs in order to route the traffic requests by decreasing the blocking probability. The methodology and the detailed results can be found in Chap. 6 of the Ph.D. thesis and in the following paper: [18].

Implementation of Machine Learning in Real SDN/NFV Testbeds
While in the previous sections we have proposed novel algorithms in simulated networks, in this section we introduce briefly three direct applications of ML-based algorithms on different emulated and real SDN/NFV testbed scenarios.
1. First, we developed the network planner module under the Metro-Haul European project 2 in which ML-based algorithms interact with the control plane of the proposed SDN/NFV infrastructure. We introduced the software components inside the planning tool, which compose a framework that enables intelligent optimization algorithms based on ML to assist the control plane in taking strategic decisions. We presented a demonstration of a ML-assisted routing algorithm in an emulated network scenario. The proposed framework aims at guaranteeing a fair behavior towards past, current and future requests as network resource allocation decisions are assisted with ML approaches. The detailed results can be found in Chap. 7 of the Ph.D. thesis and in the following papers: [19][20][21][22] 2) The second is a field deployment of such solution in a real network located in an Italian city. These testbeds target the monitoring and the traffic engineering features based on RL of an SD-WAN solution and explores different approaches to understand their advantages and limitations. Our implementations provide an overlay WAN with controlled performance in terms of delay, losses and jitter over low-cost public Internet connectivity. Detailed results can be found in Chap. 9 of the Ph.D. thesis and in the following paper: [26].

Concluding Remarks
Machine learning is one of the game-changing technologies which only recently has become mature. With the growing availability of data, computing power and platforms, ML is now applicable within the academic world and by large companies. It is an extremely powerful technique which is able to perform intelligent tasks which were up to now usually done by humans. The networking community welcomed ML as the main mathematical tool for solving complicated optimization problems. ML can recognize, predict, advice, optimize and classify, and therefore it can support the automation of various network operations enhancing the telecom industry. This PhD research work introduces the concept of Machine-Learning defined Networking. It is an approach to network management, based on SDN and NFV, that enables self-learning algorithms to improve network performance and monitoring. We have demonstrated the effectiveness of this approach by developing different algorithms able to predict and classify network traffic in order to optimize 5G metrocore networks. ML represents an essential asset for building management tools for ever increasing intelligent networks. Research on this topic is growing at a dizzying pace and is opening up research directions for many areas in telecommunications.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.