1 Introduction

Wi-Fi Networks are experiencing two paradigm shifts in terms of size and applications. The size of Wi-Fi networks has increased from small home and office networks to large-scale ultra-dense networks, also referred to as Overlapping Basic Service Set (OBSS). At the same time, there has been an increasing number of novel applications and services such as content distribution [1], Internet of Things (IoT) [2], Intelligent Transportation Systems (ITS) [3, 4], Device-to-Device (D2D) based cooperative networking [5, 6] and Unmanned Aerial Vehicles (UAVs) [7] that are using Wi-Fi as a communication technology. These two paradigm shifts pose several challenges in legacy Wi-Fi networks. Two fundamental challenges in ultra-dense Wi-Fi networks are (i) handover prediction and (ii) access point (AP) selection.

1.1 Handover Prediction Problem

Handover prediction refers to the problem of anticipating about the connection state of a mobile device associated with an AP. Handover prediction can play a key role in providing seamless connectivity in next generation networks. It brings several potential benefits; firstly the accurate prediction of the handover event allows to timely initiate the transfer of connection to a new AP to reduce the handover delay. Secondly, it prevents unnecessary handovers (i.e. ping-pongs) to avoid connection disruptions in highly dynamic networks.

Handover prediction can be challenging in some cases. Fig. 1 illustrates different scenarios of inter-BSS handovers in Wi-Fi networks. A Wi-Fi user travels from point A to point E (follows the trajectory shown as red, dashed line). Assume that the user passes through the region where the radio coverage of AP-1 and AP-2 overlaps, the received signal strength (RSS) drops below the threshold value and it starts scanning for alternate connection. Meanwhile, when the user moves a bit farther to point D, it discovers AP-2 with a stronger signal. It disassociates from AP-1 and associates to AP-2 (1st handover). The user continues to move and follows the trajectory from point E to G (dashed blue line) and again passes through an overlapping region of AP-2 and AP-3. At point F, the user changes association to AP-3 (2nd handover) and back to AP-2 when it moves a little farther (3rd handover). The user moves ahead and follows the third trajectory from point G to H, and changes association to AP-1 when it approaches to point H (4th handover). At Point H, the user can’t move farther towards AP-1 due to hindrance and the signal form AP-2 becomes stronger even with a slight movement in any direction (5th handover).

Fig. 1
figure 1

Inter-BSS handover scenario

From the above discussion, it becomes obvious that there are some cases where the handover shall not take place despite the signal strength drops slightly below the threshold level to avoid the ping-pong effect.

1.2 Access Point Selection Problem

Assuming a Wi-Fi device is located in the transmission range of more than one AP, it can associate with either one as shown in Fig. 2. By default, a station associates to the AP from which it first receives a beacon or a probe response frame. However, in practice such kind of automatic association of stations can cause performance degradation when the connection to the selected AP is weak. The optimal selection of an access point in dense WLANs is crucial for network performance.

Fig. 2
figure 2

User association in overlapping BSS’s

The legacy methods for users association in WLANs are: (i) Strongest Signal First (SSF) and (ii) Least Loaded First (LLF). Both the SSF and LLF association methods have shortcomings. For instance, in SSF scheme, a station associates to the AP from which it receives a stronger radio signal, however, if the AP is over-utilized, the association of more stations can cause congestion in the BSS which leads to increase in the packet loss and the packet end-to-end delay [8,9,10]. On the other hand, in LLF scheme, the selection of the least loaded AP provides load balancing at multiple APs, however it may force a station to associate with a distant AP. Consequently, the station suffers from poor connection quality. To address these shortcomings of SSF and LLF schemes, the authors in [11] propose a new metric for AP selection named as potential bandwidth which is defined as, “the MAC layer bandwidth that an end-host is likely to receive if it were to affiliate with a given access point”. The new metric takes into account the signal strength as well as the AP load and additionally the contention on the wireless medium. However, the technique in [11] may not achieve the desired performance if the APs uses different beacon frequencies. It is therefore necessary to devise an AP selection strategy that improves the overall network performance while meeting the demands of new users.

Recently, new architectures based on Software Defined Network (SDN) and Cognitive Networking (CN) paradigms are being proposed in the literature [12,13,14]. SDN [15,16,17] refers to the type of networks in which the control and data forwarding functions are separated. In these architectures, the network devices such as switches, routers and access points act as non-intelligent data forwarding devices while the intelligent functions such as data routing are implemented in a central controller also called as the SDN Controller. On the other hand, cognitive networking [14] refers to the network paradigm in which the networks automatically learn and respond to changes by actively taking decisions and planning network resources to achieve the end to end performance goals. SDN offers the software adaptation to implement cognitive networks [18]. Cognitive networks can be realized using both distributed and centralized architectures. A novel approach to realize cognitive networks is to adapt data-driven machine learning (ML) algorithms to address challenges in future ultra-dense and dynamic networks [19,20,21]. ML algorithms can be used for both network design [22,23,24,25] and network performance evaluation [26,27,28,29].

This paper proposes a centralized network architecture using an SDN controller that uses machine learning algorithms to solve the two aforementioned network problems. Firstly, it anticipates the handover event that is likely to occur and to decide whether the handover is actually required? The proposed scheme reduces the likelihood of unnecessary handover events in the presence of Overlapping BSS (OBSS) in ultra-dense deployment. Secondly, it solves the AP selection problem by predicting the post-selection network throughput to choose the best AP. Network throughput is a significant metric to measure the user experience. The prior knowledge of future throughput can help to avoid network congestion which can play a vital role in AP selection decision. The proposed scheme can be used to develop large frameworks and testbeds for real-time monitoring and network diagnostic to boost the Quality of Service (QoS) in Wi-Fi networks.

1.3 Contribution

The paper presents efficient schemes to address two fundamental issues (i.e., handover prediction and AP selection) in cognitive Wi-Fi networks. The proposed schemes uses data-driven machine learning algorithms to solve the two problems. The results are compared with traditional methods to validate the benefits of the proposed scheme. The paper also proposes useful datasets to train the machine learning algorithms for robust performance. The datasets used in this study are acquired using simulations performed in network simulator (ns-3) [30] and Mininet [31] emulator. An interesting contribution is the use of ns-3 building class in the simulation which is not used in previous related works to the best of our knowledge. The use of ns-3 building class allows simulating both indoor and outdoor scenarios by configuring buildings, floors, rooms and other real-world structures in simulation to acquire more realistic datasets.

The rest of the paper is organized as follow: Sect. 2 presents the state-of-the-art approaches to solve the handover prediction and AP selection problems. Section 3 describes the proposed scheme. Section 4 explains the methodology used to evaluate the performance of the proposed scheme. The evaluation results are reported in Sect. 5. Lastly, conclusions are drawn in Sect. 6.

2 Related Work

A cognitive network is a modern network architecture that is fully aware of the network state and can adapt to the varying network conditions. Such a network learns from these adaptations to make future decisions to achieve the end-to-end performance goals. Several prototypes for realizing cognitive networks are proposed in the last few years [12, 13, 41]. To realize efficient and scalable cognitive networks, machine learning techniques are being used in [42, 43]. ML algorithms can be used for sophisticated learning and decisions-making in large and complex wireless networks where analytical methods do not meet the required performance requirements. Machine learning techniques are applied to several problems in wireless networks, e.g. throughput estimation [34, 44], interference classification [27], delay analysis [35] and channel migration strategy [45].

In [46], authors address the problem of throughput estimation for TCP flows in Wide Area Networks (WAN). The authors used Support Vector Regression (SVR) on dataset obtained using a laboratory testbed. The prediction accuracy is evaluated using “relative prediction error” metric. In [33], authors address the TCP throughput prediction in cellular (3G/HSPA) networks using seven prediction algorithms and compared the prediction accuracy of each using the root-mean-squared error (RMSE) metric. In [32], authors propose a throughput estimation strategy in Long Term Evolution (LTE) cellular network using several network parameters such as RSSI, Signal-to-Noise Ratio (SNR), Reference Signal Received Quality (RSRQ) and Reference Signal Received Power (RSRP). The authors used three machine-learning algorithms namely Generalized Linear Model (GLM), Artificial Neural Networks (ANN) and Random Forests (RF), to evaluate the predictor performance (Table 1).

Table 1 Summary of related works

Previous works on handover prediction in Wireless LANs [10, 36, 37] use timeseries forecasting methods. Classical timeseries forecasting methods include autoregression (AR), moving average (MA), Autoregressive Moving Average (ARMA), Autoregressive Integrated Moving Average (ARIMA), Simple Exponential Smoothing (SES) and other variants of these methods. For instance, authors in [36] proposed a method to trigger handover using RSS based prediction in Wireless LANs. The authors argue that the RSS value remains constant during a short time interval (0.5 s in the proposed model), and predict the future RSS values using the autoregressive process of order 1 i.e. AR(1). The proposed scheme is evaluated using dataset collected from ns-3 simulation. Handover event is predicted using the position information of the mobile device in [37]. In [10], authors proposed a handoff scheme based on the continuous monitoring of wireless links using the short-term and long-term trends in signal strength of beacon frames. The proposed scheme claims 50% reduction in the handover delay as well as improvement in the overall performance. In [38] authors used two RSS-based methods for handover prediction i.e. ARMA for stationary signals and ARIMA for non-stationary signals.

In [39], authors proposed traveling distance prediction based model for handover decision. Authors used the RSS values to calculate the distance between the AP and the mobile terminal (MT) using the following formula.

$$\begin{aligned} RSS_P \; = \; E_t \times l_{OP}^{-\beta } \times 10^{\epsilon /10} \end{aligned}$$
(1)

where \(E_t\), \(\beta \) , \(\epsilon \) represent the transmit power (in mW) of the AP, the path loss exponent and a zero mean Gaussian distributed random variable, respectively. The algorithm assumes that the MT travels at a constant speed. Results are compared with Mohanty’s [47] and Varma’s methods [48]. The classical methods used in the aforementioned works, perform poorly on noisy data and in multi-step forecasting [49]. Hence, there is an opportunity to use novel and efficient methods to solve the handover prediction problem. For instance, authors in [50] proposed handover prediction using recurrent neural networks (RNN) in vehicular networks. Although RNN is relatively a more sophisticated method for time series predictions, we did not use RNN in our proposed scheme due to their complexity. Instead, we designed our dataset in a way to capture the time-dependency. Furthermore, the work [50] does not considers AP load while selecting the new AP after handover.

To solve the AP selection problem in dense networks, authors in [40] propose a decentralized algorithm. The proposed Optimal AP Selection Algorithm (OPASA) uses the estimated downlink SINR which captures inter-BSS interference from overlapping APs. The authors show that OPASA outperforms the SSF algorithm by achieving up to 99% aggregate throughput gain. In [11], authors propose potential bandwidth as a metric for AP selection. Potential bandwidth is calculated from the beacon timing of APs. Authors in [9] solve the AP selection problem using approximation of Max-min fair bandwidth allocation algorithm. The authors compare the results and show that the proposed approximation algorithm outperforms both SSF and LLF algorithms. Authors in [51] proposed to use supervised ML methods (i.e., naive bayes, decision trees, and random forest) to solve the AP selection, with higher throughput improvement reported for random forest method.

3 Proposed Scheme

3.1 Architecture

The proposed scheme consists of four components: SDN controller, feature extraction module, datasets, and machine learning module. Fig. 3 illustrates the functional architecture of the proposed scheme.

Fig. 3
figure 3

Framework for cognitive network

The SDN controller constantly collects network data consisting of several parameters of interest such as device’s capability, supporting rates, battery status, user’s position and speed information, Wi-Fi channel being used, packet arrival rates, average throughput and frames retransmission ratios. The network attributes constitute raw data which is then processed to extract useful features. In the feature extraction module, some attributes are directly used as features, whereas some new features are created from the raw data. For instance, the number of associated clients to an AP is directly used as a feature, whereas the inter-arrival time of the packets is a feature that is computed from the packet-arrival times of two consecutive packets.

The features are then combined to form ML-ready datasets which are used by ML algorithms to implement end-to-end learning. Two types of datasets are created namely design datasets and evaluation datasets. The design datasets are used to predict a design parameter e.g. the AP for association. Other examples include the maximum number of nodes served by the AP, transmit power of the AP and the optimum channel to be used etc. The evaluation datasets are used for evaluating the network performance in the current conditions e.g. transmission throughput. Other examples include average packet end to end delay, packet inter-arrival rates, network congestion and channel access delay.

3.2 Functional Overview

The SDN controller continuously monitors the network changes (called as triggers). Three types of triggers are used by the controller i.e. (i) topology change, (ii) performance degradation, and (iii) periodic triggers. A new user sending association request to an AP corresponds to the first type of trigger. The lower network throughput or increase in the packet end-to-end delay than a pre-defined threshold level corresponds to the second type of trigger. Periodic triggers are activated at regular intervals regardless of any change in the network state. The activation of any of these triggers automatically runs the appropriate ML model. The ML model at fixed intervals imports the required ML-ready dataset form the database to retrain. When triggered, the ML model can thus generate accurate output. The output of the ML model is used by the SDN controller to implement a control action. The proposed scheme for handover prediction and AP selection is explained as follow:

3.3 Handover Prediction Scheme

Handover prediction is solved as a binary classification problem using supervised learning techniques. The raw data for handover prediction consists of timeseries of RSS values of beacon frames received from APs. To be used in supervised learning, the timeseries is transformed into dataset as illustrated in Fig. 4 and further explained in this section.

Fig. 4
figure 4

Handover prediction framework

Fig. 5 illustrates the proposed handover prediction scheme. Each device constantly monitors the RSS and records the RSS values in beacon frames in a RSS_REGISTER. The RSS_REGISTER is then shared with the controller every second (beacon frequency). The controller copies the values from the RSS_REGISTER into a database of raw data. Each time a RSS_REGISTER is received, it is appended to the previous data. The raw data is then accessed by the feature extraction module, which transforms the raw data into ML-ready dataset. The ML-ready dataset consists of several features as depicted in Table 2. Each sample in the dataset consists of 13 features. Columns 1 to 10 contains per-second average RSS values for 10 s. Column 11–13 contain the statistics calculated based on the first 10 columns i.e. mean, minimum and maximum. Each row in the dataset is calculated by applying a unit (1 s) shift to the previous column.

The controller constantly monitors the current association of the device. The method defines two RSS thresholds denoted as \(T_1\) and \(T_2\). \(T_1\) refers to the RSS level which is significantly low, but still supports an ongoing connection despite if RSS drops below it. Whereas, \(T_2\) refers to the RSS level which is the minimum level to support a connection. If RSS drops slightly below the threshold, the connection will be terminated. The controller sends first trigger when the RSS of the device drops the first threshold \(T_1\). The first trigger indicates the possibility of a handover in the next couple of seconds and hence a proactive measure is necessary. The trigger activates the machine learning module to run the algorithm at each time step to predict the probability of handover in the next time step. It is worthy to note that the first trigger is significant to reduce unnecessary processing by continuously running the ML algorithms when the device lies in good coverage. Once the trigger is generated, the ML module runs the trained model to predict whether handover should be initiated or not? The ML module periodically imports the most recent feature vector from the dataset for inference and runs the model to predict the handover. The dataset is updated by appending the prediction decision for the given feature vector to improve the future learning process and prediction accuracy.

Fig. 5
figure 5

Handover prediction scheme

Table 2 Dataset for handover prediction

When the handover is detected for a given feature vector, the handover process is initiated. After completing the handover, when the RSS from the new AP is increased and becomes higher than \(T_1\), the controller sends another trigger to the machine learning module to stop running the prediction process. If at anytime, the RSS drops to the second threshold \(T_2\), a handover is initiated without running the ML model (the manual handover decision is not illustrated in Fig. 5) and the dataset is updated by appending the handover decision to the given feature vector.

3.4 Access Point Selection Scheme

The AP selection problem is addressed by the proposed scheme using a multi-criteria online learning technique as illustrated in Fig. 6

Fig. 6
figure 6

AP selection flow diagram

When an AP receives an association request from a Wi-Fi station (STA), it forwards this request to the SDN controller. The SDN controller checks if the dataset (Table 4) is available to use machine learning algorithm to choose the best AP to offer connection to the new user. Initially when the network is first deployed, the dataset is not available. Hence, the controller uses the default algorithm (i.e. SSF or LLF) to select the AP. The controller computes the per BSS throughput for the given network parameters. Once, the dataset is populated with sufficient datapoints, any new association request is handled by the machine learning model. The proposed scheme predicts the throughput for each AP in the overlapping BSS and returns the estimated throughput for each AP (if the new STA would be associated to this AP) to the controller. The controller then selects the AP which provides higher estimated throughput for connecting the requesting client.

To create the dataset for throughput estimation, the controller constantly records the information such as the number of associated clients and packet information (e.g. timestamps, arrival time, packet size and signal to noise ratio etc.). New feature, Inter-Arrival Time (IAT) is calculated from the timestamp and arrival time of each packet. The two features, IAT and the number of clients connected to the access point are primarily selected to be used for throughput estimation. Furthermore, new features are derived from the IAT values, using the statistics such as minimum, maximum, mean, variance, skew and kurtosis. The features are collected over a time window of fixed duration for the whole network. The structure of ML-ready dataset for throughput estimation is given in Table 3.

Table 3 Dataset for throughput estimation

For AP selection, the controller simultaneously collects other parameters to compute features to create dataset. The structure of dataset used for AP selection is listed in Table 4.

Table 4 Dataset for AP selection

4 Evaluation

The proposed scheme is implemented using ns-3 simulator [30] and Linux-based Mininet network emulator [31]. Mininet provides a sufficient level of flexibility and control over the network to dynamically implement new configurations. Additionally, it allows interactive simulation and user can add traffic and applications on devices as well as apply some topological changes during the simulation runtime, thus enabling users to create more dynamic scenarios. On the other hand, ns-3 is a de-facto standard for simulating wireless networks. It provides accurate models of the wireless channel. The recent version of ns-3 also supports indoor models where users can model buildings, floors, rooms and other parameters of the real world.

To implement the proposed scheme for handover prediction, we performed extensive simulations in ns-3 to acquire raw network data. Both indoor and outdoors devices are deployed in the simulation. The simulation uses the design parameters defined in Table 5.

Table 5 Simulation parameters for building topology

The raw data acquired is transformed into dataset as given in Table 2. The datasets are then used in Mininet-based simulation to predict handovers using Random Forest (RF) algorithm. Random Forest (RF) [52] is a supervised learning algorithm employed in classification problems. It randomly selects features to build several decision trees and then averages the results. It is relatively a simpler algorithm and requires less time to train a ML model.

To implement the proposed scheme for AP selection, the controller is configured to simulate the two user association algorithms i.e. SSF and LLF in Mininet. The simulations include 3 APs and 50 STAs, randomly moving in the network and changing association controlled by these algorithm. The network traces are collected and dataset is created according to Table 4. Lastly, the controller uses an ML model to perform AP selection. The previously collected datasets are used to train the ML model to estimate network throughput. The STA-AP association which will give higher estimated aggregate throughput, is then selected.

The AP selection dataset involves the use of estimated throughput and hence it is necessary to evaluate the accuracy of the algorithms which estimates the throughput. To evaluate the accuracy of estimated throughput, we used two algorithms i.e. Multi-layer Perceptron (MLP) and SVR due to their capability to better predict such metrics [43]. The raw traces form the simulated network are collected and transformed into useful features as listed in Table 3 to create the ML-ready dataset. The dataset is divided into training-validation (70–30 %) splits. The two algorithms are trained with the training data and are then tested by applying to the unseen validation data. To further validate the statistical significance of the model, 10-fold cross validation is used to avoid over-fitting.

5 Results and Discussion

The performance of the proposed handover prediction scheme primarily depends on the accuracy of the machine learning model. Firstly, the prediction accuracy of the RF algorithm used for handover prediction is evaluated using confusion matrix. A confusion matrix is used to evaluate the percentage of correct and wrong predictions on data points of all classes in the dataset. The confusion matrix shown in Table 6 shows the accuracy of the RF algorithm.

Table 6 Confusion matrix

It can be seen that the RF algorithm provides high accuracy to correctly predict the handover events. In the next step, the performance of the proposed handover prediction scheme is compared to other methods stated earlier to assess the overall performance. Figure 7 shows the performance of the proposed scheme versus two other handover prediction methods based on RSS forecasting method [36] and traveling distance method [39]. The figure shows the number of unnecessary handovers (cumulative) over time, computed for the three methods. It can be seen that the proposed scheme outperforms the two methods by reducing the overall numbers of unnecessary handover. The analysis of results show that the proposed scheme reduces the number of unnecessary handovers by approximately 60% and 50% as compared to RSS method and traveling distance method, respectively.

Fig. 7
figure 7

Unnecessary handovers using the proposed scheme versus previous approaches

The proposed scheme for AP selection problem is then evaluated. As the accuracy of the AP selection scheme directly relies upon the accuracy of the throughput estimation, we first evaluate the accuracy of the throughput estimation using two ML algorithms i.e. MLP and SVR.

The predicted throughput versus actual throughput is plotted for both algorithms as given in Figs. 8 and 9. It can be observed that the MLP model provides better accuracy (i.e. predicted values are much closer to the actual values) as compared to the SVR model. To further quantify the performance of both models, three performance metrics i.e. training time, Mean Squared Error (MSE) and R-squared are computed and the results are listed in Table 7. The MLP based model requires long training time (1.59 s) than the SVR model (0.211 s), however it provides better accuracy (i.e. less MSE for MLP = 0.067 as compared to SVR = 0.211) and better generalization to future predictions (i.e. higher R-squared for MLP = 0.974 as compared to SVR = 0.916). The better learning capabilities of MLP costs longer training time due to its complex design (hundreds of neurons arranged in several layers).

Fig. 8
figure 8

Throughput estimation (using MLP)

Fig. 9
figure 9

Throughput estimation (using SVR)

Table 7 Performance and complexity analysis of throughput estimation models

The estimated throughput using MLP algorithm is then used for AP selection. In AP selection, two performance metrics i.e. average BSS throughput and per-STA throughput are used to compare the throughput gain of the proposed scheme versus standard AP selection schemes (i.e. SSF and LLF). The results are shown in Figs. 10 (average BSS throughput) and 11 (per-STA throughput). It can be observed that the proposed scheme improves the average BSS throughput as well as per-STA throughput. The analysis of throughput gains report an average improvement of 9.2% and 8% as compared to the SSF and LLF schemes respectively. It is worthy to note here, that the work in [9] also propose an alternate AP association scheme for load balancing in Wi-Fi networks and compared against SSF and LLF schemes. We could not compare our work against [9] due to the complexity of the scheme in ns-3 environment. Additionally, it would not be fair to compare our results against [9] due to the difference in network configurations, topology, and the assumptions typically involved in analytical models versus simulation environments.

Fig. 10
figure 10

Comparison of the average BSS throughput

Fig. 11
figure 11

Comparison of per-STA throughput

6 Conclusions and Future Work

The paper proposes machine learning techniques to solve two well-known problems in WLAN networks i.e. the handover prediction problem and the AP selection problem. The handover prediction problem is formulated as a multi-step time-series prediction problem. It is solved using supervised learning algorithm i.e. random forest in the proposed scheme. The goal in binary prediction problem is to achieve high prediction accuracy. On the contrary, the AP selection is a design problem to find the optimum AP-STA associations that improves the network throughput performance. The proposed scheme solves this issue by estimating throughput using all possible configurations and selects the one which provides higher throughput gain. The performance of the proposed scheme is validated and results report significant improvement in the overall performance. The proposed scheme for handover prediction outperforms the RSS method and traveling distance method by reducing the number of unnecessary handovers by 60% and 50% respectively. In the AP selection problem, the proposed scheme outperforms the SSF and LLF algorithms by achieving higher throughput gains upto 9.2% and 8% respectively. However, it is expected that a large set of problems and challenges in future Wi-Fi networks can be solved using similar approaches. Although running ML applications over resource constrained mobile devices can be challenging, the new edge computing paradigm [53] can be a promising approach to meet the computation requirements of future networks.