Empirical Statistical Model for LTE Downlink Channel Occupancy

This paper develops an empirical statistical channel occupancy model for downlink long-term evolution (LTE) cellular systems. The model is based on statistical distributions mixtures for the holding times of the channels. Moreover, statistical distribution of the time when the channels are free is also considered. The data is obtained through an extensive measurement campaign performed in Stockholm, Sweden. Two types of mixtures are considered, namely, exponential and log-normal distributions to fit the measurement findings. The log-likelihood of both mixtures is used as a quantitative measure of the goodness of fit. Moreover, finding the optimal number of linearly combined distributions using the Akaike information criterion is investigated. The results show that good fitting can be obtained by using either exponential or log-normal distributions mixture. Even though, the fitting is done for a representative case with a tempo-spatial consideration, the model is yet applicable in general for LTE and other cellular systems in a wider sense.


Introduction
A need for different data rates in mobile broadband systems has been rapidly growing in recent years.In that regard, long term evolution (LTE) has been provided by the 3rd generation partnership project (3GPP) as a standard for packet based adaptive data rate systems [1].LTE has been further developed to LTE advanced (LTE-A) to provide higher data rates and more spectral efficiency [2].For robust optimization for cellular systems in general and LTE systems in particular, the traffic demand of cellular networks is needed to be modelled.
Beside resource optimization, other several optimization problems in cellular networks impose traffic modelling such as performance evaluation and billing.Among the statistics used for traffic evaluation in cellular systems is the channel occupancy which is defined as the time that a user occupies a channel in a cell while it is located in the serving area of that cell [3].The channel usage for a cellular system is modelled as a two states Markov chain process [4].The first state is the busy state when the channel is being assigned for a user whereas the second state is the idle state when the channel is idle.
Many studies have been carried out to characterize the cellular channel occupancy statistical distribution.In [5], it is shown that mobile telephony channel occupancy can be approximated by exponential distribution.A great advantage of the exponential distribution is the traceability in finding analytical solutions for optimization problems.Therefore, exponential distribution has been intensively used to model cellular channel occupancy, see [4] as an example.Nevertheless, many research findings concluded poor similarity between exponential distribution and empirical data [6].One of the main disagreements between exponential distribution and empirical data is the heavy tail behaviour for the empirical channel occupancy which is not properly characterized by exponential distributions.Therefore, some heavy tail distributions are used as alternatives to model the cellular channel occupancy, among which, the log-normal distribution is found to better fit the empirical data [7,8].
Even though many studies were carried out to model the cellular channel occupancy, non of these studies considers LTE yet.Therefore, LTE channel occupancy modelling is a topic that needed to be studied which is the main contribution of this paper.Furthermore, this paper contributes also in exploring fitting the empirical data for the cellular channel occupancy into a mixture of either exponential or log-normal distributions, combined linearly.This contribution is seen by using LTE as an example of a cellular system.
Using distribution mixture is motivated by keeping the advantageous of the ease of exponential and log-normal distributions.Hence, we can avoid using complicated distribution to model the cellular channel occupancy such as Beta and Kumaraswamy distributions [9].Moreover, distribution mixtures are more general than single distributions and can be used to fit the data under different conditions.Consequently, the algorithms developed based on exponential and log-normal distributions of cellular channel occupancy can still be used based on their mixtures with small changes considering the linear combination of many of them.
The rest of this paper is structured as follows: Sect. 2 handles the theoretical aspects of the paper including the channel usage model and using distributions mixture to fit data.Section 3 shows the measurements setup and the fitting results.Finally, Sect. 4 concludes the paper.
The theoretical aspects of the paper are handled in this section.The section starts with presenting the Markov based model for the channel occupancy.Following that, distribution mixture fitting mathematical framework is introduced.Furthermore, exponential and lognormal distributions mixture fitting are studied in particular.

System Model
The LTE channel usage can be modelled as a two states Markov process.These two states are the ON state representing occupied channel state and the OFF state denoting the channel being idle.ON and OFF states temporal length are random variables (RV).Hereafter, ON and OFF temporal length are assigned the RVs x and y respectively.Figure 1 exhibits the channel usage model.The problem tackled throughout this paper is how to find statistical distributions that fit x and y.
The rest of this section provides the theoretical aspects of distributions mixture fitting in general and exponential and log-normal mixtures fitting in particular.
Without lose of generality, the RV x is considered in the coming parts of this paper.The same findings of x can be applied to y. Denote the empirical probability density function (pdf) of x as g(x) .g(x) can be fitted with a linear combination of k pdfs as p i is the weight of the pdf number i, f ðÁÞ denotes a single pdf and H i is the distinct distribution parameters for the pdf number i.For the whole mixture model, X contains all the distinct mixture parameters and defined as with ðÁÞ T denoting the transpose.An important notice here is that the formulation of X given in (2) assumes that the mixture is composed of the same distribution type which is considered in this paper.The goodness of fit is judged through the log-likelihood estimator, LðxjXÞ, found as

Exponential Distributions Mixture Fitting
In [10] a linear combination of exponential pdfs is introduced to fit a heavy tail distributed data.For exponential mixture distribution, the pdf number i has a form as in (4a) while the collection of the distinct parameters, X exp , is expressed in (4b).
The rest of this subsection shows how to find X exp as the essence of [10].The process of finding X exp is a recursive procedure and starts with fitting the tail and moving backwards.Starting from the assumption that the part of the tail where x [ c 1 can be fitted exclusively with the first exponential distribution, then where F c ðxÞ is the empirical complementary cumulative distribution function (CCDF) of x. Similarly, Accordingly, the first pair, ðk 1 ; p 1 Þ is found as Following the same idea, the pairs ðk i ; p i Þ for 2 i k are found as where e Àk j c i ; Finally the last pair ðk k ; p k Þ is found as The values of c 1 , b, and a are user defined and the reader is referred to [10] for more details on how to set them.

Log-Normal Distributions Mixture Fitting
In this paper, log-normal distributions mixture is used to improve the goodness of fit for cellular channel occupancy compared to a single log-normal distribution.In [11] a mixture of normal distribution is used to fit a specific data.To deal with the monotonicity behaviour of the measured cellular channel occupancy, log-normal mixture can be used instead of normal mixture.The pdf number i and the collection of distribution distinct parameters, X lgn , in a log-normal mixture are shown in (9a) and (9b) respectively.
X lgn can be found using Newton Raphson optimization method by solving the equation Starting from an initial guess of X ð1Þ lgn , then X ðiþ1Þ lgn is updated as where HðÁÞ denotes the Hessian matrix.As the Hessian matrix is needed to be updated every iteration, then the stopping criterion is the convergence of H.

Optimizing the Number of Distributions
To optimize the value of k, Akaike information criterion [12] is used.AIC is a statistical model identification used to optimize the model order [12].AIC is calculated considering the log-likelihood penalized by the number of independent model parameters.AIC is obtained using AICðx; X; NÞ ¼ À2LðxjXÞ where N is the model order defined as the number of independent model parameters.The optimal model order is found by minimizing the value of AIC in (12).For the exponential mixture distribution, each pair i where 1 i ðk À 1Þ represents a single independent parameter while the last pair ðk k ; p k Þ is fully dependant on the other pairs.Hence, the exponential distributions mixture has ðN ¼ k À 1Þ independent parameters.Accordingly, the optimal model order for exponential mixture distribution, k exp AIC , is found as For the log-normal distribution mixture, with Newton Raphson method, there are N ¼ 3k independent parameters as all the components of X lgn are independent.Therefore, the optimal model order for log-normal distributions mixture, k lgn AIC is determined as 3 Measurements

Measurements Setup
The empirical downlink LTE traffic is obtained through a measurement campaign performed in an indoor location in Kista, Stockholm, Sweden.The measurements location has a GPS coordinates of 59 24 0 19:13 00 N , 17 56 00 56:12 0 E .The measurement area is densely occupied by offices with a shopping mall and residential buildings in the surroundings.A google map of the measurement location is shown in Fig. 2.
For robust measurements, a real time spectrum analyser (RTSA) is used to collect the data.The data is fed to the RTSA through a wideband tunable antenna.Figure 3 exhibits the measurements setup.Since different channels experience different loads at different times, the measurements are treated in time spans of 2 h.Hereafter, the findings for an LTE downlink traffic channel will be discussed as a representative case.The results for the other channels and systems are similar with different parameters.The presented results are for the measurements carried out for a 1.4 MHz channel lies between 2650.6 and 2652.0MHz during the period: Wednesday, 2013/10/02 09:00 am to 11:00 am.

Fitting Results
Before diving into the fitting results, it is important to note that the LTE load on the measurements area changes with time, This changes are depicted by the obtained values of the duty cycle through a week of measurements shown in Fig. 4. Even-though, different loads are experienced at different times, yet the fitting procedure is the same and the findings are similar with different values.Hereafter, the results for the period Wednesday, 2013/10/02 09:00 am to 11:00 am are shown as an example of the results.
Figures 5 and 6 show the empirical distribution and the fitted exponential and lognormal mixtures respectively.Both Figs. 5 and 6 illustrate how the fitted mixtures of exponential or log-normal distributions approach towards the empirical distribution with the change of k.A quantitative evaluation is obtained by means of the log likelihood estimation which is provided in Fig. 7.
As it is shown in Fig. 5, the lower values of k make the exponential mixture to fit the tail with poor fitting for the lower values of x.In contrast, increasing k improves fitting the lower region of x.This is explained as follows; as the first pair ðk 1 ; p 1 Þ always characterizes the tail beyond c 1 , then there is always a guarantee that all the values greater than c 1 are well fitted, depending on the obtained values of ðk 1 ; p 1 Þ and the value of k, rest pairs ðk i ; p i Þ are obtained and the last pair ðk k ; p k Þ is fully dependant on the previous obtained pairs.Therefore, when k increases the part that is characterized by ðk k ; p k Þ decreases.However for very large values of k a point where the property expressed in ( 5) is not held which makes the recursive fitting procedure for the remaining pairs inapplicable any longer.Therefore, there is a crossover point when the log-likelihood estimation starts to Fig. 2 The measurements location Fig. 3 The measurements setup degrade with the increase of k as shown in Fig. 7.For the lognormal mixture the higher the k, the better the fitting as the log-likelihood curve exhibited in Fig. 7.
Figure 7 depicts the obtained AIC for both exponential and log-normal distributions mixtures when k changes.According to the figure, for the exponential distributions mixture the optimal model order is 7 while for log-normal distribution mixture the optimal model order is 4. The difference in the model order between the two mixtures is explained by the influence of the parameters penalty function.As shown in ( 13) and ( 14) the log-normal mixture AIC is penalized more than the exponential mixture AIC.Moreover, for the same reason in the case of exponential distributions mixture, the AIC curves follow the loglikelihood curve.On the other side, for the log-normal distributions mixture, the AIC and The obtained distinct mixture parameters matrices for the optimal exponential and lognormal mixtures, X exp and X lgn are shown respectively below.As explained by (4b), the first column of X exp is the probabilities of the different exponential distributions with their corresponding values of k in the second column.Similarly, as in (9b) the probabilities of the log-normal distributions are placed in the first column of X log with the corresponding values of the means and the standard deviations in the second and third columns respectively.

Conclusions
An empirical statistical model for the downlink LTE channel occupancy is introduced in this paper.The introduced model is based on using a linear mixture of exponential or lognormal distributions.The exponential and log-normal distributions mixture can better characterize the downlink LTE channels occupancy compared to the single exponential and log-normal distributions.

and F c 1
ðxÞ ¼ F c ðxÞ:

Fig. 6 Fig. 7
Fig.6The empirical and fitted CDF for log-normal distributions mixture with different values of k Akaike information criterion is used to optimize the number of the exponential or log-normal distributions composing the mixture.Log-normal mixture Akaike information criterion is affected more by the model order compared to the exponential mixture.The model is a general statistical model and can be used for other cellular systems.Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Slimane Ben Slimane received his B.Sc. degree in electrical engineering from the University of Quebec at Trois-Rivieres, Canada, in 1985, and his M.Sc.and the Ph.D. degrees, both from Concordia University, Montreal, Canada, in 1988 and 1993, respectively.During the period 1993-1995, he worked as a research associate and part-time instructor at Concordia University.He is currently an associate professor in the Radio Communication Systems group, department of Communication Systems (COS), the Royal Institute of Technology (KTH), Stockholm, Sweden.His research interest is in the area of wireless communications, with special emphasis on digital communication techniques for fading channels, error control coding, cooperative communications, spread spectrum communications, multicarrier transmission techniques, and cognitive radio.