A self-organizing base station sleeping and user association strategy for dense cellular networks

Due to the rising concerns of energy consumption in wireless networks, base station (BS) sleeping strategies were introduced to save energy in low trafﬁc scenarios. In this paper we analyse a weighted trade-off between energy consumption and user-perceived performance in dense cellular networks. We present an optimization problem representing this trade-off and derive properties of its optimal solutions. Using these properties we design a self-organizing strategy that dynamically (online) makes load-aware user association and BS operation decisions. Our strategy is self-organizing in the sense that it does not need any information or optimization beforehand, it simply relies on real-time load measurements at the BSs and user-reported SINR values. We furthermore present extensive simulation results, demonstrating the effectiveness of our self-organizing strategy and the impact of increased energy consumption on the user-perceived performance.


Introduction
Wireless cellular networks have experienced immense growth in traffic loads over the last years as a consequence of the rapid proliferation of smartphones, tablets, and their bandwidth-hungry applications. A key option to further increase wireless network capacity is to deploy dense cellular networks (DCNs) since they allow for higher spectral reuse and efficiency (shorter communication range, and thus lower path loss).
The denser concentration of base stations (BSs) raises new and challenging issues compared with the traditional macro cellular networks (MCNs), especially with regard to cell planning and traffic engineering [5]. Physical constraints will typically make it even harder to arrange BSs in an ideal hexagonal pattern, which causes the coverage areas to significantly overlap, and the natural cell regions to be irregularly shaped. As a result, the nominal traffic loads will tend to exhibit not only more spatial variation but also stronger temporal fluctuations. This variability in traffic could potentially result in severe load imbalances and performance degradation under existing BS sleeping strategies and traditional user association schemes.
BS sleeping strategies were introduced as a result of the rising concerns of energy consumption of wireless networks, both in in terms of environmental impact and economic cost. In MCNs, BSs are responsible for about 60-80% of the total energy consumption [28], where a single BS may consume up to 90% of its peak energy consumption in the absence of any traffic due to cooling and pilot signalling [24]. In terms of economic costs, Nokia corporation recently estimated [19] that the global energy bill of radio access networks is currently over 72 billion Euros. These costs and the environmental impact caused by the massive energy consumption of cellular networks drives the need to improve their energy efficiency. A common approach to save energy is to switch BSs into low-power operational modes in the absence of traffic, e.g. sleep modes.
Although DCNs are expected to experience more variability in traffic, the high density of BSs also offers more flexibility than traditional MCNs to deal with this increased variability. With the overlapping cell areas in DCNs, switching BSs into sleep mode does not directly lead to coverage holes, as is often the case in MCNs, forcing the latter to be more conservative in switching off BSs. That means DCNs can potentially react to traffic dynamics on a much smaller timescale than MCNs. For example, DCNs may be able to react to locally appearing (and disappearing) hotspots of traffic demands on a timescale of (several) minutes, while MCNs can typically only react to day/night traffic patterns appearing on a timescale of hours to days due to the severe coverage degradation when switching off a macro BS. To fully harvest the potential energy savings and capacity gains in dense cell deployments, more refined load-aware BS sleeping strategies must be developed [2,22].
An important issue is that reducing energy consumption by switching BSs into sleep mode basically reduces the system capacity. This is at odds with the primary goal of dense cellular networks: increasing network capacity. The latter is most important for optimizing user-perceived performances, which is typically done by applying load balancing schemes. From an energy perspective we wish to have a minimum number of active BSs, while optimizing the user-perceived performance ideally activates as many BSs as possible. Hence-as has also been mentioned by Zhou et al. [35]-we have two opposite objectives and a trade-off has to be made.
To complicate matters further, in DCNs, traffic conditions are typically strongly varying over time and hard to estimate, making manual traffic engineering highly impractical. This motivates the need for self-organizing strategies: measurement-based algorithms that realize excellent performance without requiring explicit prior knowledge of system parameters like traffic conditions. In this paper we present a self-organizing, load-aware strategy that makes a trade-off between energy consumption and user-perceived performance for DCNs, using a pre-specified trade-off parameter.

Discussion and related work
In the past few years ''green cellular networks'' has been an active research field providing many approaches to reduce energy consumption. In this section we give an overview of different approaches proposed in literature and point out how our approach is different from existing solutions.
In principle energy can be saved in two ways: reducing the downlink transmission power (e.g. [3,33]) yet keeping the BSs active, and/or switching BSs into sleep mode in situations of low load conditions. We focus on the second option, i.e. switching BSs into sleep mode.
There are roughly three network modelling perspectives in the existing literature on BS sleeping strategies: models that focus on a single BS (e.g. [12,13,32]), models that focus on a single HetNet cell with a macro BS and several pico BSs (e.g. [10]), and models that consider the entire network (e.g. [1, 3, 9, 11, 14-17, 23, 25, 27, 29, 31, 34]). We will briefly discuss the first two approaches and then focus on network wide models since our approach belongs to the latter category.
First, when considering a single BS, energy can be saved by switching the BS into sleep mode when no more users are in service. Several papers exploit an M/G/1 queueing model to derive asymptotically (locally) optimal activation strategies [12,13,32] such as activation after a pre-optimized sleep period or when the number of users awaiting the activation hits a certain threshold. These queueing based methods are easy to implement but require a priori information to operate optimally and also place users in a queue when the BS is not active, potentially leading to unnecessary delays.
Secondly, in the case of a single HetNet cell the macro BS is typically always on [10,21]. In such a scenario the authors optimize the operational modes of the pico BSs in the macro cell. Their approaches have the advantage that users are always directly placed in service, but limit the potential for saving energy as they do not consider the traffic conditions in neighbouring (HetNet) cells, nor allow for the macro BS to be turned off.
For the remainder of this section we focus on the cellular network at system level. In this setting, user assignments can specifically focus on active BSs, which eliminates delays experienced by users assigned to a sleeping BS. Moreover, BS sleeping strategies with system-wide awareness may better recognize when to (de-)activate a BS as they can account for traffic offloading to neighbouring active BSs, potentially leading to increased energy efficiency compared to local strategies. There are many results on models that consider the entire network and we further discuss them from three different perspectives: the considered network topology (regular or arbitrary), the proposed decision type of the algorithms (randomized or not randomized), and the considered user population (fixed or dynamic). We briefly consider each perspective, using them to position our work in relation to existing literature.
In terms of the network topology, several papers focus on the traditional macro cellular hexagonal BS positioning, presenting specific case studies [2], dealing with both transmission power and operation modes [3], or using detailed user position information [31]. These results rely on regular BS positioning, which is no longer a fair assumption in DCNs. The works outlined below (including our work) do apply to arbitrary network topologies.
Several papers exploit stochastic geometry to find an asymptotically optimal number of active BSs, or probabilities that BSs are active [9,23,25,29]. These all result in randomized strategies that can make a different decision when presented with the same load conditions. Other approaches [1, 11, 14-17, 27, 34] and-as we will see later-our approach consistently make the same decision under the same conditions, making them more reliable in the sense that the algorithms do not suffer from ''unlucky tosses''.
Looking at the user population dynamics, we find many approaches that optimize BS operation modes for a specific (fixed) user population [1,11,14,27,34]. Jahid et al. [15,16] on the other hand focus on user association specifically and on the use of on-site renewable energy sources (e.g. solar panels), and simply switch any BS into sleep mode during off-peak hours. We particularly mention the work of Zheng et al. [34], which applies game theory to include the effect that switching a BS into sleep mode leads to load increases at other BSs. All of these approaches are optimized for a static user population, potentially requiring a new optimization every time the user population changes. Considering the fast flow-level dynamics of DCNs, strategies optimized for static user populations may need to change operation modes faster than the optimal operation modes can be determined.
Although the literature overview given above is by no means exhaustive, it does paint a broader picture: there are little to no known strategies in green cellular networks that cover arbitrary network topologies with dynamic user populations and that make consistent decisions. One of the few exceptions is the work of Klessig et al. [17], which proposes a BS activation strategy inspired by the activation of cytotoxic killer cells in the immune system of mammals. However, even though their method is self-organizing in the sense that it does not need manual intervention during operation, it does require detailed information on the BS coverage areas and an a priori chosen BS hierarchy.

Main contributions
In this paper we analyse a weighted trade-off between energy consumption through switching BSs on and off and user-perceived performance in dense cellular networks. Optimizing the user performance is realized by applying load balancing user association schemes, and we furthermore introduce a trade-off parameter. To the best of our knowledge we are the first to analyse such a tunable trade-off for dense cellular wireless networks: typically the userperceived performance is taken as a hard constraint. We present an optimization problem representing the abovementioned trade-off and derive properties of its optimal solutions. Using these properties we design a self-organizing strategy that dynamically (online) makes load-aware user association and BS operation decisions. Our strategy is self-organizing in the sense that it does not need any information or optimization beforehand, it simply relies on real-time load measurements at the BSs and user-reported signal-to-interference-plus-noise ratio (SINR) values. We furthermore present extensive simulation results, demonstrating the effectiveness of our self-organizing strategy and the role of the trade-off parameter.

Organization of this paper
The remainder of this paper is organized as follows. In Sect. 2 we give the model description and introduce some useful notations. In Sect. 3 we present our optimization problem and derive conditions and properties of optimal strategies. Then, in Sect. 4 we propose our dynamic approach which is based on the properties derived in Sect. 3. In Sect. 5 we show results of simulations we performed to gain insight in the performance characteristics of our proposed approach. Finally, in Sect. 6 we make some concluding remarks and mention directions for future research.

Problem statement and model description
The challenge is to dynamically adapt the set of active BSs and the user association to changing traffic demands such that a specific desired trade-off is realized between energy consumption and user-perceived performance. In our proposed approach, we first study this trade-off in a stationary regime: a period in time where the traffic demands (i.e. file transfer initiation rate, mean download file size) are constant. For such a stationary regime we formulate an optimization problem that represents the desired trade-off using a trade-off parameter. We then analyse the structure of the optimization problem to obtain properties for optimal user association and optimal BS (de-)activation. These properties are then used to design an optimal dynamic user association algorithm and sufficient conditions for dynamic BS (de-)activation. The user association algorithm and BS (de)activation rules will no longer depend on the stationary scenario and can react to changing traffic demands.
We will now proceed as follows. In the remainder of this section, we introduce the system model for a stationary scenario and we specify the power consumption model. In Sect. 3 we formulate the optimization problem for the stationary scenario and derive properties of optimal user association and BS (de-)activation. In Sect. 4 we will describe how these properties are turned into a dynamic algorithm, the Green Shadow Price Assignment (GSPA) algorithm.

System model
We consider a system with L BSs, and we focus on downlink communication only. Within the considered area, BSs provide service to a time-varying set of users. For convenience, we assume that there is a discrete set of N user locations, which may be interpreted as a suitable discretization of the overall coverage area. A location represents a class of users that all have (approximately) equal physical transmission rate characteristics with respect to the BSs. At location n, users initiate file transfers (downloads) at a rate m n . The sizes of the file transfers initiated by users in location n are independent and have mean b n Mbit. As soon as a user has downloaded its file it leaves the system. We do not explicitly consider user mobility.
Remark 2.1 Together, the rates m n and the mean file sizes b n determine the traffic demands of the system and they may change over time. However, in the stationary regime that we currently consider they are constant (by definition).
The bit rate at which users are served depends on their experienced SINR values. We do not account for fast fading and consider only average SINR values. A BS l transmits at a fixed power in the time slots it is serving users. A user i does not receive the full transmission power of the BS, but a reduced portion due to path loss factors. For convenience we do not consider frequency-selective fading, which means that the path loss is independent of the frequency and only depends on the distance between the location and the BS. The Shannon formula implies that user i at location n can receive a maximum communication rate R n;l (in bits per second) from BS l given by where w is the fixed bandwidth available to the BS in Hz, and where SINRðn; lÞ is the signal-to-interference-plusnoise ratio that user i experiences at location n when served by BS l. We assume that there is an interference-free spectrum allocation to the BSs such that each BS has a fixed bandwidth w. BSs apply a proportional fair scheduling policy as is also common in 4G LTE networks, such that the service rate experienced by users depends on the number of users in service at the BS as well as the SINR of that user. Let x n;l be the fraction of the users that initiate a file transfer at location n that is allocated to BS l, and let x be a vector representing all the individual values of x n;l . Then the long term load, or resource utilization, of BS l can be expressed by (see e.g. [26]) We assume that the network has a centralized control unit, which may be realized by using Radio-over-Fiber (RoF) technology [18]. That means that at the BS site there is only a simple remote radio head, and all BS intelligence is located at a centralized entity. This has the advantage that important state information-specifically load estimates and user-reported SINR values-is known for the entire system, and can be used in the dynamic operation of the network.
Remark 2.2 At this point we wish to note that, although in our model and analysis we use a discrete set of locations, we aim to design decision rules that do not rely on these locations since the underlying discretization is most probably not available in practice. For the same reason, we also aim to avoid the specific use of m n and b n in the final decision rules. Instead we use estimates of the resource utilization at BSs to obtain load proxies and use these in our decisions. As a result the decision rules become selforganizing in the sense that they dynamically react to changing load conditions at the BSs. In other words, when the system changes from one stationary scenario to another, the decision rules automatically adjust to the new situation.

BS modes and power consumption model
A BS can be in three operational modes: active, sleep and start-up mode. A BS can directly go from active to sleep mode, but when a BS is switched from sleep to active mode, there is a start-up delay of T On seconds before the BS is operational and can start serving users. The start-up delay is typically small (e.g. 1 second [12]) compared with the time-scale at which BSs are activated or de-activated (minutes). When a BS is switched to sleep mode, the users in service at that BS will be handed over to other BSs. Similarly, an activated BS may take over users of other BSs. We adopt the widely-used (e.g. [10,11,25,30,32,34]) load-dependent power consumption model of Auer et al. [7], where the power consumption P l in Watt of a BS l is given by Here, P 0 is the constant power consumption of an operational BS with no traffic, and P Á q l ðxÞ is the load-dependent power consumption term for given load q l ðxÞ. When a BS is in sleep mode, we assume its power consumption P Off satisfies 0 P Off \P 0 . Moreover, in start-up mode, a BS has a power consumption of P ST (e.g. P ST ¼ 2P 0 [12]).

Optimization problem and analysis
In this section we first formulate an optimization problem for a stationary regime of the system, where the stationary regime represents a period in which the rates m n and mean file sizes b n are constant. Thereafter we break down the optimization problem into two separate parts: user association and BS (de-)activation. In the first part we will show how to realize an optimal (in the sense of the trade-off objective) user association for a given set of active BSs. In the second part we derive sufficient conditions for activating a BS or putting a BS into sleep mode. The results presented in this section will serve as input for Sect. 4, where we will use the optimal user association and the sufficient conditions for (de-)activating BSs to design a dynamic control algorithm. Let s l ¼ 1 if BS l is in active mode, and s l ¼ 0 if it is in sleep mode. The setup mode is not considered as it is a very short temporary mode preceding activation, having little effect on the overall power consumption. The objective is a trade-off between minimizing the total power consumption and optimizing the user perceived-performance. For the latter we specifically choose load balancing as is common in cellular networks [4,6,8]. Load balancing is realized by minimizing the highest BS load, and hence we can formulate the following minimization problem, where a is the desired trade-off factor between power minimization and load balancing, and U represents the maximum BS load (further explanation of the constraints is given below).
sub: P l ðx; sÞ ¼ s l P 0 þ P Á q l ðxÞ ð Þ þ ð 1 À s l ÞP Off ; 8l; ð4bÞ q l ðxÞ ¼ X N n¼1 m n b n R n;l x n;l U; 8l; ð4cÞ x n;l ! 0; 8n; 8l: ð4fÞ The objective (4a) is minimized over the operation modes of the BSs, over the user association x, and over the maximum BS load U. The user association x is included since users can only be assigned to active BSs. The user association in combination with constraint (4c) and the minimization of U give rise to a load balancing problem, which is then weighted by a factor a with the power consumption. Constraint (4d) makes sure that (exactly all) traffic is only assigned to active BSs (all locations have coverage). The problem (4a)-(4f) is a non-convex [due to constraints (4d)], mixed-integer, quadratically constrained [also due to constraints (4d)] quadratic program (QCQPthe objective contains quadratic terms), and in particular the non-convexity makes it hard to find (provably) globally optimal solutions. However, we do not wish to find optimal solutions for this formulation directly since we aim for decision rules that can be applied dynamically and in particular without knowledge of the values of m n and b n . We will now consider the optimization problem (4a)-(4f) in two separate parts: user association and the operation modes of the BSs.

Optimal user association
To gain insight in the optimal user association, let us fix the operation modes of the BSs and consider the sub-problem of load balancing for the active BSsl 2 LðsÞ ¼ fl 2 L j sl ¼ 1g for any given operational mode s: sub: qlðxÞ ¼ X N n¼1 m n b n R n;l x n;l U; 8l; ð5bÞ where LðsÞ j j¼L. The problem (5a)-(5d) is a linear programming problem (LP) with continuous decision variables x and U. This problem is a variation on the LP for the user association problem presented by Post and Borst [26], where the only difference is in the objective function, which is originally min x;U U. Post and Borst [26] presented a dynamic user association algorithm-the shadow price assignment (SPA) algorithm-that realizes an optimal user association for the original objective function. We can modify the SPA-algorithm to also find optimal user assignment fractions x Ã for (5a)-(5d).
The idea of the original SPA-algorithm was to assign users to BSs using shadow prices yl 1 for the BSs. The shadow prices are adapted over time depending on load proxies observed at the BSs, eventually leading to optimal user associations. We briefly explain the two most important differences with the original SPA-algorithm if we wish to apply it to find optimal solutions for the LP (5a)-(5d).
Let yl be the shadow prices of BSl, and let y denote the vector containing all shadow prices (including the BSs in sleep mode). The first, and perhaps most important difference appears in the optimality condition of user assignments. Using the Karush-Kuhn-Tucker optimality conditions, we can derive that an optimal user assignment x Ã satisfies x Ã n;l ðyÞ [ 0 )l 2 arg min fl 0 :s l 0 ¼1g breaking ties at random when the minimizer is not unique. Condition (6) may be interpreted as follows: given (optimal) shadow prices y, the optimal user assignment allocates users to BSs with either a low shadow price or a high service rate. If all shadow prices are equal, users will simply be assigned to the BS l that provides them with the highest rate R n;l . The only difference compared with the original SPA-algorithm is the added power consumption term P. Notice that condition (6) may be implemented without relying on locations by using user-dependent service rates R i;l for user i. This can be thought of as if each user has its own unique location. In practice, the service rates R i;l can be obtained by using user-reported SINR values.
The second most important difference is that the optimal shadow prices y satisfy PL l¼1 y Ã l ¼ a (instead of summing to one as in the original SPA-algorithm). More details about the modification of the original SPA-algorithm, and specifically the modified update step for the shadow prices, can be found in the ''Appendix''. The practical implementation is summarized in Sect. 4.3.

Sufficient conditions for changing operation modes
We will now study the operational modes of BSs. For notational convenience we take P Off ¼ 0. The analysis for P Off [ 0 only leads to one added term P Off in the conditions that we derive, and P Off is often assumed negligible compared to P 0 .
Considering that we are dealing with a non-convex, mixed-integer QCQP problem, we are not planning to find the optimal set of BSs for a given situation, but rather take a different approach. We consider the current BS loads and test if we can (expect to) improve the objective (4a) by activating BSs or by switching BSs into sleep mode. Then, after we have made a change we measure the new realized BS loads, at which point we start to repeat the process. Although we do not expect to always find optimal operation modes, the advantage of this approach is that it can dynamically react to changing load conditions and that the choices are clearly motivated and consistent: given the same situation, this approach will take the same decision.
Recall that we wish to avoid relying on the discretization into locations. In Sect. 3.1 we have seen a dynamic algorithm that realizes an optimal user association for a given set of active BSs without relying on the discretization. Let q l ðsÞ denote the resulting load of BS l under the optimal user association realized by the modified SPA-algorithm for the set of active BSs represented by s. This allows us to only focus on the operation modes s of BSs and the corresponding optimal BS loads q l ðsÞ, without having to worry about the user assignments too much.
Switching to sleep mode Suppose the system is currently using the operation modes s. Then the objective value (4a) can be written as hðqðsÞÞ, with We will now explain how to use the shadow prices of the modified SPA-algorithm to obtain estimatesqðsÞ for the new load values qðsÞ. Considering that we are applying the modified SPA-algorithm as described in Sect. 3.1, the load proxies used by this algorithm can represent the values q l ðsÞ. Then, to obtain values forq l ðsÞ we will use the instantaneous user population at the BSlLet us focus on the active BSl which serves the set of users Il and which experiences a load of qlðsÞ. Then we can associate this load proportionally to the users in service at BSl as follows: user i 2 Il is responsible for a fraction q i;l of the load, where the fraction q i;l is given by Suppose now that after BSl has been switched to sleep mode, user i is handed over from BSl to its new serving BS l 2 arg min l 0 6 ¼l fðỹ l 0 þ PÞ=R i;l 0 g, whereỹ l 0 are the shadow prices directly after BSl has been switched into sleep mode (or directly after activation, see Sect. 4.2). Then the load that user i adds to BS l is the load it induced to BSl multiplied by a factor R i;l =R i;l . Hence, if we handover user i from BSl to BS l, the new loadq l ðsÞ of BS l can be estimated bŷ Following these lines of reasoning, we can obtain values forq l ðsÞ (where l 6 ¼l and s l ¼ 1) by takinĝ Activation Now suppose BSl is in sleep mode, and we consider activating it. Then in a similar way as described above, the load of candidate BSl after activation can be approximated bŷ Furthermore, the loads q l ðxÞ of other (active) BSs after activation of BS l can be approximated bŷ The new load estimates given by (12) and (14) can be used to obtain sufficient conditions for switching to sleep mode and activation respectively. However, the sufficient condition for activation can be further improved. Since the load estimates are based on instantaneous user populations, the following situation is very likely. By using the modified SPA-algorithm there are multiple BSs that maximize the load, i.e. arg max l fq l ðxðsÞÞg j j [ 1. When we activate a currently sleeping BSl it may attract load from some of the BSs in arg max l fq l ðxðsÞÞg, but very likely not all. This means that the maximum load among the BSs is not decreased according to the load estimates given by (14). Even though we indeed do not expect the newly activated BSl to be able to alleviate all maximum loaded BSs, we can expect a cascading effect: BSl takes over some load from BS l 2 arg max l fq l ðxðsÞÞg, which in turn allows the BS l to take over some load from another BS l 0 2 arg max l fq l ðxðsÞÞg. This effect is eventually realized by the modified SPA-algorithm, but it is not captured by the load estimates given by Eq. (14).
To account for the cascading effect described above, we propose an extra step in determining load estimates for BS activation. First we determine the load estimates according to Eq. (14). IfqlðsÞ is higher than the old maximum load, then we do not expect to gain in the objective, and hence we can assume thatqlðsÞ\ max l fq l ðsÞg. Next we consider the set of BSs L Å that had a load equal to the maximum load and that did not offload any users to the newly activated BSl. We will then pretend that BSl will take some load by averaging the loads of the BSs in L Å with BSl in a weighted manner. First, the weights for BSs l 0 2 L Å are given by w l 0 ¼ min i2I l 0 fR i;l 0 =R i;l g, and wl ¼ 1, such that the weights represent the best possible ratio in which load from BS l 0 can be offloaded to BSl. Let W ¼ wl þ P l 0 w l 0 be the total weight, then the improved load estimates are given bỹ q l ðsÞ otherwise. ( In short, we use load estimates (12) to obtain a sufficient condition for switching a BS into sleep mode. We use the load estimates (15) to obtain a sufficient condition for activating a BS, resulting in a set of candidate BSs for which a change in operation mode improves the objective. The condition (9) considers changing the operation mode of a single BS at the time. Theoretically we can simply consider a set of BSs rather than a single BS, but when we change the operation modes of more than one BS it immediately becomes unclear how the new loads will behave. Therefore we have chosen to focus on one BS at a time. Nevertheless, condition (9) may still present several candidates of BSs for which a change in operation mode realizes a better objective. In the next section we will discuss which BSs are selected.

Dynamic control
In this section we propose a self-organizing strategy which dynamically makes load-aware user association and BS operation decisions. These decisions are based on the optimality conditions for the optimization problem (4a)-(4f) derived in the previous section and use load proxies observed at the active BSs.
For the user assignments we apply a modified version of the SPA-algorithm as described in Sect. 3.1, which requires to frequently update the shadow prices associated with the BSs. For the operation modes we use periodic decision moments in which we will change at most two BSs operation modes at each decision moment: at most one activation and at most one into sleep mode. The proposed strategy will therefore perform two types of updates, each on a different timescale: The shadow price updates are fully determined by the modified SPA-algorithm as described in Sect. 3.1, but we still need to specify how we update the operation modes. In addition, we will also specify what happens to the shadow prices y ðjÞ when the set of active BSs changes. These two issues will be covered in the next two subsections. In Sect. 4.3 we will give a precise description of our proposed strategy.

Operation mode updates
At operation update moments we treat the set of active and sleeping BSs separately. For the set of active BSs we check which ones are candidates for de-activation (switching into sleep mode). To do this we use the load estimates (12)  Similarly, for all sleeping BSs we also compute Dlðs ðkÞ Þ, but now using the load estimates (15). Then, from all sleeping candidate BSs with Dlðs ðkÞ Þ [ 0 we activate BS l Ã ¼ maxl :s ðkÞ l ¼0 fDlðs ðkÞ Þg.

Remark 4.1
We allow for a BS activation and another BS to switch into sleep mode simultaneously. In a small network, we do not expect this to happen, however in a large network the two BSs may be separated by enough distance that they locally do not influence each other. Moreover, in large networks the local traffic demands may vary a lot, where one area experiences a high load, whereas other areas are better off reducing their number of active BSs.

Remark 4.2
The activation and sleep rules presented above do not take into account any activation cost or deactivation costs. These costs may be implemented by setting different thresholds for Dlðs ðkÞ Þ that allow activation or deactivation.

Adjusting shadow prices after operation mode changes
In Sect. 3.1 we showed that for an optimal user assignment, the shadow price iterates y ðjÞ l sum up to a. If we activate a BS or put a BS into sleep mode, the number of active BSs changes and we either gain or lose a shadow price respectively. Consequently we have to adjust the shadow prices such that the sum over the shadow prices of active BSs equals a. The easiest way to do this is to reset all shadow prices to a=jLðs ðkÞ Þj, where Lðs ðkÞ Þ is the set of active BSs under operation mode vector s ðkÞ . However, this method loses the information that the modified SPA-algorithm has already learned on the shadow prices, and therefore we will introduce updates for the shadow prices that maintain their mutual ratios.
First we will describe how the shadow prices y where again the new shadow prices sum up to a and their mutual ratios are maintained.
We can now fully specify how the shadow prices are adapted. In case of assignment update moments, the shadow prices are updated according to the SPA-algorithm, as described in Sect. 3.1. In the case of operation update moments, the shadow prices y ðjÞ ðs ðkÞ Þ are updated according to update step (16) in case of a BS activation or according to update step (17) when a BS has been switched to sleep mode. In the situation where the update moment of the shadow prices and the update moment of the operation modes coincide, we first apply the shadow price updates given in Sect. 3.1 and then (16) or (17).

Algorithm specification
We will now give a formal algorithm description for the GSPA-algorithm: the Green Shadow Price user Association algorithm. There are two types of decision epochs: t ðjÞ y for the shadow price updates, and t ðkÞ s for the operation mode updates. The time between two same-type decision epochs is deterministic, i.e. we always have the same number of decision epochs per time unit (second). However, as the SPA-algorithm needs time to find new user associations, the rate of shadow price updates is higher than the rate of operation mode updates. Also, operation modes should not be updated too often as from an operational point of view it is undesirable to have a large number of BSs switching on and off on a fast time scale. However, operation modes should be updated often enough to follow statistical changes in the traffic demands.
The modified SPA-algorithm uses load proxies r ðjÞ l to update the shadow prices. The formal definition of the proxies r ðjÞ l as necessary to obtain theoretical optimality results is given in Sect. 1. In practice they can be defined as the fractional resource utilization of BS l between time The loads are hence estimated by a moving average principle, where e s determines the size of the updates, and thus how sensitive the load estimates are to the realized load proxies. We now have introduced all ingredients for our self-organizing scheme: the GSPA-algorithm, which is summarized in Algorithm 1.

Numerical results
In this section we present various results of numerical experiments we conducted to gain insight in the performance of the GSPA-algorithm. We consider an area of 1000 m Â 500 m with 10 BSs and used three different traffic scenarios: 1. Uniform The times between two users initiating a file transfer is independent and exponentially distributed with mean 1/5 s. The positions of the users are independent, uniformly at random in the considered area. 2. Moving Hotspot Similar to the Uniform scenario, except that there are additional users initiating file transfers in the form of a non-stationary hotspot. The hotspot is a 200 m Â 100 m area, and moves to a new position after every 1000 s. It starts with its south-west corner at (200, 100), then it moves to (400, 100), (600, 100), back to (400, 100) and finally returns to (200, 100), after which this pattern repeats. The hotspot has a relative file transfer initiation rate of 10 times the normal rate. This scenario is designed to test if our algorithm can cope with (rather extreme) spatial and temporal variation in the traffic demands. 3. Rush hour This scenario represents rush hours, and basically switches between two Uniform scenarios of different file transfer initiation rates. During the first two hours the initiation rate is of a high intensity m ¼ 50 file transfers per second, and during the next four hours it is of low intensity with m ¼ 5 file transfer initiations per second. This pattern repeats over time.
The file sizes of users are independent and exponentially distributed with a mean of b ¼ 5 Mbit. These chosen initiation and files size distributions are not essential for the GSPA-algorithm to operate, but are primarily used for convenience in the simulations. The load proxies r ðjÞ l and q ðkþ1Þ l are obtained by measuring the fractional resource utilization of BSs, as we suggested for practical systems in Sect. 4.3. The shadow prices update moments occur every second, and the operation mode update moments occur every ten seconds.
All simulations are based on 500,000 user arrivals. The number of users that can be in service at a BS simultaneously is limited by 100 users. If there are 100 users in service at a BS l and a new user initiates a connection and is assigned to BS l, then that user will be denied service and leaves the system directly without receiving service.
The values for P 0 , P are derived from the work of Auer et al. [7], leading to P 0 ¼ 13:6 W and P ¼ 1 W for Pico BSs. Furthermore, each BS transmits with equal power of 24 dBm over the spectrum it has available. The signal propagation and path loss follows the 3GGP urban micro model defined in 3GPP 36.814 v9.0.0, where the path loss (in dB) from BS l to user i is given by PLði; lÞ ¼ 140:7 þ 36:7 log 10 ðdði; lÞÞ=1000Þ, and dði; lÞ is the distance in meters between user i and BS l. Furthermore, each BS has available spectrum of bandwidth 5 MHz, and we assume a thermal noise of À 174 dBm=Hz.
The BS positions are generated uniformly at random, and for the Rush Hour scenario they are shown in Fig. 1.

Benchmarks
We compare our GSPA-algorithm to three benchmarks. Two benchmarks are based on queues with vacation times and were proposed by Guo et al. [12]. In both these benchmarks, a BS directly goes into sleep mode when it has no more users to serve. The activation policies are different: • SISL Single Sleep. The BS is activated after a deterministic time since it was switched into sleep mode, regardless of if there are users to serve. • NLIM N-limited. The BS is activated when there are N users in the queue at the BS.
In both SISL and NLIM benchmark systems, users are always assigned to the BS that provides the strongest received reference signal, even if the BS is in sleep mode. The SISL and NLIM systems treat BSs on an individual basis, and do not take into account that other BSs can take over the users of a BS that was switched off. The advantage of the SISL and NLIM systems is that they are easy to operate as they have very simple and intuitive activation and de-activation policies. However, the optimal number of users in the queue before a BS is activated in the NLIM system depends on the arrival rate that a BS is experiencing [12]. In practice this arrival rate may be unknown and timevarying. For the purpose of the simulations we have averaged the arrival rates at BSs over time (considering the Hotspot and Rush Hour scenarios) to determine the optimal number of users waiting in the queue before a BS is activated. The arrival rates per BS are obtained by considering the cell sizes of BSs as shown in e.g. Fig. 1.
The third benchmark that we consider will be referred to as the OPT system. The OPT system assigns users to BSs according to the same rule as the GSPA-algorithm, but it Fig. 1 The BS positions with their natural cell areas in the Rush Hour scenario uses predetermined optimal operation modes and shadow prices. These optimal values are obtained by discretizing the 500 m Â 1000 m area into 5 m Â 5 m squares, where each square represents a location. Then we use Cplex to find optimal solutions to (5a)-(5d) for each state that the user arrival process can be in (Uniform has only one state, Hotspot has 3 states, Rush Hour has 2 states). When the user arrival process changes state, the OPT system applies the corresponding optimal operation modes and shadow prices for that new state.
Remark 5.1 For each scenario we generated a sequence of users (file sizes, locations, and times between file transfer initiations), and each system was presented with the same sequence of users to obtain fair comparisons.

Remark 5.2
We chose the benchmarks SISL and NLIM because of their consistency: the evaluation of the systems is completely determined by the sequence of user as described in Remark 5.1 and does not include any probabilistic mechanism for BS activation or user association. It would be interesting to compare the GSPA algorithm against the policy proposed by Zhen et al. [34] or Klessig et al. [17], as these approach also takes network-wide effects into account. However, it is difficult to make a fair comparison since Zhen et al. do not provide a mechanism for practically obtaining an interaction graph, and Klessig et al. do not provide a mechanism for choosing a good BS hierarchy. These details are crucial for the respective approaches.

Performance
As described in Sect. 1 we are primarily interested in the power consumption and user-perceived performance. For the latter we consider two performance metrics: the number-or fraction-of service denials and the user-perceived throughput, where we define user-perceived throughput as the file size of a user divided by its total time spent in the system (hence it includes the time a user may be waiting in the queue of the SISL of NLIM systems for its BS to activate). Under equal power consumption, a lower fraction   The SISL and NLIM benchmarks are also included in these plots as single nodes. The SPA-nodes represent a system that applies the original SPA-algorithm and always has all BSs active, and can be thought of as the GSPA-system with a ! 1. Furthermore, Table 1 shows the realized service denials and we present plots of the user-perceived throughput in Figs. 5, 6, 7 and 8.
We can clearly see that as a increases, the power consumption of the GSPA and OPT systems is increasing, and simultaneously the user-perceived performance is improving: the percentage of service denials is decreasing and the user-perceived throughput is increasing. For a ¼ 100, the realized power consumption of the GSPA and OPT systems seem extremely favourable, but they have to be weighed against the high number of service denials.
In the Rush Hour scenario with a ¼ 10 4 , the GSPAalgorithm has comparable service denials as the NLIM and SISL systems, against a slightly improved power consumption and a significantly higher mean user-perceived throughput. In this case, the user-perceived throughput is worse in the low-throughput region as users in the GSPA system may be offloaded to BSs that provide them with a  The GSPA-algorithm outperforms the SISL and NLIM systems in the Uniform and Moving Hotspot scenarios when a ¼ 10 3 or a ¼ 10 4 : it has a lower power consumption, less service denials and higher user-perceived throughputs. This suggests that optimal trade-offs may be expected for some a in between 10 3 and 10 4 , although Quality-of-Service constraints may require higher values for a. Also, it shows that we can improve both user-perceived performance and power consumption by considering the system as a whole and accounting for traffic offloading, instead of looking at each BS individually.
Curiously, the OPT system is not outperforming the GSPA system on all levels: for the Uniform and Moving Hotspot scenario with a ¼ 10 5 and a ¼ 10 6 the GSPA-algorithm realizes a lower power consumption. Moreover, in Fig. 8 we see that the GSPA system has significantly less users with very low throughputs ( 1 Mbit=s) for all investigated values of a. This can be explained by the dynamic behaviour of the GSPA-algorithm. Although the OPT system applies optimal shadow prices and operation modes for each specific (statistically different) state of the user file transfer initiation process, it does not respond to inherent variations in of this stochastic process. The GSPAalgorithm on the other hand may not directly have the optimal shadow prices, nor have an optimal set of active BSs, but it does respond to variations in the user arrival process and clearly that comes with some gains.
The plots in Figs. 2, 3 and 4 can act as a guide for operators to choose the best value for the trade-off parameter a. For both the Uniform and Moving Hotspot scenarios, a trade-off value a ¼ 10 4 appears a very good choice: the increase in power consumption compared to a ¼ 10 3 also comes at a significant improvement in mean throughput, but for a [ 10 4 the increase in power consumption only comes at a marginal improvement in the mean user-perceived throughput. We have also considered plots where the (arithmetic) mean user-perceived throughput is replaced by the geometric mean to put more weight on users with low experienced throughputs, and the conclusions drawn in this paper also apply to the geometric mean user-perceived throughputs.
Finally, observe that the GSPA-algorithm operates without any a priori information, in contrast to the OPT, NLIM and SISL systems, and solely bases its decisions on load proxies determined at the BSs and SINR values reported by the users. In light of this property, the performance of the GSPA-algorithm is remarkably favourable compared to the considered benchmarks.

Conclusion
In this paper we presented a self-organizing green load balancing algorithm, the GSPA-algorithm, specifically designed to deal with the many overlapping cells and the cell load fluctuations appearing in dense cellular networks. We formulated an optimization problem for a trade-off between power consumption and user-perceived performance and derived sufficient conditions for activating BSs and switching BSs into sleep mode. Furthermore, we constructed a user assignment strategy that realizes an optimal user assignment in terms of the trade-off for a given set of active BSs. These results were then used to design the GSPA-algorithm. The GSPA-algorithm relies on load measurements at BSs and SINR measurements reported by users, to make a tunable trade-off between power consumption and user-experienced performance by activating BSs or putting BSs into sleep mode and also by adapting the user assignment.
Extensive simulations demonstrated the effectiveness of the GSPA-algorithm to dynamically react to changing load conditions without other information than load proxies at the BSs and SINR measurements from users. The GSPAalgorithm realized both a lower power consumption and better user-perceived performances (fewer service denials, higher perceived throughput) than two considered benchmarks. Moreover, by tuning the trade-off, the simulations clearly show a change from minimizing power consumption towards optimizing user-perceived performance.
To the best of our knowledge, this is the first self-organizing BS sleeping strategy designed for dense cellular networks. We wish to stress the fact that the GSPA-algorithm realizes good performance without the need of prior optimization. An interesting direction for future research is to improve the performance of the GSPA-algorithm for large systems by locally (geographically) clustering the BSs in smaller sub-systems and hence increasing the rate of local self-organization.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.

Appendix: Modified SPA-algorithm
In this section we provide further details about the modification of the SPA-algorithm [26] so that it can be applied to the optimization problem (5a)-(5d). In the setting of the original SPA-algorithm, all loads were equal in the optimal solution. In the current setting however, this complete load balancing property is lost due to the ''price'' of extra power consumption when offloading users to a less favourable BS (with lower user-experienced SINR values). That means that in the optimal solution ðx Ã ðsÞ; U Ã ðsÞÞ to (5a)-(5d) some BSsl have an optimal load qlðx Ã ðsÞÞ strictly lower than the optimal maximum load U Ã ðsÞ. This influences the way in which we have to update the shadow prices. Further analysis of the Lagrangian dual problem to (5a)-(5d) gives us the complementary slackness condition: y Ã l [ 0 implies qlðx Ã Þ ¼ U Ã . In other words, in the optimal solution, BSs with a positive shadow price have an optimal load equal to the maximum load. BSs with an optimal load lower than the maximum load have their shadow prices equal to 0. The update step for the modified SPA-algorithm has to take into account that shadow prices may become zero, and furthermore reflect that for all BSsl with optimal shadow price y Ã l [ 0, the loads should be equal to the maximum load, and hence all loads of BSs with positive optimal shadow price have equal loads. Hence, rather than looking at the system wide average load, we will use the system wide average load conditioned on the shadow price being positive: where L þ ðyÞ is the set of BSs with positive shadow price yl: L þ ðyÞ ¼ fl 2 L : yl [ 0g.
We will now present the modified update step. The update step for the shadow prices will only have to balance the loads of BSs with strictly positive shadow prices. Let where P Y a ½y is the projection of y to the closest vector in Y a in the Euclidean sense. The projection is needed because we can not guarantee by the updates themselves that shadow prices remain non-negative. We even suspect that when a is small compared to P many optimal shadow prices will be zero, reflecting that only a few BSs will have their loads equal to the maximum load. If we then do not actively correct the shadow prices, they turn negative when a BS has a load lower than the average load. With the above-described modifications to the original SPA-algorithm, we have obtained the modified SPA-algorithm for the updates of the shadow prices. The framework of Kushner and Yin [20,Thm. 8.2.5] can be used to conclude that the modified SPA-algorithm realizes optimal user assignments in the long run (a formal proof of this statements is not the main focus of this paper and is omitted due to space restrictions).