1 Introduction

Coordination between base stations (BSs) of a mobile radio network is currenty under discussion for fourth as well as fifth generation systems [1, 2], as an important interference mitigation technique to improve the network performance. Urban deployments are typically interference limited for two reasons: a high BS density and a frequency reuse factor of one. Dense deployments with small cell sizes are required to fulfil the growing capacity demand, while a frequency reuse factor of one enables all BSs to use the full system bandwidth. This, however, causes interference between a BS and all active neighbours, generating a need for efficient interference mitigation solutions. The objective of interference mitigatin is to improve the signal to interference and noise ratio (SINR) of the mobile stations (MSs), thus improving the system throughput. Reducing interference is especially favourable when MSs suffer heavily from it at the so called cell-edge regions. There is a multitude of different coordination techniques [1, 3], starting with loose cooperation such as transmission point blanking. Here one BS can be muted to reduce the interference of MSs at another BS. The other extreme is tight cooperation called joint transmission. In this case BSs at different locations jointly transmit to one MS. Coordinated scheduling and coordinated beamforming lie in between the two extremes. In the case of coordinated scheduling the BSs cooperate in the resource assignment. Coordinated beamforming means that the BSs coordinate the beams they create (normally by means of precoding) in such a way that they do not produce interference to an MS of a neighbouring BS. The coordination schemes especially differ in the amount of data that needs to be exchanged between the BSs [4]. The tighter the cooperation is, the higher the requirements in terms of latency and bandwidth are.

Another trend besides coordination is the development towards heterogeneous networks [5]. A heterogeneous network in this context is a network with BSs of different transmit power. A typical case is the densification of an existing network with the help of pico BSs (PBSs). Such BSs have a reduced transmit power (typically 10–20 dB less than traditional macro BSs). Due to the frequency reuse factor of one, each PBS can reuse the full system bandwidth. However, it is also interfered by all other BSs in the vicinity. The resulting heterogeneous network offers a strongly increased capacity [6]. Heterogeneous networks are also a suitable deployment for coordination [1]. A heterogeneous network which is considered in the following is depicted in Fig. 1. It consists of 21 macro BSs (MBSs), where in each of the coverage areas of an MBS (a sector) a PBS (red dot) is placed. In it, the MBSs, due to the low inter site distance of 500 m (a typical assumption for urban networks [7]), provide a full coverage of the area. For the downlink, which is considered here, the PBSs therefore have to accept strong interference for the MSs they are serving.

Fig. 1
figure 1

Heterogeneous Network with 21 macro and 21 pico base stations

Recent work underlines the importance of a correct modelling of the network topology to investigate the performance of coordinated systems [8]. The coordination takes place within a group of BSs, the so called cooperation cluster. Suitable algorithms can mitigate interference within this cluster. They operate in the BSs of a cooperation cluster or in an overarching controller. However, there is always a level of interference from BSs outside the cluster (out of cluster interference—OOCI) which cannot be controlled. As shown in [8] this fact limits the performance. The peformance limit caused by OOCI can also be seen from two different directions in the related work: When simplified networks (e.g. with two cells only) are considered, huge gains are possible [9, 10]. On the other hand, in realistic, large scale networks, gains are difficult or impossible to obtain [11, 12]. Coordinated beamforming techniques should therefore take into account OOCI and be studied under practical network conditions [13].

The work presented here contributes a detailed analysis of the performance of coordinated beamforming and coordinated scheduling in the large scale network depicted in Fig. 1. This paper studies main factors that limit the potential gains. An adaptive algorithm for coordinated beamforming is proposed, which realizes the achievable gains close to optimal with a reduced computational complexity. This is achieved by exploiting the knowledge mentioned beforehand, namely the performance limiting factors.

In more detail, the target of coordinated beamforming (CBF) in the considered scenario is to reduce the interference from an MBS to MSs attached to a PBS. This interference can be severe due to the called cell range expansion [14] which is used to attach more MSs to the PBS for balancing load between BSs. Here an MS connects to a PBS, even if the received power from the PBS is lower than the one of the MBS. In contrast, the interference from the PBS to MSs attached to the MBS (MMSs) is typically low: Due to the cell range expansion, an MS is only connected to the MBS, if the power received from the MBS is significantly higher than the power received from the PBS. The principle of focussing on reducing interference from MBSs to PMSs also underlies the 3GPP LTE approach of enhanced inter cell interference coordination (eICIC) [15]. The key role in this respect is allotted to the scheduler. It assigns the time/frequency resources to the MSs and calculates beams per MS and radio resource. In addition, it is responsible for maintaining fairness among users. The precise target of the work presented here is to improve the throughput of PMSs with the help of coordinated beamforming while satisfying a fairness criterion. Improving throughput is achieved by means of a coordinated scheduler, that jointly assigns resources and applies coordinated beamforming for an MBS and the PBS placed within its coverage region. Maintaining fairness, especially such that also MSs with low channel quality (high interference) are served, is achieved by using proportional fair scheduling [16].

The remainder of this document is structured as follows: Section 2 describes the considered system model for a wireless multiple input multiple output (MIMO) link. It is then introduced how CBF can be used to improve such a link. Section 3 presents an adapted precoding technique called HetNet RZF which was used for this work. Section 4 proposes a measure for the computational complexity in case CBF is applied in large networks and discuses how complexity scales with the number of mobile and base stations. Section 5 provides a detailed analysis of the performance of CBF including the main influencing factors for performance gains. Based on these findings, Sect. 6 proposes a heuristics for applying CBF without requiring the full computational complexity. Section 7 then provides simulation results.

Notations: We make use of the following mathematical conventions: \(A^*\) indicates the complex conjugate transpose of matrix A, \(A^T\) the transpose of matrix A, ||a|| the Euclidean norm of vector a and |a| the magnitude of a complex value a.

2 System model and related work

This section introduces the MIMO system model used in the following and explains how it can be used for interference mitigation. It is then describes how these principles are applied in the related work.

2.1 System model and interference mitigation through coordination

To introduce the principles of CBF, a generic MIMO system as depicted in Fig. 2 is used. Each MS is equipped with two antennas and being served by a single BS, also equipped with two antennas. Each BS sends one data stream towards its MS.

Fig. 2
figure 2

MIMO system model

The received signal at \(MS_{i}\) is modelled according to Eq. 1.

$$\begin{aligned} y_{i} = \underbrace{\sqrt{P_i \alpha _{ii}} u_{i} H_{ii} v_{i} s_{i}}_{\text {Wanted\,Signal}} + \underbrace{\sum _{\begin{array}{c} 1\le j\le k\\ j\ne i \end{array}}\sqrt{P_j \alpha _{ij}} u_{i} H_{ij} v_{j} s_{j}}_{\text {Interference}} + \underbrace{n_{i}}_{\text {Noise}} \end{aligned}$$
(1)

The first part of the received signal (“Wanted Signal”) describes the intended data transmission from \(BS_i\) to \(MS_{i}\). In it, \(P_i\in \mathbb {R}\). represents the transmit power of \(BS_{i}\), \(\alpha _{ii}\in \mathbb {R}\) the pathloss between \(MS_{i}\) and \(BS_{i}\), \(u_i\in \mathbb {C}^{1\times 2}\) the receive combining vector at \(MS_{i}\), \(H_{ii}\in \mathbb {C}^{2\times 2}\) the channel transfer function between \(MS_{i}\) and \(BS_{i}\), \(v_i\in \mathbb {C}^{2\times 1}\) the precoder selected at \(BS_{i}\), \(s_i\in \mathbb {C}^{1\times 1}\) the unit-power symbol to be transmitted by \(BS_{i}\) and k the number of BSs. More details on the individual components, especially on the precoder, follow below. The second part of Eq. 1 describes the interference that \(MS_i\) experiences. As it will be outlined in this section, the principle of CBF relies on reducing this term by means of selecting suitable precoders (\(v_j\)). The third part relates to the noise which is present in the receiver. It is assumed to be a fixed value.

Equation 1 describes the precoder \(v_i\) used by \(BS_i\) as linear factors that define how the transmitted signal \(s_i\) is mapped onto the two antennas. This process is therefore called linear precoding [17]. As the precoder has an influence on the direction of the signal, this process is also called beamforming.

In systems without coordination between BSs, the precoder is typically used to maximize the power at the receiver of the wireless transmission (Eq. 2). This can also be interpreted as the BS shaping the transmission into the direction of the MS. In the following this principle is denoted as maximum ratio transmission (MRT), referring to the same principle which is used in the well known maximum ratio combining receiver.

$$\begin{aligned} v_{i} = \hbox {arg max } \vert H_{ii} v_{i} \vert \end{aligned}$$
(2)

An increased signal power at the receiver (due to precoding) enables the usage of a higher modulation and coding scheme, resulting in an increased throughput. In uncoordinated systems, a BS does not have information about the effects its beams cause with respect to interference in other cells. It therefore cannot take this influence into account.

In coordinated systems, wireless links can also be improved by reducing the interfering term in Eq. 1. With respect to CBF this is achieved by means of precoding. Similar to increasing the signal power, reducing the interference also enables the selection of higher modulation and coding schemes. This can lead to the case where a \(BS_j\) selects a precoder \(v_j\) such that Eq. 3 is fulfilled. Here the transmission from \(BS_j\) sums up to zero at the receiver of \(MS_i\) such that \(BS_j\) does not produce interference to \(MS_i\). As this principle forces the inference to be zero, it is referred to as zero forcing (ZF). However, the received signal at \(MS_j\) is typically lower in case \(BS_j\) uses ZF instead of MRT precoding.

$$\begin{aligned} u_{i} H_{ij} v_{j} = 0 \end{aligned}$$
(3)

2.2 Related work

The previous subsection introduced how coordination can be used to mitigate interference by means of precoding. The two precoding schemes that have been described up to now are the two extremes with respect to the effect they cause: MRT precoding maximizes the received power at the MS to be served without any consideration of the interference that is created. In ZF precoding, the constraint to fully remove interference causes that, depending on the realization of the instantaneous channel, power reductions in the intended signal have the be accepted. In the following it is discussed how related work approaches this trade-off.

In [18] and [19] the trade-off between ZF and MRT is described. A BS can act “selfishly” meaning that it maximizes the utility (signal) of its MS (MRT precoding). The opposite is a fully “altruistic” behaviour, such that no interference to the MS of the cooperating BS is produced, irrespective of the disadvantage (reduced signal power compared to MRT) for its own MS.

There are several precoding schemes that target a compromise between MRT and ZF, such as relaxed zero forcing (RZF) [20, 21] and signal to leakage and noise ratio (SLNR) precoding [22]. They define precoders that reduce interference (but not null it out) and increase the intended signal compared to ZF.

SLNR precoding [22] is based on maximizing the ratio of the intended signal power to the sum of noise and generated interference power (“leakage”). With respect to the MIMO model presented above, the SLNR at \(BS_i\) that occurs when serving \(MS_i\) is defined by Eq. 4. The term \(\Vert H_{ii}v_i\Vert ^2\) represents the intended power towards \(MS_i\). It is divided by the noise power in the receiver of \(MS_i\) (\(n_i^2\)) and the sum of powers that is transmitted towards the MS that are served simultaneously by other BSs.

$$\begin{aligned} SLNR_i = \frac{\Vert H_{ii}v_i\Vert ^2}{n_i^2 + \sum _{\begin{array}{c} 1\le j\le k\\ j\ne i \end{array}}{\Vert H_{ji} v_{i}\Vert ^2}} \end{aligned}$$
(4)

The SLNR is maximized by selecting \(v_i\) and normalizing it according to Eqs. 5, 6 and 7, with \(N_{rx,i}\) being the number of receive antennas at \(MS_i\) and I as an identity matrix.

$$\begin{aligned} v_i = {\text{max.}}\,{\text{eigenvector}}\left( \frac{H_{ii}^*H_{ii}}{N_{rx,i}n_i^2I +{\tilde{H}}_i^*{\tilde{H}}_i}\right) \end{aligned}$$
(5)
$$\begin{aligned} {\text {with}}\,\,{\tilde{H}}_i = [ H_{1i} \ldots H_{1(i-1)} H_{1(i+1)} ... H_{1k} ]^T \end{aligned}$$
(6)
$$\begin{aligned} v_i = \frac{v_i}{\Vert v_i\Vert } \end{aligned}$$
(7)

RZF [20, 21] relies on a combination of an MRT and a ZF precoder. For the multiple input single output (MISO) interference channel (i.e. with receivers that are equipped with only one antenna) the precoder is defined by Eqs. 8 and 9.

$$\begin{aligned} v_i = \alpha _1 v_{i,MRT} + \alpha _2 v_{i,ZF} \end{aligned}$$
(8)
$$\begin{aligned} {\text{under}}\,{\text{the}}\,{\text{condition}}:\Vert v_i \Vert = 1 \end{aligned}$$
(9)

As stated in the introduction, it is a main purpose of the work presented here to study the limiting factors for CBF. Therefore a precoding scheme is required which can provide a set of different precoders (from MRT to ZF) to be evaluated for their performance. RZF precoding in this respect is a suitable framework, as it can be parametrized to allow different levels of interference whereas SLNR precoding practically selects one uncontrolled point in between ZF and MRT [21]. For this reason, RZF is used as framework for precoding in the following. RZF precoding has been proposed for the multiple input single output (MISO) Interference Channel, i.e. to networks with MSs with one antenna [20]. [21] proposes an adaption to the MIMO interference channel, i.e. to the case with multiple MS antennas which is considered here. However, the approach from [21] was not used here for two reasons: It is an iterative approach that it is not able to compute precoders in a single step which can be unacceptably complex for realistic systems. In addition, it is based on a single threshold for the overall network that indicates the acceptable level of interference at the MSs. This is in conflict to the targets mentioned above, namely a flexible precoding that can allow different levels of interference for each link between MS and BS.

2.3 Channel state information

Precoding algorithms such as RZF and SLNR require detailed information about the characteristics of the radio channels, the so called channel state information (CSI) [22]. With respect to the scenario depicted in Fig. 2, this means that information about the complex channel transfer functions \(H_{11}\), \(H_{21}\), \(H_{12}\) and \(H_{22}\) is required, which in addition has to be shared between BSs [4]. To obtain full CSI, in time division duplex (TDD) systems, channel reciprocity can be utilized, such that detailed CSI for the downlink can be obtained through uplink channel estimation [22]. In more detail, MS 1 can send out a known channel estimation sequence, that is received at BS 1 and BS 2, such that the channels \(H_{11}\) and \(H_{12}\) can be measured. The same principle can be used form MS 2 in order to also analyse \(H_{21}\) and \(H_{22}\). For frequency division duplex (FDD) systems, CSI can be obtained through feedback from the MSs, which might be limited, such that the full potential of CBF cannot be exploited. As only under the assumption of this knowledge the full potential of CBF can be exploited, this is also assumed in the following. Section 7.5 discussed how the results obtained for full CSI can be interpreted in the direction of systems with limited CSI.

3 Adapted relaxed zero forcing approach

In the following an adapted RZF approach is presented which is non-iterative and allows a set of interference levels for each link between MS and BS. It is based on the characteristics of the considered heterogeneous network (HetNet) scenario and is therefore called HetNet RZF. The goal of HetNet RZF is to reduce the interference from the MBS (MBS) to an MS attached to the PBS as described in the introduction.

As it is now necessary to distinguish between different types of MSs (PMSs and MMSs) and BSs (PBSs and MBSs), an adapted description of the MIMO system is required (Eq. 10). In it, \(y_p\) indicates the signal received at the PMS. It consists of the wanted signal coming from the PBS, with \(P_p\) being the transmit power of the PBS, \(\alpha _{pp}\) the pathloss between PMS and PBS, \(u_p\) the receive combining vector of the PMS, \(H_{pp}\) the channel transfer function between PMS and PBS, \(v_p\) the precoder at the PBS and \(s_p\) the data being sent by the PBS.

$$\begin{aligned} \begin{aligned} y_p&= \underbrace{\sqrt{P_p \alpha _{pp}} u_{p} H_{pp} v_p s_p}_{\text {Wanted\,Signal\,at\,PMS}} +\underbrace{\sqrt{P_m \alpha _{pm}} u_{p} H_{pm} v_m s_m}_{{\text {Interference}}\,{\text{MBS}}\rightarrow {\text {PMS}}} \\&+ \underbrace{\sum _{\begin{array}{c} 1\le j\le k\\ j\ne m, j \ne n \end{array}}\sqrt{P_j \alpha _{pj}} u_{p} H_{pj} v_{j} s_{j}}_{\text {Interference\,from\,other\,BSs}} + \underbrace{n_p}_{\text {Noise}} \end{aligned} \end{aligned}$$
(10)

The target of HetNet RZF is to define a set of precoders for the MBS which cause different levels of interference at the PMS. It is then a task of the scheduler to select one element out of this set. It is assumed here that the coordinated scheduler has the knowledge (e.g. about the radio channels) for both BSs. HetNet RZF relies on an estimation of the receive combing vector \(u_p\) used at the PMS. This can be obtained by means of signalling from the PMS to the coordinated scheduler. In case the PMS uses a maximum ratio combining (MRC) receiver (which is assumed here) the receive combining vector can directly be calculated from the channel matrix \(H_{pp}\) and the precoder \(v_p\) (Eq. 11)

$$\begin{aligned} u_p = \dfrac{(H_{pp} v_p)^*}{\Vert H_{pp} v_p\Vert } \end{aligned}$$
(11)

The precoder \(v_p\) at the PBS is used to maximize the received power at the PMS (as described before, there is no interference suppression from PBS to MMS). It can be obtained with the help of the singular value decomposition [23] (Eq. 12).

$$\begin{aligned} \begin{aligned} H_{pp} = U S V^* \\ v_p= \left( {\begin{array}{c}V_{11}\\ V_{21}\end{array}}\right) \end{aligned} \end{aligned}$$
(12)

In contrast to the PBS, for the MBS a set of different precoders is calculated. Using the information on \(v_p\) and \(u_p\), at first a ZF precoder \(v_{mZF}\) for the MBS can be calculated (Eq. 13). In case this precoder is used at the MBS, no interference occurs at the PMS.

$$\begin{aligned} \begin{aligned} u_p H_{pm} v_{mZF} =0\\ v_{mZF} = \ker ({u_p H_{pm}}) \end{aligned} \end{aligned}$$
(13)

Following the same principle as for the PBS, an MRT precoder \(v_{mMRT}\) for the MBS can be calculated (Eq. 14).

$$\begin{aligned} \begin{aligned} H_{mm} = U S V^* \\ v_{mMRT}= \left( {\begin{array}{c}V_{11}\\ V_{21}\end{array}}\right) \end{aligned} \end{aligned}$$
(14)

The set of precoders which HetNet RZF provides is described by Eq. 15. In it, \(\lambda _{1}\) defines the level of interference suppression (Eq. 16). A selection of zero results in no interference from the MBS to the PMS (MRT is fully suppressed), whereas one means full interference. In case \(\lambda _{1} < 1\), the remaing power at the MBS can be allocated for a ZF transmission. In this case, \(\lambda _{2}\) is selected such that the total power constraint is met (Eq. 17).

$$\begin{aligned} v_{m} &= \lambda _{1} v_{mMRT} + \lambda _{2} v_{mZF} \end{aligned}$$
(15)
$$\begin{aligned} \lambda _{1} &= [0 \dots 1] \end{aligned}$$
(16)
$$\begin{aligned} \Vert v_{m}\Vert = 1 \end{aligned}$$
(17)

4 Scheduling and its computational complexity

The previous section introduced the concept of CBF for a system model consisting of two BSs and two MSs operating on two interfering radio channels. To consider a full network, this model has be extended in several dimensions: Tens of BSs are required for a realistic network size (e.g. 42 for the network depicted in Fig. 1). Each BS serves a number of MSs in parallel. To do so, the frequency band is in divided into multiple sub-carriers which can individually be allocated in time domain based (time domain orthogonal frequency-division multiple access—TD-OFDMA). The result is a set of radio resources, wherein each radio resource covers a part of the system bandwidth and lasts for a certain time transmission interval (TTI). Each BS can allocate the radio resources to its MSs in each TTI which is the main task of the scheduler. For downlink transmission, the scheduler of a BS distributes the data received from the core network onto the radio resources available in one of the next TTIs. In case of a coordinated scheduler, this process happens jointly for a group of BSs. A second task of the scheduler is the selection of a suitable precoder for each radio resource and MS. A coordinated scheduling decision in this context consists of the following decisions per radio resource:

  1. 1.

    assignment of radio resources to the MSs

  2. 2.

    selection of precoders

For one MBS and one PBS, the first decision turns into the selection of a pair of two MSs per radio resource, one for each BS. For each pair and radio resource, precoders according to Eqs. 12 and 15 have to be calculated. Whereas this is a single calculation for the PBS (Eq. 12) for the MBS different realizations of \(\lambda _{1}\) are possible (Eq. 15).

Each scheduling decision realizes for each radio resource a different throughput at the MBS and at the PBS. Scheduling is computationally complex due to the large extend of potential decisions. Equation 18 describes the number of options (\(N_{choices RR}\)) for one radio resource. It scales linearly with the number of PMSs, the number of MMSs and the number for realizations of \(\lambda _1\).

$$\begin{aligned} N_{choices RR} = N_{PMS} \cdot N_{MMS} \cdot N_{Steps\lambda _1} \end{aligned}$$
(18)

As a scheduling decision is required for each radio resource, the total number of options scales with the number of radio resources \(N_{RR}\) (Eq. 19).

$$\begin{aligned} N_{choices total} = N_{choices RR} \cdot N_{RR} \end{aligned}$$
(19)

Each potential decision leads to an expected spectral efficiency for the two transmissions. Along with the principle of proportional fair scheduling, this can be converted into a metric expressing the utility of the corresponding transmission. The target is then for each radio resource to find the setting (in this case the pair and the configuration of \(\lambda _1\)) with the highest metric value.

There is a trade-off between the computational complexity of the scheduling process and the quality of the decision: finding the resource with the highest metric value causes that all \(N_{choices total}\) transmission parameters have to be calculated and evaluated in terms of their metric. This is especially challenging due to the real-time requirements: a scheduling decision has to be taken periodically per time transmission interval (e.g. per millisecond in the case of LTE), meaning that the calculations for a decision have to be finalized before generating the next one. Reducing the complexity is possible by not evaluating every single scheduling decision. However, this implies the risk that also the potential decision with the highest metric is not found and thus the performance of the network is degraded.

Besides the complexity of the scheduling itself, also the signalling of the in- and output from and to the coordinated scheduler is a challenging task. In more detail, the scheduling procedure requires access to the CSI as described in Sect. 2.3. In case the coordinated scheduler is located in the MBS, this can be achieved by signalling CSI from the PBS to the MBS. After generating the scheduling decision, the generated precoders and the selected modulation and coding schemes (per PMS) have to be signalled back to the PBS. While the data rate required for the exchange of CSI is limited [24], the requirement in terms of latency can be demanding (below 1 ms) [25].

5 Performance analysis

In the following an approach for reducing the computational complexity of the scheduling process is presented. It relies on the fact that certain requirements have to be fulfilled for advantageous effects of suppressing interference at the PMS. In case these requirements are currently not fulfilled, selected transmission parameters can be excluded from the considerations in the scheduling. As these transmission parameters would show lower or equal performance compared to others, their exclusion can theoretically happen without affecting the performance. The definition of requirements is based on a detailed study of the performance gains of coordinated beamforming under different parameters that will be introduced in the following subsections. Section 6 then describes the conclusions and how they are applied in the proposed approach.

5.1 The effect of out of cluster interference

An important factor that influences the performance of CBF is the so called out of cluster interference (OCCI). The more uncoordinated interference an MS receives, the lower the achievable gains from CBF are. An investigation of the effect of OOCI was presented in [26]. A summary is provided in the following.

Equation 20 shows the total interference at a PMS i. In it, \(P_j\) indicates the transmit power of BS j, \(\alpha _{ij}\) the pathloss between PMS i and BS j, \(u_i\) the receive combining vector of PMS i, \(H_{ij}\) the channel transfer function between PMS i and BS j, \(v_j\) the precoder at the BS j and \(s_j\) the data being sent by BS j. The total interference can be decomposed into two parts: The interference coming from the cooperating MBS l and the interference from all other BSs. The second part is denoted OOCI as it represents the uncoordinated interference from outside the cooperation cluster.

$$\begin{aligned} \begin{aligned} I_{i}&= \sum _{\begin{array}{c} 1\le j\le k\\ i\ne j \end{array}} \sqrt{P_j \alpha _{ij}} u_{i} H_{ij} v_{j} s_{j} \\&= \underbrace{\sqrt{P_l \alpha _{il}} u_{i} H_{il} v_{l} s_{l}}_{\text {Interference\,from\,cooperating\,MBS l}} + \underbrace{\sum _{\begin{array}{c} 1\le j\le k\\ i\ne \{j,l\} \end{array}} \sqrt{P_j \alpha _{ij}} u_{i} H_{ij} v_{j} s_{j}}_{\text {Interference\,from\,all\,BSs\,except\,l}} \end{aligned} \end{aligned}$$
(20)

Equation 21 shows the OOCI ratio (OOCIR) for a PMS i. This expresses the ratio of uncoordinated versus coordinated interference. It is defined as the sum of interference from not cooperating BSs versus the interference coming from the cooperating MBS l. An uncoordinated interferer j uses the precoder \(v_j\). While in general, this precoder can be of any kind (e.g. MRT or ZF), it is assumed for the simulations in Sect. 7, that all cooperation cluster apply HetNet RZF, i.e. each MBS reduces interference for the PMS attached to the PBS within the coverage area of the MBS. For the cooperating MBS l, Eq. 21 assumes the selection of an MRT precoder in order to reflect the maximum level of interference from within the cooperation cluster.

$$\begin{aligned} \text {OOCIR}_{i,l} = \frac{\sum _{\begin{array}{c} 1\le j\le k\\ i\ne \{j,l\} \end{array}} \sqrt{P_j \alpha _{ij}} u_{i} H_{ij} v_{j} s_{j}}{\sqrt{P_l \alpha _{il}} u_{i} H_{il} v_{l(MRT)} s_{l}} \end{aligned}$$
(21)

Knowledge about the OOCI and the OOCIR of an MS can be obtained by feeding back a channel quality indication (e.g. an SINR estimate) from the MS to the BS. By using the CSI (see Sect. 2.3), especially the pathloss component it includes, the OOCI and OOCI can be extracted. In addition, measurement for handover between cells (mobility management) which contain the signal power received at the MS for different BSs can be used to estimate OOCI and OOCIR.

The key finding with respect to the OOCIR is as follows: if for a PMS i served by PBS i, OOCI dominates such that \(\text {OOCIR}_{i,l}{>>} 1\), there is only a low influence of the precoder \(v_{l}\) onto the performance of MS i. In contrast, \(\text {OOCIR}_{i,l}{<<} 1\) in indicates a strong influence of \(v_{l}\) onto the performance of MS i.

Figure 3 details this by depicting the maximum achievable spectral efficiency gain at different levels of OOCIR. The results were calculated under the assumption of a signal to noise ratio of 30 dB. For different levels of interference (expressed by the signal to interference ratios—SIRs) the achievable spectral efficiency gain can be calculated. The values were calculated based on the Shannon capacity. For practical systems there can be deviations due to discrete modulation and coding schemes. The gain is based on the assumption that a fraction of the interference (defined by the OOCIR) can be removed completely through ZF precoding at the interferer. The highest gains are achievable for very strong levels of interference (SIR = − 10 dB). Here, without interference mitigation nearly no communication is possible. If it assumed that a vast majority of this interference comes from inside the interference cluster (OOCIR = − 20 dB) and thus can be removed, the spectral efficiency can be improved by a factor of 65. For lower levels of interference (e.g. SIR = 10 dB), lower gains are achievable due to an improved performance without coordination. With increasing interference from outside the cooperation cluster the gains reduce. At high levels of OOCIR no significant gains are possible. The fact that at low SIRs high gains are achievable also underlines the suitability of CBF in heterogeneous networks with cell range expansion as described in the introduction.

Fig. 3
figure 3

Maximum achievable gains in terms of spectral efficiency for different levels of out of cluster interference

5.2 Influence of the number of MSs per BS

A second factor that influences the performance of CBF is the number of MSs in the system. For an PMS i, served by PBS i, the coordinated scheduler selects a second MMS l to be served using the same radio resource at the cooperating MBS l. Even for the case the MBS uses MRT precoding only (\(\lambda _1 = 1\)) there is a potential for the coordinated scheduler to reduce interference at the PMS: each MMS is associated with a corresponding precoder \(v_l\) that would be used to serve it. As each precoder \(v_l\) causes a different level of interference at the PMS i, the selection of an MMS l decides also on the interference at PMS i. The potential for an interference suppresion only by the selection a suitable MMS grows with the number of MMSs: The higher it is, the larger is the variety of precoders out of which the coordinated scheduler can select. In the same way increases the corresponding likelihood that this includes a precoder with a significantly reduced interference at PMS i.

With respect to calculating additional precoders with interference suppression at the MBS (\(\lambda _1 < 1\)), the situation is vice versa. If for an PMS i there is an MMS l which significantly mitigates the interference (while it is served with MRT precoding), there is only a low potential for improvement by calculating additional precoders. In contrast, if there is only one MMS attached to MBS l, the degrees of freedom collapse to zero, meaning that this MMS has to be served in order make use of the corresponding radio resource. This happens without respect to how much interference occurs at PMS i. In this case there can be a high benefit from calculating additional precoders that suppress interference.

6 Reduced complexity scheduling heuristics

This section proposes a heuristic to effectively apply HetNet RZF in a coordinated scheduler. Section 4 showed the number of potential scheduling decisions. Investigating every potential decision is computationally complex but guarantees that the element with the highest utility is found. Restricting the search space implies the risk of leaving out the best element and thus generating a sub-optimal decision. However, for an implementation in real systems where computational resources are limited, a lower complexity is important, even if it does not achieve optimal performance. This is especially relevant as the number of potential decisions scales linearly with the number of MMSs and at the same time with the number of PMSs (Eq. 18). For large number of MSs the problems therefore becomes highly complex. The proposed heuristic makes use of the previously described performance influencing factors in order to restrict the computational complexity of the scheduling process.

The scheduling applies the principle of proportional fair scheduling [16, 27], which assigns the access to the channel to the MS with the highest proportional fair metric (Eq. 22). In it, r(n) is the instantaneous (at the current time instance n) achievable rate of an MS for the full channel bandwidth. R(n) indicates the rate the MS achieved in the past, calculated according to Eq. 23. \(\beta\) (a value between zero and one) is the so called forgetting factor, which enables MSs that once gained access to the channel (and therefore have a high value of R) to re-gain it.

$$\begin{aligned} M_{PFS}(n) &= \frac{r(n)}{R(n)} \end{aligned}$$
(22)
$$\begin{aligned} R(n) &= \beta \cdot R(n-1) + (1-\beta ) \cdot r(n) \end{aligned}$$
(23)

Proportional fair scheduling was adapted to be frequency selective in an OFDMA system [28, 29]. Here the scheduling assigns access to subbands (e.g. one radio resource) instead of granting access to the full channel bandwidth. The proportional metric therefore is calculated based on achievable rate per subband (\(r_{SB}\)) in Eq. 24

$$\begin{aligned} M_{PFS}(n) = \frac{r_{SB}(n)}{R(n)} \end{aligned}$$
(24)

With respect to CBF, the target is for each radio resource to find the two MSs PMS i and MMS l, in combination with the corresponding precoders, that maximize the sum proportional fair metric \(M_{PFS}^{HetNet}\) (Eq. 25).

$$\begin{aligned} M_{PFS}^{HetNet} = \frac{r_{SB}^{PMS}(n)}{R^{PMS}(n)} + \frac{r_{SB}^{MMS}(n)}{R^{MMS}(n)} \end{aligned}$$
(25)

At the same time, the number of assessed potential scheduling decisions N should be low compared to the total of options (Eq. 26).

$$\begin{aligned} N{<<} N_{choices\,total} \end{aligned}$$
(26)

Section 5 revealed the following main trends:

  1. 1.

    the lower the OOCIR, the higher is the benefit of a reduced interference from MBS l to a PMS i. In cases of low OOCIR, calculating the full range of CBF precoders (\(\lambda _1 = [ 0 \dots 1 ]\)) should be considered.

  2. 2.

    the higher the number of MSs at MBS l, the higher is the diversity of precoders available in case only MRT is used (\(\lambda _1 = 1\)). Therefore lower advantage can be taken out of calculating additional precoders with \(\lambda _1 < 1\).

These trends can be converted into two thresholds: Calculating more than the MRT precoders is especially beneficial, in case

  1. 1.

    the OOCIR is below a threshold \(T_{OOCIR}\) and

  2. 2.

    the number of MSs at \(BS_{l}\) is below a threshold \(T_{nMS}\)

The proposed approach is to restrict the calculation of interference suppressing precoders (\(\lambda _1 < 1\)) to the cases where both thresholds are kept. In case one or both thresholds are reached or exceeded, only MRT precoders are calculated. Out of the reduced set of potential decisions the coordinated scheduler then selects the pair and a precoder with the highest proportional fair metric for each radio resource.

Fig. 4
figure 4

Flowchart of the proposed heuristic

Figure 4 shows this process in more detail. As stated before, the heuristic is executed for a cooperation cluster of one MBS and one PBS. Its target is to assign each radio resource to one PMS and one MMS. This decision has to happen inline with a calculation of the corresponding precoders. The process starts with generating all possible pairs of one PMS and one MMS in a cooperation cluster. It then continues with finding the assignment for the first radio resource. To do so, it is checked pair by pair, whether the before-mentioned threshold are kept for this radio resource and this pair of two MSs. If yes (case 1), it is foreseen that the usage of interference suppressing precoders might be beneficial. Here a set of precoders is generated as described in Sect. 3. If one or both thresholds are exceeded (case 2), it considered that generating a single MRT precoder per MS is sufficient. This separation of the pairs into two classes is the key element of the proposed approach. It enables that for a part of the pairs computations are avoided. Each pair has now been associated with corresponding precoders. This can either be a set of precoders (case 1) or a single MRT precoder per MS (case 2). The throughput that each each pair can achieve is estimated in the next step. This can again be a multitude of values (case 1) or a single value per MS (case 2). The throughput values are then converted into proportional fair metric values (Eq. 25). The radio resource is finally assigned based on finding the highest metric value. This is also directly coupled to the selection of the precoder: if the pair is associated to a single MRT precoder per MS (case 2) the corresponding precoders are used. If there are multiple precoders for one pair (case 1), each precoder achieves a different performance and therefore is coupled with a different metric value. The highest metric value therefore in this case points not only to the pair to select but also to precoder to use. The process is then executed in the same manner for the remaining radio resources.

7 Simulation results

In this section simulation results that were obtained with a MATLAB-based 3GPP compliant LTE system level simulator are presented. The simulator was calibrated according to the procedures described in [30], Annex A.2.2. The network layout varies for the individual simulations and is therefore introduced in the individual subsections below.

As the characteristics of the radio channels have a strong impact on the performance of MIMO systems, they have to be modelled in detail. This was achieved by using the ITU-R Urban Micro and Urban Macro channel and propagation model [7], which however leads to very complex simulations. For example, a non-line-of-sight channel between a BS and an MS is modelled by 380 (Urban Micro) or 400 (Urban Macro) propagation rays. Taking into account a high number of BSs and MSs, this can lead to an high complexity for calculating all (serving and interfering) wireless links. The detailed simulation assumptions are listed in Table 1.

The results are structured as follows: Section 7.1 shows results that illustrate the effects described in Sects. 5.1 and 5.2. Section 7.2 then gives performance results for a set of large networks consisting of tens of BSs such that especially the OOCI is modelled realistically. In Sect. 7.3 the obtained results are then compared with results from literature. Section 7.4 gives insights on the complexity-considerations introduced in Sect. 4.

Table 1 System level simulation parameters

7.1 Influence of out of cluster interference and number of mobile stations

Simulations were carried out to quantify the effect of the OOCIR and the number of MSs onto the performance of CBF. This is especially required to select suitable values for \(T_{OOCIR}\) and \(T_{nMS}\) later.

To investigate the effect of the number of MSs only, a network configuration without OOCI is required. This was implemented in the form of a single MBS with one PBS inside its coverage area (with 225 m distance between MBS and PBS). To avoid also OOCI between sectors (sector one of site one creates OOCI in sector two of site one), the MBS was configured with an omni-directional antenna without sectorization. The so called hotspot MS distribution (configuration 4b in [30]) was used, such that two-thirds of the MSs are located in the vicinity of the PBS. This reflects the fact that operators will tend to install PBSs at locations with a high density of MSs in order to fulfil the corresponding traffic demand in such areas. In a series of simulations, an increasing number of MSs were placed in this network to investigate the effect as described in Sect. 5.2.

Figure 5 shows the throughput the MSs attached to the PBSs achieved for three MSs in the network. The threshold \(T_{OOCIR} = -\infty\) (red curve) is by default reached or exceeded by any amount of OOCI. Thus the proposed approach assumes for all transmissions that calculating additional precoders (\(\lambda _1 < 1\)) is not beneficial and only MRT precoding is used. For the blue curve, \(T_{OOCIR} = \infty\) causes that \(T_{OCCIR}\) is never reached and thus a full set of precoders is calculated. In the case of three MSs in the network, two of them attach to one BS whereas the remaining MS attaches to the second BS (wherein one BS is a PBS and one BS is an MBS). This causes that only two pairs of one PMS and one MMS can be formed. For MRT precoding only, this low degree of freedom results in no performance gain in comparison to the uncoordinated case. Calculating the full set of precoders results in high gains. The mean throughput of the PMSs increases from 12.7 Mbit/s (no coordination) to 19.9 Mbit/s (RZF with \(T_{OOCIR} = \infty\)) resulting in a gain of 57%. For RZF with MRT precoding only it remains at 12.7 Mbit/s. The high gains for RZF with \(T_{OOCIR} = \infty\) are expected in this scenario, because it includes ZF precoders that null out interference. As no OOCI is present, this can improve the SINR drastically.

Fig. 5
figure 5

Throughput of MS associated to the pico BS in a network with two BSs and 3 MSs

Figure 6 shows results for the same setup, with the difference that now six MSs are placed in the system. With a growing number of MSs, coordinated scheduling with MRT precoding only is able to achieve significant gains over the uncoordinated case. The mean throughput of the PMSs increases from 10.5 to 12.9 Mbit/s (23% gain). Due to the absence of OOCI, calculating the full set of precoders is still highly beneficial. The mean throughput grows to 16.4 Mbit/s, which results in a gain of 27% over MRT precoding only and of 56% over no coordination.

For twelve MSs in the system (Fig. 7), the trend continues. Due to the increasing number of pairs, the performance for MRT precoding approaches the case where all precoders are calculated. The mean throughput grows from 8.4 Mbit/s (no coordination) to 11.4 Mbit/s (\(T_{OOCIR} = -\,\infty\)) and 13.4 Mbit/s (\(T_{OOCIR} = \infty\)). The resulting throughput gains now equal 36% (MRT precoding vs. no coordination), 18% (RZF with \(T_{OOCIR} = \infty\) vs. RZF with \(T_{OOCIR} = -\,\infty\)) and 60% (RZF with \(T_{OOCIR} = \infty\) vs. no coordination).

Fig. 6
figure 6

Throughput of MS associated to the pico BS in a network with two BSs and 6 MSs

Fig. 7
figure 7

Throughput of MS associated to the pico BS in a network with two BSs and 12 MSs

With respect to different levels of OOCI, Fig. 3 showed insights for the achievable performance gains. More detailed simulation results were presented in [26]. From Fig. 3 it can be obtained that at levels of OOCIR > 0 dB gains are hard to achieve.

7.2 Performance results for large networks

For a realistic modelling of the OOCI, a large network consisting of tens of BSs is required. Figure 1, shown in the introduction, fulfils these requirements. In it, each PBS and the overlaying MBS form a cooperation cluster. In each cooperation cluster, the proposed heuristic for coordinated scheduling operates. The PBSs in Fig. 1 are located in the centre of the coverage areas of the cooperating MBSs. This causes a strong interference from the cooperating MBS, and thus a relatively low OOCIR. Afterwards also a second network with a higher OOCIR is analysed.

In the simulation area a varying number of MSs is dropped in a random process. Again the so called hotspot MS distribution was used, such that two-thirds of the MSs are located in the vicinity of a PBS.

Figure 8 shows the simulation result in the case when 42 MSs (one per BS) are placed inside this network. Similar to the results from Fig. 5, coordinated scheduling with MRT precoding only (\(T_{OOCIR} = T_{nMS} = -\infty\)) shows low gains compared to the case without coordination. The mean throughput of the PMSs in this case grows by 2%. Also similar to the results in Fig. 5, calculating additional precoders shows performance gains. For the proposed heuristic two different threshold settings were used (\(T_{OOCIR} = 0 \text { dB, } T_{nMS} = 6 \text { and } T_{OOCIR} = -\,3 \text { dB, } T_{nMS} = 4\)). The more strict threshold settings (\(T_{OOCIR} = -\,3 \text { dB, } T_{nMS} = 4\)) exclude more potential scheduling decisions and therefore show a slightly lower performance. Here the gain in terms of the mean throughput of the PMSs is 11%. In general, the proposed approach is relatively close to the results for an exhaustive search (\(T_{OOCIR} = T_{nMS} = \infty\)), which achieves 13% mean throughput gain.

Fig. 8
figure 8

Throughput of MSs associated to the PBS in a network large network with 42 MSs (network from Fig. 1)

Figure 9 shows results for the same network with 315 MSs. Significant performance gains are now obtained for coordinated scheduling with MRT precoding only (\(T_{OOCIR} = T_{nMS} = -\,\infty\)). The mean throughput of the PMSs in this case increases by 19%. Additional gains from calculating more precoders are only present for an exhaustive search (\(T_{OOCIR} = T_{nMS} = \infty\)), which achieves 29% mean throughput gain compared to the uncoordinated case.

Fig. 9
figure 9

Throughput of MS associated to the pico BS in a network large network with 315 MSs (network from Fig. 1)

Comparing Figs. 8 and 9 shows the adaptability of the proposed approach: In the network conditions with a low number of MSs substantial gains from interference suppressing precoders can be achieved. These gains are also to a large extend realized by the proposed approach. For a  larger number of MSs, where low gains from calculating additional precoders are possible, the heuristic tends towards applying MRT precoding only, which is desired for complexity reasons.

As already stated, the previously investigated network (depicted in Fig. 1) is characterised by a relatively low OOCI at the PMSs, enabling the performance gains from CBF, especially in the case of only 42 MSs. For comparison also the network depicted in Fig. 10 was simulated. Here the PBSs are located at the edges of the coverage areas of the MBSs, making it likely that PMSs receive interference from multiple MBSs. A higher OOCIR and thus an expected lower gain from coordination is the result.

Fig. 10
figure 10

Heterogeneous network with higher out of cluster interference

Figure 11 again shows the result for 42 MSs. Performance gains from coordination can be seen, especially for the case when all precoders are calculated (\(T_{OOCIR} = T_{nMS} = \infty\)). Here the mean throughput of the PMS increases by 7%. The influence of the OOCI can been seen from the fact the for the network of Fig. 1 this gain was 13%.

Fig. 11
figure 11

Throughput of MSs associated to the PBS in a network large network with 42 MSs (network from Fig. 10)

Figure 12 shows the results for the case of 315 MSs. As also for the previous network, the huge number of MSs enables gains for precoding based on MRT precoding only (\(T_{OOCIR} = T_{nMS} = -\,\infty\)). The mean throughput of the PMSs increases by 3%. Creating additional precoders is not of value in this scenario. For the previous network, 19% (\(T_{OOCIR} = T_{nMS} = -\,\infty\)) to 29% (\(T_{OOCIR} = T_{nMS} = \infty\)) gain were possible, which again highlights the influence of OOCI.

Fig. 12
figure 12

Throughput of MS associated to the pico BS in a network large network with 315 MSs (network from Fig. 10)

In summary, the results show that a high level of OOCI prohibits gains from CBF. This could limit the applicability to selected area, e.g. to the center of the MBS coverage areas. For lower levels of OOCI, substantial gains are possible. The source of the achievable gains differs, depending on the number of MSs: For a low a number of MSs, gains can be achieved when calculating interference suppressing precoders (black versus green curve in Fig. 8). For a high number of MSs, gains can be achieved by calculating MRT precoders only (black versus red curve in Fig. 9).

7.3 Comparison with results from literature

The analysis in Sect. 5 and the simulation results in this section show two main influencing factors for the performance of CBF. The detailed understanding and description of the factors is a main contribution of this work. It was also shown that these have the potential to heavily affect the achievable performance gains. For example, calculating interference suppressing precoders in the two-cell deployment with three MSs considered for Fig. 5, resulted in a mean throughput gain for the PMSs of 57%. The same principle leads to a gain of only 8% for the larger deployment considered for Fig. 9.

The influence of the performance limiting factors can be used to better interpret existing results from the literature: The work presented in [9] is based on ideal assumptions for gains from ZF: two BSs with two MSs are considered. In accordance with the results presented above this can lead to high performance gains, especially at low OOCI. OOCI is not explicitly covered in [9], as BSs that are not part of the coordination are not present. However, there are results with different noise levels (e.g. Figure 2 in [9]). A high noise level or a low Signal to Noise Ratio (SNR) causes similar effects as a high OOCI. Accordingly, Figure 2 in [9] shows high gains in spectral efficiency (which can be mapped to throughput gains) at high SNRs, e.g. more than 100% gain at 25 dB SNR. [12] provides simulation results for a network with three and with 21 BSs, with one MS per BS. In case of three BSs (no OOCI), significant performance gains are obtained (an increase in spectral efficiency from approximately 6.5 bit/second/Hertz (bps/Hz) to approximately 8 bps/Hz (Figure 2 in [12]). For a network with 21 BSs the gains deteriorate to almost zero (see also Figure 2 in [12]).

In summary, the simulation results from the literature are in line with the simulation results provided here. Besides the negative effect of OCCI, which has been described also in [8, 13], this work emphasizes the importance of the number of MSs. In addition, it is shown in the following section, how the detailed understanding of the performance influencing factors from Sect. 5 can be used to reduce the complexity in the scheduling.

7.4 Complexity

As introduced in Sect. 4, the complexity of the scheduling process is an important factor. Corresponding to the complexity definition provided there, the number of potential scheduling decisions that were needed to achieve the results in Figs. 8 and 9 were counted. Figure 13 shows the result for the case of 42 MSs. The proposed approach in this case reduces the number of calculations compared to the exhaustive search. However, it is still on a high level compared to MRT precoding only and the uncoordinated case. On the other hand, the increased complexity also achieves the performance gains depicted in Fig. 8.

Fig. 13
figure 13

Number of transmission parameters calculated for the case of 42 MSs

Figure 14 shows the number of calculations for the case of 315 MSs. Here the proposed approach significantly reduces the complexity. This also corresponds to the performance results in Fig. 9: The proposed approach here limits the complexity as the thresholds (especially \(T_{nMS}\)) is often exceeded. It achieves a performance similar to MRT precoding only, because calculating additional preocoders would only show low gains under these conditions.

Fig. 14
figure 14

Number of transmission parameters calculated for the case of 315 MSs

7.5 Applicability of the results to other precoding schemes

The results shown in this section were generated using the HetNet RZF approach. However, the conclusions drawn here can also be interpreted in a broader way. The effect of OOCI affects all kinds of coordination schemes, as also emphasized in [8, 13]. Also the general considerations depicted in Fig. 3 are not bound to the usage of a certain precoding algorithm. Similarly, the impact of the number of MSs has a general background: The more MSs there are in a system, the more degrees of freedom the scheduler has in allocating the radio resources and setting corresponding precoders (even in case MRT only is applied).

For systems with limited channel state information (e.g. LTE FDD systems), a flexible feedback from the MSs as standardized in LTE-Advanced [31] can be applied. In it, BSs send out precoded pilot data and the MSs report on the effect these precoder causes. This might limit the performance of CBF (e.g. a ZF precoder is hard to find using this approach), but still enable a limited operation.

8 Conclusions

This paper analyzed the performance gains that coordinated beamforming can achieved in a heterogeneous network consisting of macro and pico base stations. As a framework for this, an applied version of relaxed zero forcing (HetNet RZF) was presented. A detailed performance analysis revealed two main factors that influence the performance of coordinated beamforming: The number of mobile stations in the system and the amount of out of cluster interference. A scheduling algorithm with reduced computational complexity, which is based on this finding, was presented. It estimates whether gains from coordinated beamforming can be expected under the current conditions. Only if this is the case, the complex calculation of additional precoders is executed. As a result, similar performance as achieved with an exhaustive search is realized with significantly lower computational complexity.