1 Introduction

The ever-growing demand for user capacity and higher data rates lead to effective utilization of available frequency spectrum. The recent research work has focused on usage of mmWave frequency bands for 5G cellular systems to achieve higher spectral utilization as the wavelengths are very smaller at mmWave bands. On the other side, mmWave frequencies cause higher propagation losses and limit the communication range. To overcome these losses, power consumption, and hardware cost in 5G NR networks, large scale antenna arrays are used where each antenna element or multiple antenna elements (subarrays) are connected to a TR module and this technique is called as “Hybrid Beamforming (HB)”. The HB and mMIMO technologies together combines the analog pre/post processing and digital processing at mmWave bands. It provides the tradeoff between energy and spectral efficiencies which is key the designing aspect in 5G wireless networks.

In order to allow the BS to serve multiple UEs with same available frequency-time resources, SDMA is used in Multi-User mMIMO (MU-mMIMO) wireless communication networks which provides maximum multiplexing gain, spectral efficiency, and system throughput. Design of transmit vectors to distinguish UEs and reduce co-channel interferences is challenge in MU-mMIMO systems, especially in UE dense areas where MU pairs are higher. However, mMIMO designs enhances the spatial resolution with higher number of narrow beams and maximum degree of freedom for MU paring. mMIMO HB system allows tens to hundreds number of antennas at the BS and serve UEs with maximum number of data streams in a cell.

The higher propagation losses occur at mmWave bands can overcome using larger antennas arrays which are deployed in 5G cellular systems. As the physical dimensions are relatively smaller at mmWave bands, we can accommodate higher number of antenna elements in given physical area. However, providing one RF chain per antenna element is not at all feasible and cost effective method. Therefore, HB transceivers are designed where the number of RF chains are much less than the number of antenna elements when we compare in a pure analog (in RF domain) and digital (in baseband domain) beamforming systems. Multiple-TR path designs in 5G NR addresses inter carrier interference, improves spectral efficiency and reliability. It also increases the data transmission by aggregating resources of multiple-TR paths.

Fully digital beamforming designs are not effective in terms of power consumption and hardware cost as each antenna should have a dedicated RF-to-baseband chain. To have maximum throughputs in a scattering-rich environment where there may exist only non-LOS paths between Tx and Rx antennas, mMIMO HB is implemented by the use of precoding at Tx side and combining at Rx side to improve SNR and separate independent spatial channels. mMIMO HB systems as shown in Fig. 1 minimizes the required number of RF-to-baseband chains. With appropriate selection of precoding and combining weights, HB gives much better performance compared to complete analog or digital beamforming. Present modern wireless communication systems including 5G cellular systems extensively use mMIMO HB technology for SNR improvements and SDMA to enhance data throughput in scattering rich environments. Using HB, spectral efficiency can be improved significantly by supporting more number of data streams.

Fig. 1
figure 1

mMIMO HB system’s RF architecture at transmitter

2 Literature Review

HB can be classified based on the parameters like required instantaneous/average CSI in the analog beamforming part, carrier frequency, complexity (fully, reduced, and switched). There is no single design/algorithm that can give the best trade-off between these parameters [1]. The performance of MU-mMIMO HB FDD system at FR2 frequency range (24.25 GHz—52.6 GHz) is very attractive to facilitate higher bandwidths and support extremely higher data rates [2]. Finding a suitable configuration in designing a mMIMO HB system is very essential to have optimum performance in 5G wireless networks [3]. There are lots of research is happening in designing mmWave mMIMO HB system and realize it with different precoding/combining techniques, number of BS antennas, and different channel conditions [4]. Optimal design of analog and digital precoder/combiner in a MU-mMIMO HB DL system reduces the number of RF chains and support multiple data streams per UE and maximum sum rate close to channel capacity under perfect CSI [5, 6]. This system gives the tradeoff between BS antennas and data streams per user. It is recommended to have more number of parallel data streams per user to achieve higher throughputs [7]. MU-MIMO HB systems focus on optimization of analog RF precoding/combining in fully connected and partially connected structures [8]. Partitioning RF baseband and digital domain with optimal precoding and combining are important aspects in MU-mMIMO HB systems. It provides the tradeoff between beamforming gains and hardware cost [9].

As fully-digital beamforming is not feasible in mMIMO systems due to their more energy consumption of RF chains at mmWave bands, HB is introduced to minimize the number of RF chains. HB in mMIMO systems gives tradeoff between system performance and implementation cost (in terms of both power consumption and hardware cost). It reduces energy consumption and the cost of implementation. Even though, the mmWave communications with FD provides higher SE compared to half-duplex, the inherent properties of higher path loss at mmWave bands and self-interference in FD may minimize the overall performance of the system. HB at mmWave bands exploit a large bandwidth and also overcome the large path loss of Rayleigh fading channels. It gives the trade-off between the cost efficiency and energy consumption by RF chains [10].

Optimizing the analog beamformers and digital combiners of HB for the DL of MU-mMIMO in FD mode reduces the CSI acquisition overhead [11]. mMIMO DL can provide maximum tradeoff between SE and EE when the number of BSs for the given SE is high and every cell is allocated with optimal transmit power [12]. Interference (both inter-user and intra-user) can be cancelled in a mmWave MU-mMIMO HB DL systems using techniques like ZF and SIC. But, SIC based HB algorithms outperforms in terms of SE at the cost of higher computational complexity [13]. The joint design of Tx and Rx in mmWave MU-mMIMO HB system for DL reduces the computational complexity where “piecewise dual joint iterative approximation” method is used which can provide closed form solutions in the analog beamforming and “baseband piecewise successive approximation” in designing digital beamforming that maximizes the number of simultaneous users served [14, 15].

The optimal design of MU-mMIMO HB FD DL system uses 2nd-order channel statistics where UE needs to feed back only intra-group effective channel. The strongest Eigen beam of the Rx correlation matrix forms optimal analog combiner. The joint optimization of the digital precoder/combiner with limited instantaneous CSI can enhance the sum-rate [16]. Formulating precoding/combining as a sparse reconstruction and designing algorithms based on the basis pursuit principle leads to low-cost RF hardware [17]. “Channel hardening” in a mMIMO system makes the effective channels as almost deterministic in spite of the random channel response. Efficient CSI can be estimated using TDD that uses channel reciprocity where only UL pilot signals are needed with no feedback [18]. With optimal design of transmit vectors (which mitigates the co-channel interference), SDMA provides maximum system throughput in MU-mMIMO DL systems. “Block-diagonalization” method optimizes the maximum transmission rate and “successive optimization” method optimizes the minimum power [19]. In a MIMO system, fading correlations between Tx and Rx depends on scatterers and the physical dimensions of the antenna elements. Characterizing these fading correlations is important as they contribute in defining the channel capacity [20].

“Joint spatial division and multiplexing” method in MU-mMIMO DL system can minimize the dimensions of CSI at Tx and allows FDD to enhance SE. In this, DFT based “pre-beamforming” matrix is constructed based on channel second order properties in designing a precoder and ensure that there is no loss of optimality compared to full CSI case [21]. “manifold optimization” based HB algorithm is proposed to achieve MMSE and enhance the transmission reliability. To minimize the computational complexity in HB design, eigenvalue decomposition and OMP methods are suitable for narrowband and broadband scenarios respectively [22]. Fully digital BF is not suitable for mmWave mMIMO due to their large number of RF chains and channel bandwidths. Polarization enhances the performance further for mmWave mMIMO, particularly for LOS channels. While for NLOS channels where they have greater path loss, it is essential to have HB designs with SDMA features. MU-mMIMO HB design for mmWave networks supports sending data to multiple users using low-rank channels and have multiplexing gain. The extensive channel measurements at 28 GHz and 73 GHz show that the directional beams exist with low path loss and less multipath time dispersion. It means that higher amount of link powers can be received with beamforming design without a tedious equalization process [23]. mMIMO system with HB is envisioned to provide excellent tradeoff between system performance and hardware complexity. Joint design of digital and analog precoder/combiner in a single cell MU-mMIMO DL system avoids loss of information at individual stages and supports multiple streams per UE to achieve maximum sum-rate that approaches channel capacity under perfect CSI [5].

HB with selection method reduces the computational as well as the hardware costs of MU-mMIMO systems without affecting the beamforming gain. In this method, the antenna array of transceiver is fed with analog is a number greater than up/down conversion chains, K̅) and selects instantaneous best K̅ ports as input to the up/down conversion chains [24]. Higher path attenuations at THz bands also can be compensated by beamforming, aperture gains using mMIMO, IRS techniques. Joint and cooperative HB techniques reduces the channel estimation cost under perfect CSI and achieves performance close to the fully digital beamforming system [25]. In the given frequency band, FD radio doubles the data rates using simultaneous transmission and reception. The BS in a single-cell mmWave mMIMO HB FD system suffers from limited dynamic range noise because of non-ideal hardware impairments. Optimal power allocation schemes that consider self-interference at UEs in designing HB reduces the number of RF chains and maximizes the weighted sum-rate [26]. In mMIMO systems, RF-baseband precoding using “uniform circular arrays” through phase mode transformation suppresses the inter-user interference and maximizes sum-rate with minimum CSI [27]. Spatial modulation scheme minimizes the required number of RF chains without compromise for spectral efficiency in mmWave mMIMO HB systems. The gain of these systems can be enhanced by increasing the number of antennas in the array at the Tx with no increase in the number of RF chains [28]. Acquisition of CSI in designing mMIMO systems plays a major role where instantaneous CSI gives better SINR and average (second-order) CSI provides lower overhead. A joint design of linear precoder and decoder using unweighted minimum MSE criterion can reduce ISI there by reducing BER. For practical implementations, it depends on the trade-off between the system complexity and accuracy of CSI [29].

In designing an optimal HB for minimizing the transmission power under individual SINR constraints in a MU-mMIMO system, we need to consider two cases. If the number of UEs less than or equal to number of RF chains, a globally optimal solution achieves optimal transmitting power similar to fully-digital beamforming. Otherwise, if the number of UEs are greater than number of RF chains, a stationary point is obtained using a globally convergent alternating algorithm [30]. Selection of UE is an important function in MU mMIMO system as it is severely affected by inter-user interference. The computational complexity of UE selection increases with size of the channel matrix and user-fairness because it involves re-computation of precoding and combining matrices. We can overcome this by selecting UE based on “proportional fairness” criteria and chordal distance [31]. HB designs in MU mMIMO-OFDM systems maximizes the SE when we use “locally optimal alternating maximization” and minimum MSE based iterative algorithms for analog and digital beamformers respectively [32]. Higher propagation losses at mmWave bands can be compensated by applying relays in MU mMIMO systems. HB designs in such relay assisted systems include “transmit-receive coordinated beam alignment procedure” for analog beamforming and “non-linear precoding” for digital beamforming [33]. Most of the HB designs in MU MIMO-OFDM systems considered frequency-flat channels, however it is also essential to evaluate the frequency-selective channels to maximize the received signal strength and to reduce the leakage powers [34].

In MU-MIMO HB systems, in order to support maximum number of UEs simultaneously in the give time–frequency slot, high dimensional channel information is needed in which case, feedback overhead is very high. The channel feedback overhead can be reduced using the properties “common sparsity of channel” and “nonlinear quantization”. The channel sparsity matrices are calculated using minimum MSE-OMP algorithm and they are quantized using nonlinear codebook generated by “conditional random vector quantization” [35]. ZF precoding in HB enhances the EE as it uses minimum number of RF chains with increased number of BS antennas [36]. When “conjugate beamforming” and ZF methods are used in mMIMO systems, the spectral efficiencies at 28 GHz is better than at 73 GHz and it increases with number of BS antennas. However, with higher number of UEs, conjugative beamforming outperforms the ZF in terms of complexity [37]. Design of sub-antenna arrays with non-uniform and optimal number of antennas per RF chain at the Rx of a mMIMO HB system enhances the total achievable rate by 10% with marginal increment in power consumption [38]. The IQ imbalance at the Tx of UEs in mMIMO HB UL system leads to finite ceiling of achievable sum-rate even at higher SNR. Introducing ZF techniques at the BS can compensate for undesirable IQ imbalance under different channel conditions [39].

Conventional HB methods are highly complex and strongly dependent on the quality of CSI. An extreme learning machine based HB can optimize beamformers of Tx and Rx by reducing the computational time and enhance the sum-rate [40]. Machine learning techniques to select beam-user combination reduces the complexity in MU mMIMO HB DL system even under poor channel conditions. In orthogonal HB design, analog beamforming matrix is generated by household reflectors and feedforward neural network to select beam-user. It gives (SE, EE) of (41.35 b/s/Hz, 0.8 b/Hz/J) compared with (33.77 b/s/Hz, 0.63 b/Hz/J) of DFT based state-of-the art techniques [41]. Most of the existing HB approaches in a mmWave mMIMO systems using FD links are strongly relay on optimization processes, which are either dependent on quality of CSI or too complex. To overcome this, CNN and machine learning based algorithms are introduced to guarantee 22.1% higher SE compared to OMP algorithm [42]. “Federated learning” based ML framework for HB design overcomes the need of training a global model with a larger data set. Instead, the model is trained at the BS with the gradients collected from UEs and it is shown that it is tolerant to the corruption and imperfections of channel data [43]. Beamforming designs with minimum number of RF chains and phase-shifters under imperfect CSI is challenging in a mmWave mMIMO systems. Deep learning based beamforming designs provide higher SEs even under imperfect CSI [44]. mMIMO with HB enhances the computational accuracy of “over-the-air computation” in IoT networks with reduced mean squared errors [45]. The digital beamforming using “block diagonalization” mitigates intra-user and inter-user interferences also it provides more sum rate over OMP based HB if the number of RF chains are at least double the number of data streams [46]. An optimal HB design to reduce the total transmission power under individual SINR constraints in a MU-mMIMO system leads to a “non-convex optimization problem”. In order to solve this problem, two low-complex algorithms are proposed based on the number of UEs and RF chains that give globally optimal solution [30].

A novel HB design for mMIMO DL is proposed which combines the RF and baseband precoding based on “regularized channel diagonalization” to have a low complex solution with maximum sum-rate [47]. Aiming at minimizing the ratio of computation to communication power in MU-mMIMO systems, a joint optimization problem is formulated with partially-connected RF structures and the proposed algorithm enhances the EE, cost efficiency, and maximum power saving by 76.59% [48]. The capacity of a MU-MISO DL system can approach the fully digital precoding system if the iterative hybrid precoding scheme is used. In this scheme, ZF digital precoding is used to get large baseband gain and then obtain the optimized RF precoder by reducing the total power [49]. In a mmWave mMIMO systems, the number of RF chains are dependent on hybrid precoding designs and therefore efficient precoding architectures are essential to enhance the EE and SE in the system. Design of hybrid precoder based on closed-form expressions can minimize the required active RF chains and the computational complexity, also enhance the EE and SE even at the low SNR values [50]. If the HB system has higher number of antenna ports to support maximum bandwidth, sub-band level precoding leads to higher feedback overhead. Transform domain precoding based on channel sparsity can enhance SE by 2.38% 21% and reduce the feedback overhead by 60% 89% over the existing techniques [51]. It has been concluded that a single algorithm or structure is not existing to give best tradeoffs among the design parameters of HB.

3 Proposed Methodology

In a mmWave MU-mMIMO HB systems, precoding/combining, beamforming processes are performed partly in the in analog RF domain and partly in digital baseband domain. It is challenging to derive four matrices jointly for analog and digital precoder, analog and digital combiners. Therefore, we have separately designed analog and digital beamformers at Tx and Rx using MMSE technique with separate transmit precoding and receive combining to maximize the sum-rate. Using the spatial structure of mmWave channels, we have designed RF and baseband filters jointly via OMP and approach the fully digital beamformers.

The proposed MU-mMIMO HB system consists of mainly four modules: Tx, Rx, channel and hybrid weight calculations as shown in Fig. 2. Data streams are generated at the MIMO Tx and precoded followed by modulation. The modulated signal is travelled through a scattering MIMO channel and at the Rx, it is decoded and demodulated. The precoding weights are used to modulate the data streams at the Tx and recovered at Rx using combining weights. There exists always a trade-off between computational load and optimal weights calculations of the above four matrices. The communication link between the Tx and Rx is validated based on scattering-spatial channel model and flat-static MIMO channel which considers various antenna patterns and TR spatial locations. The periodic updating of channel matrix mimics the variations of MIMO channel with respect to time. Digital Rx reconstructs the transmitted data streams and RMS EVM is calculated, also compared for different channels at FR2 frequency bands. Beamforming is obtained in analog/RF domain by applying phase shift at each antenna element of the subarray. Beamforming is achieved in the digital baseband domain using channel matrix to derive precoding/combining weights which help to transmit and recover multiple independent data streams in a single channel.

Fig. 2
figure 2

MU-mMIMO HB Communication Systems

In the Figs. 3 and 4, NT and NR are the number of Tx and Rx antennas respectively; NS is the number of data streams; NTRF and NRRF are the number of transmit and receive RF chains respectively; ‘H’ represents MIMO scattering channel; FBB and FRF are the digital decoder of size NS \(\times\) NTRF and analog precoder of size NTRF \(\times\) NT respectively; WRF & WBB are the analog combiner with NR \(\times\) NRRF size and digital combiner with NRRF \(\times\) NS size respectively. In a HB system, NTRF < NT it means that the number of TR modules are less than the number of antenna elements. The design flexibility is achieved by connecting each antenna element to one or more TR modules. The design of HB using mathematical representation is shown in Eqs. 1 and 2. FRF and WRF matrices represent the signal phase values.

$$ {\text{Precoding weights matrix}},{\text{F}} = {\text{F}}_{{{\text{BB}}}} \times {\text{F}}_{{{\text{RF}}}} \, {\text{and whose size N}}_{{\text{S}}} \times {\text{N}}_{{\text{T}}} . $$
(1)
$$ {\text{Combining weights matrix}},{\text{W}} = {\text{ W}}_{{{\text{RF}}}} \times {\text{W}}_{{{\text{BB}}}} \, {\text{and whose size is N}}_{{\text{R}}} \times {\text{N}}_{{{\text{S}} }} $$
(2)
Fig. 3
figure 3

Precoding in mmWave MU-mMIMO HB system at Tx side

Fig. 4
figure 4

Precoding in a mmWave MU-mMIMO HB system at Rx side

We have used two HB algorithms named as “quantized sparse HB”, and “quantized HB with peak search”. For computing the hybrid weights in a channel matrix, OMP algorithm is used and the channel model considered is “MIMO scattering channel”. The outputs are analog precoding/combining weights and they become steering vectors corresponding to dominant modes in the channel matrix. Hybrid weights of the channel matrix are computed iteratively to replicated the channel variations. Data streams which use the most dominant mode of MIMO channel gives maximum SNR. When we use HB with peak search algorithm, it avoids iterative searching for dominant modes in channel matrix and provides all the digital weights with peaks to extract the corresponding analog beamforming weights. This algorithm is well suited for mMIMO systems of larger size.

3.1 Design of MU-mMIMO HB Systems

In the MU-mMIMO HB systems, the channel rank is higher and it creates a rich number of effective channels using spatial separation for users. The mMIMO channel can be modelled as

$$ {\mathbf{y}} = {\mathbf{Hs}} + {\mathbf{n}} $$
(3)

where H is the channel matrix given as \( {\mathbf{H}} = \left[ {\begin{array}{*{20}c} {h_{11} } & {h_{21} } & {...} & {h_{M1} } \\ {h_{12} } & {h_{22} } & {...} & {h_{M2} } \\ {...} & {...} & {...} & {h_{M3} } \\ {h_{1M} } & {h_{2M} } & {...} & {h_{MM} } \\ \end{array} } \right],{\mathbf{s}} \to {\text{transmitted vector}},{\mathbf{y}} \to {\text{received vector}},{\text{ and }}{\mathbf{n}} \to {\text{noise vector}}. \) where each element, hij in the channel matrix is a complex random variable of gaussian in nature and represents the fading gain between ith transmitted & jth receiver antenna.

High dimensional channels are estimated to design a precoder of MU-mMIMO system at mmWave frequency bands. Even though, the analog precoders are less complex, but they provide limited performance (only one data stream is supported). On the other hand, digital precoders give high performance but at the cost of power consumption and hardware complexity (higher number of ADCs and RF chains). However, hybrid precoding combines analog and digital precoding techniques to minimize the required number of RF chains and maximum spatial multiplexing gain in MU-mMIMO HB systems as shown in Fig. 5. Under known CSI conditions, the diagonalization of H provides optimal precoding weights by extracting the first NTRF dominating modes. Therefore, for the known CSI at the Tx of BS, we can perform MU precoding to allow all users to transmit their data streams at the same frequency & time slots, still allow users to recover data streams with lower complexity.

Fig. 5
figure 5

Computation of precoding vectors at BS in a MU-mMIMO HB system

The observation vector for user ‘i’ (for the DL signal),

$$ {\mathbf{y}}_{{\text{i}}} = {\mathbf{H}}_{{\text{i}}} {\text{x}}_{{\text{i}}} + {\mathbf{H}}_{{\text{i}}} \sum\limits_{{{\text{a}} \ne {\text{i}}}}^{K} {{\text{x}}_{{\text{a}}} + {\mathbf{n}}_{{\text{i}}} } $$
(4)

where xi → signal meant for user k, ni → noise vector, Hi → channel matrix that represents the channel from BS to user ‘i’. 2nd term of Eq. (4) gives the signals for remaining users.

To transmit multiple data streams, we perform precoding as xi = Wisi.

Then the DL signal for user ‘i’ becomes,

$$ {\mathbf{y}}_{{\text{i}}} = {\mathbf{H}}_{{\text{i}}} {\mathbf{W}}_{{\text{i}}} {\mathbf{s}}_{{\text{i}}} + {\mathbf{H}}_{{\text{i}}} \sum\limits_{{{\text{a}} \ne {\text{i}}}}^{K} {{\mathbf{W}}_{{\text{a}}} {\mathbf{s}}_{{\text{a}}} + {\mathbf{n}}_{{\text{i}}} } $$
(5)

where Wi → precoding matrix si → transmitted QAM symbol vector for user ‘i’.

2nd term of the Eq. (5) represents precoding for remaining users.

The scalar observation for user ‘i’ is given as,

$$ {\text{y}}_{{\text{i}}} = {\mathbf{h}}^{{\text{T}}}_{{\text{i}}} {\mathbf{W}}_{{\text{i}}} {\mathbf{s}}_{{\text{i}}} + {\mathbf{h}}^{{\text{T}}}_{{\text{i}}} \sum\limits_{{{\text{a}} \ne {\text{i}}}}^{K} {{\mathbf{W}}_{{\text{a}}} {\mathbf{s}}_{{\text{a}}} + {\text{ n}}_{{{\text{i}} }} } $$
(6)

where hTi → DL channel for user ‘i’, ni → scalar noise.

1st term of Eq. (6) shows the effective channel experienced the user ‘i’ and 2nd term represents the interference caused by other users.

Therefore, the MU-mMIMO HB system is represented as,

$$ {\mathbf{y}} = {\mathbf{HWs}} + {\mathbf{n}} $$
(7)
$$ {\mathbf{y}} = \left[ \begin{gathered} \mathop {_{y} }\nolimits_{1} \hfill \\ \mathop {_{y} }\nolimits_{2} \hfill \\ ... \hfill \\ \mathop {_{y} }\nolimits_{i} \hfill \\ \end{gathered} \right] = \left[ \begin{gathered} h_{1}^{T} \hfill \\ h_{2}^{T} \hfill \\ ... \hfill \\ \hfill \\ h_{i}^{T} \hfill \\ \end{gathered} \right]\left[ {{\mathbf{w}}_{{{1} }} {\mathbf{w}}_{{{2} }} {\mathbf{w}}_{{{3 } \ldots .}} {\mathbf{w}}_{{\text{K}}} } \right]\left[ \begin{gathered} s_{1} \hfill \\ s_{2} \hfill \\ ... \hfill \\ ... \hfill \\ s_{i} \hfill \\ \end{gathered} \right] + \left[ \begin{gathered} n_{1} \hfill \\ n_{2} \hfill \\ ... \hfill \\ ... \hfill \\ n_{i} \hfill \\ \end{gathered} \right] $$
$$ \left[ \begin{gathered} \mathop {_{y} }\nolimits_{1} \hfill \\ \mathop {_{y} }\nolimits_{2} \hfill \\ ... \hfill \\ \mathop {_{y} }\nolimits_{i} \hfill \\ \end{gathered} \right] \to {\text{observations of all the users}}\left[ \begin{gathered} h_{1}^{T} \hfill \\ h_{2}^{T} \hfill \\ ... \hfill \\ \hfill \\ h_{i}^{T} \hfill \\ \end{gathered} \right] \to {\text{user channels}} $$
$$ \left[ {{\mathbf{w}}_{{{1} }} {\mathbf{w}}_{{{2} }} {\mathbf{w}}_{{{3 } \ldots .}} {\mathbf{w}}_{{\text{i}}} } \right] \to {\text{precoding factors at all users}} $$
$$ \left[ \begin{gathered} s_{1} \hfill \\ s_{2} \hfill \\ ... \hfill \\ ... \hfill \\ s_{i} \hfill \\ \end{gathered} \right] \to {\text{QAM symbols transmitted at all users}} $$

To achieve the optimum performance of MU-mMIMO HB system, the precoding matrix, W shown in Eq. (7) can be designed at the BS using techniques like MMSE, and ZF.

At the Rx, all signals are summed to have an observation vector,

$$ \mathbf{y} = \sum\limits_{{{\text{i}} = {1}}}^{K} {} {{\bf{H}}^{\rm{T}}}_{\rm{i}}{{\bf{x}}_{\rm{i}}}\, + \,{\bf{n}}$$
(8)

From Eq. (8), the observation vector for UL is given as,

$$ {\mathbf{y}} = \, \left[ {{\mathbf{H}}^{{\text{T}}}_{{1}} {\mathbf{H}}^{{\text{T}}}_{{{2} }} {\mathbf{H}}^{{\text{T}}}_{{{3} \ldots \ldots \ldots .}} {\mathbf{H}}^{{\text{T}}}_{{\text{i}}} } \right] \left[ \begin{gathered} \mathop {_{x} }\nolimits_{1} \hfill \\ \mathop {_{x} }\nolimits_{2} \hfill \\ .. \hfill \\ \mathop {_{x} }\nolimits_{i} \hfill \\ \end{gathered} \right] + {\mathbf{n}} $$

[HT1 HT2 HT3 ………. HTi] is the channel matrix of size Mr \(\times\)(iMt) which is the concatenation of all the UL channels.

Mr rows represent the Rx antennas at BS and Mt represents the number of Tx antennas per user.

$$ \left[ \begin{gathered} \mathop {_{x} }\nolimits_{1} \hfill \\ \mathop {_{x} }\nolimits_{2} \hfill \\ .. \hfill \\ \mathop {_{x} }\nolimits_{i} \hfill \\ \end{gathered} \right] \to {\text{Transmitted signals of all users}} ,\quad {\mathbf{n}} \to {\text{noise vector}} $$

3.2 Precoding and Combining in MU-mMIMO HB Systems

Consider K users and BS with M antennas, then the channel matrix H of DL becomes K \(\times\) M size where each row represents a DL channel of a particular user.

The ZF precoding matrix,

$$ {\mathbf{W}} = {\text{ H}}^{ + } = {\text{ H}}^{{\text{H}}} \left( {{\text{HH}}^{{\text{H}}} } \right)^{{ - {1}}} $$
(9)

where H+ → channel pseudo inverse.

Under the poor channel conditions, W in Eq. (9) becomes large and it causes higher transmission powers. By enforcing the constraint PTotal = \(\left\| W \right\|^{2}\).

The total power,

$$ {\text{P}}_{{{\text{Total}}}} = {\text{ trace}}\left( {{\mathbf{W}}^{{\text{H}}} {\mathbf{W}}} \right) = \sum\limits_{{{\text{k}} = {1}}}^{K} {p_{k} } \left[ {\left( {{\mathbf{H}}^{ + } } \right)^{{\text{H}}} {\mathbf{H}}^{ + } } \right]_{{{\text{k}},{\text{ k}} }} $$
(10)

In order to get the Eq. (10), scale the precoding matrix with a lower values and ensure equal SNR at each user.

In case 1, where BS has large number of Tx antennas, then the channel matrix, H will have almost orthogonal rows and in case 2 where the UE has large number of Rx antennas, then H will have almost orthogonal columns. The channel capacity of such channels are given as.

The capacity of a mMIMO channel,

$$ {\text{C }} = {\text{ BM}}_{{\text{r}}} {\text{log}}_{{2}} \left( {{1 } + \, \rho } \right) {\text{for case 1}}, {\text{C }} = {\text{ BM}}_{{\text{t}}} {\text{log}}_{{2}} \left( {{1 } + \, \rho {\text{M}}_{{\text{r}}} /{\text{M}}_{{\text{t}}} } \right) {\text{for case 2}} $$
(11)

where \({\text{B}} \to {\text{bandwidth}}, \, \rho \to {\text{SNR}}\).

In a MU-mMIMO HB DL systems, the channels are nearly orthogonal with reciprocity (apply TDD) and the corresponding channel matrices can be decomposed into the Equations shown in (12) and (13).

For UL channel, the channel matrix

$$ {\mathbf{H}} = {\mathbf{G}}_{{\text{M x K}}} {\mathbf{D}}^{{{1}/{2}}}_{{{\text{K x K}} }} $$
(12)

For DL Channel,

$$ {\mathbf{H}}^{{\text{T}}} = {\mathbf{D}}^{{{1}/{2}}}_{{\text{K x K}}} {\mathbf{G}}^{{\text{T}}}_{{{\text{KxM}}}} $$
(13)

GTG* ≈ M IK are the orthogonal channels with hi = d \(_{i}^{0.5}\) gi. where D → diagonal matrix which represents path loss per UE, G → multi-path fading matrix, hi → column vector of H matrix.

The DL is complex as additional precoding is required at BS to avoid inter-user interference and the observation model is given in Eq. (14) as,

$$ {\mathbf{y}} \, = \, {\mathbf{H}}^{{\mathbf{T}}} {\mathbf{Ws}} \, + \, {\mathbf{n}} $$
(14)

where HT → DL channel matrix, W → optimal precoding matrix, n → noise vector at each user,

s → data for each user.

An optimal precoder is pseudo inverse of channel matrix.

Therefore, we select the precoding in the form,

$$ {\mathbf{W}} = {\mathbf{H}}^{*} \surd {\mathbf{D}}_{{\text{p}}} /\surd {\text{M}} $$
(15)

where \({\mathbf{H}}^{*} \to\) complex conjugate of the channel \(\surd {\text{M}} \to\) scaling factor.

Dp → diagonal matrix that has square root of powers at each user.

To meet the power constraints in Eq. (15), power allocation Dp to ensure \(\left\| W \right\|^{2} = {\text{ trace}}\left( {{\mathbf{W}}^{{\text{H}}} {\mathbf{W}}} \right) \, = {\text{ P}}_{{{\text{Total}}}}\).

We find that \({\mathbf{W}}^{{\text{H}}} {\mathbf{W}} = \, \surd {\mathbf{D}}_{{\text{p}}} {\mathbf{H}}^{{\text{T}}} {\mathbf{H}}^{*} \surd {\mathbf{D}}_{{\text{p}}} /{\text{M}}_{{}} = \, \surd {\mathbf{D}}_{{\text{p}}} {\mathbf{DD}}\surd {\mathbf{D}}_{{\text{p}}} = {\mathbf{D}}_{{\text{p}}} {\mathbf{D}}\).

\({\mathbf{D}}_{{\mathbf{p}}} \to\) diagonal matrix of powers \({\mathbf{D}} \to\) diagonal matrix of path loss values.

The observation vector,

$$ {\mathbf{y}} = \, \surd {\text{M }}\surd {\mathbf{D}}\surd {\mathbf{D}}_{{\text{p}}} {\mathbf{s}} + {\mathbf{n}}s $$
(16)

From the Eq. (16), the observations at each user, \({{\rm{y}}_{\rm{k}}} = {\rm{ }}\surd \left( {{\rm{M}}{{\rm{d}}_{\rm{k}}}} \right){\rm{ }}\surd {\left[ {{{\bf{D}}_{\rm{p}}}} \right]_{{\rm{k}},{\rm{ k}}}}{{\rm{s}}_{\rm{k}}} + {\rm{ }}{{\rm{n}}_{\rm{k}}}\)

$$ {\text{SNR}}_{{\text{k}}} = {\text{ Md}}_{{\text{k}}} \left[ {{\mathbf{D}}_{{\text{p}}} } \right]_{{{\text{k}},{\text{ k}}}} {\text{E}}_{{{\text{s}},{\text{k}}}} /{\text{N}}_{0} $$
(17)

The channel in the DL for user i,\({\mathbf{h}}_{{\text{i}}}^{{\text{T}}} = \, \surd {\text{d}}_{{\text{i}}} {\mathbf{g}}_{{_{{\text{i}}} }}^{{\text{T}}} \quad {\mathbf{h}}_{{\text{i}}}^{{\text{T}}} \to {\text{row vector of length M}}.\)

From the Eqs. (16) and (17), we can understand that MU-mMIMO HB DL channel is a matched filter with asymptotically optimal linear precoder. The conclusion is that with MU-interference can be eliminated at all UEs with simple linear processing at BS.

For a MU-mMIMO HB UL systems, the observation model is given in Eq. (18) as,

$${\bf{y}} = {\bf{Hx}} + {\bf{n}} = {\bf{G}}{\bf{D}}^{0.{\rm{5}}}{\bf{x}} + {\bf{n}}$$
(18)

The combining matrix is given as,

$$ {\mathbf{z}} = {\mathbf{H}}^{{\text{H}}} {\mathbf{y}} = {\mathbf{H}}^{{\text{H}}} {\mathbf{Hx}} + {\mathbf{H}}^{{\text{H}}} {\mathbf{n}} = {\text{ M}}{\mathbf{Dx}} + {\mathbf{w}} $$
(19)

In Eq. (19),

$$ {\text{z}}_{{\text{k}}} = {\text{ Md}}_{{\text{k}}} {\text{x}}_{{\text{k}}} + {\text{ w}}_{{\text{k}}} ,{\mathbf{w}} \, \sim CN({\mathbf{0}},{\text{N}}_{0} {\text{M}}{\mathbf{D}}),{\text{and w}}_{{\text{k}}} \sim CN\left( {0,{\text{ N}}_{0} {\text{Md}}_{{\text{k}}} } \right) $$

The SNR of kth user, \({\text{SNR}}_{{\text{k}}} = {\text{ Md}}_{{\text{k}}} {\text{E}}_{{\text{x}}} /{\text{N}}_{0 } {\text{where N}}_{0} {\text{Md}}_{{\text{k}}} \to {\text{noise variance}}\), where \({\text{N}}_{0} \to {\text{noise power}}\).

Data rate for kth user,

$$ {\text{R}}_{{\text{k}}} = {\text{ Blog}}_{{2}} \left( {{1 } + \, \left( {{\text{Md}}_{{\text{k}}} {\text{E}}_{{\text{x}}} } \right)/{\text{N}}_{0} } \right) $$
(20)

Therefore, total sum-rate in a MU-mMIMO HB UL system is given as,

$${{\rm{R}}_{{\rm{sum}}}} = {\rm{ B}}\sum\limits_{k = 1}^K {{\rm{lo}}{{\rm{g}}_{\rm{2}}}\left( {{\rm{1 }} + {\rm{ }}\left( {{\rm{M}}{{\rm{d}}_{\rm{k}}}{{\rm{E}}_{\rm{x}}}} \right)/{{\rm{N}}_0}} \right)} $$
(21)

The rate of each user given in Eq. (20) can be summed up to get the total system capacity as shown in Eq. (21). The conclusion is that the simple linear combing at BS with asymptotic increase in number of antennas leads to optimal results even in the presence of MU-interference.

In MU-mMIMO HB systems, the users are orthogonal and LOS depends on the antenna array geometry as well as far field conditions. If the antenna elements are spaced at 0.5λ and the incoming signal has an angle ‘θ’ with a path difference of δ = d sin(θ). Then, the passband and baseband received signals are given in the Eqs. (22) & (23) respectively.

$$ {\text{r}}_{0} \left( {\text{t}} \right) \, = {\text{ Re}}\left\{ {{\text{gs}}\left( {\text{t}} \right){\text{ exp}}\left( {{\text{j2}}\pi {\text{f}}_{{\text{c}}} {\text{t}}} \right)} \right\} \, \& {\text{ r}}_{{1}} \left( {\text{t}} \right) \, = {\text{ Re}}\left\{ {{\text{gs}}\left( {\text{t}} \right){\text{ exp}}\left( {{\text{j2}}\pi {\text{f}}_{{\text{c}}} {\text{t}}} \right) \, ){\text{ exp}}\left( { - {\text{j2}}\pi {\text{f}}_{{\text{c}}} \delta /{\text{c}}} \right)} \right\} $$
(22)
$$ {\text{y}}_{{\text{m}}} \left( {\text{t}} \right) \, = {\text{ gs}}\left( {\text{t}} \right){\text{exp}}\left( { - {\text{j2}}\pi \delta {\text{m}}/ \, \lambda } \right),{\text{ m}} = 0, \, \ldots \ldots {\text{ M}} - {1} $$
(23)

The condition for LOS orthogonality is given in Eq. (24).

$$ \left| {{\text{h}}^{{\text{H}}} {\text{kh}}_{{\text{l}}} } \right|^{{\text{2}}} = {\text{ }}\left| {{\text{g}}^{*} _{{\text{k}}} {\text{g}}_{{\text{l}}} } \right|^{{\text{2}}} |\left\| {\sum\limits_{{m = 0}}^{{M - 1}} {{\text{exp}}\left( {{\text{j}}\pi {\text{m}}\left( {{\text{sin}}\theta _{{\text{l}}} - {\text{sin}}\theta _{{\text{k}}} } \right)} \right)|^{{\text{2}}} \approx 0,{\text{ }}\Delta _{{{\text{lk}}}} > {\text{2}}/{\text{M}}} } \right. $$
(24)

The favorable conditions in NLOS propagation depends orthogonal columns of UL channel matrix HHH≈MIK, scaling of off-diagonal elements, and Rayleigh fading entries in CN (0,1).

4 Result Analysis

In this paper, spatial and static-flat fading MIMO channels were used for conducting simulations with simulation parameters as shown Table 1. “Single-bounce ray tracing” scattering model is applied with random placement of scatters. A common channel is used to model path loss under LOS and non-LOS (more close to real scenarios) conditions. The same channel is used for data transmissions and sounding. Simulations were conducted at 28 GHz, 39 GHz, and 66 GHz carrier frequencies for different MU-mMIMO configurations with isotropic antenna arrays having linear and rectangular geometry. For achieving maximum SE, each RF chain transmits a data stream at every UE and each UE is assigned with independent channels. A complex Gaussian distribution is considered with circular symmetry for scattered ways to define the path gain. The Rx of UE is modeled to compensate for thermal noise and path loss. Four users are considered in a MU-mMIMO HB system with 3, 2, 1, and 2 number of independent data streams per UE and the number of antennas at each UE are 12, 8, 4, and 8 respectively. RMS EVM values (quantitative analysis) and receive constellations (qualitative analysis) are observed using different modulation schemes.

Table 1 MU-mMIMO HB System Simulation Parameters

Figure 6 shows the EVM values at 28 GHz for various modulation schemes with 256 number of BS antennas. With 64 BS antennas, the lowest EVM which is 0.37929% is achieved for User 1 and highest EVM, 2.1165% is noted for User 3. As User 1 has three independent data streams and User 3 has only one independent data stream, it means that EVM is decreasing with higher number of independent data streams. If 128 BS antennas are used, there is moderate EVM values are achieved for Users 1, 2 and interestingly EVM values are decreased for User 3 and 4. If the number of BS antennas are increased to 256, there is a slight increment in the values of EVM for all users. User 4 has optimal performance compared to all other users as it has 8 UE antennas with two independent data streams. User 3 has highest EVM as it has only one data stream with 4 UE receiving antennas. For 64 BS antennas, EVM is minimum at higher order modulation scheme that is 256-QAM. For 128 BS antennas, EVM is very slightly increasing with increasing modulation order from 4 to 8.

Fig. 6
figure 6

RMS EVM values at 28 GHz carrier frequency with 256 Base station antennas

From the overall results, it shows that 128 × 8 mMIMO gives optimum performance at 28 GHz carrier frequency. Figures 7, 8, 9 represent radiation pattern of the signal in a scattering channel with multiple BS antennas. It has clearly shown that increasing number of antennas in the MU-mMIMO HB system leads to focused radiation beams. This improves SNR at the Rx and provide high reliability for data transmissions. Figure 10 shows the equalized symbol constellation/stream of 256-QAM modulation with 256 base station antennas and it shows that for user 3 the received symbols are more deviated from the signal constellation points that leads to errors in detection.

Fig. 7
figure 7

Radiation pattern of the signal in a scattering channel with 64 BS antennas and 256-QAM modulation

Fig. 8
figure 8

Radiation pattern of the signal with 128 Base station antennas and 256-QAM modulation

Fig. 9
figure 9

Radiation pattern of the signal with 256 Base station antennas and 256-QAM modulation

Fig. 10
figure 10

Equalized symbol constellation/stream of 256-QAM modulation with 256 Base station antennas

Figures 11, 12, 13 indicate the RMS EVM values at 39 GHz carrier frequency with different number of BS antennas and various modulation schemes. From the observations of simulated results at 39 GHz, the lowest EVM value is 0.55795% and it is achieved by user 4 with 256 BS antennas using 64-QAM. The maximum EVM value is noted as 2.2252% with 64 BS antennas using 64-QAM modulation scheme. For user 1, minimum EVM values are achieved with 64 BS antennas for all the modulation schemes and moderate EVM values are noted when the BS antennas are 128. Errors are slightly increased when the number of BS antennas are increased to 256. For user 2, minimum EVM values are achieved with 256 number of BS antennas. For user 3 where it has only one independent data stream, EVM values are minimum when BS antennas are 128 and 256. For user 4, for all the modulation schemes, it is very clearly showing that EVM is proportionally decreasing with increase in number of antennas.

Fig. 11
figure 11

RMS EVM values at 39 GHz carrier frequency with 256 base station antennas for various modulation schemes

Fig. 12
figure 12

RMS EVM values at 39 GHz carrier frequency with 64 base station antennas for various modulation schemes

Fig. 13
figure 13

RMS EVM values at 39 GHz carrier frequency with 256-QAM for different number of base station antennas

From the overall results carried at 39 GHz carrier frequency, it is clear that 256 BS antennas and higher order modulation schemes are preferred to have minimum EVM values for data transmissions. Figures 14, 15, 16 represent the radiation pattern of a signal transmitted with 64, 128 and 256 BS antennas respectively, it shows that the radiation beams are more sharp and focused with increasing number of BS antennas. It helps to improve SNR and minimize the errors at the Rx. Figure 17 shows the 16-QAM equalized symbol constellation/stream of a signal received at Rx with 64 BS antennas. It is observed that for user 3, the signal constellation points are more close and it leads to more errors at the Rx while signal detection.

Fig. 14
figure 14

Radiation pattern in a scattering channel with 64 BS antennas and 256-QAM modulation

Fig. 15
figure 15

Radiation pattern in a scattering channel with 128 BS antennas and 256-QAM modulation

Fig. 16
figure 16

Radiation pattern in a scattering channel with 256 BS antennas and 256-QAM modulation

Fig. 17
figure 17

Equalized symbol constellation/stream of a scattering channel 16-QAM modulation with 64 BS antennas

Figures 18, 19, 20 represent RMS EVM values at 66 GHz carrier frequency with multiple BS antennas and various modulation schemes. At 66 GHz carrier frequency, minimum EVM is 0.4631% with 128 BS antennas and maximum EVM is 3.7507% noted when BS antennas are 64. For users 1 & 2 with given modulation scheme, EVM is increasing as number of BS antennas increases. Also, EVM is very slightly increasing as the order of modulation is increasing. For users 3 & 4, optimum performance is achieved with number of BS antennas is equal to 128 compared to 64, 256 number of antennas. For a given number of BS antennas, EVM is decreasing with increasing the order of modulation. Figures 21, 22, 23 represent the radiation patterns of a signal with multiple BS antennas and various modulation schemes. The patterns prove that with increasing number of BS antennas, sharpness of beams increases and helps to improve SNR at Rx and reduce error rates. Figure 24 shows the equalized symbol constellation/stream of 64-QAM modulation for a scattering channel with 64 BS antennas and it proves that deviations are minimum in the symbol points if the user maintains more number of independent data streams.

Fig. 18
figure 18

MS EVM values at 66 GHz with 256 BS antennas and various modulation schemes

Fig. 19
figure 19

RMS EVM values at 66 GHz with 64 BS antennas and various modulation schemes

Fig. 20
figure 20

RMS EVM values at 66 GHz with 256-QAM modulation scheme and multiple BS antennas

Fig. 21
figure 21

Radiation pattern of a signal with 64 BS antennas and 256-QAM modulation

Fig. 22
figure 22

Radiation pattern of a signal with 128 BS antennas and 256-QAM modulation

Fig. 23
figure 23

Radiation pattern of a signal with 256 BS antennas and 256-QAM modulation

Fig. 24
figure 24

Equalized symbol constellation/stream of a scattering channel 64-QAM modulation with 64 BS antennas

From the overall results at different carrier frequencies shown in Figs. 25, 26, 27, following observations are made: for given number of BS antennas, EVM is slightly increasing with order of modulation scheme. For a given modulation scheme, EVM is decreasing with increase in number of BS antennas. For a given number of BS antennas, EVM is increasing as carrier frequency increases. User 4 has optimal performance, it has lowest EVM and it is further decreasing at 66 GHz. User 3 has more EVM compared to all other users as it has only four receiving antennas with one data stream at UE.

Fig. 25
figure 25

RMS EVM values for 256 Base station antennas, 256-QAM modulation for different carrier frequencies

Fig. 26
figure 26

RMS EVM values for 256 Base station antennas and 64-QAM modulation for different carrier frequencies

Fig. 27
figure 27

RMS EVM values for 64 Base station antennas and 256-QAM modulation for different carrier frequencies

5 Conclusions and Future Scope

The future demands of next-generation wireless networks can be addressed using mMIMO systems and mmWave technology. This paper has focused on the hybrid beamforming design in mMIMO systems at mmWave frequency bands aiming to achieve higher data rates that approaches fully-digital system with reduced hardware complexity. The performance of MU-mMIMO HB system is evaluated at 28 GHz, 39 GHz, and 66 GHz using “MIMO” and “Scattering” channels to mimic the channel behavior in real scenarios. The system parameters considered are number of Tx/Rx antennas, number of data streams per users, modulation schemes (modulation orders from 2 to 8). RMS EVM values and symbol constellations are measured for various combinations of BS and UE antennas, carrier frequencies, and modulation schemes to identify the best fit. Based on the results obtained, it has been shown that with optimal selection of precoding, combining weights HB performs better than a fully analog beamforming or digital beamforming, also HB with peak search is a very much suitable for simulating 128 × 8 MU-mMIMO system. The overall results show that usage of HB technique in MU-mMIMO wireless communication systems improves capacity by transmitting multiple independent streams simultaneously. The HB designs aiming at minimizing the EVM values and fine tuning of phase shifters resolution are considered as future research directions.