Channel estimation using hybrid optimizer based recurrent neural network long short term memory for MIMO communications in 5G network

In the fifth-generation (5G) networks, multiple input multiple-output (MIMO) systems are further developed to enhance transmission reliability. However, channel estimation is one of the major challenges which needs focus for improved data transmission in MIMO. Although efficient estimation techniques have been recently proposed, estimation accuracy needs to be upgraded further. Hence, an optimized Recurrent Neural Network-Long Short-Term Memory (RNN-LSTM) network is presented in this paper for channel estimation. At first, history of channel responses of pilot block is collected or estimated using Least Square (LS) channel estimation method. Using these collected channel responses, the proposed RNN-LSTM is trained where weight parameters are chosen optimally using hybrid Particle Swarm Optimization (PSO)-Adam optimizer. Using the trained PSO-Adam optimizer based RNN-LSTM, the current channel response is predicted or estimated. The performance of the proposed channel estimation scheme is analysed by varying pilot sequence length and number of antennas to evaluate the metrics Bit Error Rate (BER) and Mean Square Error (MSE). Complexity analysis of the proposed scheme is compared with standard estimators like LS and Minimum Mean-Square Error (MMSE). Main objective of this work is efficient channel estimation using deep learning technique. Hybrid algorithms are used to optimize the performance of deep learning technique. This proposed technique outputs better MSE and BER. Main objective of this work is efficient channel estimation using deep learning technique. Hybrid algorithms are used to optimize the performance of deep learning technique. This proposed technique outputs better MSE and BER.


Introduction
To solve the problem of big size of data traffic demand from Internet-of-Things (IoT) and wireless video streaming services, right now, 5G remote communication gained momentum. Using millimeter wave (mmWave) in MIMO is one of the major technologies for 5G remote communication frameworks [1][2][3]. Millimeter waves provide a large bandwidth that can translate into better data transfer rates and speeds of about 10 gigabits per second.

Problem statement and contribution
As of late, deep learning (DL) has been broadly researched in the communications and signal processing issues, like 'estimation' , 'decoding' and more [9,11]. A deep neural network (DNN) is a universal function approximator with prevalent logarithmic learning capacity and advantageous improvement ability, and consequently can be utilized without any precise numerical model. However, it is not adequate to manage the precise and timely acquisition of massive CSI. So, to enhance the accuracy of channel estimation, the following contribution is presented in this paper.
• Initially, channel responses of Pilot block of each are estimated using LS channel estimator. • These estimated previous channel responses are given as input to the RNN-LSTM for predict or estimate the current channel response. RNN-LSTM is trained using the estimated channel responses and pilot block. • To optimize the LSTM performance, hybrid PSO-Adam algorithm is presented. Weight parameters of the LSTM are chosen using this hybrid algorithm. • Finally, the trained structure of the optimized RNN-LSTM is tested using the previous channel responses of pilot block. At the output, the CSI can be predicted by minimizing the MSE or loss function.
The following sections are organized as pursues. Section 2 surveys some channel estimation in MIMO based literatures. Section 3 proposes hybrid PSO-Adam optimizer based RNN-LSTM for channel estimation in MIMO. Results of the proposed channel estimation method are described in Sect. 4. The conclusion of the work is discussed in Sect. 5.

Elated works
Mehrtash Mehrabi et al. [9] had used deep neural network for designing the decision directed channel estimation scheme for MIMO space time block coded (STBC) frameworks. By using the scheme, the authors had removed the importance of Doppler rate estimation in fast changing quasi stationary channels. In the work, the authors had trained two deep neural networks. These deep neural networks were learned by imaginary and real parts of MIMO fading channels. For transmission of STBC, they had presented the maximum likelihood decoding algorithm in the proposed scheme. Because of the proposed scheme, the authors had attained lower error propagation.
Haythem Bany Salameh et al. [10] had the objective to decrease the overhead of pilot in massive MIMO frameworks. So, they had presented a new channel estimation scheme based on compressive sensing. In the proposed scheme, the authors had combined compressive sampling matching as well as sparsity adaptive matching pursuit methods. With respect to spatial correlations the signals  [11] had presented an effective online channel state information prediction method which was abbreviated as OCEAN. Initially, the authors have identified few significant features influencing a radio link CSI and have used data samples which contain CSI information and these features. Then, they developed a learning system which included the combination of convolutional neural network and LSTM. Besides, to enhance the prediction results, they had designed a mechanism of offline-online two-step training. By presenting the proposed scheme, the authors had obtained high prediction accuracy and average difference ratio.
Yang Liu et al. [12] had proposed decreased dimension decomposition-based channel estimation method for millimetre wave massive MIMO systems. The authors had decomposed the estimation of channel matrix into the estimation of angle and channel gain information for enhancing the estimation accuracy. Using dimensionality reduction, the received signals were decomposed. Initial set of sparse support was attained using sparse signal recover method. The authors had considered the off-grid error as the adjustment parameter. Utilizing the Taylor's formula, true discrete grid was approximated. At final, path gain was calculated using LS estimation algorithm. Due to the proposed scheme, the authors had obtained better normalized MSE and achievable spectral efficiency. Imran Khan [13] had the objective to decrease the channel estimation complexity in massive MIMO. So, they had presented the inherent sparsity based sparse channel estimation method. Using discrete Fourier transform, the proposed scheme divided the channel taps from the noise space. So, the proposed channel estimation scheme only calculated the channel tap part for reducing the computational complexity. Results of the article depicted that the proposed scheme reduced the minimum mean square error. Besides, it achieved better inter cell interference and bit error rate.
To solve the problem of the pilot contamination, Pasangi et al. [14] had presented Time Division Duplexing method. In the approach, downlink signals were pre-coded using the uplink estimated channel and downlink channel estimation was converted into channel gain estimation. Besides, to solve the problem of low spectral efficiency, the authors had proposed blind method-based downlink channel gain estimation in massive MIMO system based on Time Division Duplexing. In this approach, channel gain was estimated by computing the average energy of the received signals. Due to the proposed scheme, they had obtained better normalized MSE.
Jeya and Amutha [15] had presented an enhanced Semi-Blind Sparse algorithm for solve the complexities of channel estimation in MIMO-OFDM. Initially, based on the Quadrature Phase Shift Keying (QPSK) modulation, the input signal was modulated. Then, Inter-Symbol Interferences were reduced utilizing the Pulse Shaping Algorithm. Symbol mapping at every transmitter was done by performing the operation of Inverse Fast Fourier Transforms. At receiver, channel estimation was performed using their proposed algorithm. Also, cost function was decreased using Enhanced Differential Evolution. By proposing the proposed scheme, the authors showed better symbol error rate and Peak signal-to-noise ratio (PSNR).
From the review of existing works, we are motivated to present deep learning model for channel estimation in this work. Although the existing works give better results, these are not adequate to manage the precise and timely acquisition of massive CSI. Thus, an optimized deep learning model is presented in this work.

Overview
In MIMO communication system, prediction, or estimation of the CSI between transmitter and receiver is the main challenge. So, to enhance the transmission rate in 5G network, channel estimation problem is to be solved by predicting the current CSI or channel response (H t ) of pilot block according to the previous channel response of (H t-1 ) of pilot blocks. Thus, for predicting or estimating the current channel response, an optimized RNN-LSTM is presented. Initially, channel responses of pilot sequences are estimated using LS estimator. These estimated channel responses of pilot sequences are given as input to the propose RNN-LSTM structure, where the LSTM weight  Based on the knowledge of P and Y, the channel estimator can find the channel matrix H.

Channel model
In this work, Rayleigh fading channel is modelled between the transmitter and receiver. Generally, this fading channel forms when there are more various signal paths among the transmitters and receivers. Because of the RNN-LSTM, the proposed channel estimation scheme can be realized with only priori knowledge about Doppler rate range and not the exact value. Hence it does not require Doppler rate estimation and provides a more reliable packet transmission in highly dynamic environments.
The time varying fading channel between the t th transmitter and r th receiver at time index k is defined as follows, where h tr (k) denotes the channel between the tth transmitter and rth receiver at time index k. (1) The probability density function (PDF) of the received signal which is transmitted via the Rayleigh fading channel is defined as follows, where r 0 denotes the received signal's amplitude, 2 denotes the variance and 2 2 denotes the average multipath signal power.
The PDF of the instantaneous SNR is defined as follows, where, b denotes the SNR of the received signal. b denotes the average SNR per bit and it can be defined as follows, where E b /Z 0 denotes the SNR of the received signal and E 2 denotes the average of Rayleigh distribution 2 .

Channel response of pilot sequence
In this work, history of channel responses of pilot sequences is used to predict the current channel response of the pilot block. To estimate the channel response of previous pilot sequences, LS channel estimator is used. The channel response of previous pilot sequence is defined as follows, where H k-1 denotes the channel response of the pilot sequences at time period k-1. These estimated channel responses are given as input to the proposed RNN-LSTM model for predicting the current channel response (H k ) of the pilot sequence. The following section describes the channel estimation using optimized RNN-LSTM.

RNN-LSTM
LSTM is a unique type of RNN and is used in this work for channel estimation. The LSTM cell incorporates an extraordinarily evolved model known as a "gate" which is the appropriate region to choose input features; likewise having multiplication operation and sigmoid neural framework layer.
In addition, it contains the gates such as input, forget and output gates. The forget gate chooses what data to dispose of or extract from the memory. The input gate chooses the information that should be stored while updating the cell state. The output gate generates the appropriate information depend on the cell state. The structure of LSTM is shown in Fig. 1. As depicted in the figure, depend on the output from cell state at 't−1' and 't' , the output of cell at 't + 1' is obtained.
In this work, LS based estimated channel responses are given as input to the LSTM. The input LS channel responses are defined as follows, The formulation of LSTM network is described as follows where F t , I t , V t , and O t represent the output of forget gate, input gate, tanS layer and output gate respectively. S t denotes the final hidden state output.w F , w I , w V andw 0 denote the weight parameters of forget gate, input gate, tanS layer and output gate respectively.c F , c I , c V andc 0 denotes the bias values of forget gate, input gate, tanS layer and output gate p respectively. denotes the sigmoid function. If the output value of F t is one, all information will be maintained in the cell state else the information will be rid of. Equation (9) performs for new information. V t denotes the current state of the cell. Equation (11) generates the appropriate information depend on the cell state. S t denotes the overall output of the cell. In this work, S t can be denoted as the estimated current channel response (H LS ™(k)) and is defined as follows At last, the loss function or MSE of the structure is determined as follows, where Ĥ (k) denotes the desired channel output.

Channel estimation using hybrid pso-adam optimizer based Rnn-Lstm
Although LSTM outputs the better results, the prediction accuracy is further to be enhanced. So, to enhance the LSTM performance in terms of prediction accuracy, the weight parameters are chosen optimally using hybrid PSO-Adam optimizer algorithms in this approach. Basic of PSO algorithm: As the convergence speed and adaptability of PSO algorithm is better than other algorithms, it is chosen to optimize the weight parameters of LSTM in this work. It is an evolutionary algorithm to attain the optimal solution by imitating the bird flocks' movement. In this algorithm, each particle in the D th dimension search space represents the position of the solution. The position of the i th particle can be defined as follows, where Y i denotes the position of the ith particle.
The position of the solutions is updated according to the following equations, where P i (t) denotes the personal best position, P g (t) denotes the global best position, c 1 and c 2 denote the learning factor and rand () denotes the random number within the range [0, 1].
Basic of adam optimizer: Adam optimizer is the combination of RMSprop [16] and Stochastic Gradient Descent with momentum [17]. As RMSprop, learning rate can be scaled by Adam utilizing the squared gradients. Also, Adam utilizes moving average of the gradient with the momentum. It is otherwise called adaptive learning rate method. From the calculation of first and second moments of gradient, Adam algorithm estimates the individual adaptive learning rates for various parameters. In this method, the parameter magnitudes is invariant to the gradient rescaling, as step size has a particular range. Besides, Adam method performs with sparse gradients and generally executes a structure of step size annealing. Shortly, Adam optimizer computes the gradient (g) of parameters initially. Second, it estimates first moment E(g) and second moment E(g 2 ). The first moment ignores disordered moving and stops only attaining local optima. The second moment certain an upper bound of step size. Next unbiased calculation of both moments should be estimated. At last, the output parameter ( t ) is obtained.
PSO-adam optimizer based RNN-LSTM: To improve the search ability of the population, both PSO and Adam optimizer are combined in this work. In this algorithm, the Adam optimizer uses the solution space attained from the PSO algorithm for future search. It leads to attain new solution space.
In this work, hybrid PSO-Adam optimizer is utilized for choosing the optimal weight parameters of RNN-LSTM. The phases of PSO-Adam optimizer for weight parameter selection are described as follows: Initialization: Particles or candidate solutions are initialized. In this approach, the weight parameters,w F , w I , w V andw 0 of LSTM are considered as solutions. The initialization of the candidate solutions is represented as follows: where Y iD can be defined as follows,

Fitness calculation: The fitness of each initialized solution is calculated as follows,
where Loss(i) defines the loss function and is defined in Eq. (16).
Update the solution: In this approach, each solution is updated using PSO initially. Then, the obtained solution is updated using Adam optimizer. The steps of updating the solution are described as follows: PSO: Step 1: For each solution, pbest is calculated as well as gbest is calculated for all solutions.
Step 2: The position of each solution is updated using the following equation, where V PSO denotes the velocity obtained from the Eq. (18).
Step 3: After updating the position of the solution using (23), the new solution space is represented as Then the position of the obtained solution is updated using Adam optimizer.

Adam optimizer: Step 4:
Running average coefficient ( 1,t ) of first moment is degraded as follows, where 1 and denote the decay rate for the estimation of the moment. Range of 1 and is considered within [0, 1].
Step 5: Gradient (g t ) is calculated based on the stochastic objective at t.
where f ( ) denotes the stochastic objective function.
Step 6: Estimation of biased first moment (m t ) is updated as follows, Step 7: Estimation of biased second moment (v t ) is updated as follows, where 2 denotes the decay rate for the estimation of moment and is considered within [0, 1].
Step 8: Bias-corrected first moment ( m t ) is estimated as follows, Step 9: Bias-corrected first moment ( v t ) is calculated as follows, Step 10: The parameter t is updated as follows, where represents the step size and denotes the random variable within [0, 1].
Step 11: The process is continued from step 4 to 10 if the t doesn't converge. Else, return the resulting parameter t .
The position of each particle or solution is updated using Eq. (31).
Step 12: Fitness of Y' and Y'' is calculated. By comparing the fitness of Y' and Y'' , the solution with better fitness is selected as the current position.
Termination: The solution is updated until finding the optimal solution else the algorithm is terminated.

Results and discussions
In the proposed scheme, input signals are modulated using QPSK modulation technique. Carrier frequency is considered as 700 MHz. Besides, Rayleigh fading channel model is used in the simulation. For estimating the channel responses of the signals at 't−1' , LS channel estimation scheme is used. From the history of channel responses, 80% used for training and 20% used for testing to predict the current channel response at 't' . Pilot lengths 128, 136, 152, 160 are considered for different number of antennas N t = 2; N r = 2, N t = 4; N r = 4, N t = 8; N r = 8 and N t = 10; N r = 10 respectively.

Performance analysis
The performance of the proposed channel estimation scheme is evaluated in terms of BER and MSE for SNR. The evaluation metrics of the proposed PSO-Adam-LSTM is compared with that of the Adam-LSTM, PSO-LSTM and conventional LSTM. Besides, the performance of different channel estimation schemes is evaluated under the various pilot lengths and the number of antennas.

The performance analysis based on various pilot lengths
The evaluation metrics of different channel estimation schemes are analysed under various pilot lengths (PL = 128, 136, 152, 160). Figure 3 shows the comparison of BER and MSE of different channel estimation schemes under PL = 128. As shown in Fig. 3a, BER is decreased when SNR increases. As the weight parameters of LSTM are selected optimally using PSO, BER of the PSO-LSTM is reduced to 42% than that of the conventional LSTM. Nevertheless, compared to PSO, Adam optimizer is computationally efficient so that BER of Adam-LSTM is decreased to 47% than that of the PSO-LSTM. To attain the better prediction output, in this approach, both PSO and Adam optimizer are presented to optimize the weight parameters of LSTM. Thus, BER of the PSO-Adam-LSTM is reduced to 79% and 88% than that of the Adam LSTM and PSO-LSTM respectively. As shown in the Fig. 3b, when PL = 128, MSE of the PSO-Adam-LSTM is reduced to 74%, 87% and 91% than that of the Adam-LSTM, PSO-LSTM and LSTM respectively. The comparison of BER and MSE of different channel estimation schemes under PL = 136 is shown in Fig. 4. As depicted in Fig. 4a, compared to Adam-LSTM, PSO-LSTM and LSTM, BER of PSO-Adam-LSTM is decreased to 76%, 83% and 91% respectively. Also, MSE of PSO-Adam-LSTM is reduced to 73%, 82% and 86% than that of the Adam-LSTM, PSO-LSTM and LSTM respectively as depicted in Fig. 4b. Figure 5 shows the comparison of BER and MSE of different channel estimation schemes under PL = 156. As depicted in Fig. 5a and b, the PSO-Adam-LSTM outperforms the Adam-LSTM, PSO-LSTM and LSTM in terms of BER and MSE for varying SNR when PL is 156. As depicted in Fig. 6a, when PL = 160, BER of PSO-Adam-LSTM is decreased to43%, 69% and 79%than that of the Adam-LSTM, PSO-LSTM and LSTM. Also, compared to Adam-LSTM, PSO-LSTM and LSTM, MSE of PSO-Adam-LSTM is reduced to 53%, 73% and 78% respectively as shown in Fig. 6b.

Performance analysis based on number of antennas
In this section, the evaluation metrics of different channel estimation schemes are analysed by varying numbers of antennas. Figures 7, 8 When N t = 2, N r = 2, the BER of the PSO-Adam-LSTM decreases to 51%, 65%, and 75% and MSE of the PSO-Adam-LSTM decreases to 6%, 43%, and 47% than BER and MSE of Adam-LSTM, PSO-LSTM and LSTM respectively as depicted in Fig. 7a and b. Figure 8 shows the BER and MSE of different channel estimation schemes when N t = 4, and N r = 4. As shown in Fig. 8a, at SNR = 25 dB, the BER of PSO-Adam-LSTM is decreased to 54%, 68%, and 79% than that of the Adam-LSTM, PSO-LSTM, and LSTM respectively. Compared to channel estimation schemes Adam-LSTM, PSO-LSTM, and LSTM, the MSE of the proposed channel estimation scheme is reduced to 7%, 50%, and 60% respectively at SNR = 25 dB as depicted in Fig. 8b. In Fig. 9a Figure 10 shows the comparison of BER and MSE of different channel estimation schemes N t = 10 and N r = 10. As shown in Fig. 10a, at SNR = 25 dB, the BER of the proposed channel estimation scheme is reduced to 55%, 79%, and 84% than that of the Adam-LSTM, PSO-LSTM, and LSTM respectively. Also, as depicted in Fig. 10b, Compared to channel estimation schemes Adam-LSTM, PSO-LSTM, and LSTM, the MSE of the proposed channel. Figure 11 depicts the computational analysis of different channel estimation techniques in terms of execution time. As depicted in the figure, compared to deep learning-based channel estimation schemes, conventional schemes consume high execution time because of computational complexity. Nevertheless, the execution time of PSO-Adam LSTM is high compared to Adam-LSTM, PSO-LSTM, and LSTM. The hybrid techniques of Adam and PSO have increased the computational complexity.  [10,14], the proposed approach attains minimum MSE i.e., − 18 dB.

Conclusion
The process of estimating the channel has a predominant role in deciding the wireless system performance. Apart from the conventional pilot aided channel estimation method, deep learning algorithms exhibits notable enhancement in channel predictability and reduction in computational complexity of 5G networks. Although LS estimate is generally employed to capture channel estimates due to its low cost, without having prior knowledge of the channel, it offers comparatively high estimation error. Although the previous works give better results, it poses inadequacy in managing the precise and timely acquisition of massive CSI for 5G communication scenario. To estimate the accurate channel state information or channel response of the pilot data block, an optimized RNN-LSTM is used in this paper. The optimized deep learning-based estimator improves the estimate obtained by LS approach. Firstly, the history of channel responses of pilot  sequences at time 't-1' has been estimated using the LS estimation scheme. From the history of channel responses, 80% are used for training and remaining 20% for testing to predict the current channel response at 't' . These estimated channel responses have been given as input to the proposed hybrid PSO-Adam optimizer-based LSTM in which weight parameters are optimized using PSO and Adam optimizer. Finally, channel response at time 't' is estimated or predicted using the optimized LSTM. LS channel estimation scheme is used. Performance of the proposed scheme has been analysed under the following scenarios such as various pilot lengths and number of antennas. Simulation results showed that the PSO-Adam-LSTM outperforms the channel estimation schemes based on Adam-LSTM, PSO-LSTM, and LSTM in terms of BER and MSE. Comparative analysis of various channel estimation techniques is demonstrated in terms of execution time. We substantially prove that the proposed estimator demonstrates superior performance over other channel estimation approaches in previous works, owing to better channel learning, resulting in decreased estimation errors. The future work should focus on a robust learning architecture to further reduce the bit error rate with implement ability on massive connection platform. Deep learning facilitated channel estimation would also require availability of prototypical datasets publicly, where several pilot types, channel conditions, antenna arrangements and scenarios are considered in a comprehensive manner. To handle the time and location varying nature of wireless environment, utilization of transfer learning is needed to survive in situations that are not experienced at the time of training.
Author contributions All authors read and approved the final manuscript.
Funding The authors declare that they have no competing interests and funding.
Data availability Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
Code availability Code is available.

Conflict of interest
On behalf of all authors, the corresponding author states that there are no relevant financial or non-financial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.