1 Introduction

To solve the problem of big size of data traffic demand from Internet-of-Things (IoT) and wireless video streaming services, right now, 5G remote communication gained momentum. Using millimeter wave (mmWave) in MIMO is one of the major technologies for 5G remote communication frameworks [1,2,3]. Millimeter waves provide a large bandwidth that can translate into better data transfer rates and speeds of about 10 gigabits per second. Narrower beams can be obtained in millimeter wavelengths and therefore higher resolution. Besides, small size is another major advantage of millimeter wave devices, making the antenna very small for necessary and feasible high-frequency applications. However, millimetre wave has some limitations such as Limited Range, channel estimation and line of sight (LOS). Due to fog, rain and moisture, the limited transmission range of mmWave can be affected. Line of sight (LOS) communications required for millimeter-wave communications and physical obstacles in practical applications can attenuate the signal and shorten the transmission range. In planning a MIMO remote communication framework in 5G, the major complexity encountered is channel estimation. Channel state information (CSI) is an inevitable information in remote communications. It alludes to known channel characteristics of a radio link. CSI can describe the joined impact of shadowing, path loss, fading and so forth, when a signal travels from a transmitter to its relating receiver. Accordingly, this data reveals the quality of radio link. Getting exact CSI is of central significance in ensuring the performance of radio connections in remote communication frameworks. To be more explicit, CSI profoundly decides the parameters of physical-layer and methods performed for a radio communication interface in a remote communication framework. It is noted that CSI additionally affects radio resources allocation and obstructs in remote correspondence frameworks. Hence, it is significant to find precise CSI in remote communication frameworks. Remote communication frameworks can get precise CSI by performing channel estimation [4, 5]. Hence, analysts have presented many channel estimation methods, most likely the MMSE [6], LS [7] and maximum-likelihood (ML) [8] methods. These algorithms need to execute the operations of matrix such as singular value decomposition, matrix inverse, matrix multiplication and Eigen value decomposition. Executing such operations of matrix causes high complexity. Thusly, such conventional techniques are just reasonable for small-scale channel estimation in remote communication frameworks.

1.1 Problem statement and contribution

As of late, deep learning (DL) has been broadly researched in the communications and signal processing issues, like 'estimation', 'decoding' and more [9, 11]. A deep neural network (DNN) is a universal function approximator with prevalent logarithmic learning capacity and advantageous improvement ability, and consequently can be utilized without any precise numerical model. However, it is not adequate to manage the precise and timely acquisition of massive CSI. So, to enhance the accuracy of channel estimation, the following contribution is presented in this paper.

  • Initially, channel responses of Pilot block of each are estimated using LS channel estimator.

  • These estimated previous channel responses are given as input to the RNN-LSTM for predict or estimate the current channel response. RNN-LSTM is trained using the estimated channel responses and pilot block.

  • To optimize the LSTM performance, hybrid PSO-Adam algorithm is presented. Weight parameters of the LSTM are chosen using this hybrid algorithm.

  • Finally, the trained structure of the optimized RNN-LSTM is tested using the previous channel responses of pilot block. At the output, the CSI can be predicted by minimizing the MSE or loss function.

The following sections are organized as pursues. Section 2 surveys some channel estimation in MIMO based literatures. Section 3 proposes hybrid PSO-Adam optimizer based RNN-LSTM for channel estimation in MIMO. Results of the proposed channel estimation method are described in Sect. 4. The conclusion of the work is discussed in Sect. 5.

2 Elated works

Mehrtash Mehrabi et al. [9] had used deep neural network for designing the decision directed channel estimation scheme for MIMO space time block coded (STBC) frameworks. By using the scheme, the authors had removed the importance of Doppler rate estimation in fast changing quasi stationary channels. In the work, the authors had trained two deep neural networks. These deep neural networks were learned by imaginary and real parts of MIMO fading channels. For transmission of STBC, they had presented the maximum likelihood decoding algorithm in the proposed scheme. Because of the proposed scheme, the authors had attained lower error propagation.

Haythem Bany Salameh et al. [10] had the objective to decrease the overhead of pilot in massive MIMO frameworks. So, they had presented a new channel estimation scheme based on compressive sensing. In the proposed scheme, the authors had combined compressive sampling matching as well as sparsity adaptive matching pursuit methods. With respect to spatial correlations the signals sources were distributed sparsely. The compressive sensing technique with the distribution pattern solved the problem of channel estimation in MIMO. Experimental results of the article depicted that the proposed compressive sensing scheme outperformed conventional compressive sensing scheme in terms of (signal-to-noise ratio) SNR and normalized MSE.

Changqing Luo et al. [11] had presented an effective online channel state information prediction method which was abbreviated as OCEAN. Initially, the authors have identified few significant features influencing a radio link CSI and have used data samples which contain CSI information and these features. Then, they developed a learning system which included the combination of convolutional neural network and LSTM. Besides, to enhance the prediction results, they had designed a mechanism of offline-online two-step training. By presenting the proposed scheme, the authors had obtained high prediction accuracy and average difference ratio.

Yang Liu et al. [12] had proposed decreased dimension decomposition-based channel estimation method for millimetre wave massive MIMO systems. The authors had decomposed the estimation of channel matrix into the estimation of angle and channel gain information for enhancing the estimation accuracy. Using dimensionality reduction, the received signals were decomposed. Initial set of sparse support was attained using sparse signal recover method. The authors had considered the off-grid error as the adjustment parameter. Utilizing the Taylor’s formula, true discrete grid was approximated. At final, path gain was calculated using LS estimation algorithm. Due to the proposed scheme, the authors had obtained better normalized MSE and achievable spectral efficiency.

Imran Khan [13] had the objective to decrease the channel estimation complexity in massive MIMO. So, they had presented the inherent sparsity based sparse channel estimation method. Using discrete Fourier transform, the proposed scheme divided the channel taps from the noise space. So, the proposed channel estimation scheme only calculated the channel tap part for reducing the computational complexity. Results of the article depicted that the proposed scheme reduced the minimum mean square error. Besides, it achieved better inter cell interference and bit error rate.

To solve the problem of the pilot contamination, Pasangi et al. [14] had presented Time Division Duplexing method. In the approach, downlink signals were pre-coded using the uplink estimated channel and downlink channel estimation was converted into channel gain estimation. Besides, to solve the problem of low spectral efficiency, the authors had proposed blind method-based downlink channel gain estimation in massive MIMO system based on Time Division Duplexing. In this approach, channel gain was estimated by computing the average energy of the received signals. Due to the proposed scheme, they had obtained better normalized MSE.

Jeya and Amutha [15] had presented an enhanced Semi-Blind Sparse algorithm for solve the complexities of channel estimation in MIMO-OFDM. Initially, based on the Quadrature Phase Shift Keying (QPSK) modulation, the input signal was modulated. Then, Inter-Symbol Interferences were reduced utilizing the Pulse Shaping Algorithm. Symbol mapping at every transmitter was done by performing the operation of Inverse Fast Fourier Transforms. At receiver, channel estimation was performed using their proposed algorithm. Also, cost function was decreased using Enhanced Differential Evolution. By proposing the proposed scheme, the authors showed better symbol error rate and Peak signal-to-noise ratio (PSNR).

From the review of existing works, we are motivated to present deep learning model for channel estimation in this work. Although the existing works give better results, these are not adequate to manage the precise and timely acquisition of massive CSI. Thus, an optimized deep learning model is presented in this work.

3 Channel estimation using hybrid PSO-adam optimizer based RNN-LSTM for MIMO communications in 5 G network

3.1 Overview

In MIMO communication system, prediction, or estimation of the CSI between transmitter and receiver is the main challenge. So, to enhance the transmission rate in 5G network, channel estimation problem is to be solved by predicting the current CSI or channel response (Ht) of pilot block according to the previous channel response of (Ht-1) of pilot blocks. Thus, for predicting or estimating the current channel response, an optimized RNN-LSTM is presented. Initially, channel responses of pilot sequences are estimated using LS estimator. These estimated channel responses of pilot sequences are given as input to the propose RNN-LSTM structure, where the LSTM weight parameters are optimized using hybrid PSO and Adam optimizer to attain the accurate prediction output.

3.2 System model

Consider a MIMO communication system model with t number of transmitters and r number of receivers. For estimating CSI, it is important to transmit pilot sequences and receive feedback. Then CSI estimation can be performed depend on the record of transmitted and received signals. The transmitted pilot sequences are denoted asp1, p2,.., pm where pm represents the complex matrix of the pilot signals. The vector of the received signal at the receiver can be defined as follows,

$${y}_{{m}} { = Hp}_{{m}} { + z}_{{m}}$$
(1)

where ym denotes the vector of the received signal, H denotes the channel matrix and zm denotes the complex vector of zero mean white noise. To estimate CSI from channel matrix H, the number of training pilot sequence should be less than or equal to number of transmitters i.e.

N ≤ t. The received signal can be defined as follows,

$${Y = HP + Z}$$
(2)

Based on the knowledge of P and Y, the channel estimator can find the channel matrix H.

3.3 Channel model

In this work, Rayleigh fading channel is modelled between the transmitter and receiver. Generally, this fading channel forms when there are more various signal paths among the transmitters and receivers. Because of the RNN-LSTM, the proposed channel estimation scheme can be realized with only priori knowledge about Doppler rate range and not the exact value. Hence it does not require Doppler rate estimation and provides a more reliable packet transmission in highly dynamic environments.

The time varying fading channel between the tth transmitter and rth receiver at time index k is defined as follows,

$$H\left( {k} \right) = \left[ {\begin{array}{*{20}c} {{h}_{{11}} \left( {k} \right)} & {{h}_{{12}} \left( {k} \right)} & { \ldots .} & {{h}_{{1{r}}} \left( {k} \right)} \\ {{h}_{{21}} \left( {k} \right)} & {{h}_{{22}} \left( {k} \right)} & { \ldots .} & {{h}_{{2{r}}} \left( {k} \right)} \\ \vdots & \vdots & { \ldots .} & \vdots \\ {{h}_{{{t}1}} \left( {k} \right)} & {{h}_{{{t}2}} \left( {k} \right)} & { \ldots .} & {{h}_{{{tr}}} \left( {k} \right)} \\ \end{array} } \right]$$
(3)

where htr (k) denotes the channel between the tth transmitter and rth receiver at time index k.

The probability density function (PDF) of the received signal which is transmitted via the Rayleigh fading channel is defined as follows,

$$p\left({r}_{0}\right)=\frac{{r}_{0}}{{\sigma }^{2}}\mathrm{exp}\left(-\frac{{r}_{0}^{2}}{2{\sigma }^{2}}\right)$$
(4)

where r0 denotes the received signal’s amplitude, \({\sigma }^{2}\) denotes the variance and 2 \({\sigma }^{2}\) denotes the average multipath signal power.

The PDF of the instantaneous SNR is defined as follows,

$$p\left({\gamma }_{b}\right)=\frac{1}{\overline{{\gamma }_{b}}}{\mathrm{e}}^{\frac{-{\gamma }_{b}}{{\overline{\overline{\gamma }}}_{b}}}$$
(5)

where, \({\gamma }_{b}\) denotes the SNR of the received signal.\(\overline{{\gamma }_{b}}\) denotes the average SNR per bit and it can be defined as follows,

$$\overline{{\gamma }_{b}}= \frac{{E}_{b}}{{Z}_{0}} E({\alpha }^{2})$$
(6)

where Eb/Z0denotes the SNR of the received signal and \(E\left({\alpha }^{2}\right)\) denotes the average of Rayleigh distribution \(\left({\alpha }^{2}\right)\).

3.4 Channel response of pilot sequence

In this work, history of channel responses of pilot sequences is used to predict the current channel response of the pilot block. To estimate the channel response of previous pilot sequences, LS channel estimator is used. The channel response of previous pilot sequence is defined as follows,

$${H}_{{{k} - 1}} = \left( {P} \right)^{{ - 1}} {Y}$$
(7)

where Hk-1 denotes the channel response of the pilot sequences at time period k-1. These estimated channel responses are given as input to the proposed RNN-LSTM model for predicting the current channel response (Hk) of the pilot sequence. The following section describes the channel estimation using optimized RNN-LSTM.

3.5 RNN-LSTM

LSTM is a unique type of RNN and is used in this work for channel estimation. The LSTM cell incorporates an extraordinarily evolved model known as a "gate" which is the appropriate region to choose input features; likewise having multiplication operation and sigmoid neural framework layer. In addition, it contains the gates such as input, forget and output gates. The forget gate chooses what data to dispose of or extract from the memory. The input gate chooses the information that should be stored while updating the cell state. The output gate generates the appropriate information depend on the cell state. The structure of LSTM is shown in Fig. 1. As depicted in the figure, depend on the output from cell state at ‘t−1’ and ‘t’, the output of cell at ‘t + 1’ is obtained.

Fig. 1
figure 1

The structure of LSTM

In this work, LS based estimated channel responses are given as input to the LSTM. The input LS channel responses are defined as follows,

$$\begin{aligned} X_{f} = & H^{{LS}} \left( {k - 1} \right) \\ = & \left[ {\begin{array}{*{20}c} {h_{{11}}^{{LS}} (k - 1)} & {h_{2}^{{LS}} (k - 1)} & { \cdots \cdots } & {h^{{LS}} (k - 1)} \\ {h_{{21}}^{{LS}} (k - 1)} & {h_{{22}}^{{LS}} (k - 1)} & { \cdots \cdot h} & {h_{2}^{{LS}} (k - 1)} \\ \vdots & \vdots & { \cdots \cdot } & \vdots \\ {h_{1}^{{LS}} (k - 1)} & {h_{2}^{{LS}} (k - 1)} & { \cdots \cdot h} & {h^{{LS}} (k - 1)} \\ \end{array} } \right] \\ \end{aligned}$$
(8)

The formulation of LSTM network is described as follows

$${F}_{t}= \sigma \left[ {w}_{F }\left({X}_{t }, {S}_{t-1}\right)+ {c}_{F}\right]$$
(9)
$${I}_{t}= \sigma [ {w}_{I }\left({X}_{t }, {S}_{t-1}\right)+ {c}_{I}]$$
(10)
$${V}_{t}=\mathrm{tan}S [ {w}_{V }\left({X}_{t }, {S}_{t-1}\right)+ {c}_{V}]$$
(11)
$${V}_{t}= {F}_{t}* {V}_{t-1}+ {I}_{t}* {V}_{t}$$
(12)
$${O}_{t}= \sigma \left[ {w}_{0 }\left({X}_{t }, {S}_{t-1}\right)+ {c}_{0}\right]$$
(13)
$${S}_{t}= {O}_{t}*\mathrm{tan}S({V}_{t})$$
(14)

where Ft, It, Vt, and Ot represent the output of forget gate, input gate, tanS layer and output gate respectively. St denotes the final hidden state output.\({w}_{F },{w}_{I },{w}_{V }\mathrm{and }{w}_{0}\) denote the weight parameters of forget gate, input gate, tanS layer and output gate respectively.\({c}_{F} ,{c}_{I}, {c}_{V}\mathrm{and }{c}_{0}\) denotes the bias values of forget gate, input gate, tanS layer and output gate p respectively.\(\sigma\) denotes the sigmoid function. If the output value of Ft is one, all information will be maintained in the cell state else the information will be rid of. Equation (9) performs for new information. Vt denotes the current state of the cell. Equation (11) generates the appropriate information depend on the cell state. St denotes the overall output of the cell. In this work, St can be denoted as the estimated current channel response (HLS™(k)) and is defined as follows

$${H}^{LSTM}\left(k\right)=\left[\begin{array}{ccccc}{ h}_{11}^{LSTM}\left(k\right)& {h}_{12}^{LSTM}\left(k\right)& \cdots & {h}_{1r}^{LSTM}\left(k\right)& \\ {h}_{21}^{LSTM}\left(k\right)& {h}_{22}^{LSTM}\left(k\right)& \cdots & {h}_{2r}^{LSTM}\left(k\right)& \\ \vdots & \vdots & \cdots \cdots & \vdots & \\ { h}_{t1}^{LSTM}\left(k\right)& {h}_{t2}^{LSTM}\left(k\right)& \cdots & {{ h}_{tr}}^{LSTM}\left(k\right)& \end{array}\right]$$
(15)

At last, the loss function or MSE of the structure is determined as follows,

$$LOSS= \sum ( {H}^{LSTM}\left(k\right)- \widehat{H}\left(k\right){)}^{2}$$
(16)

where \(\widehat{H}\left(k\right)\) denotes the desired channel output.

3.6 Channel estimation using hybrid pso-adam optimizer based Rnn-Lstm

Although LSTM outputs the better results, the prediction accuracy is further to be enhanced. So, to enhance the LSTM performance in terms of prediction accuracy, the weight parameters are chosen optimally using hybrid PSO-Adam optimizer algorithms in this approach.

Basic of PSO algorithm: As the convergence speed and adaptability of PSO algorithm is better than other algorithms, it is chosen to optimize the weight parameters of LSTM in this work. It is an evolutionary algorithm to attain the optimal solution by imitating the bird flocks’ movement. In this algorithm, each particle in the Dth dimension search space represents the position of the solution. The position of the ith particle can be defined as follows,

$${\text{Y}}_{{\text{i}}} = \left( {{\text{Y}}_{{{\text{i1}}}} ,{\text{ Y}}_{{{\text{i2}}}} , \ldots ,{\text{ Y}}_{{{\text{iD}}}} } \right)$$
(17)

where Yi denotes the position of the ith particle.

The position of the solutions is updated according to the following equations,

$${V}_{i}\left(t+1\right)={V}_{i}\left(t\right)+\mathrm{rand}( )*{c}_{1}*\left({P}_{i}\left(t\right)-{Y}_{i}\left(t\right)\right)+\mathrm{rand}\left( \right)*{c}_{2}*\left({P}_{g}\left(t\right)-{Y}_{i}\left(t\right)\right)$$
(18)
$${\text{Y}}_{{\text{i}}} \left( {{\text{t}} + {1}} \right) = {\text{Y}}_{{\text{i}}} \left( {\text{t}} \right) + {\text{V}}_{{\text{i}}} \left( {{\text{t}} + {1}} \right)$$
(19)

where Pi (t) denotes the personal best position, Pg (t)denotes the global best position, c1 and c2 denote the learning factor and rand () denotes the random number within the range [0, 1].

Basic of adam optimizer: Adam optimizer is the combination of RMSprop [16] and Stochastic Gradient Descent with momentum [17]. As RMSprop, learning rate can be scaled by Adam utilizing the squared gradients. Also, Adam utilizes moving average of the gradient with the momentum. It is otherwise called adaptive learning rate method. From the calculation of first and second moments of gradient, Adam algorithm estimates the individual adaptive learning rates for various parameters. In this method, the parameter magnitudes is invariant to the gradient rescaling, as step size has a particular range. Besides, Adam method performs with sparse gradients and generally executes a structure of step size annealing. Shortly, Adam optimizer computes the gradient (g) of parameters initially. Second, it estimates first moment E(g) and second moment E(g2). The first moment ignores disordered moving and stops only attaining local optima. The second moment certain an upper bound of step size. Next unbiased calculation of both moments should be estimated. At last, the output parameter (\({\theta }_{t}\)) is obtained.

PSO-adam optimizer based RNN-LSTM: To improve the search ability of the population, both PSO and Adam optimizer are combined in this work. In this algorithm, the Adam optimizer uses the solution space attained from the PSO algorithm for future search. It leads to attain new solution space.

In this work, hybrid PSO-Adam optimizer is utilized for choosing the optimal weight parameters of RNN-LSTM. The phases of PSO-Adam optimizer for weight parameter selection are described as follows:

Initialization: Particles or candidate solutions are initialized. In this approach, the weight parameters,\({w}_{F },{w}_{I },{w}_{V }\mathrm{and }{w}_{0}\) of LSTM are considered as solutions. The initialization of the candidate solutions is represented as follows:

$$Y_{i} = \left( {Y_{i1} , \, Y_{i2} , \ldots , \, Y_{iD} } \right)$$
(20)

where YiD can be defined as follows,

$${YiD =({w}_{F },{w}_{I },{w}_{V }, {w}_{0 })_iD}$$
(21)

Fitness calculation: The fitness of each initialized solution is calculated as follows,

$$F_{i} = {\text{Min}}(Loss(i))$$
(22)

where Loss(i) defines the loss function and is defined in Eq. (16).

Update the solution: In this approach, each solution is updated using PSO initially. Then, the obtained solution is updated using Adam optimizer. The steps of updating the solution are described as follows:

PSO:

Step 1: :

For each solution, pbest is calculated as well as gbest is calculated for all solutions.

Step 2: :

The position of each solution is updated using the following equation,

$${Y}^{^{\prime}}=Y\left(t\right)+ {V}_{PSO}$$
(23)

where VPSO denotes the velocity obtained from the Eq. (18).

Step 3: :

After updating the position of the solution using (23), the new solution space is represented as \({\theta }_{0}= {Y}^{^{\prime}}\). Then the position of the obtained solution is updated using Adam optimizer.

Adam optimizer:

Step 4: :

Running average coefficient \(({\gamma }_{1,t })\) of first moment is degraded as follows,

$${\gamma }_{1,t }= {\gamma }_{1 }{\lambda }^{t-1}$$
(24)

where \({\gamma }_{1}\) and \(\lambda\) denote the decay rate for the estimation of the moment. Range of \({\gamma }_{1}\) and \(\lambda\) is considered within [0, 1].

Step 5: :

Gradient (gt) is calculated based on the stochastic objective at t.

$${\mathrm{g}}_{t }= {\nabla }_{0}{f}_{t} \left({\theta }_{t-1}\right)$$
(25)

where \(f\left(\theta \right)\) denotes the stochastic objective function.

Step 6: :

Estimation of biased first moment (mt) is updated as follows,

$${m}_{t}={\gamma }_{1,t}*{m}_{t-1}+{g}_{t}*\left(1-{\gamma }_{1,t}\right)$$
(26)
Step 7: :

Estimation of biased second moment (vt) is updated as follows,

$${v}_{t}={\gamma }_{2}*{v}_{t-1}+{g}_{t}^{2}*\left(1-{\gamma }_{2}\right.)$$
(27)

where \({\gamma }_{2}\) denotes the decay rate for the estimation of moment and is considered within [0, 1].

Step 8: :

Bias-corrected first moment (\({\widehat{m}}_{t}\)) is estimated as follows,

$${\widehat{m}}_{t}=\frac{{m}_{t}}{\left(1-{\gamma }_{1,t}\right)}$$
(28)
Step 9: :

Bias-corrected first moment (\({\widehat{v}}_{t}\)) is calculated as follows,

$${\widehat{v}}_{t}=\frac{{v}_{t}}{\left(1-{\gamma }_{2,t}\right)}$$
(29)
Step 10: :

The parameter \(\mathop \theta \nolimits_{t}\) is updated as follows,

$${\theta }_{t}={\theta }_{t-1}-\beta *\frac{{\widehat{m}}_{t}}{\sqrt{{\widehat{v}}_{t}}+\varepsilon }$$
(30)

where \(\beta\) represents the step size and \(\varepsilon\) denotes the random variable within [0, 1].

Step 11: :

The process is continued from step 4 to 10 if the \({\theta }_{t}\) doesn’t converge. Else, return the resulting parameter \({\theta }_{t}\).

The position of each particle or solution is updated using Eq. (31).

$${Y}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}\left(t\right)={\theta }_{t}$$
(31)
Step 12: :

Fitness of Y’ and Y’’ is calculated. By comparing the fitness of Y’ and Y’’, the solution with better fitness is selected as the current position.

$${Y}_{i}\left(t+1\right)=\left\{\begin{array}{c}{Y}^{\mathrm{^{\prime}}}\hspace{0.25em}\hspace{0.25em}\hspace{0.25em}\hspace{0.25em}f\left({Y}^{\mathrm{^{\prime}}}\right)<f\left({Y}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}\right)\\ {Y}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}\hspace{0.25em}\hspace{0.25em}\hspace{0.25em}\hspace{0.25em}f\left({Y}^{\mathrm{^{\prime}}}\right)\ge f\left({Y}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}\right)\end{array}\right.$$
(32)

Termination: The solution is updated until finding the optimal solution else the algorithm is terminated.

figure a

3.7 Channel estimation

Figure 2 shows the process of channel estimation or prediction using the proposed hybrid of PSO-Adam optimizer based RNN-LSTM. As shown in the figure, in the training phase, the channel responses of input pilot signals (pm) are estimated using LS channel estimator and are denoted as HLS (k−1). These channel responses are given as input to the weight optimized RNN-LSTM model. Based on the input factors, the model is trained.

Fig. 2
figure 2

Proposed channel estimation process

This trained model is tested in the testing phase. In the testing phase, the estimated channel responses of pilot ignals are given as input to the trained model. Depend on the input factors, the current channel response (H (k) or HLS™ (k)) is predicated or estimated at time period ‘k’ by minimizing the loss function.

4 Results and discussions

In the proposed scheme, input signals are modulated using QPSK modulation technique. Carrier frequency is considered as 700 MHz. Besides, Rayleigh fading channel model is used in the simulation. For estimating the channel responses of the signals at 't−1', LS channel estimation scheme is used. From the history of channel responses, 80% used for training and 20% used for testing to predict the current channel response at 't’. Pilot lengths 128, 136, 152, 160 are considered for different number of antennas Nt = 2; Nr = 2, Nt = 4; Nr = 4, Nt = 8; Nr = 8 and Nt = 10; Nr = 10 respectively.

4.1 Performance analysis

The performance of the proposed channel estimation scheme is evaluated in terms of BER and MSE for SNR. The evaluation metrics of the proposed PSO-Adam-LSTM is compared with that of the Adam-LSTM, PSO-LSTM and conventional LSTM. Besides, the performance of different channel estimation schemes is evaluated under the various pilot lengths and the number of antennas.

4.2 The performance analysis based on various pilot lengths

The evaluation metrics of different channel estimation schemes are analysed under various pilot lengths (PL = 128, 136, 152, 160). Figure 3 shows the comparison of BER and MSE of different channel estimation schemes under PL = 128. As shown in Fig. 3a, BER is decreased when SNR increases. As the weight parameters of LSTM are selected optimally using PSO, BER of the PSO-LSTM is reduced to 42% than that of the conventional LSTM. Nevertheless, compared to PSO, Adam optimizer is computationally efficient so that BER of Adam-LSTM is decreased to 47% than that of the PSO-LSTM. To attain the better prediction output, in this approach, both PSO and Adam optimizer are presented to optimize the weight parameters of LSTM. Thus, BER of the PSO-Adam-LSTM is reduced to 79% and 88% than that of the Adam LSTM and PSO-LSTM respectively. As shown in the Fig. 3b, when PL = 128, MSE of the PSO-Adam-LSTM is reduced to 74%, 87% and 91% than that of the Adam-LSTM, PSO-LSTM and LSTM respectively.

Fig. 3
figure 3

a BER vs. SNR when PL = 128. b MSE vs. SNR when PL = 128

The comparison of BER and MSE of different channel estimation schemes under PL = 136 is shown in Fig. 4. As depicted in Fig. 4a, compared to Adam-LSTM, PSO-LSTM and LSTM, BER of PSO-Adam-LSTM is decreased to 76%, 83% and 91% respectively. Also, MSE of PSO-Adam-LSTM is reduced to 73%, 82% and 86% than that of the Adam-LSTM, PSO-LSTM and LSTM respectively as depicted in Fig. 4b. Figure 5 shows the comparison of BER and MSE of different channel estimation schemes under PL = 156.As depicted in Fig. 5a and b, the PSO-Adam-LSTM outperforms the Adam-LSTM, PSO-LSTM and LSTM in terms of BER and MSE for varying SNR when PL is 156. As depicted in Fig. 6a, when PL = 160, BER of PSO-Adam-LSTM is decreased to43%, 69% and 79%than that of the Adam-LSTM, PSO-LSTM and LSTM. Also, compared to Adam-LSTM, PSO-LSTM and LSTM, MSE of PSO-Adam-LSTM is reduced to 53%, 73% and 78% respectively as shown in Fig. 6b.

Fig. 4
figure 4

a BER vs. SNR when PL = 136. b MSE vs. SNR when PL = 136

Fig. 5
figure 5

a BER vs. SNR when PL = 152. b MSE vs. SNR when PL = 152

Fig. 6
figure 6

a BER vs. SNR when PL = 160. b MSE vs. SNR when PL = 160

4.3 Performance analysis based on number of antennas

In this section, the evaluation metrics of different channel estimation schemes are analysed by varying numbers of antennas. Figures 7, 8, 9, 10 show the comparison of BER and MSE of different channel estimation schemes by varying numbers of antennas. As depicted in the figures, the proposed PSO-Adam-LSTM-based channel estimation scheme outperforms the channel estimation schemes based on Adam-LSTM, PSO-LSTM, and LSTM. When Nt = 2, Nr = 2, the BER of the PSO-Adam-LSTM decreases to 51%, 65%, and 75% and MSE of the PSO-Adam-LSTM decreases to 6%, 43%, and 47% than BER and MSE of Adam-LSTM, PSO-LSTM and LSTM respectively as depicted in Fig. 7a and b. Figure 8 shows the BER and MSE of different channel estimation schemes when Nt = 4, and Nr = 4. As shown in Fig. 8a, at SNR = 25 dB, the BER of PSO-Adam-LSTM is decreased to 54%, 68%, and 79% than that of the Adam-LSTM, PSO-LSTM, and LSTM respectively. Compared to channel estimation schemes Adam-LSTM, PSO-LSTM, and LSTM, the MSE of the proposed channel estimation scheme is reduced to 7%, 50%, and 60% respectively at SNR = 25 dB as depicted in Fig. 8b. In Fig. 9a and b, BER and MSE of the proposed channel estimation scheme based on PSO-Adam-LSTM outperform the Adam-LSTM, PSO-LSTM, and LSTM based channel estimation schemes when Nt = 8, Nr = 8. Figure 10 shows the comparison of BER and MSE of different channel estimation schemes Nt = 10 and Nr = 10. As shown in Fig. 10a, at SNR = 25 dB, the BER of the proposed channel estimation scheme is reduced to 55%, 79%, and 84% than that of the Adam-LSTM, PSO-LSTM, and LSTM respectively. Also, as depicted in Fig. 10b, Compared to channel estimation schemes Adam-LSTM, PSO-LSTM, and LSTM, the MSE of the proposed channel. Figure 11 depicts the computational analysis of different channel estimation techniques in terms of execution time. As depicted in the figure, compared to deep learning-based channel estimation schemes, conventional schemes consume high execution time because of computational complexity. Nevertheless, the execution time of PSO-Adam LSTM is high compared to Adam-LSTM, PSO-LSTM, and LSTM. The hybrid techniques of Adam and PSO have increased the computational complexity.

Fig. 7
figure 7

a BER vs. SNR when Nt = 2, Nr = 2. b MSE vs. SNR when Nt = 2, Nr = 2

Fig. 8
figure 8

a BER vs. SNR when Nt = 4, Nr = 4. b MSE vs. SNR when Nt = 4, Nr = 4

Fig. 9
figure 9

a BER vs. SNR when Nt = 8, Nr = 8. b MSE vs. SNR when Nt = 8, Nr = 8

Fig. 10
figure 10

a BER vs. SNR when Nt = 10, Nr = 10. b MSE vs. SNR when Nt = 10, Nr = 10

Fig. 11
figure 11

Complexity analysis in terms of execution time

4.4 Comparative analysis

Table 1 illustrates the comparative analysis of the proposed approach with the approaches in the previous literature. As illustrated in the table, Haythem Bany Salameh et al. [10] had attained MSE of 13 dB as well as execution time is 9 s. Compared to proposed approach, the authors attained maximum MSE but execution time had reduced. The execution time of the approach which was presented by Changqing Luo et al. [11] was 4 s. Unfortunately, this is the least execution time than the other schemes compared in the table. Compared to the proposed scheme, Yang Liu et al. [12] attained minimum MSE i.e., − 24 dB. Likewise, Imran Khan [13] attained minimum BER i.e., 0.0007. Compared to [10, 14], the proposed approach attains minimum MSE i.e., − 18 dB.

Table 1 Comparative analysis with previous literature

5 Conclusion

The process of estimating the channel has a predominant role in deciding the wireless system performance. Apart from the conventional pilot aided channel estimation method, deep learning algorithms exhibits notable enhancement in channel predictability and reduction in computational complexity of 5G networks. Although LS estimate is generally employed to capture channel estimates due to its low cost, without having prior knowledge of the channel, it offers comparatively high estimation error. Although the previous works give better results, it poses inadequacy in managing the precise and timely acquisition of massive CSI for 5G communication scenario. To estimate the accurate channel state information or channel response of the pilot data block, an optimized RNN-LSTM is used in this paper. The optimized deep learning-based estimator improves the estimate obtained by LS approach. Firstly, the history of channel responses of pilot sequences at time 't-1' has been estimated using the LS estimation scheme. From the history of channel responses, 80% are used for training and remaining 20% for testing to predict the current channel response at 't’. These estimated channel responses have been given as input to the proposed hybrid PSO-Adam optimizer-based LSTM in which weight parameters are optimized using PSO and Adam optimizer. Finally, channel response at time 't' is estimated or predicted using the optimized LSTM. LS channel estimation scheme is used. Performance of the proposed scheme has been analysed under the following scenarios such as various pilot lengths and number of antennas. Simulation results showed that the PSO-Adam-LSTM outperforms the channel estimation schemes based on Adam-LSTM, PSO-LSTM, and LSTM in terms of BER and MSE. Comparative analysis of various channel estimation techniques is demonstrated in terms of execution time. We substantially prove that the proposed estimator demonstrates superior performance over other channel estimation approaches in previous works, owing to better channel learning, resulting in decreased estimation errors. The future work should focus on a robust learning architecture to further reduce the bit error rate with implement ability on massive connection platform. Deep learning facilitated channel estimation would also require availability of prototypical datasets publicly, where several pilot types, channel conditions, antenna arrangements and scenarios are considered in a comprehensive manner. To handle the time and location varying nature of wireless environment, utilization of transfer learning is needed to survive in situations that are not experienced at the time of training.