1 Introduction

In wireless data communication systems, there is an increased requirement to boost data throughput and effectively use the available spectrum. One of the techniques that have been employed to attain these goals is the OFDM modulation technique [1, 2].

In most modern high-speed wireless communication technologies, OFDM is rapidly becoming the preferred modulation technique [3]. This is because of its robustness to multipath fading and its capability to alleviate the intersymbol interference (ISI) caused by wireless channel delay spread [4, 5].

Generally, CSE is vital in communication systems because it has a direct impact on system quality. CSE is a crucial challenge in OFDM systems since the channel response varies significantly over time owing to the mobility of the transmitter, receiver, or dispersion obstacles [6, 7].

CSE is achieved by inserting pilot carriers, which are known in advance on both sides but consume additional spectrum resources, during the transmission of OFDM signals. On the other hand, the CSE and its impacts must be estimated and compensated at the receiving end to accurately retrieve the desired signal [8].

Although increasing the number of pilots in the OFDM symbol results in more accurate estimates, the inserted pilot signals occupy more spectrum resources and are more likely to be affected by noise, resulting in a degradation of original signal recovery and bandwidth loss [9].

In the context of traditional channel estimation techniques, the LS estimator is well-known for its low computational complexity, requiring no prior channel statistics. However, in many practical applications, especially for multipath channels, LS estimation produces relatively significant channel estimation errors [9, 10].

As an alternate option, the MMSE estimation approach produces significantly higher channel estimation quality than the LS estimator. However, operational noise power and statistics of the channel are required in MMSE, making it have significant computational complexity [11, 12].

DL techniques are a recent trend in various wireless communication applications [13,14,15,16,17]. Channel equalization, radio resource allocation, physical security, channel decoding, and channel estimation are just a few of the applications [18,19,20,21,22,23].

There are several factors supporting the use of DL in diverse fields. One of these critical facets is that DL-based algorithms are data-driven, making them more suited to handle real-world application challenges. Furthermore, the DL-based approaches have lower computational complexity, requiring numerous layers of basic operations, including matrix–vector multiplications. Additionally, the implementation of DL algorithms can be highly parallelized and constructed simply using low-precision data types.

In terms of the CSE and SD applications, the authors in [24] proposed a deep neural network (DNN) model that employs hyperparameter optimization to estimate the CSI in OFDM wireless systems and symbol detection over the winner II channel. The proposed DNN model provides higher performance than the conventional MMSE estimation approach. This article [25] proposes an LSTM NN model to estimate the CSI in OFDM wireless systems. The comparative analysis revealed that the suggested estimator outperformed the MMSE and LS estimators with a restricted number of pilots and previous uncertainty in channel conditions. The authors in [26] proposed DNN-based CE for doubly selective fading channels. The simulation outcomes demonstrate that the suggested DNN estimator performs superior to the linear MMSE (LMMSE) estimator regarding robustness and efficiency. The authors in [27] introduced two DNN model architectures in a 5 G MIMO-OFDM system with frequency selective fading, used for channel estimation. In addition, the performance of the proposed DNN estimators was compared to the traditional LMMSE and LS estimation methods in terms of bit error rate (BER) versus signal-to-noise ratio (SNR) and channel estimate errors. The suggested DNN-aided estimate outperformed other methods in lowering channel estimation errors. The authors in [28] introduced the employment of several DNN architectures, including convolutional neural network (CNN), bi-LTSM, and a fully connected DNN, to aid in the CSE procedure in a MIMO-OFDM system with various cases of fading multi-path channel models based on the TDL-C model outlined in 5 G networks. The simulation results demonstrate that the proposed DNN estimators perform better than the traditional LMMSE and LS estimation techniques. The article [29] introduced the use of a one-dimensional (1D) CNN for CSE in the OFDM system. The simulation results showed that the proposed estimator outperformed the feed-forward neural network (FFNN), LS, and MMSE estimators for BER and mean squared error (MSE). The authors in [30] proposed a recurrent neural network (RNN) with BiLSTM architecture integrated with CNN and a batch normalization (BN) layer for signal detection tasks in uplink OFDM systems over time-varying channels. The article [31] proposed applying DL LSTM-based data-pilot aided (DPA) estimation followed by temporal averaging (TA) processing as a noise alleviation technique to the IEEE 802.11p standard and employed it in a vehicular channel scenario under different mobility conditions. The authors in [32] employed the DL BiLSTM architecture to perform symbol detection tasks in a MIMO-OFDM system and study the impact of the number of pilots on the system’s performance.

To the best of the author’s knowledge, none of the studies cited [24,25,26,27,28,29,30,31,32] addressed the impact of the absence of CP on the efficacy/accuracy of the employed DL methods, which is essential for improving spectrum efficiency, energy efficiency, and transmission data rates. When comparing our proposed DL LSTM model to others in the DL domain, we focus on using LSTM architecture for both CSE and SD tasks, rather than utilizing LSTM for CSE alone, as seen in [31], or combining BiLSTM with other DL architectures for SD, as in [30], or incorporating other processing units for SD, as in [32]. This approach enhances receiver functionality while reducing computational complexity.

The simulation results show that the proposed DL LSTM estimator beats the conventional LS and MMSE channel estimation approaches, in addition to the DL BiLSTM model. Furthermore, the proposed estimator offers robust performance when the CP is omitted, limited training pilots are utilized, and without prior channel statistics knowledge.

The rest of the paper is structured in the following way: Sect. 2 describes the study’s aims and methods and the architecture of the OFDM system used. Section 3 illustrates the proposed DLNN-based CSE framework and the model training approach. Section 4 demonstrates the simulation results that examine the suggested estimator’s performance and compare it to other standards. Finally, Sect. 5 summarizes and concludes the paper, along with future directions.

2 Methods and aims

Wireless networks have become increasingly sophisticated. However, the latest wireless systems are designed utilizing mathematical models. These mathematical models vary based on the scenario and often fail to incorporate insights from past experiences or system trends, limiting their potential for offering general solutions. Using ML approaches to wireless communication networks is the focus of significant research to overcome the above constraints. A generic learning system can be created using ML-based design’s prediction/estimation abilities. DL has emerged as a viable option as wireless transmission channels have become more complex and unpredictable. DL is ideally suited to problems like CSE because of the varying channel conditions and the necessity for estimation/training to determine channel parameters. In particular, the current study focuses on developing a low-spectrum receiver for OFDM wireless systems using DL LSTM RNNs.The most important contributions of this study are the following:

  • We utilize a DL LSTM-NN model to improve CSE in OFDM systems operating over Rayleigh fading channels. LSTM architectures allow for the storage and later use of data. When dealing with a sequence of data or a time series, this feature is helpful.

  • The proposed DL LSTM estimator is initially trained in offline mode using the data set results from the simulation. After that, the trained DL model is employed online to retrieve the transmitted data.

  • We evaluate the proposed DL LSTM estimator’s performance in terms of symbol error rate (SER) versus SNR in addition to comparing the efficiency of the proposed DL LSTM estimator to that of the traditional MMSE and LS channel estimation techniques.

  • Furthermore, we compare the proposed DL LSTM estimator to other DL options, including the BiLSTM model mentioned in [30, 32].

  • Under different CP durations and pilot densities, we examine the performance of the proposed DL LSTM channel estimator structure. Furthermore, the proposed CSE does not require knowledge of channel details. As a result, the proposed DL LSTM model offers a practical means of reducing the spectrum resources necessary for CSE.

In this study, the proposed DL LSTM-NN estimator is trained using the adaptive moment estimation (Adam) optimization approach. Additionally, the primary loss function in this investigation is cross-entropy.

2.1 System model description

This section provides a brief overview of the OFDM system. For the current study, we adopted an OFDM system with a single user. Using the architecture of the system depicted in Fig. 1, the transmitting and receiving components are identical to those used in conventional systems. Additional information about the processes carried out in the transceiver can be found in [22].

Fig. 1
figure 1

The architecture of the OFDM system [22]

3 Deep learning neural networks-based channel estimation

This section provides a detailed explanation of the architecture of the proposed DL-based channel estimation approach. Then, we briefly describe how the training stages are conducted.

3.1 Proposed DL-based CE architecture

RNNs are designed to learn sequential data and have been shown to be highly effective in many time series applications. Nevertheless, when the sequence is much longer, several critical challenges emerge, such as long-term dependency issues and vanishing/exploding gradient issues. The LSTM architecture was presented as an enhanced version of RNN to address these issues [33]. Also, LSTMs can effectively use data from previous time sequences and deal with the long-term dependencies of time sequences, especially when making predictions and classifying based on sequence data. A typical LSTM-NN cell is described in Fig. 2, which consists of an input gate, an output gate, and a forget gate, as well as a memory cell. The LSTM-NN manages the flow of the training data through the mentioned gates by adding data selectively (input gate), discarding data (forget gate), or allowing it to pass to the next cell (output gate) [34].

Fig. 2
figure 2

Multi-gated LSTM cell architecture [34]

Specifically, in the first, the forget gate allows the LSTM network to discard unwanted data before it passes through the cell. This is achieved by using the current input \(\textit{x}_{t}\) and the cell output \(\textit{h}_{t}\) of the previous step. Meanwhile, the input gate determines the amount of information from the current cell input \(\textit{x}_{t}\) and the previous cell output \(\textit{h}_{t-1}\) that will be combined with the previous LSTM cell state \(\textit{c}_{t-1}\) to generate a new state for the cell \(\textit{c}_{t}\). Finally, the output gate finds the amount of information from the current cell input \(\textit{x}_{t}\) and the previous cell output \(\textit{h}_{t-1}\) that will be used in the existing cell state \(\textit{c}_{t}\) to produce the current cell output \(\textit{h}_{t}\).


The mathematical functions describing the architecture and mechanism of the LSTM-NN cell are as follows [33]:

$$\begin{aligned} i_{t}=\, & {} \sigma _{g}(w_{i}x_{t} + R_{i}h_{t-1} + b_{i}) \end{aligned}$$
(1)
$$\begin{aligned} f_{t}=\, & {} \sigma _{g}(w_{f}x_{t} + R_{f}h_{t-1} + b_{f}) \end{aligned}$$
(2)
$$\begin{aligned} g_{t}=\, & {} \sigma _{c}(w_{g}x_{t} + R_{g}h_{t-1} + b_{g}) \end{aligned}$$
(3)
$$\begin{aligned} o_{t}=\, & {} \sigma _{g}(w_{o}x_{t} + R_{o}h_{t-1} + b_{o}) \end{aligned}$$
(4)
$$\begin{aligned} c_{t}= \,& {} f_{t}\odot c_{t-1} + i_{t}\odot g_{t} \end{aligned}$$
(5)
$$\begin{aligned} h_{t}=\, & {} o_{t}\odot \sigma _{c}(c_{t}) \end{aligned}$$
(6)

where i, f, g, o, \(\sigma _{g}\), \(\sigma _{c}\) and \(\odot\) indicate the input gate, forget gate, memory cell candidate, output gate, sigmoid (logistic) activation function, tanh (hyperbolic tangent function) activation function, and element by element multiplication, respectively.\(W=\,[w_{i}\; w_{f}\; w_{g}\; w_{o}]^{T}\), \(R=\,[R_{i}\; R_{f}\; R_{g}\; R_{o}]^{T}\) and \(b=\,[b_{i}\; b_{f}\; b_{g}\; b_{o}]^{T}\) are the weight vectors for forget gate (inputs), the weight vectors for the candidate (outputs), and bias, respectively.

Fig. 3
figure 3

The proposed DL LSTM estimator layout with variant layers

For the channel estimation task, we employed a DL LSTM-based NN. The proposed estimator includes an input layer with a size of 256, followed by an LSTM layer with 16 hidden units. Afterward, the output of the LSTM layer passes through a fully connected layer with a size of 4, then through a softmax layer, and finally to a classifier. Figure 3 illustrates the architecture of the proposed DL LSTM estimator.

3.2 Training of the proposed DL model

The proposed DL LSTM NN-based CSE is incorporated into the conventional OFDM system to estimate the channel conditions explicitly. DL approaches typically involve two phases: model training and implementing the learned model.

In this study, the DLNN model was initially trained with simulated data offline before the implementation phase online. Specifically, during the offline training phase, the proposed CSE is trained with received OFDM signals that are created with diverse information sequences and under variant channel properties with particular statistical characteristics.

The training dataset is designed for a single-user OFDM system, where each OFDM frame includes both transmitted data symbols and pilots. The necessary training dataset consists of the received OFDM signal, which is corrupted by existing channel characteristics and noise, as well as the originally transmitted data.

During the online implementation phase, the previously trained offline CSE produces output representing the transmitted data without explicitly estimating the wireless channel.

Fig. 4
figure 4

The proposed estimator’s training dataset generation and offline DL process

In this paper, the Adam optimizer is used to train the suggested estimator. It adjusts both biases and weights to minimize the difference between the estimator’s outputs and the actual sent data, employing a specific loss function. Cross-entropy (crossentropyex) is the main loss function used in this study to enhance training speed, which for the k mutually exclusive classes can be expressed as [25]:

$$\begin{aligned} crossentropyex=-\sum _{i=1}^{N}\sum _{j=1}^{C}X_{ij}(k)log({\hat{X}}_{ij}(k)) \end{aligned}$$
(7)

where N represents the entire number of samples, C denotes the entire number of classes, \(X_{ij}\) denotes the ith sample data sent for the jth category, and \({\hat{X}}_{ij}\) is the proposed estimator’s output for a sample i for category j. Figure 4 depicts the procedures for constructing training sets and conducting an offline DL to produce a learned LSTM estimator.

4 Simulation results

In this study, the suggested DL LSTM-NN-based CSE is trained offline with generated datasets. It is then utilized to implicitly estimate the CSI and retrieve the transmitted data in an OFDM wireless communication system. A dataset for training and validation is created for a single subcarrier. The received OFDM frame comprises data symbols interspersed with pilot symbols. A comparative analysis of SER at different SNRs is conducted to assess the performance and efficiency of the proposed DL LSTM model in comparison to the conventional wireless channel estimation methods, MMSE and LS. Additionally, the proposed framework’s performance is compared to the DL BiLSTM model used in [30, 32]. The performance of the investigated estimators will be evaluated at different CP lengths (16, 8, and 0) and with varying numbers of pilots (8, and 64). The proposed CSE will be trained using the Adam optimizer and the cross-entropy loss function in the last classification layer. A Rayleigh fading channel with 24 paths is considered. Finally, all investigations assume a priori uncertainty about the channel model’s properties.

Table 1 shows the simulation settings for the applied OFDM system and the adopted channel model. Table 2 lists the specifications of the proposed DL LSTM-NN architecture and its related training settings that were determined through trial-and-error.

Table 1 Channel model and OFDM system settings
Table 2 The proposed DL LSTM-NN construction and the training parameters

When we consider the absence of pilots (8 pilots) and a CP length of 16, Fig. 5 shows that the proposed DL LSTM-based CSE beats the conventional estimators starting from 0 dB, while the LS and MMSE estimators completely lose their workability. Additionally, in the SNR ranges [0–7dB], the proposed LSTM model is comparable to the BiLSTM model. Beyond this range, the proposed LSTM estimator beats the BiLSTM model.

The proposed LSTM estimator provides superior performance compared to the BiLSTM model when the length of CP decreases to 8, as shown in Fig. 6. Conversely, the SER curve for the conventional MMSE and LS estimators saturates at all SNR values.

In the simulation scenario with 8 pilots and none of the CP, the DL LSTM-based CSE still provides superior performance in comparison to the conventional estimators, starting at SNR = 6 dB, as shown in Fig. 7. It is also noticeable that the MMSE has better performance than the LS estimator, which offers the worst performance, starting at 14 dB. On the other hand, the proposed LSTM estimator achieves the same performance as the BiLSTM model over an SNR range of [0–22 dB]. The BiLSTM model outperforms the LSTM estimator starting at 23 dB.

Fig. 5
figure 5

SER performance of the proposed LSTM-based CSE and competitive estimators at 8 pilots and a CP length of 16

Fig. 6
figure 6

SER performance of the proposed LSTM-based CSE and competitive estimators at 8 pilots and a CP length of 8

Fig. 7
figure 7

SER performance of the proposed LSTM-based CSE and competitive estimators at 8 pilots and without CP

Fig. 8
figure 8

SER performance of the proposed LSTM-based CSE at 8 pilots and CP lengths of 16, 8, and zero

As observed in Figs. 5, 6, 7, the proposed DL LSTM-CSE/SD-based model outperforms the conventional estimators with any length of the CP and fewer pilot numbers. Also, the proposed DL LSTM model outperforms the BiLSTM model with long/short CP and fewer pilot numbers. In addition, it produces comparable performance to the BiLSTM model when the CP is omitted. The above demonstrates the effectiveness of the proposed DL LSTM-CSE/SD-based model in terms of performance and spectrum savings. Moreover, it reinforces the DL-based estimator’s outstanding generalization ability regarding the CP and the number of pilots.

The performance of the proposed DL LSTM-based CSE at 8 pilots and different CP lengths of 16, 8, and 0 is summarized in Fig. 8. We can observe that the proposed estimator with or without CP has identical performance at low SNRs [0–8 dB]. Also, the proposed estimator’s performance with CP has less variation than its performance without CP over the SNR ranges [8–14 dB]. The provided results demonstrate the effectiveness of the DL LSTM-based CSE under the conditions of few pilots and the absence of the CP.

When enough pilots (64 pilots) and the length of CP 16 are used, the proposed LSTM and DL BiLSTM models perform similarly over the SNR ranges [0–10 dB], as illustrated in Fig. 9. Beyond these SNR ranges, the proposed LSTM estimator beats the BiLSTM model. In addition, the proposed LSTM estimator beats the conventional estimators. On the other hand, the MMSE estimator outperforms the LS estimator in this situation.

At the length of CP of 8, Fig. 10 depicts the superiority of the DL LSTM-based CSE/SD compared to the conventional estimators, in addition to the BiLSTM model at all SNRs. On the other hand, the MMSE estimator still outperforms the LS estimator.

In the simulation scenario of 64 pilots without CP, the proposed DL LSTM estimator attains superior performance over the conventional estimators, as described in Fig. 11. In contrast, the LS estimator provides the worst performance. On the other hand, the proposed LSTM estimator is on par with the BiLSTM model over an SNR range of [0–23 dB]. Furthermore, the proposed DL LSTM estimator outperforms the BiLSTM model starting at 24 dB.

Fig. 9
figure 9

SER performance of the proposed LSTM and competitive estimators at 64 pilots and a CP length of 16

Fig. 10
figure 10

SER performance of the proposed LSTM and competitive estimators at 64 pilots and a CP length of 8

Fig. 11
figure 11

SER performance of the proposed LSTM and competitive estimators at 64 pilots and without CP

Fig. 12
figure 12

SER performance of the proposed LSTM-based CSE at 64 pilots and CP lengths of 16, 8, and zero

It can be observed from Figs. 9, 10, and 11 that the proposed DL LSTM-based CSE provides the best performance in comparison to the MMSE and LS estimators in all CP scenarios. This is because of the DNN’s capability to learn and adjust to the properties of the wireless channel. Also, the proposed LSTM CSE/SD outperforms the DL BiLSTM model in most cases.

On the other hand, the LS estimator yields the worst SER performance in all scenarios since its estimation method doesn’t use any prior knowledge about the channel statistics. The MMSE estimator, in contrast, uses the mean and covariance matrices (channel statistics of the second order), which gives it better performance than its LS counterpart.

Figure 12 summarizes the performance of the proposed DL LSTM-CSE/SD-based model with 64 pilots and different CP lengths of 16, 8, and 0. It is clear that the proposed LSTM estimator with any CP length has comparable performance over the SNR ranges [0–20 dB]. With increasing SNR, the model with CP gained an advantage over the model without CP because the ISI increased. Also, in Figs. 8 and 12, it can be noticed that the CP length has the same effect on the proposed DL LSTM-based CSE/SD using either sufficient or fewer pilots.

Improving spectrum efficiency and transmission data rates in OFDM wireless communication systems has received significant attention in the literature. The proposed model effectively attains this understanding by reducing the length of CP to an acceptable level, resulting in superior performance when compared to conventional or other deep learning models, as demonstrated in Figs. 5, 6, 7, 8, 9, 10, 11, 12.

5 Conclusions and future work

This study develops a low-spectrum OFDM wireless receiver using DL LSTM RNNs. The simulation results show that the proposed DL-LSTM-based CSE/SD model is highly effective in distorted and interfering wireless communication channels. The simulation results proved the robustness of the proposed model and demonstrated its superior performance compared to the other examined estimators, conventional CSEs, under the conditions of a minimum number of pilots and none of CP. The conventional LS and MMSE CSEs lose their workability with a limited number of pilots. On the other hand, under the same simulation conditions, the proposed CSE model beats the DL BiLSTM peer model in every simulation scenario, proving its efficacy both with/without CP, even with a limited number of pilots. The proposed LSTM model is recommended for OFDM wireless communication systems to optimize spectrum, energy, and data transfer rates. The following are some directions the authors have proposed for further studies:

  • Studying the performance of the proposed DL LSTM-based CSE/SD model for MIMO Communication systems.

  • Investigating the use of Federated machine learning techniques on the performance of the proposed DL LSTM model.

  • Analyzing the efficiency of the proposed DL LSTM model using other optimization techniques, such as stochastic gradient descent with momentum (SGdm) and root mean square propagation (RMSProp).