1 Introduction

VLC represents a revolution in the communication systems (Naveed et al. 2015). Indeed, it provides high data transmission rates and bandwidth along with illumination in the indoor environment. In addition, compared with other techniques including radio frequency (RF) systems, VLC has a high potential in positioning. Specifically, RF techniques have lack positioning accuracy and coverage and high interference problems (Matheus et al. 2019 and Mousa et al. 2018).

The use of received signal strength (RSS) signals for indoor localization is a promising technology. Localization has recently become increasingly granular in the commercial and scientific spheres. It encompasses a wide range of monitoring, surveillance, and tracking applications (Akter et al. 2018). In general, there are two types of localization techniques: range-based and range-free (Singh et al. 2015). The range-based schemes, on average, achieve more accuracy than range-free methods. Range-based localization is used in a variety of technologies, including time of arrival (TOA), angle of arrival (AOA), and RSS approaches. Both TOA and AOA approaches deliver great accuracy at a significant cost and complexity. The RSS approach provides moderate accuracy at a cheap cost (Koyuncu et al. 2010). The average RSS-based positioning strategy reduces the obstacles in indoor localization as compared to traditional localization approaches that employ raw RSS signals.

Extended Kalman filter (KF) in the positioning system has been demonstrated in (Vatansever et al. 2017; Zhitian et al. 2017; Eroglu et al. 2019; Shawky et al. 2020). The KF can be used in VLC localization for increasing the accuracy of the system. Convolutional neural network (CNNs) are used to model problems involving spatial image inputs and provide precise image classification results.

The DL models have recently been successfully used in a wide range of data-intensive applications, including robotics, tracking, navigation, object recognition, medical diagnosis, image processing (Hossain et al. 2019). The difficulty of training the DL models and the vast amount of training data are the main challenges of DL models -based localization systems (Yasir et al. 2015).

The localization method introduced in (Hoang et al. 2019) is based on the advantages of DL models. These technologies increase the performance of localization and produced significant localization results for real-time implementation.

A hyper-parameter (HP) is a parameter in machine learning that must be fixed before the training process can begin. As a result, unlike the value of parameters (e.g., weights) that may be taught during the training process, HPs (e.g., learning rate, batch size, and number of hidden nodes) cannot be learned during the learning process. HPs can affect the quality of the model produced by the training process as well as the algorithm's time and memory requirements (Yang et al. 2020). As a result, HP must be fine-tuned to provide the best results for a given situation.

The impact of training techniques in an artificial neural network (ANN) equalizer in VLC systems employing a nature light source was investigated by (Chaleshtori et al. 2020). In Musumtci et al. (2018), authors looked on the design and implementation of machine learning-based demodulation algorithms in VLC systems physical layer. Irshad et al. created a decision tree approach for indoor localization in VLC networks and compared it to various machine learning classifiers (Irshad et al. 2021). Alonso-González et al. introduced an indoor fingerprinting positioning estimate method depending on DL models to predict the device position in a 3D environment (González et al. 1998).

This paper aims to improve the performance of localization for VLC indoor systems by DL models that are used as prediction techniques to estimate a two-dimensional positioning system. The proposed system uses averaging RSS to estimate the Cartesian coordinates \((x,z).\) The received signal power is used as the DL models input. In the averaging method, the position of the receiver is estimated utilizing the RSS method for several times for number of samples and then, proposed technique take averaging for over samples. In a previous work, E. Shawky et al. worked on improving the visible light communication localization system using Kalman filtering with averaging, without deep learning. In our work, we use the main DL models to enhance the performance of localization in indoor systems, leading to more accurate and low-cost indoor localization technology. The KF is used to predict the power of receiver for certain number of samples, using the RSS method in averaging the received power. The accuracy of the positioning systems could be increased by using KF. KF algorithm is applied by adjusting the values of KF parameters to the user to include the information signal in positioning technique. The proposed methods are analyzed in a mathematical form, considering both NLoS for first-reflection and LoS propagation. Moreover, an HP approach based on Bayesian optimization is applied to improve our frame work performance.

The accuracy is the main factor to evaluate the performance of the proposed techniques, in addition to AUC, Se, Pr and F1-score, RMSE, training and testing time. Our proposed system is cheap and is featured with high performance and low computational complexity that achieves the hardware feasibility of the system.

The main contributions of the paper are summarized as follows:

We propose the design and analysis of an HP approach based on Bayesian optimization for indoor localization using DL models.

Two techniques are used for localizing the real track of receiver: RSS averaging technique and KF with average RSS technique in both LoS and NLoS links.

When compared to standard localization techniques, the proposed HP-RSS-KF-LoS- DL models methodology achieves a greater localization accuracy and reduced error.

This paper is structured as follows. The methodology of our framework is described in Sect. 2. The obtained results are displayed and discussed in Sec. 3 to evaluate the performance and robustness of the system. Finally, Sect. 4 is devoted to the main conclusions.

2 Methodology

2.1 Indoor model

In a typical room, we introduce the optical indoor VLC for both NLoS and LoS propagation. We consider 4 LEDs transmitters at ceiling, located at \({T}_{x,i}=({x}_{i},{y}_{i},{z}_{i})\), \(i \in \{1, 2, 3, 4\},\) and one receiver as photodetector (PD), at \({R}_{x}=({x}_{0},{y}_{o},{z}_{0})\).

2.1.1 LoS link

\({H}_{LoS}^{i}\) is the optical gain link of LoS for \(i\) LED to the PD and can be expressed as (Ghassemlooy et al. 2013)

$$ H_{LoS}^{i} = \frac{m + 1}{{2\pi d_{i}^{2} }}cos^{m} \left( {\psi_{i} } \right)A_{R} cos\left( {\varphi_{i} } \right)T_{s} \left( {\varphi_{i} } \right)g\left( {\varphi_{i} } \right) $$
(1)

where \(m\) represents the Lambertian order, \({\psi }_{i}\) is the incidence angle, \({\varphi }_{i}\) is the irradiance angle, \({d}_{i}\) is the distance between the receiver and transmitter \(i\), \({T}_{s} (\cdot )\) and \(g (\cdot )\) are, respectively, the gains of the optical filter and concentrator at the receiver (assumed as unity gain), and \({A}_{R}\) is the effective area of the PD.

2.1.2 NLoS link

For the gain of the first reflection for NLoS, and the reflected point at \(p= (x,y,z)\), the gain for transmitter \(i\) can be obtained as (Huang et al. 2017)

$$ H_{NLOS}^{ip} = \frac{m + 1}{{2\pi D_{ip,1}^{2} D_{p,2}^{2} }}cos^{m} \left( {\emptyset_{ip} } \right)coscos{ }\left( {\alpha_{ip} } \right){ }dA_{p} x{ }\rho cos\left( {\beta_{p} } \right)cos\left( {\varphi_{p} } \right)T_{s} \left( {\varphi_{p} } \right)g\left( {\varphi_{p} } \right)A_{R} $$
(2)

where the \(H_{iNLOS}\) represents the gain of NLoS given by summing the reflectors for all the four walls of the room (Shchekotov 2014 and Welch et al. 2006),\({D}_{ip,1}\) represents the farness between reflected point \(p\) and the transmitter, Dp,2 represents the distance between transmitter and the reflected point \(p,\) and both of \({\beta }_{p}\) and \({\alpha }_{ip}\) are the irradiance and incidence angles at the reflection point on the wall, respectively. The receiver \({R}_{x}, {\mathrm{\varnothing }}_{ip} and {\varphi }_{p}\) are the NLoS irradiance and incidence angles related to \(p\), respectively, \(\rho \) is the reflectivity of the wall and \(dAp\) expresse the reflected area for \(p\) on the wall. Figure 1 shows the LoS/ NLoS channel model for indoor VLC system.

Fig. 1
figure 1

LoS/NLoS configurations in indoor VLC system

2.2 Localization method utilizing averaging RSS technique

The traditional trilateration localization method is applied to obtain the receiver location, by using the RSS technique from 3 LEDs transmitters (Teruyama et al. 2013). The approach is averaging the predicted receiver location over a specific value of the estimations to decrease the error of the localization. Figure 2 shows the block diagram that demonstrates this proposal.

Fig. 2
figure 2

Methodology for localization based on RSS averaging

Using Eq. (1) and the RSS technique, the received LoS power from transmitter \(i \in \{1, 2, 3, 4\}\) can be written as

$$ P_{R,i} = \left( {\frac{m + 1}{{2\pi d_{i}^{2} }}cos^{m + 1} \left( {\varphi_{i} } \right)A_{R} } \right)P_{T,i} $$
(3)

where \(P_{T,i}\) represents the power of transmitted \(i^{th}\) LED.

Here, we assumed \(\varphi i = \psi\)i, which is calculated in (Huang et al. 2017) as

$$ cos{ }\left( {\varphi_{i} } \right){ } = \frac{V}{{d_{i} }} $$
(4)

where \(V{ }\) is the height between transmitter and receiver and is assumed constant. The distance between transmitter i and receiver can be obtained as (Shawky et al. 2020)

$$ d_{i} = \sqrt[{m + 3}]{{\frac{{\left( {m + 1} \right)V^{m + 1} A_{R} P_{T,i} }}{{2\pi P_{R,i} }}}} $$
(5)

e total power, \(P_{R,i}\), collected at the receiver is obtained with considering the effect of NLoS path by modifying Eq. (3) to be:

$$ { }P_{R,i} = \left( {H_{LoS}^{i} + H_{NLoS}^{i} } \right)P_{T,i} $$
(6)

2.3 Localization using KF in conjunction with averaging

The KF is used to enhance the prediction of the receiver localization. First step, KF estimates number N of samples for the predicted received powers. Secondly, the averaging of the estimated power is computed. Utilizing the average of power estimation, the estimated location of the receiver is calculated using the RSS method. The block diagram of KF with averaging technique is shown in Fig. 3.

Fig. 3
figure 3

Proposed KF with averaging technique

The flowchart, in Fig. 4 illustrates the stages of utilizing KF with average (AVG) technique. The KF algorithm recursively estimates the state of the variables in this system in two phases; prediction and measurement (Chen et al. 2021 and Bo Liu 2021).

Fig. 4
figure 4

Flowchart for combining the KF algorithm and AVG method

2.4 Kalman algorithm

The channel is modeled to be as an auto-regressive (AR) process in the state space model. The AR models and processes operate with the premise past values taking the effect of current values. The scheme depends on the idea of using KF to enhance the accuracy of the estimation. In the KF, the state vector is denoted as x. This vector state measures the power received and number of samples that are utilizing in the process, depending on the estimation at \(k-1\), and the state \({x}_{k-1/k-1}\). The following \(k\) of the dynamics system, \({x}_{k/k-1}\), is calculated in predict and measurement stages as follows.

First: Predict step:

$$ x_{k/k - 1} = F_{k} x_{k - 1/k - 1} + v_{k} $$
(7)

where \(F_{k} { }\) is the transition state matrix and \(v_{k}\) is the process white noise. The corresponding state for the matrix covariance is given by (E. Shawky et al. 2020)

$$ P_{k/k - 1} = { }F_{k} P_{k - 1/k - 1} F_{k}^{T} + Q_{k} $$
(8)

where \(Q_{k}\) represents the process noise of the covariance.

Measuring step:

The updated variable state, \(x_{k/k}\), and updated covariance state \(P_{k/k - 1}\) can, respectively, be represented by

$$ x_{k/k} = x_{k/k - 1} + K_{k} \left( {z_{k} - H_{k} x_{k/k - 1} } \right) $$
(9)
$$ P_{k/k} = \left( {I - K_{k} H_{k} } \right)P_{k/k - 1} $$
(10)

where \(K_{k}\) represents the Kalman gain, and \(H_{k}\) denotes the observation model given by:

$$ K_{k} = P_{k/k - 1} H_{k}^{T} S_{k}^{ - 1} $$
(11)

Here, \(z_{k}\) is the measurement vector given by

$$ z_{k} = x_{k}^{T} + { }w_{k} $$
(12)

where \(w_{k}\) is the measurement noise.

Also, \(S_{k}\) represents the innovation matrix, which is correlated with the covariance of the state variables to measurement vector as:

$$ S_{k} = \left( {H_{k} P_{{\frac{k}{k} - 1}} H_{k}^{T} } \right) + R_{k} $$
(13)

where \(R_{k}\) is the covariance of the observation noise.

2.5 Proposed DL models based indoor positioning system

The DL models incorporates the benefits of optimization approach to enhance the system performance (Chen et al. 2021). We build a hybrid network, HP-RSS-KF-LoS-DLM, which employs an optimized DL models. We provide our technique for getting the optimal HP-RSS-KF-LoS-DLM configurations for target localization in this section. As previously mentioned, the first stage is data set gathering utilizing MATLAB. The second stage involves the optimization approach based on Bayesian optimization and DL models with the predicted data set using Python software. The proposed localization method identifies the user position using different strategies: average RSS based on DL models, average RSS with KF based on DL models, and the Bayesian approach hybrid with the DL models for optimization process to enhance our frame work performance. The proposed HP-RSS-KF-LOS- DL models based localization technique is illustrated in Fig. 5. The suggested system starts with gathering the RSS data; a normalization technique is used to RSS data to center it to a mean value, \(\mu ,\) to enhance and minimize redundancy. Finally, three different DL models, Yolo v3 (Adarsh et al. 2020), EfficientNetB3 (Ganesh et al. 2022) and DenseNet121 (Nandhini et al. 2022), are used to train the data.

Fig. 5
figure 5

Proposed DL models based indoor positioning system

2.6 DL models with hyper-parameter optimization

As previously stated, the choice of HPs affects the performance of a model, and determining the ideal value for each HP is not easy. As a result, we apply Bayesian optimization to adjust the suitable HP for the used DL models to check if it brings any benefit. Both the Adam optimizer (Jais et al. 2019) and the Stochastic Gradient Descent (SGD) optimizer (Ratre 2020) are subjected to HP tuning. The best combination of Adam optimizer with yolo v3 is a learning rate value of 0.001954, beta 1 value of 0.854 which gives a loss metric 2.34, while the best combination of SGD optimizer is a learning rate value of 0.01821, and a tuned momentum value of 0.962 which gives a loss metric of 2.01 as shown in Table 1. All values represented in Table 1 are obtained based on the authors trials for the different algorithms to get the optimum performance. Moreover, a batch size of 64/32/16 and number of epoch 100, 150, 200, are used for Yolo v3, EfficientNetB3 and DenseNet121, respectively.

Table 1 Hyper-parameters values and loss metrics for both optimizers

It is observed that the loss metric changes from 2.22 to 2.34 for the Adam optimizer and from 2.01 to 2.14 for SGD optimizer. This indicates that the SDG optimizer is better than Adam optimizer in these datasets.

2.7 Localization process

In Fig. 5, the localization method first employs the training RSS to train the DL models. Following model weight initialization, the system employs testing RSS for localization. The DL model gathers the RSS information from the spatial domain and then start the testing phase. The DL models predict user locations by using information in the temporal domain. The DL models output is the user \(x\) and \(y\) position values.

3 Results and discussion

3.1 Evaluation metrics

In order to achieve the superb robustness of proposed technique, various DL models are utilized. Here, we evaluate the performance of indoor localization for several DL models based on different strategies.

The metric evaluation depends mainly on calculating four parameters: the number of true positives (TP), true negatives (TN), false negatives (FN), and false positives (FP). The classification performance is identified in terms of \(ACC\), \(Se\) or recall, \(Pr\), F1-score, \(AUC\), RMSE and computational time. The \(ACC\) is used to evaluate the rate of correct classification, \(Pr\) is the positive predictive value that matches the original value, and \(Se\) is the true positive values. The F1-score is the harmonic mean of \(Pr\) and \(Se\). It represents a more generalized form for balancing both \(Pr\). The \(AUC\) measures the entire two-dimensional area underneath the entire ROC curve. The RMSE is an error metric that obtains a cumulative estimate of error. It is evaluated as the square root of the arithmetic mean of squares of error in our dataset. It provides an aggregate measure of performance across all possible classification thresholds. All these metrics are defined as follows (Muschelli 2020)

$$ ACC = \left( {TP + TN} \right)/\left( {TP + FP + TN + FN} \right) $$
(14)
$$ Pr{ } = TP/\left( {TP + FP} \right) $$
(15)
$$ Se{ } = TP/\left( {TP + FN} \right) $$
(16)
$$ F1 = 2{ }\left( {Pr \times { }Se} \right)/\left( {Pr + Se} \right){ } $$
(17)
$$ RMSE = \sqrt {\frac{1}{k}\mathop \sum \limits_{j = 1}^{k} \left[ {\left( {\hat{x}_{j} - x_{j} } \right)^{2} + \left( {\hat{y}_{j} - y_{j} } \right)^{2} } \right]} $$
(18)

where \(\left( {\hat{x}_{j} ,\hat{y}_{j} } \right)\) and \(\left( {x_{j} ,y_{j} } \right)\) refer to \(j^{th}\) estimated and true locations, respectively, and k is the number of dataset points.

Now, simulation results for the proposed algorithm are presented and compared with that of the traditional systems. The main parameters used in the simulations for the VLC link are listed in Table 2.

Table 2 Simulation parameters (Shawky et al. 2020)

We start our simulation with HP tweaking to assess the performance and correctness of the proposed DL models. In HP tuning, we train the model with various HP settings to find the optimal values that offer the best model performance. The HP values utilized in the proposed model are previously summarized in Table 1.

Table 3 illustrates that the proposed HP-RSS-KF-LoS- DL models model training and testing duration is less than the other models. However, when examining the suggested DL models localization capability, these computational durations are reasonable for indoor localization (Chatterjee et al. 2019).

Table 3 Time for all models to be trained and tested

Table 4 displays different strategies, average RSS technique, average RSS technique with KF, average RSS technique based on DL models and average technique with KF based on HP-RSS-KF-LoS- DL models with LoS and NLoS.

Table 4 Performance of different strategies based on DL models

It is observed from the experimental results that the HS-RSS-KF-LoS- DL models, YoloV3, achieves the best performance with 99.99% accuracy, 99.98% AUC, 98.88% sensitivity, 98.98% precision, 99.97% F1-score and 0.112 RMSE. We would like to notify that the obtained RMSE is related to Yolo V3 model that is concluded to have the superior probabilities of performance.

Our proposed frame work is compared with others in the literature as introduced in Table 5. The results reveal that our proposed framework achieves superior performance in ACC, Pr, AUC, Se, F1-score and RMSE. The comparison depicts an accuracy enhancement of 1.29% to 4.04% and an RMSE enhancement of 3.89% to 21.59%. The other evaluation indicators are also better in our work, with less percentage ratios. We note also that, sensitivity is not found in literature. All of this gives a superiority of our work.

Table 5 Comparison between our framework and others in the literature

4 Conclusion

In this paper, we introduced multi-techniques to enhance the localization using RSS average technique and KF with average RSS technique with both effects of LoS and NLoS links. The output of these techniques \((x,y)\) of the estimated track of the receiver was the input of DL models -based localization system for indoor VLC system. The viability of employing the average RSS hybrid with KF for indoor localization is tested experimentally. It is observed that the HP optimization plays an important role in improving the performance of our proposed framework. The suggested HP-RSS-KF-LoS-DLM-based localization system achieves a reasonable localization accuracy for indoor localization, according to the findings of our trials.

Compared with previously published work, our proposed frame work is found to have better performance. It achieves accuracy of 99.99% accuracy, 99.98% AUC, 98.88% sensitivity, 98.98% precision, 99.97% F1-score and 0.112 RMSE and 0.29 s for testing time.

Accordingly, our proposed system is featured with high accuracy, low complexity and small error distance at very small training time. This makes it appropriate to be included in mobile devices. Therefore, the proposed system can be scaled and applied with any VLC environment just by estimating their RSS values. Moreover, identical idea can be utilized with building 3D localization system.