1 Introduction

In the world of wireless and mobile communication, the radio and microwave frequency bands are getting more and more crowded. This congestion poses a challenge for existing systems. FSO comes to the rescue with its substantial bandwidth, providing an appealing solution to the bandwidth crunch. Think of FSO as a counterpart to fiber technology, both offering extensive bandwidth for data transmission. The key difference lies in how they transmit data; FSO sends it through the air, while fiber encases it in glass.

FSO technology stands at the forefront of this revolution, harnessing the power of light to transmit data through the atmosphere. As we explore the intricacies of FSO, our focus extends beyond conventional SISO setups to the more advanced and robust MIMO configurations (Shrouk et al. 2021).

FSO finds applications in various areas, including metropolitan networks, communication between buildings, wireless backhaul systems, indoor links, fiber backup, accelerating services, ensuring security, military operations, satellite communication, and more. Lab demonstrations have showcased FSO’s capability to transmit massive data rates, reaching Tera bit levels. Commercially available products offer On-Off-Keying (OOK) modulated data bandwidth ranging from 100 Mbps up to 10 Gbps, depending on the link’s length (Uysalet al. 2014; Al-Gailaniet al. 2021).

Despite its numerous advantages and applications, FSO links face challenges due to various weather conditions like fog (Light Fog (LF), Medium Fog (MF), and Heavy Fog (HF)), rain, dust, snow, and turbulence (Somia A. Mottalebet al. 2021). The existing literature primarily focuses on turbulence caused by air heating from the sun. Researchers have proposed FSO channel models for both SISO and MIMO wireless systems to address this issue. Additionally, some suggested techniques, such as incorporating relays, aim to enhance the overall system performance.

In the journey of improving FSO performance, we are faced with the dynamic challenges of atmospheric conditions, hardware constraints, and the unpredictable nature of our environment. This paper embarks on a comprehensive exploration of not only the technological nuances of FSO, but also the integration of cutting-edge ML techniques to augment decision-making processes and instigate a paradigm shift in risk-based maintenance strategies.

The transition from SISO to MIMO FSO configurations represents a leap forward in reliability and performance. SISO systems rely on a single optical link between transmitter and receiver, while MIMO systems, with their spatially separated transceivers, offer increased redundancy and improved signal robustness. Our study delves into the comparative advantages of these configurations and analyzes their adaptability to various communication scenarios (Alaa et al. 2020; Asmaa et al. 2020).

ML, with its ability to unravel complex patterns from data, emerges as a game-changer in the optimization of FSO systems (Singh et al. 2023). Imagine a system that learns from its surroundings, adapts to atmospheric changes in real-time, and predicts link quality variations (Somia A.A El-Mottaleb et al. 2022a, b). This paper explores the integration of ML models to dynamically adjust FSO parameters, optimizing performance and ensuring seamless communication under diverse conditions. Moreover, this paper addresses the critical aspect of risk-based maintenance in FSO systems. The outdoor deployment and sensitivity to environmental factors necessitate a proactive approach to maintenance. By leveraging historical performance data and employing ML algorithms, we aim to predict potential failure points, recommend timely maintenance actions, and consequently, reduce downtime and enhance the overall lifespan of FSO installations.

The main contributions of this work are:

  1. 1.

    A novel dataset is extracted from different FSO channels labelled: LF SISO, MF SISO, HF SISO, LF MIMO, MF MIMO and HF MIMO. The features extracted are BER, Q-factor and ROP. Each snapshot represents a different range in meters.

  2. 2.

    The classification of each label is based on an enhanced statistical preprocessing scheme that provides relatively enough information that performs the classification task efficiently. The scheme is based on multiple filtering layers along with Principal Component Analysis (PCA) that checks the multicollinearity.

  3. 3.

    The adoption of the GPC in FSO during the classification of the residual components. The GPC outperforms the other classifiers in accuracy, precision, and recall.

  4. 4.

    The differentiation in our analysis is based on the contextual metadata associated with each data point, which includes system configuration details (SISO or MIMO) and the specific link range. This metadata allows the ML model to contextualize the performance metrics (BER, Q-factor, and ROP) within the framework of the system’s physical configuration and operational range. We confirmed that the model distinguishes between the two system types under similar performance metrics.

The remainder of the paper is organized as follows: The established model of the proposed SISO/MIMO FSO model is described in Sect. 2.1. The data collection scheme with the data preprocessing methods are explained in Sect. 2.2. The GPC is then discussed in Sect. 2.3. Performance analysis of our proposed model is illustrated in Sect. 3. The Exploratory Data Analysis (EDA) and the ML evaluation are discussed in Sect. 3.1 and Sect. 3.2 respectively. Finally, Sect. 4 is devoted to the main conclusions.

2 Established model

In this section, the proposed model is discussed in two different sections, the part related to communication is in Sect. 2.1, while the data collection is implemented is in Sect. 2.2.

2.1 Proposed SISO and MIMO FSO Model

Figure 1 shows the layout of the SISO and MIMO FSO systems. Both systems have the same transmitter and receiver construction. The difference between them is the number of FSO channels used. The transmitter consists of three parts which are bits generator, electrical modulator, and optical sources. The bits generator is a Pseudo Random Bit Sequence Generator (PRBSG) which generates 10 Gbps data that is then entered electrical modulator named OOK Non-Return-Zero (NRZ) to convert the binary signal to an electrical signal. As in our model, four channels are used, so, four Continuous Wave (CW) lasers having wavelengths of 1550, 1550.8, 1551.6, and 1552.4 nm are utilized to generate optical signals. In our MIMO system, while it is common for the same wavelength to be utilized in traditional MIMO setups, we employed multiple channels, with each channel operating at a specific wavelength. This allows for the transmission of multiple independent data streams simultaneously. The spatial multiplexing technique utilized in our work ensures that data from different users can be detected and separated effectively. Moreover, a Mach-Zehnder Modulator (MZM) is utilized to modulate the electrical signals onto the optical ones. Furthermore, the four optical signals from the four channels are multiplexed using a multiplexer before transmitting to the FSO channel.

Fig. 1
figure 1figure 1

Layout of proposed FSO model a SISO and b MIMO

In the SISO FSO system, only a single FSO channel is used, while in case of MIMO FSO system, four channels are used. For both cases, the different attenuations of foggy weather are considered. Fog is a factor that affects the performance of the information signal. It arises when tiny water droplets disperse light in various directions, causing the light signal to scatter and diminish in both quality and intensity. Consequently, this leads to a degradation in the performance of the signal while propagating under this condition. The attenuation that fog conditions cause varies based on the level of fog. According to (Kimet al. 2001; Chaudharyet al. 2014; Chaudhary et al. 2021), the attenuation for LF is 9 dB/km, whereas for MF it is 16 dB/km, and for HF it is 22 dB/km. The fog attenuation depends on visibility, V, in km and wavelength ,λ, in nm and can be expressed as (I. Kim et al. 2001; Grover et al. 2017)

$$\alpha = \frac{3.912}{V}{\left(\frac{\lambda }{550 nm}\right)}^{-\text{s}}$$
(1)

where s is the size distribution of scattering particles which, in Kim model, has values of 1.6, 1.3, 0.16V + 0.34, V − 0.5, and 0 for the V greater than 50, between 6 and 50, between 1 and 6, between 0.5 and 1, less than 0.5, respectively (Kim et al. 2001).

After broadcasting on the FSO channel, the signal is then received at the receiver. In case of SISO FSO system, the received signal power, \({P}_{RX}^{SISO}\), can be e0xpressed as (Chaudhary et al. 2021; Grover et al. 2017)

$${P}_{RX}^{SISO}={P}_{TX}^{SISO}{\left(\frac{{D}_{RX}}{{D}_{TX}+{\uptheta }d}\right)}^{2}{10}^{\frac{-\alpha d}{10}}$$
(2)

where \({D}_{TX}\) and \({D}_{RX}\)are transmitter and receiver apertures diameters, respectively, \({P}_{TX}^{SISO}\) is the transmitted power, \({\uptheta }\) denotes the angle of beam divergence, and d is the propagation range.

In case of MIMO FSO system, the received power is expressed, according to Beer-Lambert law, as (Chaudhary et al. 2021; Grover et al. 2017)

$${P}_{RX}^{xy}={P}_{TX}^{xy} {e}^{-{\alpha }_{xy}d}$$
(3)

where \({P}_{RX}^{xy}\) and \({P}_{TX}^{xy}\) are the received power at the \({x}^{th}\) receiver from the\({y}^{th}\) transmitter and the transmitted power from the \({y}^{th}\) transmitter to the \({x}^{th}\) receiver, respectively, and \({\alpha }_{xy}\)is the attenuation for the link between the \({x}^{th}\) receiver and the\({y}^{th}\) transmitter.

The received light signal is passed through a demultiplexer to demultiplex it to four channels. Then, the receiver of each channel is composed of a photodetector (PD) to convert signal from the electrical domain to the optical one. Afterwards, the resultant signal from PD passes through a low pass filter (LPF) and a BER analyzer to block the unwanted signal and test the performance of the received data.

The signal to noise ratio is expressed as (Fadhil et al. 2008; Aldhaibani et al. 2014; Ahmed et al. 2017; Abdelhamid et al. 2021).

$$ SNR= \frac{{\left({P}_{RX}\mathfrak{R}\right)}^{2}}{2q{P}_{Rx}({P}_{RX}\mathfrak{R}+{I}_{D})+\frac{4{k}_{B}{T}_{a}B}{{R}_{l}}} $$
(4)

where \({P}_{RX}\) is the received power which incase of SISO equal to \({P}_{RX}^{SISO}\) while for MIMO, it equal to \({\sum }_{\begin{array}{c}x=1\\ y=1\end{array}}^{\begin{array}{c}x=4\\ y=4\end{array}}{P}_{RX}^{xy}\), \(\mathfrak{R}\) is the PD responsivity, q is the electron charge, \({I}_{D}\) is the dark current, \({k}_{B}\) is the Boltzmann constant, \({T}_{a}\) is the absolute receiver temperature, \(B\) is the electrical bandwidth, and \({R}_{l}\) is the receiver load resistance.

The BER is expressed in terms of SNR as

$$\text{B}\text{E}\text{R}= 0.5 \text{e}\text{r}\text{f}\text{c} \left(\frac{\sqrt{SNR}}{8}\right)$$
(5)

where erfc is the complementary error function.

Additionally, BER can be expressed as

$$\text{B}\text{E}\text{R}= 0.5\left(p\left(\frac{0}{1}\right)+p\left(\frac{1}{0}\right)\right)$$
(6)

where p(0/1) is the probability of deciding bit “0” when receiving bit “1” while p(1/0) is the probability of deciding bit “1” when receiving bit “0”.

2.2 Data collection

In this work, the data are obtained through the pipeline in which snapshots are taken from FSO communication channel at the form of an extracted structured data out of Data Lake. The data consists of BER, Q-factor and the ROP for the following channel types: LF MIMO, HF MIMO, MF MIMO, LF SISO, HF SISO and MF SISO systems. Each snapshot it taken in a periodic time with a constant rate of change in distance less than 1 m. The dataset total snapshots are 3000 data points where the classes have an equiprobable number of data points which is equivalent to 500 data points for each one. Figure 2 shows the preprocessing scheme in which the process of the data pipeline and Extract, Transform and Load (ETL) is illustrated.

Fig. 2
figure 2

Preprocessing scheme

The depicted preprocessing workflow initiates with the aggregation of structured and unstructured data in a Data Lake, from which a Data Extractor retrieves relevant information. This extraction is followed by the simultaneous processing of data across four distinct channels, each channel yielding measurements of Bit Error Rate (BER), Q-factor, and ROP. After this channel-specific processing, the data undergoes a transformation to standardize the various data streams, thus facilitating the structuring of the data into a format primed for analytical endeavors.

A critical step in this workflow is the evaluation of feature correlation: should any feature pair exhibit a correlation coefficient equal to or exceeding 75%, the redundant feature is discarded to mitigate multicollinearity. Features that pass this correlation threshold are earmarked as candidates for inclusion in the ML model.

The PCA is used as a preprocessing layer in ML models, executing dimensionality reduction by projecting data into fewer dimensions. This process effectively reduces the quantity of predictive variables, leading to a decrease in computational demand and complexity. By simplifying the data structure, the PCA enhances model efficiency, enabling faster processing times and more manageable data handling. Additionally, it can improve the model performance by mitigating the risk of overfitting, as fewer variables often lead to a more generalized model. This makes the PCA an essential tool in handling large datasets, where reducing dimensions without losing significant information is crucial (Zaki et al. 2023).

The result of the PCA is distilling the dataset down to two principal components. The resulting dataset is partitioned into training and testing sets, with 30% allocated for testing, to furnish a robust foundation for model training and subsequent validation. This careful orchestration of preprocessing not only streamlines the dataset but also enhances the predictive prowess and clarity of the resulting analytical models.

2.3 Gaussian process classifier

A GPC is a Bayesian classification method that uses Gaussian processes to make probabilistic predictions. Unlike many other classifiers, GPC can provide a measure of certainty or uncertainty regarding its predictions, which can be crucial in many applications. The Gaussian process is essentially a prior over functions, and with the data observed, we updated this prior to a posterior. In the context of classification, we used this process to define a distribution over the possible functions that map our input data to output labels. This allows us to make predictions for unseen data and to assess the uncertainty of those predictions (Nguyen et al. 2019).

The GPC models the probability of a class as a Gaussian process and uses the Laplace approximation to make predictions. It can be used for binary as well as multi-class classification problems. For multi-class problems, one common approach is to use several binary one-versus-rest classifiers. This is the approach taken by GPC of Scikit-Learn. The performance of a GPC heavily depends on the choice of the kernel function, which defines the covariance structure of the data. Common kernels used in GPC include the Radial Basis Function (RBF), Dot Product, Matern, Rational Quadratic, and White Kernel. Each kernel has its own set of hyper parameters, such as the length scale in the RBF kernel, which need to be optimized for the task at hand. Optimization of these hyper parameters is typically done by maximizing the log marginal likelihood of the training data (Yang et al. 2021).

It takes input and output training sets as input data. While the hyper parameters are as follows:

  • Kernel Function: This is a function \(k\left(\mathbf{x},{\mathbf{x}}^{{\prime }}\right)\) which defines the covariance between two points x and x′ in the input space. The choice of kernel function and its hyper parameters are crucial as they define the properties of the functions that the GP can model.

    For example, used in this work the RBF or Gaussian kernel has the form Zaki et al. (2022a, b)

    $$ k\left( {{\mathbf{x}},{\mathbf{x}}^{{\prime }} } \right) = \sigma _{f}^{2} {\text{exp}}\left( { - \frac{{\left\| {{\mathbf{x}} - {\mathbf{x}}} \right\|^{{{\prime }2}} }}{{2l^{2} }}} \right) $$
    (7)

    where \({\sigma }_{f}^{2}\)​ is the variance (scale factor) and l is the length scale hyper parameter.

  • Kernel Hyper Parameters Depending on the chosen kernel, one or more can be taken into consideration of the following hyper parameters:

    • Length scale l for the RBF kernel.

    • Smoothness parameter ν for the Matérn kernel.

    • Scale factor \({\sigma }_{f}^{2}\)​.

  • Noise Level If the observations are noisy, the kernel is often augmented with an independent identically distributed noise term \({\sigma }_{n}^{2}\) which represents the noise variance in the target data. The modified kernel function with noise is

    $${k}_{y}\left(\mathbf{x},{\mathbf{x}}^{{\prime }}\right)=k\left(\mathbf{x},{\mathbf{x}}^{{\prime }}\right)+{\sigma }_{n}^{2}\delta \left(\mathbf{x},{\mathbf{x}}^{{\prime }}\right)$$
    (8)

    where δ(\(\mathbf{x},{\mathbf{x}}^{{\prime }}\)) is the Kronecker delta function which is 1 if x = x′ and 0 otherwise.

  • Prior Mean and Covariance Typically represented as m(x) and K(X) respectively, where X is the matrix of input features. The mean is often assumed to be zero, i.e. \(m\left(\mathbf{x}\right)=0\)

  • Optimizer The specific optimization algorithm is not a hyper parameter that is represented mathematically within the model itself, but, it is used to optimize the hyper parameters, often by maximizing the log marginal likelihood log p(y∣X), where y is the vector of target values. The limited-memory Broyden-Fletcher-Goldfarb-Shanno with Box constraints (L-BFGS-B) algorithm is adopted for this task.

Let \(f:{\mathbb{R}}^{n}\to \mathbb{R}\) be the objective function to be minimized.

\(x\in {\mathbb{R}}^{n}\) is the vector of parameters or variables to be optimized.

\(\nabla f\left(x\right)\) represents the gradient of \(f\) at \(x\), a vector in \({\mathbb{R}}^{n}\).

The bounds on \(x\) are given as \(l\) (lower bound) and \(u\) (upper bound) for each component.

\(H\) approximates the inverse of the Hessian matrix.

Steps:

  1. 1.

    Initialization:

    • Start with an initial estimate \({x}_{0}\) .

    • Initialize \({H}_{0}\) as an identity matrix or another suitable form.

  2. 2.

    Iterative Process: For each iteration \(k\), do the following:

    1. 2.1

      Compute gradient

      Evaluate \({g}_{k}=\nabla f\left({x}_{k}\right)\).

    2. 2.2

      Calculate search direction

      Compute the search direction \({p}_{k}\) using \({H}_{k}\) and \({g}_{k}\), typically \({p}_{k}=-{H}_{k}{g}_{k}\).

    3. 2.3

      Line search

      Perform a line search to find an acceptable step length \({\alpha }_{k}\) along \({p}_{k}\).

      Update \({x}_{k+1}={x}_{k}+{\alpha }_{k}{p}_{k}\), ensuring that \({x}_{k+1}\) is within the bounds \([l,u]\).

    4. 2.4

      Update gradient and hessian

      Evaluate \({g}_{k+1}=\nabla f\left({x}_{k+1}\right)\).

      $$\text{C}\text{o}\text{m}\text{p}\text{u}\text{t}\text{e} {s}_{k}={x}_{k+1}-{x}_{k} \text{a}\text{n}\text{d} {y}_{k}={g}_{k+1}-{g}_{k}.$$

      Update the approximate Hessian \({H}_{k+1}\) using \({s}_{k}\) and \({y}_{k}\) based on the L-BFGS formula, which typically involves a series of vector updates without forming \(H\) explicitly.

  3. 3.

    Convergence check

    Check if the stopping criterion is met (e.g., \( \left\| {g_{k} + 1} \right\| \) is sufficiently small).

    If not, go back to step 2.

Jitter: A small value \( \in \) is added to the diagonal of the kernel matrix for numerical stability:

$$ K^{\prime}\left( {{\mathbf{X}},{\mathbf{X}}} \right) = K\left( {{\mathbf{X}},{\mathbf{X}}} \right) + \in I $$
(9)

where \(I\) the identity matrix.

These hyper parameters are usually learned from the data by maximizing the marginal likelihood, which is carried out using optimization technique of BFGS.

2.3.1 Gaussian Process Classifier Pseudocode

It is described as follows:

figure a

3 Results and dissociation

In this section, the Exploratory Data Analysis (EDA) is discussed along with the communication channels. Moreover, the evaluation of the ML model is depicted in this part.

3.1 EDA

The statistical distribution of the ROP, max. Q-Factor and log(BER) for each user is shown in Figs. 3, 4 and 5, where each channel is stack-plotted on count for each user.

Fig. 3
figure 3

Received optical power for each user

Fig. 4
figure 4

Maximum Q-factor for each user

Fig. 5
figure 5

Minimum BER for each user

For each user, the mean value ranges from − 20 to -18 dB as shown in Fig. 3, according to the highest occurrence of ROP within the mean range. The ROP reaches − 12 dB for the MIMO cases of different fog conditions which reflects on the higher number of transmitters and receivers for 3% of the data. The HF SISO system has the least ROP which is around − 26 dB to -19 dB and the HF MIMO case could surpass the − 19 dB for each user and achieve − 12 dB. The distribution follows the normal distribution shape with 20 bins.

While the exact statistical distribution for maximum Q-factor cannot be discerned solely from visual inspection, the data exhibits characteristics that may be consistent with log-normal distribution and other positively skewed distributions ranging from 2 to 10. The lack of sharp peaks and the presence of long right tails in the histograms suggest that the underlying distribution may not be normal. The MIMO system cases achieve higher Q-factor and higher occurrence than SISO systems on each user, which validates the higher ROP of each MIMO user. Hence, the HF MIMO system ranges from Q-factor 6 to 10 while the highest occurrence of the overall MIMO system is in higher Q-factor values.

In Fig. 5, the x-axis is the minimum BER, which is a critical performance metric in communication systems. A lower BER indicates a higher quality of the communication link. The y-axis is the count, which represents the number of occurrences (or simulation runs) resulted in a particular minimum BER. The MIMO configuration bands typically exhibit the best performance in terms of BER. This validates the data that higher MIMO system to be more suitable for high-quality communication, possibly due to less interference or better channel conditions, though they may have a limited range.

3.2 Dimension reduction results

From Fig. 2 of the data preprocessing scheme, the structured data is formed of 12 columns consisting of the three parameters: ROP, Q-factor and BER. These three parameters are considered in four different users. Then, this data passes into a filter that removes the features with correlation more than 75% of the data. The resultant features are then reduced orthogonally using PCA. The result is converting the features from 12 to 2 Principal Components: PC1 and PC2. The total explained variance reaches 99% which indicates that both PC1 and PC2 contain a high amount of information. Figure 6 displays the PCA output.

Fig. 6
figure 6

PCA results of PC1 and PC2

Both PC1 and PC2, in Fig. 6, are plotted for each channel for each 4 different users. Table 1 shows the statistical distribution of each component.

Table 1 Statistics of PC1 and PC2 training data

The values indicate that resultant PC1 and PC2 are on a different scale with standard deviation but with approximately equal means.

3.3 ML evaluation

Here, the ML model is evaluated through the accuracy, precision, recall, and time taken metrics. The balanced accuracy is the same as accuracy since the data is balanced and equiprobable on each class as the data contains 500 datapoint for each class. Hence, the testing data is also equiprobable. Figure 7 shows the balanced accuracy comparison for different supervised ML algorithms along with the GPC.

Fig. 7
figure 7

Balanced accuracy evaluation for different ML algorithms

The y-axis represents the value of the balanced accuracy ranging from 94%, where the x-axis is the model’s name. In comparison with other classifiers such as different Support Vector Classifiers (SVCs), Decision Tree Classifier, Label Propagation and Label Spreading, the GPC shows the highest accuracy among them; equivalent to accuracy greater than 99% in classifying each label represented as a communication channel. However, the accuracy itself is not a sufficient metric for the GPC. Moreover, the precision and recall are also not sufficient to have a better insight into the model complexity. To solve this problem, timing is taken into consideration, so, Fig. 8 shows the time taken to fit and predict.

Fig. 8
figure 8

Time taken evaluation for different ML algorithms

The y-axis is the time taken to fit and predict the model with the 3000 datapoint in s. By comparing the SVCs, the highest time taken is by NuSVC which is equivalent to 0.2 s while the Linear SVC is 10 times lower in time taken. Given that, their accuracies are between 96% and 97%. The GPC takes nearly 0.15 s which is less than NuSVC and is the second highest amount of time taken.

In this study, since the problem formulation of the communication scenario is to minimize the amount of error in classification in order to avoid data loss, the GPC is the best classifier to be used as ML model in this task.

Figure 9 shows the decision boundaries performed by the GPC on the PCA result.

Fig. 9
figure 9

PCA output dimensions with GPC boundaries

The GPC shows reasonable decision boundaries that could classify each channel: MF SISO, LF SISO, LF MIMO, HF SISO, HF MIMO, MF MIMO correctly. The classification accuracy is displayed in Table 2.

Table 2 Output classification report of GPC for each communication type

The term support in Classification Report of ML means the amount of data used in testing for each class, and the accuracy means how correct the model performs.

Since the metrics show a value greater than 99% accuracy when the test data is used, a cross-validation test is used to inspect the dataset and check overfitting. The training data is split into 5 folds. Then, each fold is used to validate the accuracy of the training data. Hence the next table, Table 3, shows the cross-validation test output.

Table 3 Cross-validation test results on the training accuracy

The training accuracy is evaluated in each fold and the output accuracy is 0.99875 (99.875%) which is a resultant average of the 5 folds accuracies. Hence, this indicates that the model is robust and stable with no overfitting or underfitting occurrences which validates the high-test accuracy in Table 2 of GPC.

4 Conclusion

This study focused on the performance evaluation of FSO communication systems, particularly under different fog conditions. In low fog environment, SISO configuration has been found to be effective in mitigating signal degradation. As the density of the fog increases to moderate levels, the advantages of MIMO systems, especially their spatial diversity, have been observed to enhance link robustness significantly.

In scenarios of high fog density, where traditional FSO configurations face limitations, the incorporation of ML algorithms has been explored. Predictive models, utilizing historical weather data, have been employed for the dynamic optimization of communication parameters. This approach has ensured efficient and reliable links under challenging conditions.

A key aspect of this research has involved the classification of different weather channels to facilitate the optimal selection of communication channels. An enhanced preprocessing scheme, combined with a GPC, has been introduced. The obtained results reveal that the GPC, with its ability to map nonlinear functions, achieved a classification accuracy greater than 99%, surpassing them such as techniques SVM and ensemble learning techniques. Moreover, the GPC demonstrates efficiency in execution timing, requiring only 0.15 s for the fit and predict functions, which is faster than the NuSVM. For the sake of comparison, the NuSVM, which is one of the SVM family, was used during evaluation for the utilization and the effectiveness of GPR to perform such a classification task in FSO channel classification.

In summary, the findings of this research have significant implications for the implementation of FSO systems in fog-prone areas. The study illustrates how traditional FSO configurations, augmented by ML, can enhance communication resilience in adverse weather conditions. This research contributes to the field of optical communications by providing insights into how environmental challenges can be addressed through a combination of established and emerging technologies.