Simultaneous monitoring of the values of CD, Crosstalk and OSNR phenomena in the physical layer of the optical network using CNN

The aim of the research was to explore the possibilities of using the Asynchronous Delay Tap Sampling (ADTS) and Convolutional Neural Network (CNN) methods to monitor the simultaneously occurring phenomena in the physical layer of the optical network. The ADTS method was used to create a data sets showing the combination of Chromatic Dispersion (CD), Crosstalk and Optical to Signal Noise Ratio (OSNR) as optical disturbances in graphic form. Data were generated for 10 GB/s, Non-return-to-zero On–off keying (NRZ-OOK) and Differential Phase Shift Keying (DPSK) modulation and bit delays: 1 bit, 0.5 bit and 0.25 bit. A total of 6 data sets of 62,000 images each were obtained. The learning process was carried out for the number of epochs 50 and 1000. From the obtained learning results of the network, models with the best R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R^{2}$$\end{document} matching factor were selected. The learned models were further used to study the recognition of three phenomena simultaneously. The tests were carried out on sets of 2500 images in a combination of interference in the following ranges: 400–1600 ps/nm for CD and 10–30 dB for Crosstalk and OSNR. Very good results were obtained for recognizing simultaneously occurring phenomena using models learned up to 1000 epoch. Accuracy of over 99% was obtained for CD and Crosstalk for both modulations. In the case of the OSNR phenomenon, slightly weaker results were obtained above 96% in most cases. For models taught up to 50 epoch, very good results were obtained for the CD phenomenon (over 99%). For Crosstalk weaker results for OOK modulation were obtained. Poor results were obtained for the OSNR phenomenon, where recognition accuracy ranged from 50 to 80%, depending on the type of modulation and bit delay. Based on the conducted research, it was established that the use of ADTS and CNN methods enables monitoring of simultaneously occurring CD, Crosstalk and OSNR interference in the physical layer of the optical network, while maintaining the requirements for Optical Performance Monitoring systems. These requirements are met for network models learned up to 1000 epoch.


Introduction
Optical Performance Monitoring (OPM) is the basic mechanism for managing high-capacity optical networks (from 10 Gbit/s) based on high-speed transmissions and multiplexing technologies. The task of OPM is to assess the quality of the transmission channel by measuring optical parameters, without the need for direct analysis of the transmitted bit sequence. The ability to monitor parameters in the physical layer strongly depends on how the network is designed i.e., architectures and transmission technologies. For the needs of management, monitoring and diagnostics of optical links, the most key measurable parameters are primarily: Chromatic Dispersion (CD) (Kaminow et al. 2008;Petersen et al. 2006), Crosstalk and Optical Signal to Noise Ratio (OSNR) (Khan et al. 2011). The CD parameter is crucial when performing network modernization, enabling proper tuning of devices, or increasing the range, bit rates and distance. In turn, Crosstalk and OSNR are important in the design and maintenance of optical links, because their value clearly affects into the error rate in the optical channel. The required levels of monitoring the accuracy of individual parameters have been determined for the OPM mechanism. For CD, the required accuracy should be within 2% of the real value, while for Crosstalk and OSNR, the accuracy should be within 0.5 dB (Dahan et al. 2011). The first available techniques did not allow monitoring many parameters at the same time, hence the need to use special techniques dedicated to each parameter separately. Currently, a very important direction in the development of OPM is the development of techniques that allow the simultaneous measurement of several parameters with a single technique (Chan 2010). The use of one technique would significantly reduce the operating costs of measurement duration, reduce the level of system complexity and increase its speed. These criteria are met by the electronic Asynchronous Delay Tap Sampling (ADTS) method, which allows to monitor of simultaneously occurring disturbances in a graphical form. This graphic form is called a phase portrait. The phase portrait shows signal distortions that are caused by occurring phenomena. Images obtained using the ADTS (Dods et al. 2006) method should be further analyzed to obtain accurate numerical values of specific disturbances. There are various image analysis techniques, e.g. (Chan 2010;Dods et al. 2006;Zhao et al. 2009): Support Vector Machine, Hough Transformation, Hausdorff Measure or Artifcial Neural Network. However, they have limitations, mainly in the narrow scope of value recognition, and thus do not meet the requirements of OPM. Table 1 presents measurable disturbances for the mentioned techniques for image analysis and the scope of their measurability.
The aim of the work was to explore the possibilities of using the ADTS and Convolutional Neural Network (CNN, another known name is Deep Learning) methods to monitor simultaneously occurring phenomena in the physical layer of the optical network. Several new papers are available in scientific journals presenting the results of research on the use of CNN (Zhang et al. 2018;Cheng et al. 2019;Wang et al. 2017a, b;Wan et al. 2019;Fan et al. 2019;Fan et al. 2018) in OPM. Each work presented tests for different modulation formats (OOK, DPSK, QPSK, PAM4, PAM8, 8QAM, 16QAM and others) and different transmission speeds (10, 20, 30, 60 Gb/s and others). Some of the work also used other techniques such as Asynchronous Amplitude Histogram (Cheng et al. 2019), Diagram Analyzer (Wang et al. 2017a, b) and Eye Diagram (Wang et al. 2017a, b). Only in two works, at least three parameters were monitored simultaneously (Fan et al. 2019(Fan et al. , 2018. For the first study, OSNR was tested in the range of 10-28 dB with a 2 dB jump, CD in the range of 0-450 ps/nm with a 50 ps/nm jump and DGD (Differential Group Delay) in the range of 0-10 ps with a 1 ps jump, obtaining an average monitoring error on 0.81 dB for Table 1 Range of interference measurability for image analysis techniques (Chan 2010) Data Analysis Techniques Formats Demonstrated Impairments Demonstrated (range) Support vector machine (Dods et al. 2006;Anderson et al. 2007;Clarke et al. 2009;Beaman et al. 2009) 10G NRZ, 40G NRZ-DPSK, 40G RZ-DQPSK, 80 OSNR (10, 30) dB, CD (-1400, 1400) ps/nm, PMD (0, 60) ps, In-band crosstalk (15, 25) dB, Filter offset (-12, 12) GHz Hough transformation Maruta et al. 2008;Kitayama et al. 2007) 10G RZ-DPSK, 20G RZ-DQPSK, 40G NRZ-DPSK OSNR (8.7, 35) dB, CD (-600, 600) ps/nm Homodyne detection (Choi et al. 2009) 10G NRZ-DPSK OSNR (10, 30) dB, CD (0, 800) ps/nm Hausdorff measure (Zhao et al. 2008) 10G NRZ OOK, 40G NRZ-DPSK CD (0, 400) ps/nm Artificial neural networks (Jargon et al. 2009) 10G NRZ OSNR (15, 30) dB, CD (0, 55) ps/nm, PMD (0, 10) ps OSNR, 1.52 ps/nm for CD and 0.32 ps for DGD. The tests were carried out for data sets consisting of 1100 phase portraits for each speed and type of modulation. The learning process of CNN was carried out in the range of 60 to 125 epochs. For the second work, the same parameters were measured, but for other modulations and speeds, obtaining mean error monitoring values of 0.73 dB for OSNR, 1.34 ps/nm for CD and 0.47 ps for DGD. These studies used a set of data consisting of 990 images per case, and the mentioned accuracy was obtained for network learning in the range of 35 to 50 epochs. In the other works mentioned above, the focus was on monitoring only one OSNR phenomenon (Cheng et al. 2019;Wang et al. 2017a, b;Wan et al. 2019) and in the case of some works (simultaneously or alternately) recognizing modulation formats and transmission speed. In (Cheng et al. 2019;Wang et al. 2017a, b;Wan et al. 2019), in order to better reflect the simulation, transmission interference was introduced into the real conditions by the phenomenon of CD. However, this parameter was not monitored.
The presented research results in the discussed research work show great possibilities of using CNN in OPM. Only in 2 works, 3 phenomena are monitored simultaneously, while in other studies, only one OSNR parameter is monitored, and additionally the modulation format and/or transmission speed is recognized. In none of the papers is too much attention paid to the parameter of the number of epochs, which is very important during the process of learning the network. Increasing the number of epochs causes a decrease in the average recognition error, however, it still does not provide certainty for the later use of models to recognize phenomena from external data sets while maintaining the requirements of OPM. In all work on the use of CNN in OPM, the network learning process was completed at the stage when high recognition accuracy was obtained based on verification with the test data set at the number of epochs from 30 to 100. High accuracy of model learning, however, does not mean its subsequent high accuracy in external applications using a different data set than the validation data. To assess this, it is necessary to conduct additional research on external data, which was missing in the mentioned works. In addition, in the above-mentioned works, too few images were used in training sets. Too small a measurable range and too large a jump between successive values of individual disturbances were also provided. Network models taught in this way will prove to be inaccurate as presented in the conclusions in (Cheng et al. 2019), where the accuracy of OSNR recognition from the average error increased from 0.58 dB to 0.97 dB in the case of the CD phenomenon in the range from 0 to 1600 ps/nm.
The article presents a brief description of the ADTS and CNN methods, how to generate a training data set for the CNN method, and the network settings for which long learning processes were carried out. The results of network learning models and results of recognizing simultaneously occurring disturbances were presented, using previously learned models.

Asynchronous Delay Tap Sampling
The presented ADTS method (Khan et al. 2011;Dods et al. 2006) allow to the direct measurement of signal distortion without the need of recovering the synchronization clock. In this method, the received signal is subjected to demodulation and converted from the optical to the electrical domain. Then, the electric signal is separated into two feed lines, one of which introduces inflict physical delay ∆τ (Fig. 1). Thus, the delay line propagates a delayed copy of the original signal. Both the original and delayed signals are directed to the two inputs of the analyzer, in which the sampling process is being activated. Sampling results in a creation of coordinates for a single point.
The obtained sample pairs are used to create dotted plots on which sample values x i and y i are the coordinates of a particular point on the chart. The shape of the dot plots is disturbed by various signal parameters, and therefore the presented method can be used to monitor among others: CD, Crosstalk, OSNR and others (Dahan et al. 2011). Methods based on the ADTS allow to monitor signals of different bit rates and modulation formats (Figs. 1 and 2). The measurement system is not simple, because it requires a precise matching of the delay ∆τ used in the delay line. This matching is done based on the second sample pair. Nevertheless, the significant advantage of this method is its asynchronous capability (Khan et al. 2011;Chan 2010;Dods et al. 2006). In the ADTS method it is significant that each parameter has a different effect on the shape of the dotted plots. For a ∆τ delay the following distortions may be observed: CD bends the upper right edge of the plot to the center, Crosstalk creates smaller windows with a signal waveform, Amplified Spontaneous Emission (ASE noise) causes the blur of the edges of the plot (Fig. 3). The described simulation method was carried out using VPI photonics software. The discussed test case scenario was carried out for the bit rate of 10 Gbit/s, On-off keying (OOK) and Differential Phase Shift Keying (DPSK) modulation and a wavelength of 1550 nm. The length of the optical fiber used in the simulations was variable in the range of 1 to 125 km (the dispersion intensity was regulated by changing the length of the optical fiber). Different models were used to simulate the noisy waveforms for both modulations. Referring to Fig. 1 from the presented article, the tested modulations differed in the parts "Transmitter" and "Receiver", which were appropriately adjusted for the OOK and DPSK modulations. All options in the transmitter and receiver that may have caused interference other than CD, Crosstalk and OSNR have been disabled. For this reason, the transmitter and receiver  Figure 2 presents undisturbed phase portraits that were generated by the ADTS method for OOK and DPSK modulation and for three-bit delay values. Figure 3 presents the impact of individual phenomena on the shape of the phase portrait. Each phenomenon together with the increase of the distortion value causes the portrait to be distorted in its particular area. Under each portrait is a brief description of the disturbance.
The paper presents studies of the simultaneous monitoring of three mentioned phenomena: CD, Crosstalk and OSNR with medium intensity of impact. According to recommendations ITU-T (Rec. G.697 2016), these three phenomena occur in real systems for bit rates up to 10 GB/s, therefore it was decided to test only selected basic modulations (OOK, DPSK) for bit rates up to 10 Gb/s. The available scientific papers describing the ADTS

CD CROSSTALK OSNR
The signal "waves" on two sides and diagonal.
Crosstalk creates smaller windows with a signal waveform.
ASE noise extends the signal band on the diagonal and two sides.

Convolutional neural networks
Convolutional neural networks (CNNs) are artificial neural networks, which use the convolution operation to process high-dimensional inputs (e.g., images). CNNs were introduced by LeCun ( LeCun et al. 1998). The convolution operation uses several feature maps (filters), by which a portion of the pixel map is processed. Those feature maps are learned by the network. Next, the outputs of those filters are subsampled and provided to the next convolutional layer. This process may be repeated several times. The output of the last convolutional layer is provided to a fully connected layer of neurons, and finally to the output of the network. CNNs provide superior performance in image recognition tasks. In the proposed approach, machine learning methods for providing estimates about particular parameters in OPM systems was used. In principle information about CD, Crosstalk and OSNR may be retrieved from simple signal representations such as Delay Tap Plots. It is desired to have methods for automated detection of certain impairments in the optical system. In data science this kind of tasks refered as classification problems. Moreover, such system may not only classify certain optical impairments, but also provides qualitative information about them. Such task is a problem of regression-providing a qualitative estimation about a certain feature. In this work we focus on providing estimations about the CD, Crosstalk and OSNR based on data sets from ADTS method. CNN as our regression model for this problem was used.

Data preprocessing
The ADTS methods provides a stream of (x,y) pairs, where x is the sampled value of the non-delayed signal and y is the sampled value of the τ bit delayed signal. The delaytap plots are basically plots generated from a N-element series of (x, y) pairs. CNN uses images from ADTS method as the input data, which are 2-

Learning process
The regression model using CNN was built. The implementation was based on Tensorflow (Abadi et al. 2015), TFLearn and Scikit-learn libraries. There are no standards determining the best CNN settings (such as e.g., convolutional layer, epoch number, fully connected layer, snapshot step value etc.), so before starting the tests several dozen empirical tests of the network were carried out, determining a certain set of settings adopted in the main part of the study. In this way, two values of the epoch number were selected for research: 50 and 1000. These values were chosen after preliminary testing. The number of epochs 50 is the optimal minimum number of epochs for which a satisfactory model fit coefficient R 2 (above 0.98) was obtained after the network training process was completed. A high matching factor R 2 was also observed for the number of epochs 1000. Moreover, after exceeding this value, a regular decrease in the matching factor R 2 was observed, which is undesirable. One epoch represents one iteration over the entire data set. Using the powerful NVidia GeForce 1080Ti 11 GB graphics card, the network learning process has been shortened to several minutes for 50 epoch and for 120 min for 1000 epoch. A standard architecture of our network was used: the first two layers were convolutional layers with max pooling. We use 32 filters of size 3 × 3 and 64 filters of size 3 × 3, for the first and second layer, respectively. Both layers use rectified linear units (ReLu). Next, there are two fully connected layers using tanh activation with (1* where ℂ-convolution, ℙ-pooling, ℂ -fully connected ( n 1 -number of nodes in the first layer and n 2 -number of nodes in the second layer), ℝ-regression. The first buckle is responsible for feature extraction second for classification and the last for regression. The remaining learning parameters were as follows: learning rate = 0.0001, batch size = 100, and snapshot step value = 1000. We split the data sets into training sets and verification sets with ratio: 90% training, 10% verification. For data set the network were trained for 50 and 1000 epochs. After a lot of learning attempts for the best settings and network combinations, the best fit models were selected. Most of the best models were obtained for sets of fully connected layers for numbers 3 and 5 of 10 pieces and for number 1, 7 pieces. Table 2 presents the best R 2 coefficients for estimated impairments for individual settings assumed during learning. The Coefficient of Determination R 2 was used to determine the quality of the estimated impairments. R 2 is one of the basic measures of the quality of model fit. Values in the range from 0.9 to 1.0 mean a very good fit. If the value of the R 2 coefficient is closer to the 1.0, it is mean the better fit of the model. R 2 is defined as (R2 score, the coefficient of determination 2021): i . The next section presents the results of recognizing the numerical values of simultaneous CD, Crosstalk and OSNR interference. For the estimation process, models with the highest R 2 fit factor were used, which were obtained after long-lasting learning processes with empirically selected settings.

Measuring methodology
To conduct the research, it was necessary to find a method that would allow simultaneous monitoring of several phenomena occurring in the physical layer of the optical network. The ADTS method allow to visualize impairments in a graphic form. Each phenomenon affects the shape of a different fragment of the image, which allows theoretically to distinguish these phenomena. Using the ADTS method, 6 data sets consisting of 62,000 images were built, each. Data sets were built for OOK and DPSK modulation and three types of bit delays. Each data set had three types of labels (CD, Crosstalk, OSNR) describing the impairments values for individual images. The built data sets were then used as input data in CNN (the so-called training and validation set). Each one data set was used three times for each of the studied cases using a different label to teach the network to recognize one of three phenomena (CD, Crosstalk, OSNR). The network learning process for each of the data sets and for each impairment was carried out for 7 cases of the fully connected parameter (Sect. 3.2) and for 2 cases of epochs 50 and 1000. A total of 252 network learning processes were carried out, from which one best model was selected for each interference, for given modulation and each type of bits delay. Considering all combinations, 18 models were obtained. Mentioned each model is a matrix that stored learned weights for CNN after finished of the learning process. Figure 6 is an overview diagram showing the process of generating data sets and the CNN network learning process. The best models obtained at this stage are used further to estimate the occurring impairments in the physical layer of the optical network (blue arrows). Figure 7 shows the module responsible for estimating simultaneously occurring impairments. The previously selected models with the best R 2 coefficient are sorted into groups of three (CD, Crosstalk and OSNR models) depending on the type of modulation and the type of delay. For each group at the entrance to each model are introduced appropriate data sets having a combination of impairments in the form of a twodimensional image matrix. Each model separately processes the entered matrix and estimates the numerical value of the given impairments.

Results
Using the learned models, which were discussed in the previous section, research was carried out on the recognition of simultaneously occurring phenomena on prepared data sets. The data sets had 2500 images containing combinations of CD, Crosstalk and OSNR phenomena in the range of 400-1600 ps/nm for CD, and 10-30 dB for CrosstaÛlk and OSNR. Table 3 presents the accuracy of recognizing individual phenomena using previously learned models. The accuracy for each phenomenon was calculated using the following formula (Accuracy score 2021): where: ŷ i -the predicted value of the i-th sample, y i -the corresponding true value and n samples -all samples in data set. The obtained results show a very high accuracy of recognition of individual phenomena for selected cases. For models whose network learning process was carried out for 1000 epochs, almost perfect recognition results were obtained above 99.3% for OOK and DPSK modulation, for all bit delays for CD and Crosstalk phenomena. In the case of the OSNR phenomenon, very good results were also obtained above 96% for OOK modulation, and above 93.4% for DPSK modulation. In the case of models learned for 50 epochs, very good Scheme for generating data sets and the CNN learning process Fig. 7 Scheme of impairments estimation using learned models results for CD phenomenon above 99.8% were obtained for both modulations. High recognition results were also obtained for the Crosstalk phenomenon for DPSK modulation above 99.8%. For OOK modulation, results above 92.2% were obtained. Clearly weaker results were obtained for the OSNR phenomenon. For DPSK modulation from 45.6%, while for OOK modulation from 68.3%. The presented results show a clearly higher accuracy of recognizing simultaneously occurring phenomena for models whose learning process was carried out for 1000 epochs. Such models ensure high repeatability of results and allow the phenomena to be recognized while meeting the requirements of OPM systems. For models taught up to 50 epochs, the OPM requirements for the OSNR phenomenon are not met. It should be noted that the process of teaching models for the number of 50 epochs gave very high coefficients R 2 . However, the high accuracy of the learned models did not translate into high accuracy in recognizing simultaneously occurring phenomena (primarily in the case of OSNR). Tables 4,5,6,7,8,9 present selected phase portraits from the conducted tests along with the values of the occurring disturbances and the results of estimation of these disturbances using learned CNN models for the number of epochs 1000. It is worth noting that some cases the estimated value is more than the reference value, and

Conclusion
The main purpose of the research was to check the possibility of using ADTS and CNN methods to monitor the simultaneous occurrence of CD, Crosstalk and OSNR phenomena in the physical layer of the optical network. The conducted tests for OOK and DPSK modulation and 1 bit, 0.5 bit and 0.25-bit delays gave very good recognition results, being in a combination of three simultaneously occurring phenomena in the range of 400-1600 ps/ nm for CD and 10-30 dB for Crosstalk and OSNR. Several data sets were used to test the accuracy of phenomenon recognition, which contained 2500 images. For these data sets, the phenomenon recognition accuracy was 99.9% for CD, 99.3% for Crosstalk and 95.6% for OSNR (for the most data sets). The research was carried out for network models learned up to 1000 epoch. The learning time needed to teach a single model when using a workstation with a NVidia GTX 1080Ti graphics card (11 GB RAM) was about 120-140 min. In each model there were 62,000 cases of a combination of three simultaneously occurring phenomena. Research was also carried out for network models learned up to 50 epochs. Very good results for CD for both modulations and Crosstalk for DPSK modulation were obtained for these models. In other variants, poor results were obtained, especially for the OSNR phenomenon. The tests confirmed the possibility of using the ADTS and CNN  methods to monitor simultaneously phenomena occurring in the physical optical network layer while maintaining the requirements for OPM mechanisms. The methods can be used for both OOK and DPSK modulations at 10 GB/s. There were no clear differences in the accuracy of phenomenon recognition for different bit delay values. In order to obtain high repeatability and accuracy of recognition of phenomena, models that have been taught for the appropriate number of epochs should be used. It should be remembered that obtaining high accuracy of model learning in the network learning process, which is determined on the basis of a test/validation data set, is not tantamount to obtaining the appropriate model for use in external data sets. Obtained models with high accuracy of teaching should be verified on subsequent data sets. If the results are unsatisfactory, repeat the learning process by increasing the epoch number parameter. To determine the best epoch parameter, empirical research should be carried out. The next stage of the research will be checking the exact impact of the epoch's parameter on the fit factor R 2 of the learned models and on the accuracy of recognizing individual phenomena using the learned models.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.