Analysis of a Frequency Response of a Noisy Optical Network for Its Self-adaptation

We study the quality of frequency response in a noisy optical network. Such a response can be useful in traditional frequency-domain industrial loop controllers. In particular, we analyse a (step, frequency) response of a simulated computer network, where the stimulus is one of the coefficients which regulate the network’s strategy of packet transmission, and the response is the network’s momentary performance. This way, we find a frequency range, where an instantaneous dependence between the stimulus and the response can direct a self-adaptation scheme of the proposed strategy due to changing network conditions. To stay in the safe limits of the network’s behaviour, we make the stimulus weak. We use a bursty traffic model to test the limits of this approach. We use a model of an optical ring of an experimental NGREEN network developed at NOKIA. The discussed technique was capable of optimising the network’s behaviour.


Introduction
Frequency-domain adaptation methods employed in loop controllers [2,3] are rarely found in computer networks [13,20], even if the topic of a self-adapting computer network [17], a derivative of the more general vision of adaptable systems [7,11], is increasingly discussed. One of the reasons might be that traffic conditions in computer networks can be very variable, inducing in effect a strong noise on the system, which in itself may show a non-linear and nondeterministic behaviour, these factors decreasing the quality of frequency response. Together with the requirement of a proven reliability of a computer network, this may lead to a preference of precise, mathematically proven methods, like the many TCP flow control techniques.
To estimate the robustness of the loop control methods in question, here, we test the quality of a frequency-domain response on a highly noisy model of an experimental optical network NGREEN [4,6], facing constant traffic change. A sinusoidal perturbation of the system constantly introduces small modifications into a main, strategy of self-adaptation of the network [4] which already contains its own feedback mechanisms. Possible resulting oscillations in the quality of the behaviour of the system are then detected, a response useful in loop control algorithms like the widely known Ziegler-Nichols method. For example, we may apply a lowamplitude sine wave into some strategy parameter a (so that it still oscillates around its original value) and search for a response wave of a similar frequency in some quality function q . To test the limits of obtaining a response useful for adapting a , we use a highly variable, noisy traffic generation.
The paper is constructed as follows. Section 3 precises the terminology, and then, Sect. 4 briefly describes a network architecture which we simulate to test our method. Section 5 defines the model of traffic applied. In Sect. 6, we analyse step and frequency responses of the system in question; finally, in Sect. 7, we observe the optimisation process. Section 8 concludes the paper. This article is part of the topical collection "Modelling methods in Computer Systems, Networks and Bioinformatics" guest edited by Erol Gelenbe.

Similar Work
Implementation of PID-based loop controllers in optical networks is limited. In [18,19], the authors present methods of control which manage both computing and network resourcing. This is possible via managing certain tasks in optical grid networks and establishing dynamic light paths in wavelength-division multiplexing networks. In [9,10], bandwidth allocation is managed to minimize delay in passive optical networks. Their strategy demonstrated a better performance and robustness than previously existing dynamic bandwidth allocation algorithms.

Terminology
Let there be a system S , whose designer does not know, how should a set of its parameters P i , i = 1, 2, … K follow different operational conditions (like the traffic in the case of a computer network). Yet, the designer knows that some of these parameters may vary within a certain domain D with no critical change of the system's behaviour, which must remain robust. The designer may choose a mean point within the domain in question and left it like that, which is a common practice. But why not take advantage of the arbitrariness provided by D , and instead make the point {P 1 , P 2 , … P K } oscillate in a working device, looking at the same time at its momentary performance? The testing might possibly discover some dynamically changing sweet spot. It would not be explicitly defined, how the sweet spot changes its position, the system would just follow its own diagnostics in quasi-real time, thus a self-adaptation. This is opposed to, e.g., following an equation known a priori, like that of RTT defined in RFC 6298 [15]. A small D poses extra difficulties-P i must oscillate with a small amplitude, likely much smaller than D , as if we want to compare the quality of different regions within D , a normal approach is to study them separately. Thus, the emitter signal is small. The receiver signal, e.g., some performance criteria like a queue length, might thus get lost in the noise produced by, amongst others, a varying traffic. As we are interested in the limits of the presented approach, we actually want a low signal-to-noise ratio. For this end, we will use a dedicated traffic generator, which constantly mixes several traffic streams to increase the heterogeneity. A sophisticated computer network often has internal optimisation mechanisms, which tune the network in a considerable range. An oscillation with a small D and thus a limited tuning range might be a secondary optimisation mechanism-in the further studied case, it is literally a tuning of tuning, i.e., searching for an optimal strategy of a primary tuning.

Studied System: An Optical NGREEN Ring
We have chosen an experimental optical network of an average complexity of packet management-an NGREEN ring of coloured packets [6]. It is normally predestined to be used in metropolitan aggregation networks based on 5G [12], but due to its low cost, an adaptation to a bursty traffic found, e.g., in data centres is considered [4]. We will improve that adaptation by the said tuning of tuning-one of the strategy-regulating coefficients which is normally constant will oscillate.
We will briefly describe the traits of the architecture relevant to the topic of this paper, and the details can be found in [4]. We simulate an unidirectional optical ring divided into M = 1000 fixed-size slots of S = 12,500 bytes each. There are N = 10 nodes N i , i = 1, … N , randomly distributed along the ring. A slot passes a node in the optical medium during s t = 1 μ s, which gives a transfer rate of 100 Gbps. Despite that the slots (or the packets in them) are coloured, to greatly decrease the costs, there is only a single, common optical gate/amplifier (SOA) [16] for all wavelengths. This means that a slot can only be dropped in its totality, as the wavelengths are all either amplified or suppressed. The trait makes the fact that there are multiple wavelengths, irrelevant to the further presented optimisation method.
Along the ring, there also exists a bidirectional control channel, which allows the nodes to exchange some control information or statistics. The channel is essential for the optimisation method presented in [4]-it turns out that if each node broadcasts in the two available directions and in each 1 μ s its momentary electronic input queue size, an effective optimisation can be employed called there GLOBAL. The name comes from the fact that each node N i uses global, if delayed information about the momentary size in bytes of input queues S j , j = 1, … N of all nodes along the ring, and not only its own local information about S i .
It is important to consider a bidirectional control channel, as it substantially decreases the delays of the propagation of S i . To keep the paper focused (no need to consider the routing) and to speed up the simulation, we do not take into account a bidirectional transmission possible for regular packets (as in some variants of NGREEN), assuming that for certain traffic models, the second ring would behave in a way symmetrical to the unidirectional one considered.
We consider a 10 Gbps Ethernet card at each node, which can thus produce together V max = 100 Gbps being the theoretical maximum transmission speed of the (unidirectional) ring. Let a load L represents a traffic of an average of LV max . As logically no packet is transmitted around the whole ring, depending on the traffic model and the packet strategy, L can have a value of up to 1 without a resulting packet loss.

SN Computer Science
We consider that the ring (except for the special control channel) is in the broadcast-and-select mode (B&S) [1], where a slot can be filled only by a single node until it is dropped in its entirety after the last of its multiple destinations possible. To increase performance, packets can be segmented, i.e., their subsequent fragments can be scattered along multiple slots. There is a predefined function used by each N i for each of the packets to be inserted into the ring: where the empirical constant d = 1 × 10 5 is meant to better the performance of direct optimisations like the Nelder-Mead method [14] by increasing the isotropy of the optimisation space-local quality maxima are not flattened or squashed in the direction of the three coefficients multiplied by queue-size-related values. The equation gives a "good" timer expiration value t max , i.e., a packet age after which the packet is unconditionally inserted into the ring as soon as possible, even if the resulting slot fill ratio were not 100%. N i and N t are, respectively, a source and a destination node of a packet p (still in the input buffer of N i ) whose t max is calculated. S mean (N i , N t ) and S max (N i , N t ) are, respectively, the mean and the maximum buffer size in nodes within the future path of p , computed on the basis of the most current information about S i known by a node. See that for p , we do not take here into account N t and further nodes, which in the case of the mode B&S can be seen as a simplification-as said, a slot is removed from the ring after the last destination of packets within it, and thus, a packet with a destination N t may still be a load to the ring even after N t is reached.
There is an intermediary function t lim in (1)-in a simple case, it would just clamp t opt into a domain which makes sense for t max -in particular, it would prevent t max from being negative, which has no sense for an expiration time. We have two reason for not just clamping the value: • The coefficients a, a S , a mean , a max are normally meant to be constants predefined once in a simulation, using a direct optimisation method. For such methods, it is essential that there are no flat quality regions in the optimisation space, as these are the different values of the quality function which direct the optimisation; otherwise, it might "get lost", which may give reason to, e.g., a runaway effect leading to nonsensically large values of the optimised coefficients. • Instead, a new application is found for these regions-a temporal accumulation of t max . (1) For this, we introduce an accumulation value C i for each N i , with the following interpretation, allowing an optional temporal inertia when estimating t max : • negative C i means that t max should also be small in some short period in the future, i.e., something occurred which is a strong predictor, that there will not be a need for larger t max soon; • on the contrary, positive C i indicates a strong predictor, that a large t max will be required for some time.
Let t lim (t) controls both t max and C i as follows: C i stores either an "excessive" or an "insufficient" expiration time; these two extremes represented by t l and t h , the storage to be reused as soon as possible. We limit the inertia by I C = 1 ms (a considerable period in the case of NGREEN) as we assume that long-term prediction of network conditions is not robust, and thus, long-term shaping of t max by C i is not needed, and something not needed but active may potentially be harmful. We allow t l = 0 which equals to an opportunistic mode, e.g., any packet is inserted into the optical ring as soon as possible. On the basic of practical considerations found in [1], we set t h = 100 μs , as higher expiration times were not found usable in the case of NGREEN.
The discussed run-once optimisation normally settle a, a S , a mean and a max forever. We regard these four values as a representation of a possible strategy of packet management. So defined, the strategy is constant. To make it oscillate, we could constantly change one or more of these coefficients in a production NGREEN ring. Furthermore, we will apply different oscillations to a mean . The response in question is Ŝ mean , a global average of evaluations of S mean in (1) within a further defined averaging period; we assume that Ŝ mean is inversely related to quality. Ŝ mean is not instantly available at a node-a delay of the order of s t M∕2 (we divide by 2 as the control channel is bidirectional) is needed to exchange data required for the calculation of Ŝ mean by each node.

Traffic Model
As we talk about a constant adaptation of strategy in noisy conditions, let us make the traffic model very variable, much more variable that a bare CAIDA trace [5] used in [4]. We will make a mixture of two very different CAIDA traces and for contrast, of a node-symmetric traffic given by a Poisson process, with packets of an equal size. All the three sources have the same mean packet size and the CAIDA traces are individually processed as described in [4]. where m n is a normalisation factor, so that ∑ i p i is always 1 for m n = 1 and there is no normalisation at all for m n = 0 ; see Fig. 1b for an example of the mixing function. We use a rather high m n = 0.95 , because even such a traffic was difficult to handle by the network at large L , comparing to a single CAIDA trace. It is because given the different probability distributions in Fig. 1a and the burstiness of the CAIDA sources, the momentary rate of input traffic still varies a lot at individual nodes.
Instead of using a stochastic mathematical equation as a source of k i , we took three more CAIDA traces, calculated their instantaneous rates, and multiplied the time by m s = 500 (i.e., made them m s times slower, compared to the three mixed traces) and normalised then to have a mean of s −1 NUM each. This hopefully reflects better the phenomenons like a self-similarity, found in real telecommunication networks.
The traffic with the varying mixing function applied will further be called MIXED. For contrast, we will also use its homogeneous variant POISSON, where all packets from the Poisson source are accepted, the two CAIDA sources ignored.

Responses to Stimuli
Let the stimulus be a rectangular or a sine wave of a frequency f . To gather a mean (step, frequency) response of the network, nodes will use local histograms that store S mean against mod (t, f −1 ) where t is a time of evaluation of (1); a histogram is thus swept with a frequency of f similarly to an oscilloscope and for the same purpose-to observe a waveform of that frequency. These histograms will then be integrated over all nodes into a response given by Ŝ mean .
In case of the further simulations, let the total simulated time during which arrive packets, whose statistics are taken into account, be equal to T s = 10 s, unless otherwise stated. Actual simulated time lasts until the last of these packets leaves the network.

Step Response
To get some insights, we first analyse a response to a rectangular wave, emitted as said through a mean . Let the coefficient vary between 0 and twice its optimal (predefined) value a mean opt .
To see clearly different elements in the network's behaviour, Fig. 2a shows a case of a moderate oscillation frequency, a low traffic noise, and a moderate L . Due Step response: optimal a mean is multiplied by a square waveform of levels 0, 2, and a given frequency. Vertical dashed lines show, respectively, the beginning and the end of the higher value of a mean to propagation delays along the control channel, Ŝ mean is updated progressively, which is seen in the slopes in the response, found directly after each flip of a mean . After any of the two slopes occurs an overshot followed by a slight ringing, a shape similar to that expected from an electrical serial RLC circuit with a near-critical damping. Several factors play a role here: an instantaneous increase of a mean is reflected by increased t max ; this slows down the insertion of packets into the ring. This in turn increases the queues, which due to positive a mean cause a further increase of t max (predefined constants and arguments in (1) are positive). This produces effects, which in contrast promote an increased packet insertion: • the ring becomes empty, the free slots reach the nodes which follow ring-wise; • the longer queues allow a more probable 100% fill ratio of a slot, thus less need for waiting for t max .
These factors may decrease Ŝ mean which decreases t max , etc. As seen, there exist a set of complex circular dependencies, which may make it difficult to predict a better strategy. Yet, all in all, we see that in the analysed case, a stabilised high a mean increases buffer sizes and vice versa. We suppose that larger buffer sizes are expected to be proportional to the mean latency, an essential quality trait of a network. We thus conclude that for the operating point in Fig. 2a, the mean value of a mean should be decreased. In Fig. 2b, comparing to (a), the traffic MIXED is used and also the load is increased. Due to the burstiness and heterogeneity of the traffic, we superimposed Ŝ 0 mean , which is like Ŝ mean but with a constant a mean (a fixed strategy). The visible unevenness of Ŝ 0 mean shows that at this operating point, the noisiness of the response is considerable.
The bottom row in Fig. 2 illustrates another problem: the two responses seem very similar, but the mechanism behind them is radically different: in (c), a mean and Ŝ mean are positively related: what is seen in the plot are small parts of slopes in (a). This is why the curve changes direction exactly when a mean changes, like in (a). In (d), similarly to (b), a mean and Ŝ mean are negatively related yet the phase delay between them makes (d) mimic (c)-it is betrayed by a change in the response direction seen slightly before the stimulus flips. This illustrates, though, potential difficulties in analysing the character of the response by, e.g., an FFT transform. See that the relatively high frequency of 977 Hz in (d) has also high phase delays, but on the contrary, Ŝ 0 mean is almost flat, and thus, the noise originating from the variable traffic is supposed to be low for that frequency. This trade-off may signify that there is some optimal frequency between the two considered here.

Frequency Response
The step response revealed traps which we need to avoid also in the case of the sinusoidal oscillation of a mean , which we would rather want to use, as we are afraid of potential abrupt changes produced in the network by a rectangular oscillation (see the overshoots in Fig. 2) and also we expect a sinusoidal oscillation to produce a more focused answer in the further used frequency domain. Let for an amplitude of A = 1 the coefficient a mean oscillate in the same range of 0, 2a mean opt as in Sect. 6.1. Similarly to what we did in Sect. 6.1, let us start with low loads and the traffic POISSON. The respective bode diagram, at the top row of Fig. 3, like Fig. 2a, shows a characteristic similar to that of a low-pass filter made of passive electrical components. Due to the high speed of NGREEN, the part of a phase delay induced mostly by network delays (like the transmission of packets) should be at very low frequencies negligible. We see that for these frequencies, a low noise of the traffic POISSON allows an observed total phase Fig. 3 Bode diagrams of an NGREEN ring against different L and traffic models, A = 1 . As the stimulus and the response are in different units, we normalised the magnifications by comparing them at high frequencies and assuming 0 dB at low frequencies delay of ≈ 0 which shows that a mean and Ŝ mean are positively related. At about 300 Hz, the relation between a mean and Ŝ mean should not radically change due to the small D , and thus, the phase delays observed there are mostly by the network delays. At higher frequencies, besides an increasing phase delay, we possibly observe various aliasing effects, resulting from network delays or ringing frequencies being similar to the frequencies of the stimulus. Anyway, the uninterpretability of phase makes these regions unusable for our purposes.
The subsequent Bode diagram for the traffic MIXED shows a considerable noise at frequencies < 100 Hz for the assumed T s . The noise could be reduced by longer simulation times, but we would want a mean to be able to traverse a considerable part of D within < T s , so that the self-adaptation reacts fast, and the noise is thus something to take into account instead. An apparent source of the noise might be the considerably uneven Ŝ 0 mean , comparing to the response, seen in Fig. 2b.
The other cause might be a disappearing relation between a mean and Ŝ mean , seen between L = 0.7 and L = 0.8 in Fig. 4. This may make the oscillation of a mean has no considerable effect on the network, thus whichever the response is; it can be just noise with no relation to the phase of a mean . The noise seems to slowly disappear at > 250 Hz , what is seen instead at ≈ 300 Hz and for larger L is a positive phase, which would correspond to delays > . Yet, we cannot explain it by network delays, as they were small for the traffic POISSON at the same frequencies and we find it unlikely that they radically changed due to the traffic MIXED (slots propagate at the same speed and the phase delay is similar for high frequencies). We find it more probable that the phase at ≈ 300 Hz is a transition between (1) a mixture of ≈ 0 and ≈ , characteristic for a mean and Ŝ mean positively/ negatively related, and (2) ≈ − ∕2 , observed at frequencies of about 1000 Hz, where the phase delays converge and magnifications decrease almost uniformly for all L (thus for all relations between a mean and Ŝ mean ), which suggest that the oscillation of a mean is too fast at these frequencies to considerably affect Ŝ mean . The hypothesis that the phases at ≈ 300 Hz are not just a random noise residue is strengthened by the Bode diagram for a simulation time of 4T s , where the noise seems to decrease even earlier, at ≈ 200 Hz.
The region of 350 Hz seems to be optimal-phases are considerably less noisy comparing to the lower frequencies, are not uniform for all L in the case of higher frequencies, and their values can be roughly mapped to the relation between a mean and Ŝ mean seen in Fig. 4.

Optimisation
Observing the Bode diagrams in Fig. 3 and the phases in Fig. 4, especially at the tipping point for the traffic MIXED between L = 0.7 and L = 0.8 , we classify as follows: • let a mean be increased by a step iff ∈ ( 3 8 , ⟩; • let a mean be decreased by a step iff ∈ (− ∕2, 3 8 ⟩; • otherwise, let a mean be not modified as we cannot say anything decided enough about this range of delays. In this simple demonstrative scheme, we do not take into account the amplification. Yet, if the received signal is weak or nonexistent, this may make the strategy just randomly drift. See that we allow an arbitrary value of t max which is obviously outside any reasonable limits of D . This is done on purpose: we want to see if the accumulation of t max , discussed in Sect. 4, helps in directing the optimisation, and also what happens with the quality if there is a possible runaway. Figure 5a shows an optimisation process, where after each F = 27 oscillations of 354 Hz, an averaged response is processed with a Fast Fourier Transform, and then, a mean  is modified with a step of a step = a mean opt ∕20 , depending on the phase. F and a step were found experimentally as providing an adaptation time of the order of T s , so that the level of noise which we considered in the averaged responses is more or less apt. Longer adaptation times would likely decrease the noise-see for 2T s in Fig. 4, where the phase increases the most between the two borderline cases of L = 0.7 and L = 0.8 . We also reduced the oscillation amplitude from A = 1 to 0.5 to keep better with the idea of minimally intrusive modification of strategy. The adaptation of a mean roughly follows the direction to smaller Ŝ mean depicted by the triangles in Fig. 4, except for the borderline cases where a mean changes undecidedly. Figure 6 shows the performance of the frequencydomain strategy tuning, comparing to the original strategy GLOBAL: given the limited effect of a mean seen in the exaggerated Fig. 4, the tuning works well as a "tuning of tuning", even for the smaller oscillation amplitude. Predictably, there is no improvement for the borderline cases, i.e., a mean is meaningless there. An occasional runaway effect which makes a mean < −2 × 10 −4 for L = 0.5 (compare Fig. 6 to Fig. 5) is detrimental. This shows the importance of keeping the strategy within D to keep the system robust-in Fig. 5b, we prevent a mean from being negative.

Discussion
The results point to a potential usefulness of traditional frequency-domain control loop methods in optimising a noisy telecommunication infrastructure, if a suitable range of frequencies is chosen. To increase robustness, we would consider an automated online method.
We work on a variant of the method with frequency modulation, where to test a response at some frequency f , we use a sequence of frequencies close to f . A cross-correlation technique similar to [8] might then be used to increase the resistance to noise, and thus improve the response quality.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.