We have implemented the approach described in Sect. 3 in different ways. The most important distinction between these approaches is the type of filter that is being used to reduce the noise in the signal, and to smoothen the time series of observed data volume used for scaling, i.e., the concrete function used for f. Stemming from the field of signal processing, a common approach of separating noise from signal is employing a low-pass filter [5]. We seek to improve the performance regarding detection of edges and separability in the Fourier domain [17] by proposing two non-linear filters: TVD [21] and EKF [15].
4.1 Linear Smoothing
Amongst the most basic methods in signal processing is linear smoothing (LS). Its essence is the smoothing of a noisy signal by setting each time series element to the arithmetic mean of its neighbors. In scenarios where live data is processed, only the past neighbors can be used, i.e., the window is set to end at the current element. Therefore, in its general variant, for a time series \(v_0, v_1, \dots , v_n\), and a given window width w, each filtered element \(\overline{v}_x\) is set to the following:
$$\begin{aligned} f_V(t) = \overline{v}_t = \frac{1}{w}\sum _{x = t - w}^{t} v_x \end{aligned}$$
(5)
Alternative versions include weights for more recent elements or exponential smoothing. However, since all of those methods essentially build a mean over a window of past elements, we implemented LS as a baseline reference. A major flaw of all LS algorithms is the fact that they do not detect edges well. In the context of elasticity of stream processing, this means that changes in the volume are not detected immediately, and thus scaling operations are delayed by design.
4.2 Total Variation Denoising
A more advanced approach to smoothing is the approach originally proposed in [21], commonly called TVD [22], or ROF, after the authors’ names [20]. The basic notion is that the total variation of a signal is to be minimized. Intuitively, TVD aims to remove the variation induced by noise while keeping the denoised signal as close to the original signal as possible, with respect to the least squares distance function. TVD is insensitive to the frequency ranges of noise and signal, making it more suitable to detecting sudden changes in near-real time, compared to linear methods like low-pass and high-pass filters or Fourier transforms.
Similarly to LS, TVD has one hyperparameter. In the case of TVD, this hyperparameter \(\alpha \) determines the degree of smoothing. \(\alpha = 0\) indicates no smoothing at all, i.e., the output of TVD is equal to its input, while \(\alpha \rightarrow \infty \) means that more smoothing is performed, and this smoothing converges towards a steady state which is the denoised signal [21].
In its essence, the underlying TVD minimization problem proposed in the original work [21] is based on the assumption that the functional
$$\begin{aligned} v(x, y) = \overline{v}(x, y) + n(x, y) \end{aligned}$$
(6)
expresses the raw signal v as a function of the actual (smooth) signal \(\overline{v}\), and n, the additive noiseFootnote 1. Following this, the minimization problem is stated as a problem of minimizing the variation (i.e., the integral of changes in gradients):
$$\begin{aligned} \text {minimize} \int _\varOmega (v_{xx} + v_{yy})^2 \end{aligned}$$
(7)
where \(\varOmega \) is the variable domain, \(v_{xx}\) denotes the second derivative of v with respect to x. Two additional constraints provided in [22], binding the mean and variance of the raw and the reproduced signal to each other, are not shown here.
In our application of TVD, we have no multivariate functions, i.e., our \(v_0\), v and n only depend on one (discrete) variable, which is the time t. Thus, we do not need to apply partial derivations. Since we record discrete, digital measurements, our definition of variation is also discretized and reduced, as shown in (8).
$$\begin{aligned} \text {minimize} \sum _{x = 1}^n | v_x - v_{x - 1} | \end{aligned}$$
(8)
We have used this minimization problem, together with the original constraints, and applied the majorization-minimization algorithm described in [22], which majorizes the total variation minimization problem by its quadratic function, a methodology described in [7].
4.3 Extended Kalman Filter
The EKF is a nonlinear generalization of the Kalman filter [15]. Kalman type filters work by defining state transition and state observation models, and taking into account the noise and its (co-)variance. Again, since we do not have a multivariate function, we only have one variable, which simplifies the computation.
The EKF is based on the notion that there is a transition model F and an observation model G:
$$\begin{aligned} \frac{dx}{dt}&= F(t)x + G(t)c(t)\end{aligned}$$
(9)
$$\begin{aligned} z(t)&= H(t)x(t) + n(t) \end{aligned}$$
(10)
where F(t) denotes the state transition, G(t) is the control (input) transition, c(t) is the control function, i.e., the input applied to a system, and x is the state. H(t) is the observation model, i.e., the measurement transformation, n(t) is the additive noise added to the signal, and z is the observed stateFootnote 2.
In our application, we have simplified the model in that we do not apply any input to the system, but only observe it. Thus, the entire term G(t)c(t) can be eliminated. As state x in the EKF notation, we have used the current volume (v in our notation), as well as the derivative (i.e., change in time, \(v'\)) of the current volume. Therefore, in our application of EKF, \(x = \left[ \begin{array}{l}v \\ v'\end{array}\right] \).
The term z(t) from the EKF notation corresponds to the resulting, filtered volume measurement \(\overline{v}\) in our notation. We have used this model in order to apply an unknown input to the estimation. In our case, the unknown input is the actual reason for the volume change, which is a factor we are not able to (generally) include in our model. We therefore allow the change in volume \(v'\) to be estimated by the EKF filter using only measurable data [8]. As a state transition, we use a matrix applying \(v'\) to v, i.e., we assume that without further input, the volume change will be constantly applied to the volume. The source of the change itself is, in this model, part of the noise, i.e., n(t).