Keywords

1 Introduction

In the recent past physics informed data science has become a focus of research activities, e.g., [9]. It appears under different names e.g., physics informed [12]; hybrid learning [13]; physics-based [17], etc.; but with the same basic idea of embedding physical principles into the data science algorithms. The goal is to ensure that the results obtained obey the laws of physics and/or are based on physically relevant features. Discontinuities in the observations of continuous systems violate some very basic physics and for this reason their detection is of fundamental importance. Consider Newton’s second law of motion,

$$\begin{aligned} F(t) = \frac{\,\mathrm {d}}{\,\mathrm {d}t} \left\{ m(t) \, \frac{\,\mathrm {d}}{\,\mathrm {d}t}y(t) \right\} = \dot{m}(t)\,\dot{y}(t) + m(t)\,\ddot{y}(t). \end{aligned}$$
(1)

Any discontinuities in the observations of m(t), \(\dot{m}(t)\), y(t), \(\dot{y}(t)\) or \(\ddot{y}(t)\) indicate a violation of some basic principle: be it that the observation is incorrect or something unexpected is happening in the system. Consequently, detecting discontinuities is of fundamental importance in physics based data science. A function s(x) is said to be \(C^n\) discontinuous, if \(s \in C^{n-1}{\setminus } C^n\), that is if s(x) has continuous derivatives up to and including order \(n-1\), but the n-th derivative is discontinuous. Due to the discrete and finite nature of the observational data, only jump discontinuities in the n-th derivative are considered; asymptotic discontinuities are not considered. Furthermore, in more classical data modelling, \(C^n\) jump discontinuities form the basis for the locations of knots in B-Spline models of observational data [15].

1.1 State of the Art

There are numerous approaches in the literature dealing with estimating regression functions that are smooth, except at a finite number of points. Based on the methods, these approaches can be classified into four groups: local polynomial methods, spline-based methods, kernel-based methods and wavelet methods. The approaches vary also with respect to the available a priori knowledge about the number of points of discontinuity or the derivative in which these discontinuities appear. For a good literature review of these methods, see [3]. The method used in this paper is relevant both in terms of local polynomials as well as spline-based methods; however, the new approach requires no a priori knowledge about the data.

In the local polynomial literature, namely in [8] and [14], ideas similar to the ones presented here are investigated. In these papers, local polynomial approximations from the left and the right side of the point in question are used. The major difference is that neither of these methods use constraints to ensure that the local polynomial approximations enforce continuity of the lower derivatives, which is done in this paper. As such, they use different residuals to determine the existence of a change point. Using constrained approximation ensures that the underlying physical properties of the system are taken into consideration, which is one of the main advantages of the approach presented here. Additionally, in the aforementioned papers, it is not clear whether only co-locative points are considered as possible change points, or interstitial points are also considered. This distinction between collocative and interstitial is of great importance. Fundamentally, the method presented here can be applied to discontinuities at either locations. However, it has been assumed that discontinuities only make sense between the sampled (co-locative) points, i.e., the discontinuities are interstitial.

In [11] on the other hand, one polynomial instead of two is used, and the focus is mainly on detecting \(C^0\) and \(C^1\) discontinuities. Additionally, the number of change-points must be known a-priori, so only their location is approximated; the required a-priori knowledge make the method unsuitable in real sensor based system observation.

In the spline-based literature there are heuristic methods (top-down and bottom-up) as well as optimization methods. For a more detailed state of the art on splines, see [2]. Most heuristic methods use a discrete geometric measure to calculate whether a point is a knot, such as: discrete curvature, kink angle, etc, and then use some (mostly arbitrary) threshold to improve the initial knot set. In the method presented here, which falls under the category of bottom-up approaches, the selection criterion is based on calculus and statistics, which allows for incorporation of the fundamental physical laws governing the system, in the model, but also ensures mathematical relevance and rigour.

1.2 The New Approach

This paper presents a new approach to detecting \(C^n\) discontinuities in observational data. It uses constrained coupled polynomial approximation to obtain two estimates for the \(n^\text {th}\) Taylor coefficients and their uncertainties, at every interstitial point. These correspond approximating the local function by polynomials, once from the left \(\mathsf {f}(x,\varvec{\alpha })\) and once from the right \(\mathsf {g}(x,\varvec{\beta })\). The constraints couple the polynomials to ensure that \(\alpha _i = \beta _i \,\,\, \text {for every}\, i \in [0 \ldots n-1]\). In this manner the approximations are \(C^{n-1}\) continuous at the interstitial points, while delivering an estimate for the difference in the \(n^\text {th}\) Taylor coefficients. All the derivations for the coupled constrained approximations and the numerical implementations are presented. Both the approximation and extrapolation residuals are derived. It is proven that the discontinuities must lie at local positive peaks in the extrapolation error. The new approach is verified with both known synthetic data and on real sensor data obtained from observing the operation of heavy machinery.

2 Detecting \(C^n\) Discontinuities

Discrete observations \(s(x_i)\) of a continuous system s(x) are, by their very nature, discontinuous at every sample. Consequently, some measure for discontinuity will be required, with uncertainty, which provides the basis for further analysis.

The observations are considered to be the co-locative points, denoted by \(x_i\) and collectively by the vector \(\varvec{x}\); however, we wish to estimate the discontinuity at the interstitial points, denoted by \(\zeta _i\) and collectively as \(\varvec{\zeta }\). Using interstitial points, one ensures that each data point is used for only one polynomial approximation at a time. Furthermore, in the case of sensor data, one expects the discontinuities to happen between samples. Consequently the data is segmented at the interstitial points, i.e. between the samples. This requires the use of interpolating functions and in this work we have chosen to use polynomials.

Polynomials have been chosen because of their approximating, interpolating and extrapolating properties when modelling continuous systems: The Weierstrass approximation theorem [16] states that if f(x) is a continuous real-valued function defined on the real interval \(x \in [a, b]\), then for every \(\varepsilon > 0\), there exists a polynomial p(x) such that for all \(x \in [a, b]\), the supremum norm \(\Vert f(x) - p(x)\Vert _{\infty } < \varepsilon \). That is any function f(x) can be approximated by a polynomial to an arbitrary accuracy \(\varepsilon \) given a sufficiently high degree.

The basic concept (see Fig. 1) to detect a \(C^n\) discontinuity is: to approximate the data to the left of an interstitial point by the polynomial \(\mathsf {f}(x,\varvec{\alpha })\) of degree \(d_L\) and to the right by \(\mathsf {g}(x,\varvec{\beta })\) of degree \(d_R\), while constraining these approximations to be \(C^{n-1}\) continuous at the interstitial point. This approximation ensures that,

$$\begin{aligned} \mathsf {f}^{(k-1)}(\zeta _i) = \mathsf {g}^{(k-1)}(\zeta _i), \quad \text { for every } k\, \in \left[ 1 \ldots n\right] . \end{aligned}$$
(2)

while yielding estimates for \(\mathsf {f}^{(n)}(\zeta _i)\) and \(\mathsf {g}^{(n)}(\zeta _i)\) together with estimates for their variances \(\lambda _{f(\zeta _i)}\) and \(\lambda _{g(\zeta _i)}\). This corresponds exactly to estimating the Taylor coefficients of the function twice for each interstitial point, i.e., once from the left and once from the right. It they differ significantly, then the function’s \(n^\text {th}\) derivative is discontinuous at this point. The Taylor series of a function f(x) around the point a is defined as,

Fig. 1.
figure 1

Schematic of a finite set of discrete observations (dotted circles) of a continuous function. The span of the observation is split into a left and right portion at the interstitial point (circle), with lengths \(l_L\) and \(l_R\) respectively. The left and right sides are considered to be the functions f(x) and g(x); modelled by the polynomials \(\mathsf {f}(x,\varvec{\alpha })\) and \(\mathsf {g}(x,\varvec{\beta })\) of degrees \(d_L\) and \(d_R\).

$$\begin{aligned} f(x) = \sum _{k=0}^{\infty }\frac{f^{\left( k\right) }\left( a\right) }{k!}\left( x-a\right) ^k \end{aligned}$$
(3)

for each x for which the infinite series on the right hand side converges. Furthermore, any function which is \(n+1\) times differentiable can be written as

$$\begin{aligned} f(x) = \tilde{\mathsf {f}}(x) + R(x) \end{aligned}$$
(4)

where \(\tilde{\mathsf {f}}(x)\) is an \(n^\text {th}\) degree polynomial approximation of the function f(x),

$$\begin{aligned} \tilde{\mathsf {f}}(x) = \sum _{k=0}^{n}\frac{f^{\left( k\right) }\left( a\right) }{k!}\left( x-a\right) ^k \end{aligned}$$
(5)

and R(x) is the remainder term. The Lagrange form of the remainder R(x) is given by

$$\begin{aligned} R(x) = \frac{f^{\left( n+1\right) }\left( \xi \right) }{\left( n+1\right) !}\left( x-a\right) ^{n+1} \end{aligned}$$
(6)

where \(\xi \) is a real number between a and x.

A Taylor expansion around the origin (i.e. \(a = 0\) in Eq. 3) is called a Maclaurin expansion; for more details, see [1]. In the rest of this work, the \(n^\text {th}\) Maclaurin coefficient for the function f(x) will be denoted by

$$\begin{aligned} t_{f}^{(n)} \triangleq \frac{f^{\left( n\right) }\left( 0\right) }{n!}. \end{aligned}$$
(7)

The coefficients of a polynomial \(\mathsf {f}(x,\varvec{\alpha }) = \alpha _n x^n\,+\,\ldots \,+\,\alpha _1 x\,+ \alpha _0\) are closely related to the coefficients of the Maclaurin expansion of this polynomial. Namely, it’s easy to prove that

$$\begin{aligned} \alpha _k = t_\mathsf {f}^{(k)}, \quad \text { for every } k\, \in \left[ 0 \ldots n\right] . \end{aligned}$$
(8)

A prudent selection of a common local coordinate system, setting the interstitial point as the origin, ensures that the coefficients of the left and right approximating polynomials correspond to the derivative values at this interstitial point. Namely, one gets a very clear relationship between the coefficients of the left and right polynomial approximations, \(\varvec{\alpha }\) and \(\varvec{\beta }\), their Maclaurin coefficients, \(t_{\mathsf {f}}^{(n)}\) and \(t_{\mathsf {g}}^{(n)}\), and the values of the derivatives at the interstitial point

$$\begin{aligned} t_{\mathsf {f}}^{(n)} = \alpha _n = \frac{\mathsf {f}^{\left( n\right) }\left( 0\right) }{n!} \quad \text { and } \quad t_{\mathsf {g}}^{(n)} = \beta _n = \frac{\mathsf {g}^{\left( n\right) }\left( 0\right) }{n!}. \end{aligned}$$
(9)

From Eq. 9 it is clear that performing a left and right polynomial approximation at an interstitial point is sufficient to get the derivative values at that point, as well as their uncertainties.

3 Constrained and Coupled Polynomial Approximation

The goal here is to obtain \(\varDelta t_{\mathsf {fg}}^{\left( n\right) } \triangleq t_{\mathsf {f}}^{\left( n\right) } - t_{\mathsf {g}}^{\left( n\right) }\) via polynomial approximation. To this end two polynomial approximations are required; whereby, the interstitial point is used as the origin in the common coordinate system, see Fig. 1. The approximations are coupled [6] at the interstitial point by constraining the coefficients such that \(\alpha _i = \beta _i, \, \text {for every} \, i \in [0\ldots n-1]\). This ensures that the two polynomials are \(C^{n-1}\) continuous at the interstitial points. This also reduces the degrees of freedom during the approximation and with this the variance of the solution is reduced. For more details on constrained polynomial approximation see [4, 7].

To remain fully general, a local polynomial approximation of degree \(d_L\) is performed to the left of the interstitial point with the support length \(l_L\) creating \(\mathsf {f}(x,\varvec{\alpha })\); similarly to the right \(d_R\), \(l_R\), \(\mathsf {g}(x,\varvec{\beta })\). The x coordinates to the left, denoted as \(\varvec{x}_L\) are used to form the left Vandermonde matrix \(\varvec{V}_L\), similarly \(\varvec{x}_R\) form \(\varvec{V}_R\) to the right. This leads to the following formulation of the approximation process,

$$\begin{aligned} \varvec{y}_L = \varvec{V}_L \, \varvec{\alpha } \quad \text { and }\quad \varvec{y}_R = \varvec{V}_R \, \varvec{\beta }. \end{aligned}$$
(10)
$$\begin{aligned} \begin{bmatrix} \varvec{V}_L &{} \varvec{0} \\ \varvec{0} &{} \varvec{V}_R \end{bmatrix} \, \begin{bmatrix} \varvec{\alpha } \\ \varvec{\beta } \end{bmatrix} = \begin{bmatrix} \varvec{y}_L \\ \varvec{y}_R \end{bmatrix} \end{aligned}$$
(11)

A \(C^{n-1}\) continuity implies \(\alpha _i = \beta _i, \,\text {for every}\, i \in [0\ldots n-1]\) which can be written in matrix form as

(12)

Defining

We obtain the task of least squares minimization with homogeneous linear constraints,

figure a

Clearly \(\varvec{\gamma }\) must lie in the null-space of \(\varvec{C}\); now, given \(\varvec{N}\), an ortho-normal vector basis set for \(\mathop {\mathrm {null}}\left\{ \varvec{C}\right\} \), we obtain,

$$\begin{aligned} \varvec{\gamma } = \varvec{N} \, \varvec{\delta }. \end{aligned}$$
(14)

Back-substituting into Eq. 13 yields,

$$\begin{aligned} \min _{\varvec{\delta }} \Vert \varvec{y} - \varvec{V} \, \varvec{N} \, \varvec{\delta } \Vert _2^2 \end{aligned}$$
(15)

The least squares solution to this problem is,

$$\begin{aligned} \varvec{\delta } = \left( \varvec{V} \, \varvec{N} \right) ^+ \, \varvec{y}, \end{aligned}$$
(16)

and consequently,

figure b

Formulating the approximation in the above manner ensures that the difference in the Taylor coefficients can be simply computed as

$$\begin{aligned} \varDelta t_{\mathsf {fg}}^{\left( n\right) } = t_{\mathsf {f}}^{\left( n\right) } - t_{\mathsf {g}}^{\left( n\right) } = \alpha _n = \beta _n. \end{aligned}$$
(18)

Now defining \(\varvec{d} = [1, \, \varvec{0}_{d_L -1}, \, -1, \, \varvec{0}_{d_R -1}]^\mathrm {T}\), \(\varDelta t_{\mathsf {fg}}^{\left( n\right) }\) is obtained from \(\varvec{\gamma }\) as

$$\begin{aligned} \varDelta t_{\mathsf {fg}}^{\left( n\right) } = \varvec{d}^{\text {T}}\varvec{\gamma } = \varvec{d}^{\text {T}}\varvec{N} \, \left( \varvec{V} \, \varvec{N} \right) ^+ \, \varvec{y}. \end{aligned}$$
(19)

3.1 Covariance Propagation

Defining, \(\varvec{K} = \varvec{N} \, \left( \varvec{V} \, \varvec{N} \right) ^+\), yields, \(\varvec{\gamma } = \varvec{K} \, \varvec{y}\). Then given the covariance of \(\varvec{y}\), i.e., \(\varvec{\varLambda }_{\varvec{y}}\), one gets that,

figure c

Additionally, from Eq. 19 one could derive the covariance of the difference in the Taylor coefficients

$$\begin{aligned} \varvec{\varLambda }_{\varvec{\varDelta }} = \varvec{d}\varvec{\varLambda }_{\varvec{\gamma }}\varvec{d}^{\text {T}} \end{aligned}$$
(22)

Keep in mind that, if one uses approximating polynomials of degree n to determine a discontinuity in the \(n^{\text {th}}\) derivative, as done so far, \(\varvec{\varLambda }_{\varvec{\varDelta }}\) is just a scalar and corresponds to the variance of \(\varDelta t_{\mathsf {fg}}^{\left( n\right) }\).

4 Error Analysis

In this paper we consider three measures for error:

  1. 1.

    the norm of the approximation residual;

  2. 2.

    the combined approximation and extrapolation error;

  3. 3.

    the extrapolation error.

4.1 Approximation Error

The residual vector has the form

$$\begin{aligned} \varvec{r} =\varvec{y}-\varvec{V}\varvec{\gamma } =\begin{bmatrix} \varvec{y}_L-\varvec{V}_L\varvec{\alpha }\\ \varvec{y}_R-\varvec{V}_R\varvec{\beta } \end{bmatrix}. \end{aligned}$$

The approximation error is calculated as

$$\begin{aligned}&E_a = \Vert \varvec{r}\Vert _2^2 = \Vert \varvec{y}_L-\varvec{V}_L\varvec{\alpha }\Vert _2^2 + \Vert \varvec{y}_R-\varvec{V}_R\varvec{\beta }\Vert _2^2\\ =&\left( \varvec{y}_L-\varvec{V}_L\varvec{\alpha }\right) ^\mathrm {T}\left( \varvec{y}_L-\varvec{V}_L\varvec{\alpha }\right) + \left( \varvec{y}_R-\varvec{V}_R\varvec{\beta }\right) ^\mathrm {T}\left( \varvec{y}_R-\varvec{V}_R\varvec{\beta }\right) \\ =&\varvec{y}^{\text {T}}\varvec{y} - 2\varvec{\alpha }^{\text {T}}\varvec{V}^{\text {T}}_L\varvec{y}_L + \varvec{\alpha }^{\text {T}}\varvec{V}^{\text {T}}_L\varvec{V}_L\varvec{\alpha } - 2\varvec{\beta }^{\text {T}}\varvec{V}^{\text {T}}_R\varvec{y}_R + \varvec{\beta }^{\text {T}}\varvec{V}^{\text {T}}_R\varvec{V}_R\varvec{\beta }. \end{aligned}$$
Fig. 2.
figure 2

Schematic of the approximations around the interstitial point. Red: left polynomial approximation \(\mathsf {f}(x,\varvec{\alpha })\); dotted red: extrapolation of \(\mathsf {f}(x,\varvec{\alpha })\) to the RHS; blue: right polynomial approximation, \(\mathsf {g}(x,\varvec{\beta })\); dotted blue: extrapolation of \(\mathsf {g}(x,\varvec{\beta })\) to the LHS; \(\varepsilon _i\) is the vertical distance between the extrapolated value and the observation. The approximation is constrained with the conditions: \(\mathsf {f}(0,\varvec{\alpha }) = \mathsf {g}(0,\varvec{\beta })\) and \(\mathsf {f}'(0,\varvec{\alpha }) = \mathsf {g}'(0,\varvec{\beta })\). (Color figure online)

4.2 Combined Error

The basic concept, which can be seen in Fig. 2, is as follows: the left polynomial \(\mathsf {f}\left( x,\varvec{\alpha }\right) \), which approximates over the values \(\varvec{x}_L\), is extended to the right and evaluated at the points \(\varvec{x}_R\). Analogously, the right polynomial \(\mathsf {g}\left( x,\varvec{\beta }\right) \) is evaluated at the points \(\varvec{x}_L\). If there is no \(C^n\) discontinuity in the system, the polynomials \(\mathsf {f}\) and \(\mathsf {g}\) must be equal and consequently the extrapolated values won’t differ significantly from the approximated values.

Analytical Combined Error. The extrapolation error in a continuous case, i.e. between the two polynomial models, can be computed with the following 2-norm,

$$\begin{aligned} \varepsilon _x = \int _{x_{min}}^{x_{max}} \left\{ \mathsf {f}(x,\varvec{\alpha }) - \mathsf {g}(x,\varvec{\beta }) \right\} ^2 \, \,\mathrm {d}x. \end{aligned}$$
(23)

Given, the constraints which ensure that \(\alpha _i = \beta _i \, i \in [0,\ldots ,n-1]\), we obtain,

$$\begin{aligned} \varepsilon _x = \int _{x_{min}}^{x_{max}} \left\{ (\alpha _{n} - \beta _{n} ) \, x^{n} \right\} ^2 \, \,\mathrm {d}x. \end{aligned}$$
(24)

Expanding and performing the integral yields,

$$\begin{aligned} \varepsilon _x = (\alpha _{n} - \beta _{n})^2 \, \left\{ \frac{x_{max}^{2n + 1} - x_{min}^{2n + 1}}{2n + 1} \right\} \end{aligned}$$
(25)

Given fixed values for \(x_{min}\) and \(x_{max}\) across a single computation implies that the factor,

$$\begin{aligned} k = \frac{x_{max}^{2n + 1} - x_{min}^{2n + 1}}{2n + 1} \end{aligned}$$
(26)

is a constant. Consequently, the extrapolation error is directly proportional to the square of the difference in the Taylor coefficients,

$$\begin{aligned} \varepsilon _x \propto \, \left( \alpha _n - \beta _n\right) ^2 \propto \, \left\{ \varDelta t_{\mathsf {fg}}^{\left( n\right) }\right\} ^2. \end{aligned}$$
(27)

Numerical Combined Error. In the discrete case, one can write the errors of \(\mathsf {f}(x,\varvec{\alpha })\) and \(\mathsf {g}(x,\varvec{\beta })\) as

$$\begin{aligned} \varvec{e}_\mathsf {f} = \varvec{y}-\mathsf {f}(\varvec{x},\varvec{\alpha }) \quad \text {and} \quad \varvec{e}_\mathsf {g} = \varvec{y}-\mathsf {g}(\varvec{x},\varvec{\beta }) \end{aligned}$$
(28)

respectively. Consequently, one could define an error function as

$$\begin{aligned}&E_{\mathsf {f}\mathsf {g}} = \Vert \varvec{e}_\mathsf {f}-\varvec{e}_\mathsf {g}\Vert _2^2 = \Vert (a_{n} - b_{n} ) \, \varvec{z}\Vert _2^2 = (a_{n} - b_{n} )^2 \varvec{z}^{\text {T}} \varvec{z}^{n} = (a_{n} - b_{n} )^2\sum {x_i^{n}} \end{aligned}$$
(29)

where . From these calculations it is clear that in the discrete case the error is also directly proportional to the square of the difference in the Taylor coefficients and that \( E_{\mathsf {f}\mathsf {g}} \propto \varepsilon _x\). This proves that the numerical computation is consistent with the analytical continuous error.

4.3 Extrapolation Error

One could also define a different kind of error, based just on the extrapolative properties of the polynomials. Namely, using the notation from the beginning of Sect. 3, one defines

$$\begin{aligned} \varvec{r}_{e\mathsf {f}} = \varvec{y}_L-\mathsf {g}(\varvec{x}_L,\varvec{\beta }) = \varvec{y}_L-\varvec{V}_L\varvec{\beta } \quad \text { and } \quad \varvec{r}_{e\mathsf {g}} = \varvec{y}_R-\mathsf {f}(\varvec{x}_R,\varvec{\alpha }) = \varvec{y}_R-\varvec{V}_R\varvec{\alpha } \end{aligned}$$

and then calculates the error as

$$\begin{aligned}&E_{e} = \varvec{r}^{\text {T}}_{e\mathsf {f}}\varvec{r}_{e\mathsf {f}} + \varvec{r}^{\text {T}}_{e\mathsf {g}}\varvec{r}_{e\mathsf {g}}\\ =&\left( \varvec{y}_L-\varvec{V}_L\varvec{\beta }\right) ^\mathrm {T} \left( \varvec{y}_L-\varvec{V}_L\varvec{\beta }\right) + \left( \varvec{y}_R-\varvec{V}_R\varvec{\alpha }\right) ^\mathrm {T} \left( \varvec{y}_R-\varvec{V}_R\varvec{\alpha }\right) \\ =&\varvec{y}^{\text {T}}\varvec{y} -2\varvec{\beta }^{\text {T}}\varvec{V}^{\text {T}}_L \varvec{y}_L + \varvec{\beta }^{\text {T}}\varvec{V}^{\text {T}}_L\varvec{V}_L\varvec{\beta } - 2\varvec{\alpha }^{\text {T}}\varvec{V}^{\text {T}}_R\varvec{y}_R + \varvec{\alpha }^{\text {T}}\varvec{V}^{\text {T}}_R\varvec{V}_R\varvec{\alpha }. \end{aligned}$$

In the example in Sect. 5, it will be seen that there is no significant numerical difference between these two errors.

5 Numerical Testing

The numerical testing is performed with: synthetic data from a piecewise polynomial, where the locations of the \(C^n\) discontinuities are known; and with real sensor data emanating from the monitoring of heavy machinery.

5.1 Synthetic Data

In the literature on splines, functions of the type \(y\left( x\right) = e^{-x^2}\) are commonly used. However, this function is analytic and \(C^{\infty }\) continuous; consequently it was not considered a suitable function for testing. In Fig. 3 a piecewise polynomial with a similar shape is shown; however, this curve has \(C^2\) discontinuities at known locations. The algorithm was applied to the synthetic data from the piecewise polynomial, with added noise with \(\sigma = 0.05\) and the results for a single case can be seen in Fig. 3. Additionally, a Monte Carlo simulation with \(m=10000\) iterations was performed and the results of the algorithm were compared to the true locations of the two known knots. The mean errors in the location of the knots are: \(\mu _1 = (5.59 \pm 2.05) \times 10^{-4}\) with \(95 \%\) confidence, and \(\mu _2 = (-4.62 \pm 1.94)\times 10^{-4}\). Errors in the scale of \(10^{-4}\), in a support with a range \([0,\,1]\), and \(5 \%\) noise amplitude in the curve can be considered a highly satisfactory result.

Fig. 3.
figure 3

A piecewise polynomial of degree \(d=2\), created from the knots sequence \(\varvec{x}_k = [0, 0.3, 0.7, 1]\) with the corresponding values \(\varvec{y}_k = [0, 0.3, 0.7, 1]\). The end points are clamped with \(y'(x)_{0,1} = 0\). Gaussian noise is added with \(\sigma = 0.05\). Top: the circles mark the known points of \(C^2\) discontinuity; the blue and red lines indicate the detected discontinuities; additionally the data has been approximated by the b-spline (red) using the detected discontinuities as knots. Bottom: shows \(\varDelta t^{(n)}_{\mathsf {fg}} = t^{(n)}_\mathsf {f} - t^{(n)}_\mathsf {g}\), together with the two identified peaks. (Color figure online)

Fig. 4.
figure 4

The top-most graph shows a function y(x), together with the detected \(C^1\) discontinuity points. The middle graph shows the difference in the Taylor polynomials \(\varDelta t_{\mathsf {fg}}^{\left( n\right) }\) calculated at every interstitial point. The red and blue circles mark the relevant local maxima and minima of the difference respectively. According to this, the red and blue lines are drawn in the top-most graph. The bottom graph shows the approximation error evaluated at every interstitial point. (Color figure online)

Fig. 5.
figure 5

The two error functions, \(E_e\) and \(E_{\mathsf {f}\mathsf {g}}\) as defined in Sect. 4, for the example from Fig. 4. One can see that the location of the peaks doesn’t change, and the two errors don’t differ significantly.

5.2 Sensor Data

The algorithm was also applied to a set of real-world sensor dataFootnote 1 emanating from the monitoring of heavy machinery. The original data set can be seen in Fig. 4 (top). It has many local peaks and periods of little or no change, so the algorithm was used to detect discontinuities in the first derivative, in order to determine the peaks and phases. The peaks in the Taylor differences were used in combination with the peaks of the extrapolation error to determine the points of discontinuity. A peak in the Taylor differences means that the Taylor coefficients are significantly different at that interstitial point, compared to other interstitial points in the neighbourhood. However, if there is no peak in the extrapolation errors at the same location, then the peak found by the Taylor differences is deemed insignificant, since one polynomial could model both the left and right values and as such the peak isn’t a discontinuity. Additionally, it can be seen in Fig. 5 that both the extrapolation error and the combined error, as defined in Sect. 4, have peaks at the same locations, and as such the results they provide do not differ significantly.

6 Conclusion and Future Work

It may be concluded, from the results achieved, that the coupled constrained polynomial approximation yield a good method for the detection of \(C^n\) discontinuities in discrete observational data of continuous systems. Local peaks in the square of the difference of the Taylor polynomials provide a relative measure as a means of determining the locations of discontinuities.

Current investigations indicate that the method can be implemented directly as a convolutional operator, which will yield a computationally efficient solution. The use of discrete orthogonal polynomials [5, 10] is being tested as a means of improving the sensitivity of the results to numerical perturbations.