1 Introduction

It is well known that Bayesian filtering provides a general recursive solution to the filtering problem by inferring the posterior probability density function (PDF) of the state space model with hidden states [1,2,3,4,5]. For a linear Gaussian system, the well-known Kalman filter (KF) can efficiently and accurately obtain the analytical solution of Bayesian inference [6,7,8]. However, it is impossible to obtain the analytical solution of Bayesian inference using the transfer of finite parameters in a nonlinear/non-Gaussian system. Therefore, several approximate methods were proposed to solve the posterior PDF and estimate the state in engineering applications [9,10,11,12].

According to the approximation principles, the existing nonlinear filtering methods can be grouped into local inference and global inference [13, 14]. These approaches usually revolve around two core problems: (1) How to accurately describe the posterior PDF at current step for obtaining an estimation closer to the Bayesian inference? (2) How to accurately transfer the posterior PDF from the current to the next step for ensuring the continuous execution of recursive Bayesian inference?

For local inference, the typical representative is the Kalman-like filters including extended KF (EKF) [15,16,17], unscented KF (UKF) [18, 19], cubature KF (CKF) [20, 21], quadrature KF (QKF) [13, 22], etc. All of these methods use Gaussian distributions to approximate the real posterior PDF for solving the core problem (1). Their difference lies in solving the core problem (2). The EKF linearizes a nonlinear system with a first-order Taylor series expansion, so that the posterior PDF is transferred in the linear part of a nonlinear system. The UKF describes a Gaussian distribution using obtained 2d + 1 sigma points before transferring the posterior PDF, where d denotes the dimensions of the state. The third-degree spherical–radial cubature rule is used to overcome the uncertainty of a nonlinear system in the CKF. Therefore, the calculation of volume point set becomes the key of the transfer of the posterior PDF. Similarly, the quadrature rule is used to obtain the sigma point set and transfer the posterior PDF in the QKF. The [14, 23] approximate the posterior PDF using a multiparameter mixed distribution. The resulting filter is referred to as the belief condensation filtering (BCF). In this method, the distribution model is preset. The optimization algorithm, based on the Kullback–Leibler divergence [24], is used to calculate the parameters of the distribution. All the above methods use finite parameter distribution to describe the real posterior PDF.

For global inference, a typical representative is the sequential Monte Carlo (SMC)-based filtering methods [16, 25, 26]. As the major directions of SMC method, particle filter (PF) uses many weighted particles to describe the posterior PDF and simulate the propagation characteristics of the probability distribution [27, 28]. Weighted particles can flexibly describe arbitrary distribution, which makes PF have a greater potential to solve complex nonlinear/non-Gaussian filtering problems [29, 31]. However, the development of the PF is always confused by the particle degeneracy, impoverishment problems and dimension curse, which might affect the accuracy and efficiency and even lead to filtering divergence [32,33,34,35,36]. In addition, computational complexity is a crucial aspect of concern in PF, and it has the potential to significantly restrict the applicability of nonlinear filtering methods [37, 38]. Since the advantages and the disadvantages of PF, researchers have proposed several techniques to enhance the stability, accuracy and efficiency of the PF, which mainly include the resampling technique [39,40,41,42,43,44], the improvement of the proposed distribution [45,46,47] and the integration technique [48, 49]. For mixed linear/nonlinear state space models, Rao-Blackwellized PF (RBPF) uses KF and the PF to achieve the state estimation of the linear and nonlinear parts, respectively. This method can alleviate the dimension curse suffered by PF [11, 50]. Furthermore, the stratification technique is used in [51], to compress the statistical information contained in a set of random particles. The number of particles can be compressed by the method according to the probability of different regions, so as to improve the filtering efficiency. Given the above, the idea of PF-like methods is that the high probability region of the posterior PDF dominates the performance of filtering. Therefore, the improvement of PF-like methods is essentially to find a way to improve the efficiency of particle for describing the posterior PDF. Another typical representative of global inference is the grid-based filters (GF) [9, 10]. GF is another optimal filtering method compared with the KF. It is worth noting that this applies only when the state space is discrete and consists of a finite number of states. In this case, the integrals of the recursive Bayesian filtering can be executed numerically as a discrete, nonlinear convolution [52,53,54]. However, the application scope of GF is quite limited. Therefore, GF has not been paid attention by most scholars for a long time.

Generally speaking, local inference approximates the posterior PDF using one or more mixed known distribution. Its advantage is that the calculation cost is small, so it has high efficiency. For global inference, SMC method approximates the posterior PDF by accumulating a large number of weighted particles. The posterior PDF is transferred through Monte Carlo method. They require much more computation than local inference methods. GF method uses numerical integration to describe and transfer the posterior PDF. It has strict constraint for the system environment. However, the global inference methods have better accuracy and robustness when dealing with some special nonlinear filtering problems.

In this paper, a novel non-parameter filter method called the confidence partitioning sampling filtering (CPSF) method is proposed. We have absorbed the idea of focusing on the high probability regions in the SMC method. On this basis, the concept of the confidence probability space (CPS) is defined, which is used to choose the high probability region of the posterior PDF as the posterior CPS. Then, the grid samples are generated in the posterior CPS. The posterior probability of each grid sample, also called the weight of sample, is calculated according to the approximate expression of the posterior PDF, where the acquisition of the approximate expression is based on the idea of the numerical integration in GF and the method of the transfer of the posterior PDF in local inference. Finally, the weighted grid samples can be used to describe the posterior PDF at that time. The transfer of the posterior PDF is executed around obtaining the approximate expression. The main contributions arising from this study are as follows:

  1. (1)

    The CPSF framework is established as an idea to solve the generic nonlinear filter problem in theory. First, the concept of the CPS is defined to compress the state space under a certain confidence condition maximally. Second, the partitioning sampling technique (PST) is performed in the posterior CPS, to obtain some weighted grid samples for describing the posterior PDF. The properties of the impulse function in the integral operation guide us to obtain the approximate expression of the posterior PDF, which is used in the process of the PST. Finally, the CPSF framework is formed by selecting the posterior CPS and executing the PST repeatedly, which approximates the Bayesian inference.

  2. (2)

    The SI_CPSF algorithm is proposed as a specific nonlinear filtering algorithm. To overcome the difficulty of obtaining the real posterior CPS in the CPSF framework, the space intersection (SI) method is proposed to obtain the approximate posterior CPS. Then, an executable nonlinear filtering algorithm, called SI_CPSF algorithm, is formed to solve the generic nonlinear filter problems.

  3. (3)

    The approximate error between the CPSF framework and the recursive Bayesian filter is analyzed theoretically. The approximation error is divided into two parts: the space truncation error and quantization error. These two types of error can be controlled by two parameters in the CPSF framework: confidence and sampling interval. Furthermore, the consistency of the CPSF framework to the Bayesian filter is proved theoretically.

  4. (4)

    The performances of the proposed algorithm, including the accuracy, efficiency and robustness, are evaluated by four representative simulation experiments.

The structure of this paper is as follows: In Sect. 2, the principal problem model and Bayesian filtering are introduced. In Sect. 3, the CPSF framework and the SI_CPSF algorithm which is derived from the SI technique is presented in detail. In Sect. 4, the approximate error of the CPSF framework to the Bayesian filter is discussed at great length. Furthermore, the consistency of the CPSF framework to Bayesian filter is proved. In Sect. 5, simulation results are provided that describe the performance of the SI_CPSF algorithm. In Sect. 6, this paper is concluded.

2 Problem statement

The discrete state space model of the general filtering problem at time step k can be described as follows:

$${\mathbf{x}}_{k} = {\mathbf{f}}\left( {{\mathbf{x}}_{k - 1} } \right) + {\mathbf{u}}_{k}$$
(1)
$${\mathbf{y}}_{k} = {\mathbf{h}}\left( {{\mathbf{x}}_{k} } \right) + {\mathbf{v}}_{k}$$
(2)

where \({\mathbf{x}}_{k}\) and \({\mathbf{y}}_{k}\) are the state and observation, respectively. \({\mathbf{u}}_{k}\) and \({\mathbf{v}}_{k}\) are noise sequences with \({\mathbf{u}}_{k} \sim p_{{\mathbf{u}}} \left( {\mathbf{x}} \right)\) and \({\mathbf{v}}_{k} \sim p_{{\mathbf{v}}} \left( {\mathbf{y}} \right)\). They are independent of each other. \({\mathbf{f}}\) and \({\mathbf{h}}\) denote the state model and observation model, respectively. Alternatively, the complete filter process can be described as the transition of probability distribution: State transition distribution \(p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{x}}_{k - 1} } \right.} \right)\) can be obtained from (1) and \(p_{{\mathbf{u}}} \left( {\mathbf{x}} \right)\); the likelihood distribution \(p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{x}}_{k} } \right.} \right)\) can be obtained from (2) and \(p_{{\mathbf{v}}} \left( {\mathbf{y}} \right)\).

The objective of recursive Bayesian filtering is to obtain the posterior PDF at the current time step, to achieve the state estimation [42]. Recursive Bayesian filtering which can be divided into two steps: Prediction and Update.

The prediction step infers the prior distribution at time step k, as follows:

$$p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right) = \int {p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{x}}_{k - 1} } \right.} \right)p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)d{\mathbf{x}}_{k - 1} }$$
(3)

The update infers the posterior distribution at time step k as follows:

$$p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right) = \frac{{p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{x}}_{k} } \right.} \right)p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}{{p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}$$
(4)

3 Confidence partitioning sampling filtering

In this section, the CPSF method is proposed to overcome the integral difficulty that is caused by a nonlinear system in the filtering process. The main idea of the method is that the high probability region of the posterior PDF dominates the performance of the filtering. Therefore, the whole structure of the CPSF method is finding the high probability region and describing the posterior PDF in the region.

3.1 Partitioning sampling technique

Definition 1

Suppose that \(p\left( {\mathbf{x}} \right)\) is the PDF of a distribution. Its complete probability space is called \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{0}\). \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\) is a bounded subspace of it, which satisfies

$${\text{minimize S}}$$
(5)
$$s.t\;\;\int_{{C_{{p\left( {\mathbf{x}} \right)}}^{\alpha } }} {p\left( {\mathbf{x}} \right)d{\mathbf{x}}} = 1 - \alpha$$
(6)

where \(S = \int_{{{{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha } }} {{\mathbf{1}}d{\mathbf{x}}}\), \(0 \le \alpha \le 1\). Then, \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\) is called the CPS of \(p\left( {\mathbf{x}} \right)\) under the confidence \(1 - \alpha.\)

In the \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\), the sampling interval \({{\varvec{\uptau}}}\) is used to divide it into L blocks, where \({{\varvec{\uptau}}} = \left[ {\tau_{1} ,\tau_{2} , \cdots \tau_{D} } \right]^{T}\). \(D\) is the dimension. Then, samples are deployed on the center of each block and obtain the sample set \({\hat{\mathbf{X}}}\), where \({\hat{\mathbf{X}}} \equiv \left[ {{\hat{\mathbf{x}}}_{1} ,{\hat{\mathbf{x}}}_{2} , \cdots {\hat{\mathbf{x}}}_{L} } \right]^{T}\). The weight of each sample is calculated as follows:

$${{\varvec{\upomega}}} \equiv \left[ {\omega_{1} ,\omega_{2} , \cdots \omega_{L} } \right]^{T}$$
(7)
$$\omega_{l} = \frac{{p\left( {{\hat{\mathbf{x}}}_{l} } \right)}}{{\sum\nolimits_{l = 1}^{L} {p\left( {{\hat{\mathbf{x}}}_{l} } \right)} }}$$
(8)

The above process is called partitioning sampling technique (PST). The detailed steps are shown in Table 1

Table 1 Partitioning sampling technique

Theorem 1

Suppose that \(p\left( {\mathbf{x}} \right)\) is the PDF of a distribution. \(\left\{ {{\hat{\mathbf{X}}},{{\varvec{\upomega}}}} \right\}_{L}\) is obtained using the PST from \(p\left( {\mathbf{x}} \right)\). Then,

$$p\left( {\mathbf{x}} \right) \propto \mathop {\lim }\limits_{{\alpha ,{{\varvec{\uptau}}} \to {\mathbf{0}}}} \sum\nolimits_{l = 1}^{ + \infty } {\omega_{l} \delta \left( {{\mathbf{x}} - {\hat{\mathbf{x}}}_{l} } \right)}$$

Proof

See Appendix A.

For instance, suppose \(p\left( {\mathbf{x}} \right)\sim N\left( {\mathbf{0},\,diag\left( {\left[ {1,1} \right]} \right)} \right)\). The \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\) is a circle of equal probability in two-dimensional space. Let \(\alpha = 0.001\), the radius of the circle is:

$$r = \sqrt { - 2\ln (\alpha )} \approx 3.717$$

Let \({{\varvec{\uptau}}} = \left[ {0.5,0.5} \right]^{T}\), the processing procedure of the PST can be seen in Fig. 1 intuitively. Finally, \(\left\{ {{\hat{\mathbf{X}}},{{\varvec{\upomega}}}} \right\}_{L}\) can be used to approximately express the \(p\left( {\mathbf{x}} \right)\) discretely:

$$p\left( {\mathbf{x}} \right) \propto \sum\nolimits_{l = 1}^{L} {\omega_{l} \delta \left( {{\mathbf{x}} - {\hat{\mathbf{x}}}_{l} } \right)}$$
(9)

where \(\delta \left( \cdot \right)\) denotes the Dirichlet function.

Fig. 1
figure 1

The processing procedure of the PST

In this instance, the subspace \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\) contains \(\left( {1 - \alpha } \right) \times 100\%\) probability information. Simultaneously, \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\) is the minimum space under the condition of ignoring \(\alpha \times 100\%\) probability information. In a general probability model (e.g., Gaussian distribution, mixture Gaussian distribution, etc.), the probability space can be greatly compressed into a bounded probability space \({{\mathbb{C}}}_{{p\left( {\mathbf{x}} \right)}}^{\alpha }\) with losing a negligible amount of probability information. This makes it possible to implement the PST [55, 56].

3.2 Confidence partitioning sampling filtering framework

In this subsection, we illustrate the CPSF framework using the Bayesian inference from time step \(k - 1\) to \(k\). Suppose that the posterior PDF at time step \(k - 1\) can be described by a set of weighted grid samples \(\left\{ {{\hat{\mathbf{X}}}_{k - 1} ,{{\varvec{\upomega}}}_{k - 1} } \right\}_{{N_{k - 1} }}\), where \(N_{k - 1}\) denotes the sample number. \({\hat{\mathbf{X}}}_{k - 1} \equiv \left[ {{\hat{\mathbf{x}}}_{k - 1,1} ,{\hat{\mathbf{x}}}_{k - 1,2} , \ldots ,{\hat{\mathbf{x}}}_{{k - 1,N_{k - 1} }} } \right]^{T}\) deployed in the bounded space \({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\) evenly. Therefore, all the probability information is described by their weights.

Then, the posterior PDF at time step \(k - 1\) can be expressed discretely as follows:

$$p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right) \propto \sum\nolimits_{n = 1}^{{N_{k - 1} }} {\omega_{k - 1,n} \delta \left( {{\mathbf{x}}_{k - 1} - {\hat{\mathbf{x}}}_{k - 1,n} } \right)}$$
(10)

In (10), the posterior PDF at time step \(k - 1\) is expressed by \(N_{k - 1}\) weighted impulse functions. Then, the prior PDF at time step k can be considered as a result that the impulse functions act on \({\mathbf{f}}\left( \cdot \right)\). The partial prior PDF, which is caused by one impulse function, can be expressed according to (3) as follows:

$$p_{n} \left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right) \propto \int {p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{x}}_{k - 1} } \right.} \right)\omega_{k - 1,n} \delta \left( {{\mathbf{x}}_{k - 1} - {\hat{\mathbf{x}}}_{k - 1,n} } \right)d{\mathbf{x}}_{k - 1} }$$
(11)

Meanwhile, (11) can be rewritten according to the sampling property of the impulse function:

$$p_{n} \left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right) \propto \omega_{k - 1,n} p\left( {{\mathbf{x}}_{k} \left| {{\hat{\mathbf{x}}}_{k - 1,n} } \right.} \right)$$
(12)

Furthermore, \(p\left( {{\mathbf{x}}_{k} \left| {{\hat{\mathbf{x}}}_{k - 1,n} } \right.} \right)\) is translated \(p_{{\mathbf{u}}} \left( {\mathbf{x}} \right)\) by adding \({\vec{\mathbf{x}}}_{k,n}\). Concretely, it can be expressed as follows:

$$p\left( {{\mathbf{x}}_{k} \left| {{\hat{\mathbf{x}}}_{k - 1,n} } \right.} \right) = p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\vec{\mathbf{x}}}_{k,n} } \right)$$
(13)

where \({\vec{\mathbf{x}}}_{k,n} = {\mathbf{f}}\left( {{\hat{\mathbf{x}}}_{k - 1,n} } \right)\). Substituting (12) into (13) yields:

$$p_{n} \left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right) \propto \omega_{k - 1,n} p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\vec{\mathbf{x}}}_{k,n} } \right)$$

Now, the partial prior PDF is described by the PDF of process noise. Moreover, the complete prior PDF expression can be obtained according to the additive property of the impulse function, as follows:

$$p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right) \propto \sum\nolimits_{n = 1}^{{N_{k - 1} }} {\omega_{k - 1,n} p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\vec{\mathbf{x}}}_{k,n} } \right)}$$
(14)

and the posterior PDF at time step k can be expressed according to (4) and (14) as follows:

$$\begin{gathered} p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right) \propto p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{x}}_{k} } \right.} \right)\sum\nolimits_{n = 1}^{{N_{k - 1} }} {\omega_{k - 1,n} p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\vec{\mathbf{x}}}_{k,n} } \right)} \hfill \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\; \propto p_{{\mathbf{v}}} \left( {{\mathbf{y}}_{k} - {\mathbf{h}}\left( {{\mathbf{x}}_{k} } \right)} \right)\sum\nolimits_{n = 1}^{{N_{k - 1} }} {\omega_{k - 1,n} p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\vec{\mathbf{x}}}_{k,n} } \right)} \hfill \\ \end{gathered}$$
(15)

In (15), the approximate expression of the posterior PDF is constructed using the process noise model, the observation noise model and the weights of the grid samples at time step k − 1. We can theoretically obtain the posterior CPS according to (15). Then, the PST can be executed for obtaining the weighted grid samples \(\left\{ {{\hat{\mathbf{X}}}_{k} ,{{\varvec{\upomega}}}_{k} } \right\}_{{N_{k} }}\) at time step k. The posterior PDF at time step k can be described in discrete form, as follows:

$$p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right) \propto \sum\nolimits_{n = 1}^{{N_{k - 1} }} {\omega_{k,n} \delta \left( {{\mathbf{x}}_{k} - {\hat{\mathbf{x}}}_{k,n} } \right)}$$
(16)

The state at time step k can be estimated:

$${\overline{\mathbf{x}}}_{k} = \left( {{{\varvec{\upomega}}}_{k} } \right)^{T} {\hat{\mathbf{X}}}_{k}$$
(17)

After that, the recursive Bayesian filtering is approximately executed by obtaining the approximate expression of the posterior PDF, finding the posterior CPS and performing the PST, repeatedly.

This method, which approximates the Bayesian filtering, is called the CPSF framework. In this framework, we give two descriptive forms of the posterior PDF: the approximate expression (see (15)) and the weighted grid samples \(\left\{ {{\hat{\mathbf{X}}}_{k} ,{{\varvec{\upomega}}}_{k} } \right\}_{{N_{k} }}\). The transfer of the posterior PDF is achieved by the repeated execution of the two steps: finding the posterior CPS and performing the PST. The structure of the CPSF framework is shown in Fig. 2. The filter steps are shown in detail in Table 2

Fig. 2
figure 2

Structure of the CPSF framework

Table 2 The CPSF framework

Remark 1

Compared with the computation of the PF, the CPSF framework adds the calculation step of the sample weight, which is mainly reflected in (15). The computational complexity of this part is about \(O\left( {N_{k - 1} \times N_{k} } \right)\). However, samples do not need to reflect the probability information. This phenomenon greatly reduces the requirement of the sample number, which can improve the efficiency of filtering. The efficiency is evaluated using simulation experiments in Section V.

Remark 2

Generally speaking, the posterior PDF can obtain the better estimation than the prior PDF or likelihood PDF, which means it has better compressibility than that of the prior or likelihood in most cases. For a relatively stationary filtering system, the size of the CPS is relatively stable in the whole filtering process. Therefore, the sample number obtained by the PST is relatively stable for a determined \({{\varvec{\uptau}}}\). For a non-stationary filtering system, the size of the CPS might be changed with the change of the scale of the posterior distribution. In this case, the samples number might fluctuate greatly in the whole filtering process for a determined \({{\varvec{\uptau}}}\). In this case, we can dynamically adjust the sampling interval according to the size of the posterior CPS.

Remark 3

In the CPSF framework, the selection of two parameters (\(\alpha\) and \({{\varvec{\uptau}}}\)) depends on the following principles: (i) The selection of \(\alpha\) depends on the engineering requirements for the credibility of the filtering results. In specific engineering requirements, there is usually a strict demand for the confidence level of filtering results. For instance, in civil aviation, the ICAO GNSS SARPs specify a integrity requirement of 10–7 for CAT I. (ii) The selection of the sampling interval depends on the size of the confidence space for the filter to enter the steady-state working phase and the requirements for computational timeliness.

3.3 CPSF algorithm based on space intersection

In the previous subsection, we proposed the CPSF framework based on the concept of the CPS and the properties of the impulse function. For a simple distribution, the CPS can be obtained easily according to (15). For a complex nonlinear system, it is difficult to obtain the posterior CPS directly. To ensure the implementation of the CPSF framework in arbitrary nonlinear system, the SI technique is suggested for estimating the posterior CPS roughly in this subsection, which forms a nonlinear filter algorithm called SI_CPSF algorithm.

Suppose that the \({{\mathbb{C}}}_{{p_{u} }}^{\alpha }\) is the CPS of the process noise model. \(\left\{ {{\hat{\mathbf{X}}}_{k - 1} ,{{\varvec{\upomega}}}_{k - 1} } \right\}_{{N_{k - 1} }}\) is the weighted grid sample for describing the posterior PDF at time step k−1, which satisfies

$${\hat{\mathbf{X}}}_{k - 1} \in {{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }$$

After one time step transfer of the system model, \({\vec{\mathbf{X}}}_{k}\) can be obtained, where

$${\vec{\mathbf{X}}}_{k} = \left[ {{\vec{\mathbf{x}}}_{k,1} ,\;{\vec{\mathbf{x}}}_{k,2} , \ldots ,\;{\vec{\mathbf{x}}}_{{k,N_{k - 1} }} } \right]^{T}$$

Furthermore, each sample \({\hat{\mathbf{x}}}_{k - 1,n}\) in \({\hat{\mathbf{X}}}_{k - 1}\) will be transferred in the space \({\vec{\mathbf{x}}}_{k,n} + {{\mathbb{C}}}_{{p_{{\mathbf{u}}} }}^{\alpha }\) under the confidence \(1 - \alpha\). Therefore, the estimation of the prior CPS can be described as follows:

$${{\tilde{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } = \mathop \cup \limits_{n = 1}^{{N_{k - 1} }} \left( {{\vec{\mathbf{x}}}_{k,n} + {{\mathbb{C}}}_{{p_{{\mathbf{u}}} }}^{\alpha } } \right)$$
(18)

The likelihood CPS is \({{\mathbb{C}}}_{{p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{x}}_{k} } \right.} \right)}}^{\alpha }\). As \(p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{x}}_{k} } \right.} \right)\) is determined by the observation and its noise model, we can obtain the CPS of likelihood distribution expediently. The roughly estimation of the posterior CPS is defined as:

$${{\tilde{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{\alpha } = {{\tilde{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } \cap {{\mathbb{C}}}_{{p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{x}}_{k} } \right.} \right)}}^{\alpha }$$
(19)

As the expression of the posterior PDF can be approximated by (15), we can perform the PST in the \({{\tilde{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{\alpha }\). Then, the \(\left\{ {{\tilde{\mathbf{X}}}_{k} ,{\tilde{\mathbf{\omega }}}_{k} } \right\}_{{\tilde{N}_{k} }}\) can be obtained according to the set sampling interval.

Proposition 1

Suppose \({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\) and \(\left\{ {{\hat{\mathbf{X}}}_{k - 1} ,{{\varvec{\upomega}}}_{k - 1} } \right\}_{{N_{k - 1} }}\) are the real posterior CPS and the weighted grid samples at time step k-1, respectively. When the \(p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)\) is applied to arbitrary nonlinear system \({\mathbf{f}}\left( \cdot \right)\), the lower bound of the confidence of \({{\tilde{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{\alpha }\) approximates to \(1 - \left( {3\alpha - \alpha^{2} } \right)\).

Proof

See Appendix B.

For the estimation of the posterior CPS, we can determine its lower bound:\({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{{3\alpha - \alpha^{2} }} \subset {{\tilde{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{\alpha }\). In practice, the weak amplification of the lower bound of the confidence has little effect on the state estimation as \(\alpha\) is very small. However, the change of the position relation between the likelihood CPS and the prior CPS might affect its upper bound, which makes that it impossible to determine its upper bound. This phenomenon might result in containing extremely low weighted samples in the weighted grid samples. These low weighted samples would greatly limit the efficiency of filtering. Fortunately, the probability information is completely reflected in the weights for the weighed grid samples. The samples are evenly distributed within the approximate posterior CPS, each sample represents only a subspace within it. Therefore, we take the weight as the evaluation index to discard the lowest weighted samples:

$$\left\{ {{\hat{\mathbf{X}}}_{k} ,{{\varvec{\upomega}}}_{k} } \right\}_{{N_{k} }} \Leftarrow \left\{ {{\tilde{\mathbf{X}}}_{k} ,{\tilde{\mathbf{\omega }}}_{k} } \right\}_{{\tilde{N}_{k} }}$$
(20)

In this procedure, we discard the \(\overline{N}_{k}\) (where,\(N_{k} = \tilde{N}_{k} - \overline{N}_{k}\)) samples with the lowest weights, which satisfies:

$$\sum\nolimits_{n = 1}^{{\overline{N}_{k} }} {\tilde{\omega }_{k,n} } \le \alpha,$$

and normalize the weight of the rest samples to get \({{\varvec{\upomega}}}_{k}\). Then, the weighted grid samples set \(\left\{ {{\hat{\mathbf{X}}}_{k} ,{{\varvec{\upomega}}}_{k} } \right\}_{{N_{k} }}\) is obtained, which roughly belongs to \({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{\alpha }\). This specify filtering algorithm is called SI_CPSF algorithm. The block diagram of it is shown in Fig. 3. The detailed algorithm flow is shown in Table 3.

Fig. 3
figure 3

Block diagram of the SI_CPSF algorithm

Table 3 SI_CPSF algorithm

4 Approximate error of the CPSF method to Bayesian filter

In this section, the approximation error between the CPSF method and Bayesian filtering is analyzed theoretically. The theoretical consistency between the CPSF method and Bayesian filtering is further proved.

The filtering process from time step \(k - 1\) to \(k\) is considered as an example: The posterior PDF at time step \(k - 1\) is \(p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)\). \({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{0}\) is the complete state space including two parts: \({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\) and \({{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\), where \({{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\) is the complementary set of \({{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\). Therefore:

$${{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } \cap {{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } = \emptyset$$
$${{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } \cup {{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } = {{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{0}$$

Others,

$$\int_{{{{\mathbb{C}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } }} {p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)d{\mathbf{x}}_{k - 1} } = 1 - \alpha$$
$$\int_{{{{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha } }} {p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)d{\mathbf{x}}_{k - 1} = \alpha }$$

\(\varepsilon\) is defined as the approximation error between the CPSF method and Bayesian filter, as follows:

$$\varepsilon = {\mathbf{x}}_{k,bayes} - \sum\nolimits_{i = 1}^{{N_{k} }} {\omega_{k,i} {\mathbf{x}}_{k,i} }$$
(21)

where \({\mathbf{x}}_{k,bayes}\) denotes the Bayesian estimation at time step k.

For convenience, we define:\(p_{k}^{li} = p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)\), \(p_{k}^{tr} = p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{x}}_{k - 1} } \right.} \right)\), \(p_{k - 1}^{po} = p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)\), \(p_{k}^{pr} = p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)\).

Then,

$$\begin{aligned} {\mathbf{x}}_{k,bayes} & = \int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{0} }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} } \\ & = \int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} } + \int_{{{{\overline{\mathbb{C}}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} } \\ & = {{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} } } \mathord{\left/ {\vphantom {{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} } } {\left( {1 - \alpha } \right)}}} \right. \kern-0pt} {\left( {1 - \alpha } \right)}} + {{\varvec{\upxi}}}_{1} \\ \end{aligned}$$
(22)

where

$${{\varvec{\upxi}}}_{1} = \int_{{{{\overline{\mathbb{C}}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} } - \frac{\alpha }{1 - \alpha }\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{po} d{\mathbf{x}}_{k} }$$
(23)

In (22), the first item is considered as the approximate expectation of \(p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)\) in \({{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha }\). The second item of (22) is considered as a truncation error. In (23), the error of the first item is due to the absence of the \({{\overline{\mathbb{C}}}}_{{p_{k}^{po} }}^{\alpha }\). The second term of (23) is the compensation of the first item due to the normalization of the posterior PDF at time step k in the \({{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha }\). After introducing (3) and (4) into the first term of (22), we have:

$$\begin{aligned} & \int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {\frac{{{\mathbf{x}}_{k} p_{k}^{po} }}{1 - \alpha }d{\mathbf{x}}_{k} } \\ & \quad = \frac{{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} \int_{{{{\mathbb{C}}}_{{p_{k}^{pr} }}^{0} }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} } d{\mathbf{x}}_{k} } }}{{\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}} \\ & \quad = \frac{{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} \left( {\int_{{{{\mathbb{C}}}_{{p_{k}^{pr} }}^{\alpha } }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} } + \int_{{{{\overline{\mathbb{C}}}}_{{p_{k}^{pr} }}^{\alpha } }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} } } \right)d{\mathbf{x}}_{k} } }}{{\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}} \\ & \quad = {{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} Ad{\mathbf{x}}_{k} } } \mathord{\left/ {\vphantom {{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} Ad{\mathbf{x}}_{k} } } {\left[ {\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)} \right]}}} \right. \kern-0pt} {\left[ {\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)} \right]}} + {{\varvec{\upxi}}}_{2} \\ \end{aligned}$$
(24)

where

$$A = {{\int_{{{{\mathbb{C}}}_{{p_{k}^{pr} }}^{\alpha } }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} } } \mathord{\left/ {\vphantom {{\int_{{{{\mathbb{C}}}_{{p_{k}^{pr} }}^{\alpha } }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} } } {\left( {1 - \alpha } \right)}}} \right. \kern-0pt} {\left( {1 - \alpha } \right)}}$$
(25)
$${{\varvec{\upxi}}}_{2} = {{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} Bd{\mathbf{x}}_{k} } } \mathord{\left/ {\vphantom {{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} Bd{\mathbf{x}}_{k} } } {\left[ {\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)} \right]}}} \right. \kern-0pt} {\left[ {\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)} \right]}}$$
(26)
$$B = \int_{{{{\overline{\mathbb{C}}}}_{{p_{k}^{pr} }}^{\alpha } }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} } - \alpha \int_{{{{\mathbb{C}}}_{{p_{k}^{pr} }}^{\alpha } }} {p_{k}^{tr} p_{k - 1}^{po} d{\mathbf{x}}_{k - 1} }$$
(27)

The first item of (24) is the approximate expectation of the posterior PDF, when the posterior PDF in the posterior CPS at time step k−1 transfers to that of the time step k. \(A\) denotes the normalized results of the prior PDF in the \({{\overline{\mathbb{C}}}}_{{p_{k}^{pr} }}^{\alpha }\).The second item of (24) is another truncation error. The magnitude of \({{\varvec{\upxi}}}_{2}\) is mainly dominated by (27). When \(\alpha \to 0\), the two items in (27) tend to 0; then, \({{\varvec{\upxi}}}_{2}\) tends to 0.

The first item of (24) describes the expectation of the continuous posterior PDF in the posterior CPS. After the discretization, we have:

$$\begin{aligned} & \frac{{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} Ad{\mathbf{x}}_{k} } }}{{\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}} \\ & \quad = \frac{{\int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} p_{k}^{li} \sum\nolimits_{i = 1}^{ + \infty } {\omega_{k - 1,i} p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\hat{\mathbf{x}}}_{k,i} } \right)} d{\mathbf{x}}_{k} } }}{{\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}} \\ & \quad = \int_{{{{\mathbb{C}}}_{{p_{k}^{po} }}^{\alpha } }} {{\mathbf{x}}_{k} \frac{{p_{{\mathbf{v}}} \left( {{\mathbf{y}}_{k} - {\mathbf{h}}\left( {{\mathbf{x}}_{k} } \right)} \right)\sum\nolimits_{i = 1}^{ + \infty } {\omega_{k - 1,i} p_{{\mathbf{u}}} \left( {{\mathbf{x}}_{k} - {\hat{\mathbf{x}}}_{k,i} } \right)} }}{{\left( {1 - \alpha } \right)p\left( {{\mathbf{y}}_{k} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}d{\mathbf{x}}_{k} } \\ & \quad = \sum\nolimits_{j = 1}^{ + \infty } {\omega_{k,j} {\mathbf{x}}_{k,j} } \;\;\;\left( {{{\varvec{\uptau}}} \to 0} \right) \\ \end{aligned}$$
(28)

Synthesizing (22), (24), and (28):

$${\mathbf{x}}_{k,bayes} = \sum\nolimits_{j = 1}^{ + \infty } {\omega_{k,j} {\mathbf{x}}_{k,j} } + {{\varvec{\upxi}}}_{1} + {{\varvec{\upxi}}}_{2}$$

Thus,

$$\begin{aligned} \varepsilon& = {\mathbf{x}}_{k,bayes} - \sum\nolimits_{i = 1}^{{N_{k} }} {\omega_{k,i} {\mathbf{x}}_{k,i} } \\ &= {{\varvec{\upxi}}}_{1} + {{\varvec{\upxi}}}_{2} + \sum\nolimits_{j = 1}^{ + \infty } {\omega_{k,j} {\mathbf{x}}_{k,j} } - \sum\nolimits_{i = 1}^{{N_{k} }} {\omega_{k,i} {\mathbf{x}}_{k,i} }\\ \end{aligned}$$

Suppose

$${{\varvec{\upxi}}}_{3} = \sum\nolimits_{j = 1}^{ + \infty } {\omega_{k,j} {\mathbf{x}}_{k,j} } - \sum\nolimits_{i = 1}^{{N_{k} }} {\omega_{k,i} {\mathbf{x}}_{k,i} }$$

Then,

$$\varepsilon = {{\varvec{\upxi}}}_{1} + {{\varvec{\upxi}}}_{2} + {{\varvec{\upxi}}}_{3}$$

Both \({{\varvec{\upxi}}}_{1}\) and \({{\varvec{\upxi}}}_{2}\) are truncation errors due to the absence of the low probability space. Specifically, \({{\varvec{\upxi}}}_{1}\) is the state estimation error due to the absence of the \({{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k} \left| {{\mathbf{y}}_{1:k} } \right.} \right)}}^{\alpha }\). \({{\varvec{\upxi}}}_{2}\) is the state estimation error due to the absence of the \({{\overline{\mathbb{C}}}}_{{p\left( {{\mathbf{x}}_{k - 1} \left| {{\mathbf{y}}_{1:k - 1} } \right.} \right)}}^{\alpha }\). \({{\varvec{\upxi}}}_{3}\) is the quantization error which is dominated by the sample interval. According to the above derivation, the approximate errors between the CPSF method and Bayesian filter are divided into two parts: the space truncation error (\({{\varvec{\upxi}}}_{1}\) and \({{\varvec{\upxi}}}_{2}\)) and quantization error (\({{\varvec{\upxi}}}_{3}\)). Note that the magnitude of the space truncation error can be adjusted by the value of \(\alpha\), and the magnitude of the quantization error can be adjusted by the choice of \({{\varvec{\uptau}}}\). Meanwhile, when \(\alpha \to 0\), \({{\varvec{\upxi}}}_{1} \to 0,{{\varvec{\upxi}}}_{2} \to 0,\) when \({{\varvec{\uptau}}} \to 0\), \({{\varvec{\upxi}}}_{3} \to 0\). Therefore, the CPSF and Bayesian filter are theoretically consistency. In addition, we have the following Proposition for a particular posterior PDF.

Proposition 2

When the posterior PDF is always a unimodal symmetric function, then \({{\varvec{\upxi}}}_{1} = 0.\)

Proof

See Appendix C.

Remark 3

The selection of \(\alpha\) determines the size of the posterior CPS, which affect the stability of the CPSF method for filtering. The smaller the value of \(\alpha\), the better the stability of CPSF method. Meanwhile, the amount of the computation increases. The magnitude of the sampling interval determines the sample number, which affects the accuracy of the CPSF method for filtering. The smaller the sample interval, the better the accuracy of the CPSF method.

5 Numerical simulation

In this section, the performances including the accuracy, efficiency and robustness of the SI_CPSF algorithm are evaluated using four representative simulation experiments.

5.1 Simulation experiment 1

Consider a classical two-dimensional linear Gaussian filtering system [57]

$$\left\{ {\begin{array}{*{20}c} {{\mathbf{x}}_{k} = {\mathbf{Fx}}_{k - 1} + {\mathbf{u}}_{k - 1} } \\ {y_{k} = {\mathbf{Hx}}_{k} + v_{k} \;\;\;} \\ \end{array} } \right.$$

where \({\mathbf{x}}_{k}\) and \(y_{k}\) denote the state and measurement at time step k, respectively. \({\mathbf{F}} = \left[ {\begin{array}{*{20}c} {\cos \left( \theta \right)} & { - \sin \left( \theta \right)} \\ {\sin \left( \theta \right)} & {\cos \left( \theta \right)} \\ \end{array} } \right]\), \({\mathbf{H}} = \left[ {\begin{array}{*{20}c} 1 & 1 \\ \end{array} } \right]\), \({\mathbf{u}}_{k - 1}\) and \(v_{k}\) denote the process noise sequence and measurement noise sequence, respectively. In this experiment, the system parameters are set as follows: \(\theta = {\pi \mathord{\left/ {\vphantom {\pi {18}}} \right. \kern-0pt} {18}}\); \({\mathbf{x}}_{0} = \left[ {\begin{array}{*{20}c} 1 & 1 \\ \end{array} } \right]^{T}\); \(p_{{\mathbf{u}}} \sim N\left( {0,diag\left( {\left[ {\begin{array}{*{20}c} 1 & 1 \\ \end{array} } \right]} \right)} \right)\); \(p_{v} \sim N\left( {0,0.1} \right)\). In this sense, the CPS of the process noise is a circle of equal probability. Therefore, the approximate prior CPS is the union of several circles. The likelihood CPS is a space made up of two straight lines. The posterior CPS is the intersection of the prior CPS and the likelihood CPS.

In this linear Gaussian system, the performance of the KF, PF, and the proposed SI_CPSF was compared. As the optimal Bayesian estimation can be obtained from KF algorithm, the stability of the proposed algorithm is evaluated using the root-mean-square error (RMSE) as follows:

$${\text{RMSE}}_{{{\text{KF}}}} = \sqrt {\frac{{\sum\nolimits_{i = 1}^{monter} {\sum\nolimits_{k = 1}^{K} {\left( {{\hat{\mathbf{x}}}_{{{\text{monter}},k}} - {\mathbf{x}}_{monter,k}^{KF} } \right)^{2} } } }}{{{\text{monter}} \times K}}}$$

For the SI_CPSF algorithm, we choose \(\alpha { = }0.0001\), and two types of sampling intervals: \({{\varvec{\uptau}}} = \left[ {\begin{array}{*{20}c} {0.8} & {0.8} \\ \end{array} } \right]^{T}\) and \({{\varvec{\uptau}}} = \left[ {\begin{array}{*{20}c} {0.3} & {0.3} \\ \end{array} } \right]^{T}\). For the PF (SIR-PF was used in this study), 1000 and 5000 particles are used, respectively. Figure 4 shows the estimated results of the approximate posterior CPS at a certain time step and the process of the weighted grid samples generation and contraction. Figure 5 shows that the SI_CPSF can obtain the result closer to KF comparing with the PF. Figure 6 shows that the SI_CPSF requires a very small number of samples. Figures 6 and 7 show that the required samples number by the SI_CPSF and the computational burden increases when the sampling interval decreases. However, Fig. 7 shows that the time consumption of the SI_CPSF with \({{\varvec{\uptau}}} = \left[ {\begin{array}{*{20}c} {0.3} & {0.3} \\ \end{array} } \right]^{T}\) is much littler than that of the PF with 5000 particles.

Fig. 4
figure 4

The sampling mechanism of the SI-CPSF algorithm in one step

Fig. 5
figure 5

Estimated relative errors compared with KF estimation for the different algorithms

Fig. 6
figure 6

Samples number for SI_CPSF with different \({{\varvec{\uptau}}}\)

Fig. 7
figure 7

Computational time cost for the different algorithms

To evaluate the stability of the proposed algorithm, we ran 5000 rounds of Monte Carlo. The result can be seen from Table 4. The RMSEs of the KF are the smallest. The RMSEs of the SI_CPSF decrease with the decrease in the sampling interval. The RMSEs of the PF are always larger than that of the SI_CPSF. The SI_CPSF adds the calculation step of the sample weight comparing with the PF. Therefore, the computational burden of the SI_CPSF is bound to be much greater than that of the PF for the same sample number. However, the SI_CPSF just need 50 samples to obtain a better estimate result than the PF with 5000 particles, when \({{\varvec{\uptau}}} = \left[ {\begin{array}{*{20}c} {0.8} & {0.8} \\ \end{array} } \right]^{T}\). The large reduction of sample number effectively improves the filtering efficiency. Table 4 shows that the time cost of the SI_CPSF with \({{\varvec{\uptau}}} = \left[ {\begin{array}{*{20}c} {0.8} & {0.8} \\ \end{array} } \right]^{T}\) is about 1/126 of the PF with 5000 particles. It is worth noting that the posterior PDF always satisfies the Gaussian distribution. Then, we have \({{\varvec{\upxi}}}_{1} { = }0\) for the SI_CPSF according to Proposition 2. Therefore, the SI_CPSF deviations with Bayesian filter mainly come from the \({{\varvec{\upxi}}}_{2}\) and \({{\varvec{\upxi}}}_{3}\). In addition, Table 4 shows that the RMSEs decrease slightly when the sampling interval decreases from 0.8 to 0.3. It shows that the quantization error has very little effect on RMSEs. At this point, a huge computational cost might be paid for improving the accuracy by reducing the sampling interval. In this case, increasing the posterior CPS by decreasing the \(\alpha\) might be worth trying. Of course, a huge computational cost might be paid for improving the filter accuracy when the posterior CPS increases to a certain extent. In short, the improvement of filtering accuracy usually is not linear with the increase in computation burden. However, the SI_CPSF can obviously approximate the recursive Bayesian filtering faster with less computation than the PF in this case.

Table 4 The performance of different filter algorithms with different parameters in linear/Gaussian system

5.2 Simulation experiment 2

To evaluate the performance of the SI_CPSF in the generical nonlinear system, a typical one-dimensional nonlinear Gaussian system was considered [14]

$$\left\{ {\begin{array}{*{20}c} {x_{k + 1} = \frac{1}{2}x_{k} + 25\frac{{x_{k} }}{{1 + x_{k}^{2} }} + 8\cos \left( {1.2k} \right) + u_{k} } \\ {y_{k} = \frac{{x_{k}^{2} }}{20} + v_{k} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} \\ \end{array} } \right.$$

The system parameters were set as follows: \(x_{0} = 1\); \(p_{u} \left( x \right)\sim N\left( {0,10} \right)\); and \(p_{v} \left( x \right)\sim N\left( {0,1} \right)\). In this sense, the CPS of the process noise is an interval \(\left[ {\begin{array}{*{20}c} { - x_{\alpha }^{u} } & {x_{\alpha }^{u} } \\ \end{array} } \right]\), where \(x_{\alpha }^{u}\) is determined by \(\alpha\). The approximate prior CPS is the union of several shifted interval. The likelihood CPS is:

$$\left[ {\begin{array}{*{20}c} { - x_{\alpha }^{u} - \sqrt {20y_{k} } } & {x_{\alpha }^{u} - \sqrt {20y_{k} } } \\ \end{array} } \right] \cup \left[ {\begin{array}{*{20}c} { - x_{\alpha }^{u} + \sqrt {20y_{k} } } & {x_{\alpha }^{u} + \sqrt {20y_{k} } } \\ \end{array} } \right]$$

The posterior CPS is the intersection of the prior CPS and the likelihood CPS.

The EKF, the PF and the SI_CPSF are used to estimate the sequence of the state \(\left\{ {x_{k} } \right\}_{k = 1}^{60}\). For the SI_CPSF, we choice \(\alpha = \left[ {\begin{array}{*{20}c} {10^{ - 4} } & {10^{ - 7} } \\ \end{array} } \right]\), \(\tau = \left[ {\begin{array}{*{20}c} {0.3} & {0.8} \\ \end{array} } \right]\), which forms four sets of parameters in the SI_CPSF. For the PF, 500 and 1000 particles are used respectively. It is worth noting that this state equation is a highly nonlinear function. The likelihood PDF is bimodal type. Most local inference methods are invalid for this filtering example. Meanwhile, many improved PF (e.g., extended PF, unscented PF) does also not work well [14, 58].

Figure 8 shows that the PF and the SI_CPSF can obtain the estimate results with comparable accuracy. The EKF is much less accurate, which is due to the bimodal character of the likelihood distribution. From the time step 21–23, Fig. 9 shows the change of the PDF in the filtering process. We use the distribution obtained by the PF with 10,000 particles as a reference. It can be seen from Fig. 9 that the posterior CPS can accurately find the high probability region of the posterior PDF. Combine the expression of the posterior PDF in (15), the SI_CPSF can describe the posterior PDF with a small amount of weighted grid samples. Figures 10, 11 and 12 show the performance of filtering based on the Monte Carlo simulation with 30,000 instantiations. Figure 10 shows that the computation of the SI_CPSF is much less than that of the PF. It can be seen from Fig. 11 that the SI_CPSF and the PF have similar state estimation results, which also verifies the consistency of the CPSF method and Bayesian filtering. To verify the robustness of the CPSF method, Fig. 12 (left) shows the cumulative probability function (CDF) of errors of the state results for different algorithms. As the error curves of different algorithms are very close, CDF curve of the PF with 1000 particles is used as the reference line, and Fig. 12 (right) shows that the robustness of the PF increases as the particle number increases. (The pink curve is above the green curve in the region with large error (when the error is greater than 16.).) The robustness of the SI_CPSF is always better than that of the two PFs. The robustness of the SI_CPSFs from good to bad is yellow curve, black curve, cyan curve and red curve. This shows that the change of the α has a significant impact on the robustness of filtering. When the error is about 13, the curves of the SI_CDSF from top to bottom are: yellow curve, cyan curve, black curve and red curve. In this region, the sampling interval becomes the dominant factor affecting the error. When the error is about 4, the red curve is the worst, which is influenced by both the sampling interval and the α.

Fig. 8
figure 8

Estimated relative errors for different algorithms in one simulation

Fig. 9
figure 9

Weighted samples at time steps for SI_CPSF

Fig. 10
figure 10

Computational time cost for the different algorithms

Fig. 11
figure 11

Comparison of the RMSEs for different algorithms

Fig. 12
figure 12

Comparison of the CDFs between PF and the proposed algorithm

Table 5 shows that the RMSEs of the SI_CPSF are all better than that of the two PFs with lower computational burden. Note that the RMSE of the SI_CPSF with \(\alpha = 10^{ - 4}\)\(\tau = 0.3\)\(\tau = 0.8\) is equal to the SI_CPSF with \(\alpha = 10^{ - 7}\)\(\tau = 0.8\). However, the computational burdens of them are different. This shows that we can further improve the filtering efficiency by adjusting the sampling interval and the \(\alpha\).

Table 5 Comparison of the filter accuracy and time cost for different algorithms in one-dimensional nonlinear system

5.3 Simulation experiment 3

To evaluate the performance of the SI_CPSF in the multiple dimensions generical nonlinear system, the Lorenz system is considered to verify the performance of the proposed algorithm [59].

$$\begin{gathered} \left[ {\begin{array}{*{20}c} {\dot{x}_{1} } \\ {\dot{x}_{2} } \\ {\dot{x}_{3} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\lambda \left( { - x_{1} + x_{2} } \right)} \\ {\beta x_{1} - x_{2} - x_{1} x_{2} } \\ { - \gamma x_{3} + x_{1} x_{2} + } \\ \end{array} } \right] + {{\varvec{\Gamma}}}\left( t \right) \hfill \\ y_{k} = \sqrt {\left( {x_{1} \left( {t_{k} } \right)} \right)^{2} + \left( {x_{2} \left( {t_{k} } \right)} \right)^{2} + \left( {x_{3} \left( {t_{k} } \right)} \right)^{2} } + v_{k} \hfill \\ \end{gathered}$$

The system parameters were set as follows: \({\mathbf{x}}_{0} = \left[ { - 100,\;100\;50} \right]^{T}\), \(\lambda = 10,\;\beta = 28,\;\gamma = {8 \mathord{\left/ {\vphantom {8 3}} \right. \kern-0pt} 3}\), \(p_{{\mathbf{u}}} \sim N\left( {0,diag\left( {\left[ {\begin{array}{*{20}c} 1 & 1 & 1 \\ \end{array} } \right]} \right)} \right)\); \(p_{v} \sim N\left( {0,1} \right)\). In this sense, the CPS of the process noise is a sphere of equal probability. Therefore, the approximate prior CPS is the union of several spheres. The likelihood CPS is a ring of two concentric spheres:\(y_{k} - r_{\alpha } \le r \le y_{k} + r_{\alpha }\), where \(r_{\alpha }\) is determined by the \(\alpha\). The PF and the SI_CPSF are used to estimate the sequence of the state \(\left\{ {{\mathbf{x}}_{k} } \right\}_{k = 1}^{100}\). For the PF, the particle numbers are 500 and 10,000, respectively. We choice \(\alpha = 10^{ - 4}\), \(\tau = \left[ {\begin{array}{*{20}c} {2.5} & {2.5} & {2.5} \\ \end{array} } \right]^{T}\) and \(\tau = \left[ {\begin{array}{*{20}c} {1.5} & {1.5} & {1.5} \\ \end{array} } \right]^{T}\), which form two sets of parameters for the SI_CPSF. This experiment is also a typical general nonlinear filtering scenario. The filtering effect of most local inference and improved PFs is not good.

Figure 13 shows the weighted grid samples, generated in the posterior CPS, for the SI_CPSF with different sampling intervals. As we can see that the samples are evenly distributed in the space, their probability information is completely focused on the weight. Figure 14 shows that the RMSEs of the SI_CPSF with two sets of sample interval are similar to that of the PF with 500, 10,000 particles, respectively. Figure 15 shows that the time consumption of the SI_CPSF with \(\tau = 1.5\) is less than that of the PF with 10,000 particles. Figure 16 shows the number of samples required for the SI_CPSF at different sampling intervals. It shows that the number of samples required by the SI_CPSF under these two sets of sample interval is much less than that of the PF. This is an important reason that the SI_CPSF is superior to the PF in filtering efficiency, which can be seen from Fig. 15. Similar to the experiment 2, we obtain Fig. 18 according to Fig. 17. The robustness of the two SI_CPSFs is better than that of the two PFs, respectively.

Fig. 13
figure 13

The sampling mechanism of the SI_CPSF algorithm in one step

Fig. 14
figure 14

Comparison of the RMSEs for different algorithms

Fig. 15
figure 15

Computational time cost for the different algorithms

Fig. 16
figure 16

The average samples number for SI_CPSF with different parameters

Fig. 17
figure 17

Comparison of the CDFs for PF and the proposed algorithm

Fig. 18
figure 18

Comparison of the CDFs between PF and the proposed algorithm

Table 6 shows the accuracy and effectiveness of the filtering based on the Monte Carlo simulation with 5000 instantiations. It shows that the SI_CPSF with \(\tau = 1.5\) can obtain a better accuracy of filtering using a lower computation burden compared with the PF with 10,000 particles. Compared with the PF, the simulation shows that the SI_CPSF has a faster convergence rate in the process of approaching the Bayesian filtering.

Table 6 Comparison of the filter accuracy and time cost for different algorithms in three-dimensional nonlinear system

5.4 Simulation experiment 4

To evaluate the performance of the SI_CPSF in a real-life scenario, the proposed algorithm is verified by the target tracking of the UAV with radar. The dynamic models adopted in this work is:

$${\mathbf{p}}_{k} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} } \right]{\mathbf{p}}_{k - 1} + \left[ {\begin{array}{*{20}c} {v_{k - 1}^{x} \Delta t} \\ {v_{k - 1}^{y} \Delta t} \\ {v_{k - 1}^{z} \Delta t} \\ \end{array} } \right] + {\mathbf{u}}_{k - 1}$$

where \({\mathbf{p}}_{k} = \left[ {p_{k}^{x} ,p_{k}^{y} ,p_{k}^{z} } \right]^{T}\), \(p_{k}^{x}\), \(p_{k}^{y}\) and \(p_{k}^{z}\) are the position of the aircraft in three directions at time step k; \(v_{k}^{x}\), \(v_{k}^{y}\) and \(v_{k}^{z}\) are real velocities corresponding to these two position directions (provided by the IMU) at time step k. \({\mathbf{u}}_{k - 1}\) is zero-mean Gaussian processes characterizing the IMU measurement noise.

The dynamic measurement model characterizing the radar system is:

$$Z_{k} = \sqrt {\left( {p_{k}^{x} - p_{radar}^{x} } \right)^{2} + \left( {p_{k}^{y} - p_{radar}^{y} } \right)^{2} + \left( {p_{k}^{z} - p_{radar}^{z} } \right)^{2} } + v_{k}$$

where \(\left[ {p_{radar}^{x} ,p_{radar}^{y} ,p_{radar}^{z} } \right]^{T}\) is the coordinates of the radar and is known. \(v_{k}\) is a zero-mean Gaussian processes characterizing the radar measurement noise.

The system parameters were set as follows: \({\mathbf{p}}_{0} = \left[ {2600,2800,800} \right]^{T}\), \({\mathbf{p}}_{radar} = \left[ {3500,2300,0} \right]^{T}\), \({\mathbf{u}}_{k - 1} \sim N\left( {0,diag\left( {\left[ {\begin{array}{*{20}c} 1 & 1 & 1 \\ \end{array} } \right]} \right)} \right)\); \(v_{k} \sim N\left( {0,0.1} \right)\). We choice \(\alpha = 10^{ - 4}\), \(\tau = \left[ {\begin{array}{*{20}c} 1 & 1 & 1 \\ \end{array} } \right]^{T}\) and \(\tau = \left[ {\begin{array}{*{20}c} {0.8} & {0.8} & {0.8} \\ \end{array} } \right]^{T}\), which form two sets of parameters for the SI_CPSF. For the PF, the particle numbers are 500 and 10,000, respectively. In this scenario, we ran 100 rounds of Monte Carlo simulations. The average number of particles used by the proposed algorithm during the recursive process is shown in Fig. 19. It can be seen that, in the initial stages, the number of particles steadily increases and eventually stabilizes. This is due to the continuous accumulation of IMU errors during the target tracking process, while observations continually correct its state. Therefore, the variance of the posterior distribution of the target state is a process of continuous amplification that eventually reaches stability. The proposed algorithm closely aligns with this variation process. Figure 20 shows the RMSEs of CPSFs and PFs based on the Monte Carlo simulation with 1000 instantiations. It can be observed that increasing the number of particles in PF from 3000 to 10,000 has a negligible impact on the filtering accuracy. However, the computational cost experiences a significant increase. When the CPSF sampling interval is set to 1, its performance is similar to that of the PF with 10 K particle number; when the CPSF sampling interval is set to 0.8, its performance is significantly better than that of the PF with 10 K particle number. For further comparison, we provide Table 7. It is evident that when the sampling interval is set to 1, CPSF passes an average of just over 1500 samples, and its computational cost is much lower than that of PF with 10 K particles. When the sampling interval is set to 0.8, CPSF exhibits an improvement of nearly 10% in filtering accuracy compared to PF with 10 K particles, and its computational cost is lower than that of PF.

Fig. 19
figure 19

The average samples number for SI_CPSF with different parameters for target tracking of the UAV with radar

Fig. 20
figure 20

Comparison of the CDFs for PF and the proposed algorithm for target tracking of the UAV with radar

Table 7 Comparison of the filter accuracy and time cost for different algorithms in target tracking scenario

6 Conclusion

The CPSF method was proposed to solve the generic nonlinear filtering problem in this paper. This method has absorbed the idea of the SMC method, where the high probability region of the posterior PDF dominates the performance of filtering. Meanwhile, the transfer of the posterior PDF was based on the idea of the numerical integration in GF and the local inference. The posterior CPS was used to select the high probability region. The weighted grid samples in the posterior CPS were calculated by the PST. The approximate expression of the posterior PDF, as key for the PST implementation, was obtained by using the properties of the impulse function in the integral operation. On this basis, we proposed the CPSF framework for solving the generic nonlinear filtering problem. As finding the posterior CPS is difficult directly, SI method was proposed to obtain the approximate posterior CPS, which formed the SI_CPSF algorithm. The approximate error from the CPSF framework to the recursive Bayesian filter is analyzed. The consistency of the CPSF framework to the recursive Bayesian filter is proved in this paper. Four typical simulation experiments were used to evaluate the performance of the proposed method.