Nonlinear filtering via stochastic PDE projection on mixture manifolds in \(L^2\) direct metric
 1.4k Downloads
 4 Citations
Abstract
We examine some differential geometric approaches to finding approximate solutions to the continuous time nonlinear filtering problem. Our primary focus is a new projection method for the optimal filter infinitedimensional stochastic partial differential equation (SPDE), based on the direct \(L^2\) metric and on a family of normal mixtures. This results in a new finitedimensional approximate filter based on the differential geometric approach to statistics. We compare this new filter to earlier projection methods based on the Hellinger distance/Fisher metric and exponential families, and compare the \(L^2\) mixture projection filter with a particle method with the same number of parameters, using the Levy metric. We discuss differences between projecting the SPDE for the normalized density, known as Kushner–Stratonovich equation, and the SPDE for the unnormalized density known as Zakai equation. We prove that for a simple choice of the mixture manifold the \(L^2\) mixture projection filter coincides with a Galerkin method, whereas for more general mixture manifolds the equivalence does not hold and the \(L^2\) mixture filter is more general. We study particular systems that may illustrate the advantages of this new filter over other algorithms when comparing outputs with the optimal filter. We finally consider a specific software design that is suited for a numerically efficient implementation of this filter and provide numerical examples. We leverage an algebraic ring structure by proving that in presence of a given structure in the system coefficients the key integrations needed to implement the new filter equations can be executed offline.
Keywords
Direct \(L^2\) metric Exponential families Finitedimensional families of probability distributions Fisher information Hellinger distance Levy metric Mixture families Stochastic filtering Galerkin methods1 Introduction
In the nonlinear filtering problem, one observes a system whose state is known to follow a given stochastic differential equation. The observations that have been made contain an additional noise term, so one cannot hope to know the true state of the system. However, one can reasonably ask what is the probability density over the possible states conditional on the given observations. When the observations are made in continuous time, the probability density follows a stochastic partial differential equation known as the Kushner–Stratonovich equation. This can be seen as a generalization of the Fokker–Planck equation that expresses the evolution of the density of a diffusion process. Thus, the problem we wish to address boils down to finding approximate solutions to the Kushner–Stratonovich equation. The literature on stochastic filtering for nonlinear systems is vast and it is impossible to do justice to all past contributions here. For a proof of the fact that the solution is infinite dimensional see, for example, Hazewinkel et al. [26] for the cubic sensor, whereas for special nonlinear systems still admitting an exact finitedimensional filter see, for example, Beneš [14] and Beneš and Elliott [15]. For a more complete treatment of the filtering problem from a mathematical point of view see for example Lipster and Shiryayev [37]. See Jazwinski [29] for a more applied perspective and comprehensive insight.
The main idea we will employ is inspired by the differential geometric approach to statistics developed in [5] and [44]. The idea of applying this approach to the filtering problem has been sketched first in [25]. One thinks of the probability distribution as evolving in an infinitedimensional space \({\mathcal {P}}\) which is in turn contained in some Hilbert space H. One can then think of the Kushner–Stratonovich equation as defining a vector field in \({\mathcal {P}}\): the integral curves of the vector field should correspond to the solutions of the equation. To find approximate solutions to the Kushner–Stratonovich equation, one chooses a finitedimensional submanifold M of H and approximates the probability distributions as points in M. At each point of M, one can use the Hilbert space structure to project the vector field onto the tangent space of M. One can now attempt to find approximate solutions to the Kushner–Stratonovich equations by integrating this vector field on the manifold M. This mental image is slightly inaccurate. The Kushner–Stratonovich equation is a stochastic PDE rather than a PDE so one should imagine some kind of stochastic vector field rather than a smooth vector field. Thus, in this approach we hope to approximate the infinitedimensional stochastic PDE by solving a finitedimensional stochastic ODE on the manifold. Note that our approximation will depend upon two choices: the choice of manifold M and the choice of Hilbert space structure H. In this paper, we will consider two possible choices for the Hilbert space structure: the direct \(L^2\) metric on the space of probability distributions; the Hilbert space structure associated with the Hellinger distance and the Fisher Information metric. Our focus will be on the direct \(L^2\) metric since projection using the Hellinger distance has been considered before. As we shall see, the choice of the “best” Hilbert space structure is determined by the manifold one wishes to consider—for manifolds associated with exponential families of distributions the Hellinger metric leads to the simplest equations, whereas the direct \(L^2\) metric works well with mixture distributions. It was proven in [18] that the projection filter in Hellinger metric on exponential families is equivalent to the classical assumed density filters. In this paper, we show that the projection filter for basic mixture manifolds in \(L^2\) metric is equivalent to a Galerkin method. This only holds for very basic mixture families, however, and the \(L^2\) projection method turns out to be more general than the Galerkin method. We will write down the stochastic ODE determined by the geometric approach when \(H=L^2\) and show how it leads to a numerical scheme for finding approximate solutions to the Kushner–Stratonovich equations in terms of a mixture of normal distributions. We will call this scheme the \(L^2\) normal mixture projection filter or simply the L2NM projection filter. The stochastic ODE for the Hellinger metric was considered in [17, 19] and [18]. In particular, a precise numerical scheme is given in [19] for finding solutions by projecting onto an exponential family of distributions. We will call this scheme the Hellinger exponential projection filter or simply the HE projection filter. We will compare the results of a C++ implementation of the L2NM projection filter with a number of other numerical approaches including the HE projection filter and the optimal filter. We can measure the goodness of our filtering approximations thanks to the geometric structure, and in particular, the precise metrics we are using on the spaces of probability measures. What emerges is that the two projection methods produce excellent results for a variety of filtering problems. The results appear similar for both projection methods, which gives more accurate results depends upon the problem. As we shall see, however, the L2NM projection approach can be implemented more efficiently. In particular, one needs to perform numerical integration as part of the HE projection filter algorithm whereas all integrals that occur in the L2NM projection can be evaluated analytically. We also compare the L2NM filter to a particle filter with the best possible combination of particles with respect to the Lévy metric. Introducing the Lévy metric is needed because particles’ densities do not compare well with smooth densities when using \(L^2\) induced metrics. We show that, given the same number of parameters, the L2NM may outperform a particlebased system. We should also mention that the systems we analyse here are onedimensional, and that we plan to address largedimensional systems in a subsequent paper.
The paper is structured as follows: In Sect. 2, we introduce the nonlinear filtering problem and the infinitedimensional Stochastic PDE (SPDE) that solves it. In Sect. 3, we introduce the geometric structure we need to project the filtering SPDE onto a finitedimensional manifold of probability densities. In Sect. 4, we perform the projection of the filtering SPDE according to the L2NM framework and also recall the HE based framework. In Sect. 5, we prove equivalence between the projection filter in \(L^2\) metric for basic mixture families and the Galerkin method. In Sect. 6, we briefly introduce the main issues in the numerical implementation and then focus on software design for the L2NM filter. In Sect. 7, a second theoretical result is provided, showing a particularly convenient structure for the projection filter equations for specific choices of the system properties and of the mixture manifold. In Sect. 8, we look at numerical results, whereas in Sect. 9 we compare our outputs with a particle method. In Sect. 10, we compare our filters with a robust implementation of the optimal filter based on Hermite functions. Section 11 concludes the paper.
2 The nonlinear filtering problem with continuous time observations
In the other approach to robust filtering, due originally to Balakrishnan [8], one tries to model the observation process directly with a white noise error term. Although modelling white noise directly is intuitively appealing, it brings a host of mathematical complications. Kallianpur and Karandikar [31] developed the theory of nonlinear filtering in this framework, while [7] extends this second approach to the case of correlated state and observation noises.
Going back to the projection filter, one could consider using a projection method to find approximate solutions for \(r_t\). However, as our primary interest is in finding good approximations to \(p_t\) rather than \(r_t\), we believe that projecting the KS equation is the more promising approach. For comparison purposes, we have implemented below the robust optimal filter using the method of Luo and Yau [38, 39, 40], who put emphasis on realtime implementation.
3 Statistical manifolds
3.1 Families of distributions
As discussed in the introduction, the idea of a projection filter is to approximate solutions to the Kushner–Stratononvich Eq. (2) using a finitedimensional family of distributions.
Example 3.1
Example 3.2
Mixture families have a long tradition in filtering. Alspach and Sorenson [4] already highlight that Gaussian sums work well in nonlinear filtering. Ahmed [2] points out that Gaussian densities are dense in \(L^2\), pointing at the fact that with a sufficiently large number of components a mixture of Gaussian densities may be able to reproduce most features of squared integrable densities.
3.2 Two Hilbert spaces of probability distributions
3.3 The tangent space of a family of distributions
One might be tempted to talk about submanifolds of the space of probability distributions, but one should be careful. The spaces \({\mathcal {H}}\) and \({\mathcal {D}}\) are not open subsets of \(L^2_H\) and \(L^2_D\) and so do not have any obvious Hilbertmanifold structure. To see why, one may perturb a positive density by a negative spike with an arbitrarily small area and obtain a function arbitrarily close to the density in \(L^2\) norm but not almost surely positive, see [6] for a graphic illustration.
3.4 The Fisher information metric
Example 3.3
3.5 The direct \(L^2\) metric
The ideas from the previous section can also be applied to the direct \(L^2\) metric. This gives a different Riemannian metric on the manifold. We will write \(h=h_{ij}\) to denote the \(L^2\) metric when written with respect to a particular parameterization.
Example 3.4
4 The projection filter
Proposition 4.1
This is the promised finitedimensional stochastic differential equation for \(\theta \) corresponding to \(L^2\) projection.
If preferred, one could project the Kushner–Stratonovich equation using the Hellinger metric instead. This yields the following stochastic differential equation derived originally in [19]:
Proposition 4.2
Note that the inner products in this equation are the direct \(L^2\) inner products: we are simply using the \(L^2\) inner product notation as a compact notation for integrals.
It is now possible to explain why we resorted to the KS Equation for the optimal filter rather than the Zakai equation in deriving the projection filter.
Proposition 4.3
Projecting the Zakai unnormalized density SPDE in Hellinger metric onto the statistical manifold \(p(\cdot ,\theta )\) results in the same projection filter equation as projecting the normalized density KS SPDE.
The main focus of this paper, however, is the \(d_D\) projection filter (9). For this filter, we do have an impact of the nonlinear terms. In fact, it is easy to adapt the derivation of the \(d_D\) filter to the Zakai equation, which leads to the following new filter.
Proposition 4.4
This filter is clearly different from (9). We have implemented (11) for the sensor case, using a simple variation on the numerical algorithms below. We found that (11) gives slightly worse results for the \(L^2\) residual of the normalized density than (9). This can be explained simply by the fact that if we want to approximate p in a given norm then we should project an equation for p whereas if we wish to approximate q we should project an equation for q. The fact that \(\sqrt{q}\) has variable \(L^2\) norm in time is not relevant for the projection in the Hellinger metric, while it is for the \(d_D\) metric, where the lack of normalization in the Zakai Eq. density plays a role in the projection.
5 Equivalence with assumed density filters and Galerkin methods
The projection filter with specific metrics and manifolds can be shown to be equivalent to earlier filtering algorithms. In particular, while the \(d_H\) metric leads to the Fisher Information and to an equivalence between the projection filter and assumed density filters (ADFs) when using exponential families, see [18], the \(d_D\) metric for simple mixture families is equivalent to a Galerkin method, as we show now following the second named author preprint [16]. For applications of Galerkin methods to nonlinear filtering, we refer for example to [9, 24, 27, 41]. Ahmed [2], Chapter 14, Sections 14.3 and 14.4, summarizes the Galerkin method for the Zakai equation, see also [3]. Pardoux [43] uses Galerkin techniques to analyse existence of solutions for the nonlinear filtering SPDE. Bensoussan et al. [10] adopt the splitting up method. We also refer to Frey et al [23] for Galerkin methods applied to the extended Zakai equation for diffusive and point process observations.
Theorem 5.1
For simple mixture families (14), the \(d_D\) projection filter (9) coincides with a Galerkin method (13) where the basis functions are the mixture components q.
However, this equivalence holds only for the case where the manifold on which we project is the simple mixture family (14). More complex families, such as the ones we will use in the following, will not allow for a Galerkinbased filter and only the \(L^2\) projection filter can be defined there. Note also that even in the simple case (14) our \(L^2\) Galerkin/projection filter will be different from the Galerkin projection filter seen for example in [9], because we use Stratonovich calculus to project the Kushner–Stratonovich equation in \(L^2\) metric. In [9], the Ito version of the Kushner–Stratonovich equation is used instead for the Galerkin method, but since Ito calculus does not work on manifolds, due to the secondorder term moving the dynamics out of the tangent space, we use the Stratonovich version instead. The Itobased and Stratonovichbased Galerkin projection filters will therefore differ for simple mixture families, and again, only the second one can be defined for manifolds of densities beyond the simplest mixture family.
6 Numerical software design
UML for the FunctionRing interface
FunctionRing  

+ add(\(f_1\) : Function, \(f_2\) : Function) : Function  
+ multiply(\(f_1\) : Function, \(f_2\) : Function) : Function  
+ multiply(s : Real, f : Function) : Function  
+ differentiate(f : Function) : Function  
+ integrate(f : Function) : Real  
+ evaluate(f : Function) : Real  
+ constantFunction(s : Real) : Function 
UML for the Manifold interface
Manifold  

+ getRing() : FunctionRing  
+ getDensity(\(\theta \)) : Function  
+ computeTangentVectors(\(\theta \) : Point) : Function*  
+ updatePoint(\(\theta \) : Point, \(\Delta \theta \) : Real*) : Point  
+ finalizePoint(\(\theta \) : Point) : Point 
The other key abstraction is the Manifold. We give an UML representation of this abstraction in Table 2. For readers unfamiliar with UML, we remark that the \(*\) symbol can be read “list”. For example, the computeTangentVectors function returns a list of functions. The Manifold uses some convenient internal representation for a point, the most obvious representation being simply the mtuple \((\theta _1, \theta _2, \ldots \theta _m)\). On request the Manifold is able to provide the density associated with any point represented as an element of the FunctionRing. In addition, the Manifold can compute the tangent vectors at any point. The computeTangentVectors method returns a list of elements of the FunctionRing corresponding to each of the vectors \(v_i = \frac{\partial p}{\partial \theta _i}\) in turn. If the point is represented as a tuple \(\theta =(\theta _1, \theta _2, \ldots \theta _n)\), the method updatePoint simply adds the components of the tuple \(\Delta \theta \) to each of the components of \(\theta \). If a different internal representation is used for the point, the method should make the equivalent change to this internal representation. The finalizePoint method is called by our algorithm at the end of every time step. At this point, the Manifold implementation can choose to change its parameterization for the state. Thus, the finalizePoint allows us (in principle at least) to use a more sophisticated atlas for the manifold than just a single chart. One should not draw too close a parallel between these computing abstractions and similarly named mathematical abstractions. For example, the space of objects that can be represented by a given FunctionRing does not need to form a differential ring despite the differentiate method. This is because the differentiate function will not be called infinitely often by the algorithm below, so the functions in the ring do not need to be infinitely differentiable. Similarly the finalizePoint method allows the Manifold implementation more flexibility than simply changing chart. From one time step to the next, it could decide to use a completely different family of distributions. The interface even allows the dimension to change from one time step to the next. We do not currently take advantage of this possibility, but adaptively choosing the family of distributions would be an interesting topic for further research.
7 The case of normal mixture families
Theorem 7.1
Let \(\theta \) be a parameterization for a family of probability distributions all of which can be written as a mixture of at most i Gaussians. Let f, \(a=\sigma ^2\) and b be functions in the ring \({\mathcal {R}}\). In this case, one can carry out the direct \(L^2\) projection algorithm for the problem given by Eq. (1) using analytic formulae for all the required integrations.
Although the condition that f, a and b lie in \({\mathcal {R}}\) may seem somewhat restrictive, when this condition is not met one could use Taylor expansions to find approximate solutions, although in such case rigorous convergence results need to be established. Although the choice of parameterization does not affect the choice of FunctionRing, it does affect the numerical behaviour of the algorithm. In particular if one chooses a parameterization with domain a proper subset of \({\mathbb {R}}^m\), the algorithm will break down the moment the point \(\theta \) leaves the domain. With this in mind, in the numerical examples given later in this paper we parameterize normal mixtures of k Gaussians with a parameterization defined on the whole of \({\mathbb {R}}^n\). We describe this parameterization below.
7.1 Comparison with the Hellinger exponential (HE) projection algorithm

In [17], only the special case of the cubic sensor was considered. It was clear that one could in principle adapt the algorithm to cope with other problems, but there remained symbolic manipulation that would have to be performed by hand. Our algorithm automates this process using the FunctionRing abstraction.

When one projects onto an exponential family, the stochastic term in Eq. (10) simplifies to a term with constant coefficients. This means it can be viewed equally well as either an Itô or Stratonovich SDE. The practical consequence of this is that the HE algorithm can use the Euler–Maruyama scheme rather than the Stratonvoich–Heun scheme to solve the resulting stochastic ODE’s. Moreover, in this case, the Euler–Maruyama scheme coincides with the generally more precise Milstein scheme.
 In the case of the cubic sensor, the HE algorithm requires one to numerically evaluate integrals such as:where the \(\theta _i\) are real numbers. Performing such integrals numerically considerably slows the algorithm. In effect one ends up using a rather fine discretization scheme to evaluate the integral and this somewhat offsets the hoped for advantage over a finite difference method.$$\begin{aligned} \int _{\infty }^{\infty } x^n \exp ( \theta _1 + \theta _2 x + \theta _3 x^2 + \theta _4 x^4) {\mathrm {d}}x \end{aligned}$$
8 Numerical results
 1.
A finite difference method using a fine grid which we term the exact filter. Various convergence results are known ([35] and [36]) for this method. In the simulations shown below we use a grid with 1000 points on the xaxis and 5000 time points. In our simulations, we could not visually distinguish the resulting graphs when the grid was refined further justifying us in considering this to be extremely close to the exact result. The precise algorithm used is as described in the section on “Partial Differential Equations Methods” in chapter 8 of Bain and Crisan [12].
 2.
The extended Kalman filter (EK) This is a somewhat heuristic approach to solving the nonlinear filtering problem but which works well so long as one assumes the system is almost linear. It is implemented essentially by linearising all the functions in the problem and then using the exact Kalman filter to solve this linear problem—the details are given in [12]. The EK filter is widely used in applications and so provides a standard benchmark. However, it is well known that it can give wildly inaccurate results for nonlinear problems so it should be unsurprising to see that it performs badly for most of the examples we consider.
 3.
The HE projection filter. In fact, we have implemented a generalization of the algorithm given in [19] that can cope with filtering problems where b is an arbitrary polynomial, \(\sigma \) is constant and \(f=0\). Thus, we have been able to examine the performance of the exponential projection filter over a slightly wider range of problems than have previously been considered.
8.1 The linear filter
The first test case we have examined is the linear filtering problem. In this case, the probability density will be a Gaussian at all times—hence if we project onto the twodimensional family consisting of all Gaussian distributions there should be no loss of information. Thus, both projection filters should give exact answers for linear problems. This is indeed the case and gives some confidence in the correctness of the computer implementations of the various algorithms.
8.2 The quadratic sensor
8.3 The cubic sensor
9 Comparison with particle methods
10 Comparison with robust Zakai implementation using Hermite functions
Luo and Yau [38, 39, 40] propose solving the filtering problem in real time by solving the robust Zakai equation using a spectral method based on Hermite functions. We have implemented such a filter. We found that, when one uses 45 Hermite basis functions as recommended in the papers, this approach produces excellent results which are essentially indistinguishable from our own “exact” filter implementation. This provides further evidence for the effectiveness of Luo and Yau’s approach.
For onedimensional problems, this Hermitebased spectral method is more than sufficient to solve the filtering problem in real time. However, if we apply exactly the same approach to a filtering problem with ndimensional state space one might estimate that we would need \(45^n\) basis functions. This is the familiar curse of dimensionality. Therefore, it is interesting to know how the Hermite spectral method degrades as one reduces the number of basis functions. One might expect that the approach would deteriorate gradually as the number of basis functions is decreased, but in fact our numerical results indicate that it fails rapidly once the number of basis functions used is dropped to about 25. We have not plotted the results since they are more easily summarized verbally: with 45 basis functions the Hermite spectral method is excellent (as shown also by Luo and Yau above), whereas with less than 20 basis functions it does not provide a useful approximation.
By contrast, the projection filter above is able to find reasonably accurate solutions using only a 5dimensional manifold. We may summarize the different approaches as follows. Spectral methods such as Luo and Yau’s provide an approach that gives a practical way of finding realtime solutions in very low dimensions. Particle filters provide a method of finding approximate solutions in large dimensions but are difficult to apply in real time, even for lowdimensional systems. Projection methods provide a promising avenue for find approximate solutions to mediumdimensional filtering problems in real time.
11 Conclusions
Projection onto a family of normal mixtures using the \(L^2\) metric (L2NM) allows one to approximate the solutions of the nonlinear filtering problem with surprising accuracy using only a small number of component distributions. In this regard, this filter behaves in a very similar fashion to the projection onto an exponential family using the Hellinger metric that has been considered previously. The L2NM projection filter has one important advantage over the Hellinger exponential (HE) projection filter: for problems with polynomial coefficients all required integrals can be calculated analytically. Problems with more general coefficients can be addressed using Taylor series. One expects this to translate into a better performing algorithm—particularly if the approach is extended to higher dimensional problems.
We tested both filters against the optimal filter in simple but interesting systems, and provided a metric to compare the performance of each filter with the optimal one.
We also tested both filters against a particle method, showing that with the same number of parameters the L2NM filter outperforms the best possible particle method in Levy metric.
We designed a software structure and populated it with models that make the L2NM filter quite appealing from a numerical and computational point of view.
Areas of future research that we hope to address include: the relationship between the projection approach and existing numerical approaches to the filtering problem; the convergence of the algorithm; improving the stability and performance of the algorithm by adaptively changing the parameterization of the manifold; numerical simulations in higher dimensions; more generally, we are investigating whether a new type of projection, building on the Ito stochastic calculus structure, could be suited to derive approximate equations.
References
 1.Aggrawal J (1974) Sur l’information de Fisher. In: Kampé de Fériet J (ed) Théories de l’Information. SpringerVerlag, Berlin, pp 111–117CrossRefGoogle Scholar
 2.Ahmed NU (1998) Linear and nonlinear filtering for scientists and engineers. World Scientific, SingaporeCrossRefzbMATHGoogle Scholar
 3.Ahmed NU, Radaideh S (1997) A powerful numerical technique solving the Zakai equation for nonlinear filtering. Dyn Control 7(3):293–308MathSciNetCrossRefzbMATHGoogle Scholar
 4.Alspach DL, Sorenson HW (1972) Nonlinear Bayesian estimation using Gaussian sum approximations. IEEE Trans Automat Control AC17 (4):439–448Google Scholar
 5.Amari S (1985) Differentialgeometrical methods in statistics, Lecture notes in statistics. SpringerVerlag, BerlinGoogle Scholar
 6.Armstrong J, Brigo D (2013) Stochastic filtering via \(L^2\) projection on mixture manifolds with computer algorithms and numerical examples. Available at arXiv.orgGoogle Scholar
 7.Bagchi A, Karandikar RL (1994) White noise theory of robust nonlinear filtering with correlated state and observation noises. Syst Control Lett 23:137–148MathSciNetCrossRefzbMATHGoogle Scholar
 8.Balakrishnan AV (1980) Nonlinear white noise theory. In: Krishnaiah PR (ed) Multivariate Analysis—V. NorthHolland, Amsterdam, pp 97–109Google Scholar
 9.Beard R, Gunther J (1997) Galerkin Approximations of the Kushner Equation in Nonlinear Estimation. Working Paper, Brigham Young UniversityGoogle Scholar
 10.Bensoussan A, Glowinski R, Rascanu A (1990) Approximation of the Zakai equation by the splitting up method. SIAM J Control Optim 28(6):1420–1431MathSciNetCrossRefzbMATHGoogle Scholar
 11.Billingsley Patrick (1999) Convergence of Probability Measures. Wiley, New YorkCrossRefzbMATHGoogle Scholar
 12.Bain A, Crisan D (2010) Fundamentals of Stochastic Filtering. SpringerVerlag, HeidelbergzbMATHGoogle Scholar
 13.BarndorffNielsen OE (1978) Information and Exponential Families. Wiley, New YorkzbMATHGoogle Scholar
 14.Beneš VE (1981) Exact finitedimensional filters for certain diffusions with nonlinear drift. Stochastics 5:65–92MathSciNetCrossRefzbMATHGoogle Scholar
 15.Beneš VE, Elliott RJ (1996) Finitedimensional solutions of a modified Zakai equation. Math Signals Syst 9(4):341–351MathSciNetCrossRefzbMATHGoogle Scholar
 16.Brigo D (2012) The direct \(L^2\) geometric structure on a manifold of probability densities with applications to Filtering. Available at arXiv.orgGoogle Scholar
 17.Brigo D, Hanzon B, LeGland F (1998) A differential geometric approach to nonlinear filtering: the projection filter. IEEE Trans Autom Control 43:247–252MathSciNetCrossRefzbMATHGoogle Scholar
 18.Brigo D, Hanzon B, Le Gland F (1999) Approximate nonlinear filtering by projection on exponential manifolds of densities. Bernoulli 5:495–534MathSciNetCrossRefzbMATHGoogle Scholar
 19.Brigo D (1996) Filtering by Projection on the Manifold of Exponential Densities. PhD Thesis, Free University of AmsterdamGoogle Scholar
 20.Burrage K, Burrage PM, Tian T (2004) Numerical methods for strong solutions of stochastic differential equations: an overview. Proc R Soc Lond A 460:373–402MathSciNetCrossRefzbMATHGoogle Scholar
 21.Clark JMC (1978) The design of robust approximations to the stochastic differential equations of nonlinear filtering. In: Skwirzynski JK (ed) Communication Systems and Random Process Theory, NATO Advanced Study Institute Series. (Sijthoff and Noordhoff, Alphen aan den Rijn)Google Scholar
 22.Davis MHA (1980) On a multiplicative functional transformation arising in nonlinear filtering theory. Z Wahrsch Verw Geb 54:125–139MathSciNetCrossRefzbMATHGoogle Scholar
 23.Frey R, Schmidt T, Xu L (2013) On Galerkin approximations for the Zakai equation with diffusive and point process observations. SIAM J Numer Anal 51(4):2036–2062MathSciNetCrossRefzbMATHGoogle Scholar
 24.Germani A, Picconi M (1984) A Galerkin approximation for the Zakai equation. In: ThoftChristensen P (ed) System Modelling and Optimization (Copenhagen, 1983), Lecture Notes in Control and Information Sciences, Vol 59, SpringerVerlag, Berlin, pp 415–423Google Scholar
 25.Hanzon B (1987) A differentialgeometric approach to approximate nonlinear filtering. In: Dodson CTJ (ed) Geometrization of Statistical Theory, ULMD Publications, University of Lancaster, pp 219–223Google Scholar
 26.Hazewinkel M, Marcus SI, Sussmann HJ (1983) Nonexistence of finite dimensional filters for conditional statistics of the cubic sensor problem. Syst Control Lett 3:331–340MathSciNetCrossRefzbMATHGoogle Scholar
 27.Ito K (1996) Approximation of the Zakai equation for nonlinear filtering. SIAM J Control Optim 34(2):620–634MathSciNetCrossRefzbMATHGoogle Scholar
 28.Jacod J, Shiryaev AN (1987) Limit theorems for stochastic processes. Grundlehren der Mathematischen Wissenschaften, vol 288. SpringerVerlag, BerlinCrossRefzbMATHGoogle Scholar
 29.Jazwinski AH (1970) Stochastic processes and filtering theory. Academic Press, New YorkzbMATHGoogle Scholar
 30.Fujisaki M, Kallianpur G, Kunita H (1972) Stochastic differential equations for the non linear filtering problem. Osaka J. Math. 9(1):19–40MathSciNetzbMATHGoogle Scholar
 31.Kallianpur G, Karandikar RL (1983) A finitely additive white noise approach to nonlinear filtering. Appl Math Optim 10:159–186MathSciNetCrossRefzbMATHGoogle Scholar
 32.Khasminskii RZ (1980) Stochastic stability of differential equations. Alphen aan den RijnGoogle Scholar
 33.Kloeden PE, Platen E (1999) Numerical solution of stochastic differential equations. Springer, BerlinzbMATHGoogle Scholar
 34.Kormann K, Larsson E (2014) A Galerkin radial basis function method for the Schroedinger Equation. SIAM J Sci Comput 35(6):A2832–A2855MathSciNetCrossRefzbMATHGoogle Scholar
 35.Kushner HJ (1990) Weak convergence methods and singularly perturbed stochastic control and filtering problems, volume 3 of Systems & Control: Foundations & Applications. Birkhäuser BostonGoogle Scholar
 36.Kushner HJ, Huang H (1986) Approximate and limit results for nonlinear filters with wide bandwidth observation noise. Stochastics 16(1–2):65–96MathSciNetCrossRefzbMATHGoogle Scholar
 37.Liptser RS, Shiryayev AN (1978) Statistics of random processes I, general theory. Springer Verlag, BerlinzbMATHGoogle Scholar
 38.Luo X, Yau SST (2012) A novel algorithm to solve the robust DMZ equation in real time. In: Proceedings of the 51th Conference in Decision and Control (CDC), pp 606–611Google Scholar
 39.Luo X, Yau SST (2013) Complete real time solution of the general nonlinearfiltering problem without memory. IEEE Trans Automat Control 58(10):2563–2578MathSciNetCrossRefGoogle Scholar
 40.Luo X, Yau SST (2013) Hermite spectral method to 1D forward Kolmogorov equation and its application to nonlinear filtering problems. IEEE Trans Automat Control 58(10):2495–2507MathSciNetCrossRefGoogle Scholar
 41.Nowak LD, PaslawskaPoludniak M, Twardowska K (2010) On the convergence of the WaveletGalerkin method for nonlinear filtering. Int J Appl Math Comput Sci 20(1):93–108MathSciNetCrossRefzbMATHGoogle Scholar
 42.Ocone D (1988) Probability densities for conditional statistics in the cubic sensor problem. Math Control Signals Syst 1(2):183–202MathSciNetCrossRefzbMATHGoogle Scholar
 43.Pardoux E (1979) Stochastic partial differential equations and filtering of diffusion processes. Stochastics 3:127–167MathSciNetCrossRefzbMATHGoogle Scholar
 44.Pistone G, Sempi C (1995) An Infinite dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann Stat 23(5):1543–1561MathSciNetCrossRefzbMATHGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.