Kernel methods for regression in continuous time over subsets and manifolds

Burns, John; Estes, Boone; Guo, Jia; Kurdila, Andrew; Paruchuri, Sai Tej; Powell, Nathan

doi:10.1007/s11071-023-08567-8

Kernel methods for regression in continuous time over subsets and manifolds

Original Paper
Published: 26 May 2023

Volume 111, pages 13165–13186, (2023)
Cite this article

Nonlinear Dynamics Aims and scope Submit manuscript

John Burns¹,
Boone Estes²,
Jia Guo³,
Andrew Kurdila²,
Sai Tej Paruchuri⁴ &
…
Nathan Powell ORCID: orcid.org/0000-0002-3367-6782²

140 Accesses
Explore all metrics

Abstract

This paper derives error bounds for regression in continuous time over subsets of certain types of Riemannian manifolds. The regression problem is typically driven by a nonlinear evolution law taking values on the manifold, and it is cast as one of optimal estimation in a reproducing kernel Hilbert space. A new notion of persistency of excitation (PE) is defined for the estimation problem over the manifold, and rates of convergence of the continuous time estimates are derived using the PE condition. We discuss and analyze two approximation methods of the exact regression solution. We then conclude the paper with some numerical simulations that illustrate the qualitative character of the computed function estimates. Examples of function estimates generated over a trajectory of the Lorenz system and based on experimental motion capture data are included.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Numerical method of estimating the maximal likelihood of a smooth parametric manifold

Article 13 July 2016

Maximal integral over observable measures

Article 08 April 2016

Kernel methods for the approximation of discrete-time linear autonomous and control systems

Article Open access 07 June 2019

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request

References

Bai, S., Wang, J., Chen, F., Englot, B.: Information-theoretic exploration with bayesian optimization. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1816–1822. IEEE (2016)
Berlinet, A., Thomas-Agnan, C.: Reproducing kernel Hilbert spaces in probability and statistics. Springer Science & Business Media (2011)
Burns, J., Estes, B., Guo, J., Kurdila, A.J., Liu, R., Paruchuri, S.T., Powell, N.: Approximation of koopman operators: domain exploration. Submitted to the 2022 CDC (2022)
Ciarlet, P.G.: Linear and nonlinear functional analysis with applications, vol. 130. Siam (2013)
Cucker, F., Zhou, D.X.: Learning Theory: An Approximation Theory Viewpoint. Cambridge Press (2007)
DeVore, R., Kerkyacharian, G., Picard, D., Temlyakov, V.: Mathematical methods for supervised learning. IMI Preprints 22, 1–51 (2004)
MATH Google Scholar
DeVore, R.A., Lorentz, G.G.: Constructive approximation, vol. 303. Springer (1993)
Engl, H.W., Hanke, M., Neubauer, A.: Regularization of inverse problems, vol. 375. Springer Science & Business Media (1996)
Farrell, J.A., Polycarpou, M.M.: Adaptive approximation based control: unifying neural, fuzzy and traditional adaptive approximation approaches, vol. 48. Wiley (2006)
Foster, D., Sarkar, T., Rakhlin, A.: Learning nonlinear dynamical systems from a single trajectory. Learning for Dynamics and Control, PMLR (2020)
Fukuchi, C.A., Fukuchi, R.K., Duarte, M.: A public dataset of overground and treadmill walking kinematics and kinetics in healthy individuals. PeerJ 6, e4640 (2018)
Article Google Scholar
Fuselier, E., Wright, G.B.: Scattered data interpolation on embedded submanifolds with restricted positive definite kernels: Sobolev error estimates. SIAM J. Numer. Anal. 50(3), 1753–1776 (2012)
Article MathSciNet MATH Google Scholar
Gao, T., Kovalsky, S.Z., Daubechies, I.: Gaussian process landmarking on manifolds. SIAM J. Math. Data Sci. 1(1), 208–236 (2019)
Article MathSciNet MATH Google Scholar
Guo, J., Kepler, M.E., Paruchuri, S.T., Wang, H., Kurdila, A.J., Stilwell, D.J.: Strictly decentralized adaptive estimation of external fields using reproducing kernels. arXiv preprint arXiv:2103.12721 (2021)
Guo, J., Paruchuri, S.T., Kurdila, A.J.: Approximations of the reproducing kernel hilbert space (rkhs) embedding method over manifolds. In: 2020 59th IEEE Conference on Decision and Control (CDC), pp. 1596–1601. IEEE (2020)
Guo, J., Paruchuri, S.T., Kurdila, A.J.: Persistence of excitation in uniformly embedded reproducing kernel hilbert (rkh) spaces. In: 2020 American Control Conference (ACC), pp. 4539–4544. IEEE (2020)
Gyorfi, L., Kohler, M., Krzyzak, A., Walk, H.: A Distribution-Free Theory of Nonparametric Regression. Springer (2002)
Hangelbroek, T., Narcowich, F.J., Ward, J.D.: Polyharmonic and related kernels on manifolds: interpolation and approximation. Found. Comput. Math. 12(5), 625–670 (2012)
Article MathSciNet MATH Google Scholar
Hovakimyan, N., Cao, C.: $\cal{L}_1$ Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation. SIAM (2010)
Ioannou, P.A., Sun, J.: Robust adaptive control. Courier Corporation (2012)
Krstic, M., Kanellakopoulos, I., Kokotovic, P.: Nonlinear and Adaptive Control Design. Wiley (1995)
Kurdila, A.J., Guo, J., Paruchuri, S.T., Bobade, P.: Persistence of excitation in reproducing kernel hilbert spaces, positive limit sets, and smooth manifolds. arXiv preprint arXiv:1909.12274 (2019)
Liu, G.H., Theodorou, E.A.: Deep learning theory review: An optimal control and dynamical systems perspective. arXiv preprint arXiv:1908.10920 (2019)
Narendra, K.S., Annaswamy, A.M.: Stable Adaptive Systems. Dover (1989)
Paruchuri, S.T., Guo, J., Kurdila, A.: Kernel center adaptation in the reproducing kernel hilbert space embedding method. arXiv preprint arXiv:2009.02867 (2020)
Paruchuri, S.T., Guo, J., Kurdila, A.: Sufficient conditions for parameter convergence over embedded manifolds using kernel techniques. arXiv preprint arXiv:2009.02866 (2020)
Paruchuri, S.T., Guo, J., Kurdila, A.: Kernel center adaptation in the reproducing kernel hilbert space embedding method. Int. J. Adapt. Control Signal Process. 36(7), 1562–1583 (2022)
Article MathSciNet Google Scholar
Paruchuri, S.T., Guo, J., Kurdila, A.J.: Sufficient conditions for parameter convergence over embedded manifolds using kernel techniques. IEEE Trans. Autom. Control 68(2), 753–765 (2022)
Article MathSciNet Google Scholar
Pietsch, A.: Approximation spaces. J. Approx. Theory 32(2), 115–134 (1981)
Article MathSciNet MATH Google Scholar
Powell, N., Kurdila, A.J.: Learning theory for estimation of animal motion submanifolds. In: 2020 59th IEEE Conference on Decision and Control (CDC), pp. 4941–4946. IEEE (2020)
Powell, N., Kurdila, A.J.: Distribution-free learning theory for approximating submanifolds from reptile motion capture data. Comput. Mech. 68(2), 337–356 (2021)
Article MathSciNet MATH Google Scholar
Powell, N., Liu, B., Kurdila, A.J.: Koopman methods for estimation of animal motions over unknown, regularly embedded submanifolds. arXiv preprint arXiv:2203.05646 (2022)
Qianxiao, L., Weinan, E.: Machine learning and dynamical systems. SIAM News pp. 5–7 (2021)
Rosasco, L., Belkin, M., Vito, E.D.: On learning with integral operators. J. Mach. Learn. Res. 11, 905–934 (2010)
MathSciNet MATH Google Scholar
Sastry, S., Bodson, M.: Adaptive control: stability, convergence and robustness. Courier Corporation (2011)
Smola, B.S.A.J.: Learning with Kernels: Support Vector Machines. Optimization, and Beyond. MIT Press, Regularization (2002)
Temlyakov, V.: Multivariate approximation, vol. 32. Cambridge University Press (2018)
Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)
Article Google Scholar
Vito, E.D., Rosasco, L., Toigo, A.: Learning sets with separating kernels. Applied and Computational Harmonic Analysis pp. 185–217 (2014)
Walker, J.: Dynamical Systems and Evolution Equations: Theory and Applications. Springer (2013)
Wendland, H.: Scattered data approximation, vol. 17. Cambridge University Press (2004)
Williams, C.K., Rasmussen, C.: Gaussian processes for machine learning. MIT Press (2006)
Wittwar, D.W., Santin, G., Haasdonk, B.: Interpolation with uncoupled separable matrix-valued kernels. arXiv preprint arXiv:1807.09111 (2018)

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Interdisciplinary Center for Applied Mathematics (ICAM), Virginia Tech, Blacksburg, VA, 24061, USA
John Burns
Department of Mechanical Engineering, Virginia Tech, Blacksburg, VA, 24061, USA
Boone Estes, Andrew Kurdila & Nathan Powell
School of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, 30308, USA
Jia Guo
Department of Mechanical Engineering and Mechanics, Lehigh University, Bethlehem, PA, 18015, USA
Sai Tej Paruchuri

Authors

John Burns
View author publications
You can also search for this author in PubMed Google Scholar
Boone Estes
View author publications
You can also search for this author in PubMed Google Scholar
Jia Guo
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Kurdila
View author publications
You can also search for this author in PubMed Google Scholar
Sai Tej Paruchuri
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Powell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathan Powell.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Background on Galerkin approximations

Let U be a real Hilbert space, $A\in {\mathcal {L}}(U)$ be a bounded linear operator on U, $b\in U$ be a fixed element of U, and suppose we seek to find $u\in U$ that satisfies the operator equation

$$\begin{aligned} Au=b. \end{aligned}$$

It is customary that the existence and uniqueness of the solution of this equation is established by studying the associated bilinear form $a(\cdot ,\cdot ):U\times U \rightarrow \mathbb {R}$ given by $a(u,v):=(Au,v)$ for all $u,v\in U$. Then, the operator equation above is equivalent to finding the $u\in U$ for which

$$\begin{aligned} a(u,v)=\langle b,v\rangle _U \quad \text { for all } v\in U. \end{aligned}$$

(19)

The Lax–Milgram theorem given below stipulates a concise pair of conditions that ensure the well-posedness of the operator equation.

Theorem 3

(Lax–Milgram theorem [4]) The bilinear form $a(\cdot ,\cdot ):U\times U \rightarrow \mathbb {R}$ is bounded if there is a constant $C_1>0$ such that

$$\begin{aligned} |a(u,v)|\le C_1 \Vert u\Vert _U \Vert v\Vert _U \quad \text { for all } u,v\in U, \end{aligned}$$

and it is coercive if there is a constant $C_2>0$ such that

$$\begin{aligned} |a(u,u)|\ge C_2 \Vert u\Vert ^2_U \quad \text { for all } u\in U. \end{aligned}$$

If the bilinear form $a(\cdot ,\cdot )$ is bounded and coercive, then $A^{-1}\in {\mathcal {L}}(U)$ and there is a unique solution $u\in U$ to Eq. (19).

Proof

The first condition above, continuity of the bilinear form, ensures that $A\in {\mathcal {L}}(U)$ by definition. The coercivity condition implies that the nullspace of A is just $\{0\}$. As a result, we know that A is one-to-one. This means that the operator $A^{-1}:\text {range}(A)\rightarrow U$ is well defined. From the coercivity condition, we also conclude that

$$\begin{aligned} \Vert A^{-1}b\Vert ^2\le & {} \frac{1}{C_2}|\langle A\cdot A^{-1}b,A^{-1}b\rangle _U|\\ {}\le & {} \frac{1}{C_2} \Vert b\Vert _U \Vert A^{-1}b\Vert _U \end{aligned}$$

for every $b\in \text {range}(A)$. This means that $A^{-1}\in {\mathcal {L}}(\text {range}(A),U)$ and $\Vert A^{-1}\Vert \le 1/C_2$.

One implication of the fact that $A^{-1}\!\in \! {\mathcal {L}}(\text {range}(A),U)$ is that $\text {range}(A)$ is closed. Suppose that $\{b_k\}_{k\in \mathbb {N}}\subset \text {range}(A)$ and $b_k\rightarrow {\bar{b}}$. By construction there is a sequence $\{u_k\}_{k\in \mathbb {N}}\subset U$ such that $Au_k=b_k$. But we have

$$\begin{aligned} \Vert u_m-u_n\Vert= & {} \Vert A^{-1}(y_m-y_n)\Vert _U\\\le & {} \Vert A^{-1}\Vert \cdot \Vert y_m-y_n\Vert \rightarrow 0, \end{aligned}$$

and $\{u_k\}_{k\in \mathbb {N}}$ is a Cauchy sequence in the complete space U. There is a limit $u_k\rightarrow {\bar{u}}\in U$. By the continuity of the operator A, we know that $A{\bar{u}}={\bar{b}}$, hence ${\bar{b}}\in \text {range}(A)$. The range of A is consequently closed.

It only remains to show that $\text {range}(A)=U$. Suppose to the contrary there is a ${\bar{b}} \ne 0$ with ${\bar{b}} \in (\overline{\text {range}(A)})^\perp $. Since $\mathcal {N}(A^*) = (\overline{\text {range}(A)})^\perp $, we know that

$$\begin{aligned} \langle A^*b,w \rangle _U = \langle b,Aw \rangle _U = 0 \end{aligned}$$

By the coercivity condition, we must have $0=\langle A{\bar{b}},{\bar{b}}\rangle _U\ge C_2 \Vert {\bar{b}}\Vert _U^2 \not = 0$. But this is a contradiction and $\text {range}(A)$ is all of U. $\square $

Next, we discuss how error bounds are derived for Galerkin approximations $u_N$ of the solution u of the operator equations above. Let $U_N\subseteq U$ be a finite-dimensional subspace of U. By definition, the Galerkin approximation $u_N\in U_N$ is the unique solution of the equation

$$\begin{aligned} a(u_N,v_N)=\langle b,v_N\rangle _U \quad \text { for all } v_N\in U_N. \end{aligned}$$

(20)

The theorem below summarizes one of the well-known bounds on the error $u-u_N$ between the Galerkin approximation $u_N\in U_N$ and the true solution $u\in U$.

Theorem 4

(Cea’s Lemma, [4]) Suppose that the hypotheses of the Lax–Milgram Theorem 3 hold. There is a unique solution $u_N\in U_N$ of the Galerkin Equation 20. The error $u-u_N$ is a-orthogonal to the subspace $U_N$ in the sense that

$$\begin{aligned} a(u-u_N,v_N)=0 \quad \text { for all } v_N\in U_N. \end{aligned}$$

We also have the error bound

$$\begin{aligned} \Vert u-u_N\Vert _U\le & {} \frac{C_1}{C_2} \min _{v_N\in U_N} \Vert u-v_N\Vert _U\\ {}= & {} \frac{C_1}{C_2}\Vert (I-\varPi _N)u\Vert _U \end{aligned}$$

where $\varPi _N$ is the U-orthogonal projection of U onto $U_N$.

Proof

First, note that when $a(\cdot ,\cdot )$ satisfies the boundedness and coercivity conditions, its restriction $a:U_N\times U_N \rightarrow \mathbb {R}$ to $U_N$ satisfies the boundedness and coercivity conditions with the same constants relative to $U_N$. This means that the Galerkin equations have a unique solution $u_N\in U_N$. Since Eq. (19) holds for all $v\in U$, it holds for all $v_N\in U_N$. We can subtract Eqs. 19 and 20 for each $v_N\in U_N$ and obtain

$$\begin{aligned} a(u-u_N,v_N)=0 \end{aligned}$$

for each $v_N\in U_N$. Using the boundedness and coercivity of the bilinear form, as well as the $a-$orthogonality of the error, we have

$$\begin{aligned} C_2\Vert u-u_N\Vert _U^2&\le a(u-u_n,u-u_N)\\&=a(u-u_n,u-v_N)\\&\le C_1\Vert u-u_N\Vert _U \Vert u-v_N\Vert _U \end{aligned}$$

for any $v_N\in U_N$. The theorem now follows after canceling the common term on the right and left. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Burns, J., Estes, B., Guo, J. et al. Kernel methods for regression in continuous time over subsets and manifolds. Nonlinear Dyn 111, 13165–13186 (2023). https://doi.org/10.1007/s11071-023-08567-8

Download citation

Received: 25 August 2022
Accepted: 01 May 2023
Published: 26 May 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11071-023-08567-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Kernel methods for regression in continuous time over subsets and manifolds

Abstract

Access this article

Similar content being viewed by others

Numerical method of estimating the maximal likelihood of a smooth parametric manifold

Maximal integral over observable measures

Kernel methods for the approximation of discrete-time linear autonomous and control systems

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

1.1 Background on Galerkin approximations

Theorem 3

Proof

Theorem 4

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Kernel methods for regression in continuous time over subsets and manifolds

Abstract

Access this article

Similar content being viewed by others

Numerical method of estimating the maximal likelihood of a smooth parametric manifold

Maximal integral over observable measures

Kernel methods for the approximation of discrete-time linear autonomous and control systems

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 Background on Galerkin approximations

Theorem 3

Proof

Theorem 4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation