Correction to: analysis of stochastic gradient descent in continuous time

Latz, Jonas

doi:10.1007/s11222-024-10450-4

Correction to: analysis of stochastic gradient descent in continuous time

Original Paper
Open access
Published: 02 July 2024

Volume 34, article number 146, (2024)
Cite this article

Download PDF

You have full access to this open access article

Statistics and Computing Aims and scope Submit manuscript

Correction to: analysis of stochastic gradient descent in continuous time

Download PDF

Jonas Latz ORCID: orcid.org/0000-0002-4600-0247¹

161 Accesses
Explore all metrics

Abstract

A correction regarding [Latz 2021, Stat. Comput. 31, 39].

SGEM: stochastic gradient with energy and momentum

Article 08 August 2023

Analysis of stochastic gradient descent in continuous time

Article Open access 09 May 2021

First-Order and Second-Order Variants of the Gradient Descent in a Unified Framework

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

In Latz (2021), the space of continuous paths $C^0([0, \infty ); X)$ needs to be equipped with the metric given by

$$\begin{aligned}{} & {} \rho (x,y):= \sum _{n=1}^\infty 2^{-n}\min \left\{ 1,\sup _{s \in [0,n]}\Vert x(s)-y(s)\Vert \right\} \\{} & {} \qquad (x,y \in C^0([0, \infty ); X)) \end{aligned}$$

rather than the supremum norm $\Vert x\Vert _\infty = \sup _{t \in [0,\infty )}\Vert x(t)\Vert $, see Kushner (1984). This has the following implications:

(1) The convergence results in Theorem 1 and Lemma 2 are now weaker, as the metric $\rho $ is weaker than the metric induced by $\Vert \cdot \Vert _\infty $.

(2) Proposition 4(i) does not hold uniformly in time. The function F in the proof of this proposition can be chosen as $F((\xi (t), \tau (t), {\varvec{j}}(t))_{t \ge 0},t):= \min \{1, \Vert \xi (t)\Vert \}2^{-(t+1)}$ – it now depends on t. This F is bounded and Lipschitz continuous in $(\xi (t))_{t \ge 0}$ with respect to $\rho $, continuous in $t>0$ and constant with respect to the other inputs. To show Lipschitz continuity, we observe

$$\begin{aligned}&|F((\xi (t), \tau (t), {\varvec{j}}(t))_{t \ge 0},t)- F((\xi '(t), \tau '(t), {\varvec{j}}'(t))_{t \ge 0},t)| \\&\quad = | \min \{1, \Vert \xi (t)\Vert \} - \min \{1, \Vert \xi '(t)\Vert \}|2^{-(t+1)} \\&\quad \le \min \{1, \Vert \xi (t)-\xi '(t)\Vert \}2^{-(t+1)}\\&\le \rho ((\xi (t))_{t \ge 0}, (\xi '(t))_{t \ge 0}) \end{aligned}$$

for $t > 0$, $(\xi (t))_{t \ge 0}, (\xi '(t))_{t \ge 0}, (\tau (t))_{t \ge 0}, (\tau '(t))_{t \ge 0} \in C^0([0, \infty ); X)$ and $({\varvec{j}}(t))_{t \ge 0}, ({\varvec{j}}'(t))_{t \ge 0}$ being Markov jump processes on I. Thus, the weak convergence shown in Lemma 2 now implies that

$$\begin{aligned}{} & {} \mathbb {E}[F((\xi _\varepsilon (t) - \xi (t), \tau _\varepsilon (t) -\tau (t), {\varvec{j}}_\varepsilon (t)\\{} & {} - {\varvec{j}}(t))_{t \ge 0},t)] \rightarrow 0 \qquad (\varepsilon \downarrow 0; t > 0) \end{aligned}$$

and the inequality in Proposition 4(i) now reads

$$\begin{aligned} \textrm{W}_1(\textrm{D}^{\varepsilon }_{t|0}(\cdot |\xi _0, j_0),\textrm{D}_{t|0}(\cdot |\xi _0, j_0)) \le \alpha '(\varepsilon ,t), \end{aligned}$$

with $\alpha (0,t) = 0$ and $\varepsilon \mapsto \alpha (\varepsilon , t)$ is continuous at 0 for any $t>0$.

We also note that this $\alpha '$ can depend on the initial values $\xi _0$ and $j_0$ and that the definition of $t_0$ in this proposition is not necessary.

(3) Theorem 4 is still correct as stated with a refined argument in the proof. In the last inequality of the proof of Theorem 4, we can choose $\varepsilon _t$ to depend on t. The continuity of $\alpha '$ and $\alpha ''$ implies that for any $\delta > 0$, we can find $\varepsilon _t > 0$ small enough such that $\alpha '(\varepsilon _t,t) + \alpha ''(\varepsilon _t) \le \delta $ for all $t > 0$, which is sufficient to prove convergence as stated in Theorem 4.

Data availability

No datasets were generated or analysed during the current study.

References

Kushner, H.J.: Approximation and Weak Convergence Methods for Random Processes with Applications to Stochastic Systems Theory. MIT Press (1984)
Google Scholar
Latz, J.: Analysis of stochastic gradient descent in continuous time. Stat. Comput. 31, 39 (2021)

Download references

Acknowledgements

JL thanks Chenguang Liu for helpful discussions supporting these corrections.

Author information

Authors and Affiliations

Department of Mathematics, University of Manchester, Manchester, UK
Jonas Latz

Authors

Jonas Latz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonas Latz.

Ethics declarations

Conflict of interest

The authors declare no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Latz, J. Correction to: analysis of stochastic gradient descent in continuous time. Stat Comput 34, 146 (2024). https://doi.org/10.1007/s11222-024-10450-4

Download citation

Received: 15 May 2024
Accepted: 05 June 2024
Published: 02 July 2024
DOI: https://doi.org/10.1007/s11222-024-10450-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Correction to: analysis of stochastic gradient descent in continuous time

Abstract

Similar content being viewed by others

SGEM: stochastic gradient with energy and momentum

Analysis of stochastic gradient descent in continuous time

First-Order and Second-Order Variants of the Gradient Descent in a Unified Framework

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Correction to: analysis of stochastic gradient descent in continuous time

Abstract

Similar content being viewed by others

SGEM: stochastic gradient with energy and momentum

Analysis of stochastic gradient descent in continuous time

First-Order and Second-Order Variants of the Gradient Descent in a Unified Framework

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation