Skip to main content
Log in

Simultaneous perturbation stochastic approximation: towards one-measurement per iteration

  • Original Paper
  • Published:
Numerical Algorithms Aims and scope Submit manuscript

Abstract

When measuring the value of a function to be minimized is not only expensive but also with noise, the popular simultaneous perturbation stochastic approximation (SPSA) algorithm requires only two function values in each iteration. In this paper, we present a method requiring only one function measurement value per iteration in the average sense. We prove the strong convergence and asymptotic normality of the new algorithm. Limited experimental results demonstrate the effectiveness and potential of our algorithm for solving low-dimensional problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Abdulsadda, A.T., Iqbal, K.: An improved algorithm for system identification using fuzzy rules for training neural networks. Int. J. Autom. Comput. 8(3), 333–339 (2001)

    Article  Google Scholar 

  2. Altaf, M.U., Heemink, A.W., Verlaan, M., Hoteit, I.: Simultaneous perturbation stochastic approximation for tidal models. Ocean Dyn. 61(8), 1093–1105 (2001)

    Article  Google Scholar 

  3. Bartkutė, V., Sakalauskas, L.: Simultaneous perturbation stochastic approximation of nonsmooth functions. Eur. J. Oper. Res. 181(3), 397–409 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bartkutė, V., Sakalauskas, L.: Statistical inferences for termination of Markov type random search algorithms. J. Optim. Theory Appl. 141(3), 475–493 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Doob, J.L.: Stochastic processes. John Wiley and Sons, New York (1953)

    MATH  Google Scholar 

  6. Fabian, V.: On asymptotic normality in stochastic approximation. Ann. Math. Stat. 39(4), 1327–1332 (1968)

    Article  MathSciNet  MATH  Google Scholar 

  7. Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Stat. 23(3), 462–466 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  8. Kushner, H.J., Clark, D.S.: Stochastic approximation methods for constrained and unconstrained systems. Springer, New York (1978)

    Book  MATH  Google Scholar 

  9. Moré, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17–41 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  10. Spall, J. C.: Accelerated second-order stochastic optimization using only function measurements. In: Proceedings of the 36th IEEE Conference on Decision and Control, vol. 2 pp. 1417–1424. (1997)

  11. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  12. Spall, J.C.: A one-measurement form of simultaneous perturbation stochastic approximation. Automatica. 33(1), 109–112 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  13. Spall, J.C.: Implementation of the simultaneous perturbation algorithm for stochastic optimization. IEEE Trans. Aerosp. Electron. Syst. 34(3), 817–823 (1998)

    Article  Google Scholar 

  14. Spall, J.C.: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 37(3), 332–341 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  15. Sadegh, P., Spall, J.C.: Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 43(10), 1480–1484 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Spall, J. C.: Stochastic version of second-order (Newton-Raphson) optimization using only function measurements. In: Proceedings of the 1995 Winter Simulation Conference, pp. 347–352. (1995)

  17. Xu, Z., Dai, Y.H.: A stochastic approximation frame algorithm with adaptive directions. Numerical Mathematics: Theory. Meth. Appl. 1(4), 460–474 (2008)

    MathSciNet  MATH  Google Scholar 

  18. Zhu, X., Spall, J.C.: A modified second-order SPSA optimization algorithm for finite samples. Int. J. Adapt. Control Sig. Process. 16(5), 397–409 (2002)

    Article  MATH  Google Scholar 

Download references

Funding

This research is supported by the Beijing Natural Science Foundation, grant Z180005 and by the National Natural Science Foundation of China under grants 12171021, 12071279 and 11822103, and by General Project of Shanghai Natural Science Foundation (No. 20ZR1420600).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zi Xu.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proof of Lemma 1

Proof

We start from the observation

$$\begin{aligned} \mathbb E\left[ \hat{\xi }_k\big \vert \hat{x}_k\right] = \mathbb E \left[ \mathbb E\left[ \hat{\xi }_k\big \vert \hat{x}_k,\hat{g}_k\right] \Big \vert \hat{x}_k\right] . \end{aligned}$$

Notice that each component \(\hat{\xi }_{ki}~\left( i=1,2,\cdots ,n\right) \) of \(\hat{\xi }_k\) obeys the symmetric Bernoulli distribution and satisfies \({\hat{\xi }_k}^T \hat{g}_k\ge 0\). Therefore, at least half of the components of \(\hat{\xi }_k\) have the same sign as the components \(\hat{g}_{ki}~\left( i=1,2,\cdots ,n\right) \) of \(\hat{g}_k\). If n is even, \(\hat{\xi }_k\) has \(C_{n}^{ n/2}+C_{n}^{ n/2 +1}+\cdots +C_{n}^{n}=2^{n-1}+ C_{n}^{n/2}/2\) choices. So for any possible choice \(\zeta _j\), we have

$$\begin{aligned} P\left( \hat{\xi }_k=\zeta _j\right) =\frac{1}{2^{n-1}+ C_{n}^{ n/2}/2}. \end{aligned}$$

If the signs of \(\hat{\xi }_{ki}\) and \(\hat{g}_{ki}\) \(\left( \forall i\in \{1,2,\cdots ,n\}\right) \) are the same, then at least \( n/2-1\) of the remaining \(n-1\) elements of \(\hat{\xi }_k\) share the same signs as that of \(\hat{g}_{ki}\). In this case, \(\hat{\xi }_{k}\) has \(C_{n-1}^{\left( n-2\right) /2}+C_{n-1}^{ n/2}+\cdots +C_{n-1}^{n-1}=2^{n-2}+C_{n-1}^{\left( n-2\right) /2}\) choices. Then we can write

$$\begin{aligned} \Sigma \zeta _{j}= & {} \left\{ 2^{n-2}+C_{n-1}^{\left( n-2\right) /2}-\left[ \left( 2^{n-1}+ C_{n}^{n/2}/2\right) -\left( 2^{n-2}+C_{n-1}^{\left( n-2\right) /2}\right) \right] \right\} \text {sgn}\left( \hat{g}_{k}\right) \\= & {} \left( 2C_{n-1}^{\left( n-2\right) /2}- C_{n}^{n/2}/2\right) \text {sgn}\left( \hat{g}_{k}\right) \\= & {} C_{n-1}^{n/2}\text {sgn}\left( \hat{g}_{k}\right) . \end{aligned}$$

If n is odd, \(\{\zeta _j\}\) has \(C_{n}^{\left( n+1\right) /2}+C_{n}^{\left( n+1\right) /2+1}+\cdots +C_{n}^{n}=2^{n-1}\) choices, each with probability \(P\left( \hat{\xi }_k=\zeta _j\right) = 1/2^{n-1}\). The sign of \(\hat{\xi }_{ki}\) is either the same as or opposite to that of \(\hat{g}_{ki}\). If their signs are the same, then at least \(\left( n-1\right) /2\) of the remaining \(n-1\) elements of \(\hat{\xi }_k\) share the same signs as \(\hat{g}_{ki}\). In this case, \(\hat{\xi }_k\) has \(C_{n-1}^{\left( n-1\right) /2}+C_{n-1}^{\left( n+1\right) /2}+\cdots +C_{n-1}^{n-1}=2^{n-2}+ C_{n-1}^{\left( n-1\right) /2}/2\) choices. Then we can write

$$\begin{aligned} \Sigma \zeta _{j}= & {} \left\{ 2^{n-2}+ C_{n-1}^{\left( n-1\right) /2}/2-\left[ 2^{n-1}-\left( 2^{n-2}+ C_{n-1}^{\left( n-1\right) /2}/2\right) \right] \right\} \text {sgn} \left( \hat{g}_{k}\right) \\= & {} C_{n-1}^{\left( n-1\right) /2}\text {sgn}\left( \hat{g}_{k}\right) . \end{aligned}$$

It then holds that

$$\begin{aligned} \Sigma \zeta _{j}P\left( \hat{\xi }_k=\zeta _j\right) = \left\{ \begin{array}{cc} \frac{C_{n-1}^{n/2}}{2^{n-1}+ C_{n}^{n/2}/2}\textrm{sgn}\left( \hat{g}_{k}\right) ,~ &{}\textrm{if}~n~\text {is even}, \\ \frac{C_{n-1}^{\left( n-1\right) /2}}{2^{n-1}}\text {sgn}\left( \hat{g}_{k}\right) ,~ &{}\textrm{otherwise}.\\ \end{array}\right. \end{aligned}$$

For \(\rho \) defined in Step 1 of Algorithm SPSA1-A, we have

$$\begin{aligned}&~&\mathbb E \left[ \mathbb E\left[ \hat{\xi }_k\big \vert \hat{x}_k,\hat{g}_k\right] \Big \vert \hat{x}_k\right] = \mathbb E \left[ _{\hat{g}_k}\rho \mathbb E\left[ \text {sgn}\left( \hat{g}_k\right) \vert \hat{x}_k\right] \big \vert \hat{x}_k \right] =\rho \mathbb E\left[ \text {sgn}\left( \hat{g}_k\right) \vert \hat{x}_k\right] . \end{aligned}$$

Let \(\rho _k=\rho /\Vert \hat{g}_k\Vert _{\infty }\). We can complete the proof

$$\begin{aligned} \mathbb E\left[ \hat{\xi }_k\big \vert \hat{x}_k\right] =\mathbb E\left[ \rho _k\hat{g}_k\vert \hat{x}_k\right] . \end{aligned}$$

\(\square \)

Appendix B: Proof of Lemma 2

Proof

By a proof similar to that of Lemma 1, we have

$$\begin{aligned} \mathbb E\left[ \frac{\hat{\xi }_k}{1+\rho _k}\bigg \vert \hat{x}_k\right] =\mathbb E\left[ \frac{\rho _k}{1+\rho _k}\hat{g}_k\bigg \vert \hat{x}_k\right] . \end{aligned}$$

Then, we can obtain

$$\begin{aligned} b_k\left( \hat{x}_k\right)= & {} {\mathbb E\left[ \hat{g}_k\left( \hat{x}_k\right) - g\left( \hat{x}_k\right) \vert \hat{x}_k\right] }. \end{aligned}$$

In the sequel, the lemma can be proved similar to that in [14]. \(\square \)

Appendix C: Proof of Proposition 1

Proof

According to [8], based on Lemma 2 and Assumption 1, we have

i)

$$\begin{aligned} \Vert b_k\left( \hat{x}_k\right) \Vert <\infty ~\forall ~k~\textrm{and}~b_k\left( \hat{x}_k\right) \rightarrow 0,~a.s. \end{aligned}$$

Next, we prove that

ii)

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }P\left( \sup \limits _{m\ge k}\left\| \sum \limits _{i=k}^{m}a_ie_i\left( \hat{x}_i\right) \right\| \ge \eta \right) =0,~ \mathrm {for \ any}~\eta >0. \end{aligned}$$

First, for any \(l\in \{1,2,\cdots ,n\}\), we have

$$\begin{aligned} \mathbb E e_{il}^{2}\le & {} Var\left( \frac{\hat{g}_{il}+\hat{\xi }_{il}}{1+\rho _i}\right) \\\le & {} \mathbb E\left[ \frac{\hat{g}_{il}+\hat{\xi }_{il}}{1+\rho _i}\right] ^2\\= & {} \mathbb E\frac{\hat{g}_{il}^2}{\left( 1+\rho _i\right) ^2}+2\mathbb E\frac{\hat{g}_{il}\hat{\xi }_{il}}{\left( 1+\rho _i\right) ^2} +\mathbb E\frac{\hat{\xi }_{il}^2}{\left( 1+\rho _i\right) ^2}. \end{aligned}$$

Then we obtain

$$\begin{aligned} \mathbb E\left[ \hat{g}_{il}\hat{\xi }_{il}\right]\le & {} \mathbb E\left[ \left| \hat{g}_{il}\right| \cdot \left| \hat{\xi }_{il}\right| \right] \\\le & {} \mathbb E \left| \hat{g}_{il}\right| \\\le & {} \frac{1}{2c_i}\mathbb E\left[ \left| \xi _{il}^{-1}\right| \cdot \left( \left| L\left( \hat{x}_i+c_i\xi _i\right) \right| +\left| L\left( \hat{x}_i-c_i\xi _i\right) \right| +\left| \varepsilon _i^+\right| +\left| \varepsilon _i^-\right| \right) \right] \\\le & {} \left( \sqrt{\alpha _0}+\sqrt{\alpha _1}\right) c_i^{-1}. \end{aligned}$$

It follows from [14] that

$$\begin{aligned} \mathbb E\hat{g}_{il}^2\le 2\left( \alpha _1+\alpha _0\right) c_i^{-2}. \end{aligned}$$

Since \(\frac{1}{\left( 1+\rho _i\right) ^2}=\frac{1}{\left( 1+\rho /\vert g_{il}\vert \right) ^2}\le 1\) and \(\mathbb E\hat{\xi }_{il}^2= 1\), we have

$$\begin{aligned} \mathbb E e_{il}^2\le & {} \mathbb E\hat{g}_{il}^2+2\mathbb E\hat{g}_{il}\hat{\xi }_{il}+\mathbb E\hat{\xi }_{il}^2 \le 2\left( \alpha _1+\alpha _0\right) c_i^{-2}+\left( \sqrt{\alpha _0}+\sqrt{\alpha _1}\right) c_i^{-1}+1. \end{aligned}$$

Therefore, it holds that

$$\begin{aligned} \mathbb E\Vert e_k\Vert ^2\le p[2\left( \alpha _1+\alpha _0\right) c_k^{-2}+\left( \sqrt{\alpha _0}+\sqrt{\alpha _1}\right) c_k^{-1}+1]. \end{aligned}$$

Next, since \(\{\sum _{i=k}^{m}a_ie_i\}_{m\ge k}\) is a martingale sequence, it follows from the inequality in [5, P. 315] (see also [8, P. 27]) that

$$\begin{aligned} P\left( \sup \limits _{m\ge k}\Vert \sum \limits _{i=k}^{m}a_ie_i\Vert \ge \eta \right)\le & {} \eta ^{-2}\mathbb E\Vert \sum \limits _{i=k}^{\infty }a_ie_i\Vert ^2 = \eta ^{-2}\sum \limits _{i=k}^{\infty }a_{i}^{2}\mathbb E\Vert e_i\Vert ^2, \end{aligned}$$
(C1)

where the equality holds as \(\mathbb E\left[ e_{i}^Te_j\right] =\mathbb E\left[ e_{i}^T\mathbb E\left[ e_j\vert \hat{x}_j\right] \right] =0,~\forall ~i<j\).

Then, by (C1) and Assumption 1, we complete the proof of ii). \(\square \)

Appendix D: Proof of Proposition 2

Proof

In order to complete the proof, we need to verify whether conditions (2.2.1), (2.2.2), and (2.2.3) in Fabian [6] are true. Here we assume that all assumptions on \(\theta _k\) or \(\mathscr {F}_k\) hold. According to the notation in [6], we can get

$$\begin{aligned} \hat{x}_{k+1}- x^*=\left( I-k^{-\alpha }\Gamma _k\right) \left( \hat{x}_k-x^*\right) +k^{-\left( \alpha +\beta \right) /2}\Phi _k V_k+k^{-\alpha -\beta /2}T_k, \end{aligned}$$

where \(\Gamma _k=aH~\left( \overline{x}_k\right) \), \(V_{k}=k^{-{\gamma }}\left\{ \frac{\hat{g}_k (\hat{x}_k)+\hat{\xi }_k}{1+\rho _k}-{\mathbb E}\left[ \frac{\hat{g}_k(\hat{x}_k)+\hat{\xi }_k}{1+\rho _k}\bigg \vert \hat{x}_k\right] \right\} \), \(\Phi _k=-aI\), and \(T_k=-ak^{\beta /2}b_k\left( \hat{x}_k\right) \). In fact, there is an open neighborhood of \(\hat{x}_k\) (for k sufficiently large) containing \( x^*\) in which \(H\left( \cdot \right) \) is continuous. Then

$$\begin{aligned} \mathbb E\left[ \frac{\hat{g}_k\left( \hat{x}_k\right) +\hat{\xi }_k}{1+\rho _k}\bigg \vert \hat{x}_k\right]= & {} H\left( \overline{x}_k\right) \left( \hat{x}_k-x^*\right) +b_k\left( x^*\right) , \end{aligned}$$

where \(\Gamma _k=aH\left( \overline{x}_k\right) \) lies in the line segment between \(\hat{x}_k\) and \(x^*\).

Based on the continuity of \(H\left( \cdot \right) \) and a.s. convergence of \(\hat{x}_k\), we have \(\Gamma _k=aH\left( \overline{x}_k\right) \rightarrow aH\left( x^*\right) \) a.s.

Now we prove the convergence of \(T_k\) for \(3\gamma -\alpha /2\ge 0\). When \(3\gamma -\alpha /2>0\), as \(b_k\left( \hat{x}_k\right) =O\left( k^{-2\gamma }\right) \) a.s., we can write that \(T_k\rightarrow 0\) a.s. When \(3\gamma -\alpha /2=0\), by the facts that \(\hat{x}_k\rightarrow x^*\) a.s. and the uniformly boundedness of \(L^{\left( 3\right) }\) near \(x^*\), we have

$$\begin{aligned} k^{2\gamma }b_{kl}\left( \hat{x}_k\right) -\frac{1}{6}c^2L^{\left( 3\right) }\left( x^*\right) \mathbb E\left( \xi _k\otimes \xi _k\otimes \xi _k\right) \rightarrow 0~a.s. \end{aligned}$$

Then \({\xi _{ki}}\) is symmetrically i.i.d. for each k, which means that the l-th element of \(T_k\) satisfies that

$$\begin{aligned} T_{kl}\rightarrow -\frac{1}{6}ac^2{L_{lll}^{\left( 3\right) }\left( x^*\right) +\sum \limits _{\begin{array}{c} i=1\\ i\ne l \end{array}}^{p}\left[ L^{\left( 3\right) }_{lii}\left( x^*\right) +L_{ili}^{\left( 3\right) }\left( x^*\right) +L_{iil}^{\left( 3\right) }\left( x^*\right) \right] }~a.s. \end{aligned}$$

Therefore, \(T_k\) converges for \(3\gamma -\alpha /2\ge 0\).

We can write

$$\begin{aligned} {\mathbb E}\left[ V_kV_k^T\bigg \vert \mathscr {F}_k\right]= & {} k^{-2\gamma }\left\{ {\mathbb E}\left[ \frac{1}{(1+\rho _k)^2}(\hat{g}_k+\hat{\xi }_k)(\hat{g}_k+\hat{\xi }_k)^T\bigg \vert \hat{x}_k\right] \right. \\{} & {} ~-{\mathbb E}\left. \left[ \frac{1}{1+\rho _k}(\hat{g}_k+\hat{\xi }_k)\bigg \vert \hat{x}_k\right] {\mathbb E}\left[ \frac{1}{1+\rho _k}(\hat{g}_k+\hat{\xi }_k)^T\bigg \vert \hat{x}_k^T\right] \right\} , \end{aligned}$$

where

$$\begin{aligned}{} & {} k^{-2\gamma }\mathbb E\left[ \frac{1}{\left( 1+\rho _k\right) ^2}\left( \hat{g}_k+\hat{\xi }_k\right) \left( \hat{g}_k+\hat{\xi }_k\right) ^T\bigg \vert \hat{x}_k\right] \\{} & {} =k^{-2\gamma }\mathbb E\left[ \frac{1}{\left( 1+\rho _k\right) ^2}\left( \hat{g}_k\hat{g}_k^T+\hat{g}_k\hat{\xi }_k^T+\hat{\xi }_k\hat{g}_k^T+\hat{\xi }_k\hat{\xi }_k^T\right) \bigg \vert \hat{x}_k\right] .\\ \end{aligned}$$

Define \(\xi ^{-1}_k:=\left( \xi ^{-1}_{k1},\cdots ,\xi ^{-1}_{kp}\right) ^T\). Then we have

$$\begin{aligned} \mathbb E\left[ \frac{1}{\left( 1+\rho _k\right) ^2}\hat{g}_k\hat{g}_k^T\bigg \vert \mathscr {F}_k\right] =\mathbb E\left\{ \frac{1}{\left( 1+\rho _k\right) ^2}\xi _k^{-1}\left( \xi _k^{-1}\right) ^T\left[ \frac{\varepsilon _k^{\left( +\right) }-\varepsilon _k^{\left( -\right) }}{2ck^{-\gamma }}\right] ^2\bigg \vert \mathscr {F}_k\right\} , \end{aligned}$$
(D2)

where

$$\begin{aligned} \frac{1}{\left( 1+\rho _k\right) ^2} = \frac{1}{\left( 1+\frac{\rho }{\vert \hat{g}_{kl}\vert }\right) ^2} = \frac{\left[ \frac{y^{\left( +\right) }-y^{\left( -\right) }}{2ck^{-\gamma }}\right] ^2}{\left[ \frac{y^{\left( +\right) }-y^{\left( -\right) }}{2ck^{-\gamma }}+\rho \right] ^2} = \frac{\left[ y^{\left( +\right) }-y^{\left( -\right) }\right] ^2}{\left[ y^{\left( +\right) }-y^{\left( -\right) }+2ck^{-\gamma }\rho \right] ^2}, \end{aligned}$$

and the last equation \(\rightarrow 1\) with \(k\rightarrow \infty \). Therefore, (D2) is same as the third term in (3.5) in [14]. As the element of \(\hat{g}_k\hat{\xi }_k^T+\hat{\xi }_k\hat{g}_k^T+\hat{\xi }_k\hat{\xi }_k^T\) is bounded, we have

$$\begin{aligned} k^{-2\gamma }\mathbb E\left[ \frac{1}{\left( 1+\rho _k\right) ^2}\left( \hat{g}_k\hat{\xi }_k^T+\hat{\xi }_k\hat{g}_k^T+\hat{\xi }_k\hat{\xi }_k^T\right) \bigg \vert \hat{x}_k\right] \rightarrow 0, \end{aligned}$$

and

$$\begin{aligned} {\mathbb E}\left[ \frac{1}{1+\rho _k}\left( \hat{g}_k+\hat{\xi }_k\right) \bigg \vert \hat{x}_k\right] {\mathbb E}\left[ \frac{1}{1+\rho _k}\left( \hat{g}_k+\hat{\xi }_k\right) ^T\bigg \vert \hat{x}_k^T\right] = {\mathbb E}\left[ \hat{g}_k\bigg \vert \hat{x}_k\right] {\mathbb E}\left[ \hat{g}_k^T\bigg \vert \hat{x}_k^T\right] . \end{aligned}$$

According to [14], we obtain

$$\begin{aligned} {\mathbb E}\left[ V_kV_k^T\bigg \vert \mathscr {F}_k\right] \rightarrow \frac{1}{4}c^{-2}\sigma ^2I~~~~a.s. \end{aligned}$$

Thus we have obtained the conditions (2.2.1) and (2.2.2) of [6]. Next we prove condition (2.2.3), i.e.,

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }\mathbb E\left[ \mathscr {I}_{\{\left\| V_k\right\| ^2\ge rk^\alpha \}}\left\| V_k\right\| ^2\right] =0~~~\forall r>0. \end{aligned}$$

By Holder’s inequality and \(0<\delta '<\delta /2\), the upper bound of the above limit can be obtained as

$$\begin{aligned}{} & {} \lim \limits _{k\rightarrow \infty }\sup P\left( \left\| V_k\right\| ^2\ge rk^\alpha \right) ^{\delta '/\left( 1+\delta '\right) }\left( \mathbb E\left\| V_k\right\| ^{2\left( 1+\delta '\right) }\right) ^{1/\left( 1+\delta '\right) }\\{} & {} ~\le \lim \limits _{k\rightarrow \infty }\sup \left( \frac{\mathbb E\left\| V_k\right\| ^2}{rk^\alpha }\right) ^{\delta '/\left( 1+\delta '\right) }\left( \mathbb E\left\| V_k\right\| ^{2\left( 1+\delta '\right) }\right) ^{1/\left( 1+\delta '\right) }. \end{aligned}$$

Notice that

$$\begin{aligned} \left\| V_k\right\| ^{2\left( 1+\delta '\right) }\le 2^{2\left( 1+\delta '\right) }k^{-2\left( 1+\delta '\right) \gamma }\left[ \left\| \frac{\hat{g}_k+\hat{\xi }_k}{1+\rho _k}\right\| ^2+\left\| b_k\right\| ^2+\left\| g_k\right\| ^2\right] . \end{aligned}$$

Then the proof is completed following from the proof of ([14], Proposition 1). \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Xia, Y. & Xu, Z. Simultaneous perturbation stochastic approximation: towards one-measurement per iteration. Numer Algor 94, 1085–1101 (2023). https://doi.org/10.1007/s11075-023-01528-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11075-023-01528-7

Keywords

Mathematics Subject Classification (2010)

Navigation