Skip to main content
Log in

A Method of Inertial Regularized ADMM for Separable Nonconvex Optimization Problems

  • Optimization
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The alternating direction method of multipliers (ADMM) is an effective algorithm for solving optimization problems with separable structures. Recently, inertial technique has been widely used in various algorithms to accelerate its convergence speed and enhance the numerical performance. There are a lot of convergence analyses for solving the convex optimization problems by combining inertial technique with ADMM, while the research on the nonconvex cases is still in its infancy. In this paper, we propose an algorithm framework of inertial regularized ADMM (iRADMM) for a class of two-block nonconvex optimization problems. Under some assumptions, we establish the subsequential and global convergence of the proposed method. Furthermore, we apply the iRADMM to solve the signal recovery, image reconstruction and SCAD penalty problem. The numerical results demonstrate the efficiency of the iRADMM algorithm and also illustrate the effectiveness of the introduced inertial term.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • Alvarez F (2004) Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in hilbert space. SIAM J Opt 14(3):773–782

    Article  MathSciNet  MATH  Google Scholar 

  • Alvarez F, Attouch H (2001) An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set Valued Anal 9(1–2):3–11

    Article  MathSciNet  MATH  Google Scholar 

  • Attouch H, Bolte J, Redont P, Soubeyran A (2010) Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the kurdyka-lojasiewicz inequality. Math Oper Res 35(2):438–457

  • Attouch H, Bolte J, Svaiter BF (2013) Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized gauss-seidel methods. Math Prog 137(1–2):91–129

    Article  MathSciNet  MATH  Google Scholar 

  • Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problem. Math Prog 146(1–2):459–494

    Article  MathSciNet  MATH  Google Scholar 

  • Bot RI, Cestnek ER (2014) An inertial alternating direction method of multipliers. Minimax Theory Appl 1(1):29–49

    MathSciNet  Google Scholar 

  • Bot RI, Cestnek ER (2015) An inertial tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J Opt Theory Appl 171(2):600–616

    Article  MathSciNet  Google Scholar 

  • Bot RI, Csetnek ER, Hendrich C (2015) Inertial douglas-rachford splitting for monotone inclusion problems. Appl Math Comput 256:472–487

    MathSciNet  MATH  Google Scholar 

  • Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122

    Article  MATH  Google Scholar 

  • Chao MT, Zhang Y, Jian JB (2020) An inertial proximal alternating direction method of multipliers for nonconvex optimization. Int J Comput Math. https://doi.org/10.1080/00207160.2020.1812585

    Article  MATH  Google Scholar 

  • Chen CH, Chan RH, Ma SQ, Yang JF (2015) Inertial proximal admm for linearly constrained separable convex optimization. SIAM J Imag Sci 8(4):2239–2267

    Article  MathSciNet  MATH  Google Scholar 

  • Chen CH, Ma SQ, Yang JF (2015) A general inertial proximal point algorithm for mixed variational inequality problem. SIAM J Opt 25(4):2120–2142

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho DL (2006) Compressed sensing. IEEE Trans Inform Theory 52(4):1289–1306

    Article  MathSciNet  MATH  Google Scholar 

  • Fan JQ, Li RZ (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Gabay D, Mercier B (1976) A dual algorithm for the solution of nonlinear variational problems via finite element approximations. Comput Math Appl 2:17–40

    MATH  Google Scholar 

  • Gao X, Cai XJ, Han DR (2019) A Gauss-Seidel type inertial proximal alternating linearized minimization for a class of nonconvex optimization problems. J Glob Opt 76(4):863–887

    Article  MathSciNet  MATH  Google Scholar 

  • Glowinski R, Marrocco A (1975) Sur l’approximation, par\(\acute{e}\)l\(\acute{e}\)ments finis d’ordre un, et la r\(\acute{e}\)solution, par p\(\acute{e}\)nalisation-dualit\(\acute{e}\) dune classe de probl\(\grave{e}\)mes de dirichlet non lin\(\acute{e}\)aires. Revue Fr Autom. inform. rech op\(\acute{e}\)r. Anal Num\(\acute{e}\)r 2:41-76

  • Goncalves MLN, Melo JG, Monteiro RDC (2017) Convergence rate bounds for a proximal admm with over-relaxation stepsize parameter for solving nonconvex linearly constrained problems. Pac J Opt 15(3):379–398

    MathSciNet  MATH  Google Scholar 

  • Guo K, Han DR, Wu TT (2017) Convergence of alternating direction dethod for minimizing sum of two nonconvex functions with linear constraints. Int J Comput Math 94(8):1653–1669

    Article  MathSciNet  MATH  Google Scholar 

  • He BS, Tao M, Yuan XM (2012) Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J Opt 22(2):313–340

    Article  MathSciNet  MATH  Google Scholar 

  • Hong MY, Luo ZQ, Razaviyayn M (2016) Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J Opt 26(1):337–364

    Article  MathSciNet  MATH  Google Scholar 

  • Jian JB, Liu PJ, Jiang XZ (2020) A partially symmetric regularized alternating direction method of multipliers for nonconvex multi-block optimization. Acta Math Sin Chin Ser. https://kns.cnki.net/kcms/detail/11.2038.o1.20201127.1522.014.html

  • Li GY, Pong TK (2015) Global convergence of splitting methods for nonconvex composite optimization. SIAM J Opt 25(4):2434–2460

    Article  MathSciNet  MATH  Google Scholar 

  • Li M, Sun DF, Toh CK (2015) A convergent 3-block semiproximal ADMM for convex minimization problems with one strongly convex block. Asia Pac J Oper Res. https://doi.org/10.1142/S0217595915500244

    Article  MATH  Google Scholar 

  • Liu J, Chen JH, Ye JP (2009) Large-scale sparse logistic regression. In: 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 547–555

  • Liu J, Yuan Y, Ye JP (2006) Dictionary lasso: guaranteed sparse recovery under linear transformation. arXiv:1305.0047v2

  • Mordukhovich B (2006) Variational analysis and generalized differentiation i: basic theory. Springer, Berlin

    Book  Google Scholar 

  • Moudafi A, Elizabeth E (2003) Approximate inertial proximal methods using the enlargement of maximal monotone operators. Int J Pure Appl Math 5(3):283–299

    MathSciNet  MATH  Google Scholar 

  • Nesterov Y (2004) Introductory lectures on convex optimization: a basic course. Kluwer Academic Publishing, Cambridge

    Book  MATH  Google Scholar 

  • Ochs P, Brox T, Pock T, (2015) iPiano: Inertial proximal algorithm for strongly convex optimization. J Math Imaging Vis 53(2):171–181

  • Ochs P, Chen YJ, Brox T, Pock T (2014) iPiano: Inertial proximal algorithm for nonconvex optimization. SIAM J Imag Sci 7(2):1388–1419

    Article  MathSciNet  MATH  Google Scholar 

  • Polyak BT (1964) Some methods of speeding up the convergence of iteration methods. USSR Comput Math Math Phys 4(5):1–17

    Article  Google Scholar 

  • Rockafellar RT, Wets RJB (2009) Variational analysis. Springer, Berlin

    MATH  Google Scholar 

  • Sun T, Barrio R, Rodriguez M, Jiang H (2019) Inertial nonconvex alternating minimizations for the image deblurring. IEEE Trans Image Process 28:6211–6224

    Article  MathSciNet  MATH  Google Scholar 

  • Wang FH, Xu ZB, Xu HK (2014) Convergence of alternating direction method with multipliers for nonconvex composite problems. arXiv:1410.8625

  • Wu ZM, Li M (2019) General inertial regularized gradient method for a class of nonconvex nonsmooth optimization problems. Comput Opt Appl 73(1):129–158

    Article  MATH  Google Scholar 

  • Xu ZB, Chang XY, Xu FM, Zhang H (2012) \(L_{1/2}\) regularization: a thresholding representation theory and a fast solver. IEEE Trans Neur Netw Lear 23(7):1013–1027

    Article  Google Scholar 

  • Xu JW, Chao MT (2021) An inertial Bregman generalized alternating direction method of multipliers for nonconvex optimization. J Appl Math Comput. https://doi.org/10.1007/s12190-021-01590-1

    Article  MATH  Google Scholar 

  • Yang JF (2017) An algorithmic review for total variation regularized data fitting problems in image processing. Oper Res Trans 21(4):69–83

    MathSciNet  MATH  Google Scholar 

  • Zeng JS, Lin SB, Wang Y, Xu ZB (2013) \(L_{1/2}\) regularization: convergence of iterative half thresholding algorithm. IEEE Trans Signal Process 62(9):2317–2329

    Article  MATH  Google Scholar 

  • Zhang R, Kwok JT (2014) Asynchronous distributed admm for consensus optimization. In: Proceedings of the 31st international conference on machine learning, pp 1701–1709

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of China (12061013, 11601095, 71861002), Natural Science Foundation of Guangxi Province (2016GXNSFBA380185), Training Plan of Thousands of Young and Middle-aged Backbone Teachers in Colleges and Universities of Guangxi, Special Foundation for Guangxi Ba Gui Scholars, and Guangxi Middle and Young University Teachers’ Basic Research Ability Improvement Project (2022KY1135).

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Contributions

[MC, YG, YZ] conceptualized the study; [MC, YZ] helped in formal analysis and investigation, writing—review and editing, and supervision.

Corresponding author

Correspondence to Yongxin Zhao.

Ethics declarations

Conflict of interest

The authors declare that they have conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was not required as no humans or animals were involved.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

6 Appendix A.

6 Appendix A.

1.1 A.1 Proof of Lemma 5

Proof

From the definition of the augmented Lagrangian function \({\mathcal {L}}_{\beta }(\cdot )\), it follows that

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k+1},y^{k+1},p^{k+1})-{\mathcal {L}}_{\beta }(x^{k+1},y^{k+1},p^{k})\nonumber \\{} & {} \quad =\langle p^{k}-p^{k+1},Ax^{k+1}+By^{k+1}-b\rangle =\frac{1}{\beta }\Vert p^{k}-p^{k+1}\Vert ^{2}.\nonumber \\ \end{aligned}$$
(23)

It is obvious that

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k+1},y^{k+1},p^{k})-{\mathcal {L}}_{\beta }(x^{k+1},\bar{y}^{k},p^{k})\nonumber \\{} & {} \quad = g(y^{k+1})-g(\bar{y}^{k})+\langle p^{k},B(\bar{y}^{k}-y^{k+1})\rangle \nonumber \\{} & {} \qquad +\frac{\beta }{2}\Vert Ax^{k+1}+By^{k+1}-b\Vert ^{2}-\frac{\beta }{2}\Vert Ax^{k+1}+B\bar{y}^{k}-b\Vert ^{2}.\nonumber \\ \end{aligned}$$
(24)

From the optimality condition of y-subproblem, which implies that

$$\begin{aligned} \begin{aligned} \nabla g(y^{k+1})=B^{\top }p^{k+1}. \end{aligned} \end{aligned}$$
(25)

Since \(\nabla g\) is Lipschitz continuous with modulus \(L>0\), it follows from Lemma 2 and (25) obtain

$$\begin{aligned}{} & {} g(y^{k+1})-g(\bar{y}^{k})\le \langle \nabla g(y^{k+1}),y^{k+1}-\bar{y}^{k}\rangle +\frac{L}{2}\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}\nonumber \\{} & {} \quad =\langle p^{k+1},B(y^{k+1}-\bar{y}^{k})\rangle +\frac{L}{2}\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}. \end{aligned}$$
(26)

By substituting (26) into (24), we obtain

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k+1},y^{k+1},p^{k})-{\mathcal {L}}_{\beta }(x^{k+1},\bar{y}^{k},p^{k})\nonumber \\{} & {} \quad \le \langle p^{k}-p^{k+1},B(\bar{y}^{k}-y^{k+1})\rangle +\frac{\beta }{2}\Vert Ax^{k+1}+By^{k+1}-b\Vert ^{2}\nonumber \\{} & {} \qquad -\frac{\beta }{2}\Vert Ax^{k+1}+B\bar{y}^{k}-b\Vert ^{2}+\frac{L}{2}\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}, \end{aligned}$$
(27)

and \(p^{k+1}=p^{k}-\beta (Ax^{k+1}+By^{k+1}-b),\) we have

$$\begin{aligned} \begin{aligned}\begin{array}{l} \frac{\beta }{2}\Vert Ax^{k+1}+By^{k+1}-b\Vert ^{2}=\frac{1}{2\beta }\Vert p^{k}-p^{k+1}\Vert ^{2}. \end{array} \end{aligned}\end{aligned}$$
(28)

Furthermore, \(Ax^{k+1}+B\bar{y}^{k}-b=Ax^{k+1}+By^{k+1}-b+B(\bar{y}^{k}-y^{k+1}),\) thus

$$\begin{aligned}{} & {} \langle p^{k}-p^{k+1},B(\bar{y}^{k}-y^{k+1})\rangle -\frac{\beta }{2}\Vert Ax^{k+1}+B\bar{y}^{k}-b\Vert ^{2}\nonumber \\{} & {} \quad =-\frac{1}{2\beta }\Vert p^{k}-p^{k+1}\Vert ^{2}-\frac{\beta \mu _{B^{\top }}}{2}\Vert \bar{y}^{k}-y^{k+1}\Vert ^{2}. \end{aligned}$$
(29)

By substituting (28),(29) into (27), we have

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k+1},y^{k+1},p^{k})-{\mathcal {L}}_{\beta }(x^{k+1},\bar{y}^{k},p^{k})\nonumber \\{} & {} \quad \le -\frac{\beta \mu _{B^{\top }}-L}{2}\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}. \end{aligned}$$
(30)

Since \(x^{k+1}\) is the minimum value of x-subproblem for iterative scheme (7), we have

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k+1},\bar{y}^{k},p^{k})\le {\mathcal {L}}_{\beta }(x^{k},\bar{y}^{k},p^{k})\nonumber \\{} & {} \quad +\frac{1}{2}\Vert x^{k}-\bar{x}^{k}\Vert ^{2}_{S}-\frac{1}{2}\Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}. \end{aligned}$$
(31)

On the other hand

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k},\bar{y}^{k},p^{k})-{\mathcal {L}}_{\beta }(x^{k},y^{k},p^{k})\nonumber \\{} & {} \quad =g(\bar{y}^{k})-g(y^{k})-\langle p^{k},B(\bar{y}^{k}-y^{k})\rangle \nonumber \\{} & {} \qquad +\frac{\beta }{2}\Vert Ax^{k}+B\bar{y}^{k}-b\Vert ^{2}-\frac{\beta }{2}\Vert Ax^{k}+By^{k}-b\Vert ^{2}. \end{aligned}$$
(32)

By the same reason we can obtain

$$\begin{aligned}{} & {} g(\bar{y}^{k})-g(y^{k})\le \langle \nabla g(y^{k}),\bar{y}^{k}-y^{k}\rangle \nonumber \\{} & {} \quad +\frac{L}{2}\Vert \bar{y}^{k}-y^{k}\Vert ^{2}=\langle p^{k},B(\bar{y}^{k}-y^{k})\rangle +\frac{L}{2}\Vert \bar{y}^{k}-y^{k}\Vert ^{2}.\nonumber \\ \end{aligned}$$
(33)

It is easy to verify that

$$\begin{aligned} \begin{aligned}\begin{array}{l} \frac{\beta }{2}\Vert Ax^{k}+By^{k}-b\Vert ^{2}=\frac{1}{2\beta }\Vert p^{k-1}-p^{k}\Vert ^{2}. \end{array} \end{aligned}\end{aligned}$$
(34)

Furthermore, we know

$$\begin{aligned}{} & {} \frac{\beta }{2}\Vert Ax^{k}+B\bar{y}^{k}-b\Vert ^{2}\nonumber \\{} & {} \quad =\frac{\beta }{2}\Vert \frac{1}{\beta }(p^{k-1}-p^{k})+B(\bar{y}^{k}-y^{k})\Vert ^{2}\nonumber \\{} & {} \quad \le \frac{1}{\beta }\Vert p^{k-1}-p^{k}\Vert ^{2}+\beta \lambda _{B^{\top }}\Vert \bar{y}^{k}-y^{k}\Vert ^{2}, \end{aligned}$$
(35)

where the inequality using of \((a+b)^{2}\le 2(a^{2}+b^{2}).\) By substitution (33)–(35) into (32)

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(x^{k},\bar{y}^{k},p^{k})-{\mathcal {L}}_{\beta }(x^{k},y^{k},p^{k})\nonumber \\{} & {} \quad \le (\beta \lambda _{B^{\top }}+\frac{L}{2})\Vert y^{k}-\bar{y}^{k}\Vert ^{2}+\frac{1}{2\beta }\Vert p^{k-1}-p^{k}\Vert ^{2}. \end{aligned}$$
(36)

Summing up (23),(30), (31), and (36), we obtain

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(w^{k+1})-{\mathcal {L}}_{\beta }(w^{k})\le \frac{1}{\beta }\Vert p^{k}-p^{k+1}\Vert ^{2}\nonumber \\{} & {} \quad +\frac{1}{2\beta }\Vert p^{k-1}-p^{k}\Vert ^{2}-\frac{\beta \mu _{B^{\top }}-L}{2}\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}\nonumber \\{} & {} \quad +(\beta \lambda _{B^{\top }}+\frac{L}{2})\Vert \bar{y}^{k}-y^{k}\Vert ^{2}\nonumber \\{} & {} \quad +\frac{1}{2}\Vert x^{k}-\bar{x}^{k}\Vert ^{2}_{S}-\frac{1}{2}\Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}. \end{aligned}$$
(37)

Assumption 1(v) and (7) imply that \(p^{k+1}-p^{k}\in Im B.\) It follows from Lemma 3 and (25) that

$$\begin{aligned}{} & {} \Vert p^{k+1}-p^{k}\Vert ^{2}\le \frac{1}{\mu _{B^{\top }}}\Vert B^{\top }(p^{k+1}-p^{k})\Vert ^{2}\nonumber \\{} & {} \quad =\frac{1}{\mu _{B^{\top }}}\Vert \nabla g(y^{k+1})-\nabla g(y^{k})\Vert ^{2}\nonumber \\{} & {} \quad \le \frac{L^{2}}{\mu _{B^{\top }}}\Vert y^{k+1}-y^{k}\Vert ^{2}. \end{aligned}$$
(38)

Since \(\Vert y^{k+1}-y^{k}\Vert ^{2}\le 2(\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}+\Vert \bar{y}^{k}-y^{k}\Vert ^{2}),\) we have

$$\begin{aligned}{} & {} \frac{3}{2\beta }\Vert p^{k+1}-p^{k}\Vert ^{2}\le \frac{3L^{2}}{\beta \mu _{B^{\top }}}\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}\nonumber \\{} & {} \quad +\frac{3L^{2}}{\beta \mu _{B^{\top }}}\Vert \bar{y}^{k}-y^{k}\Vert ^{2}. \end{aligned}$$
(39)

Combining to \(\bar{x}^{k+1}=x^{k+1}+\theta (x^{k+1}-\bar{x}^{k})\), \(\bar{y}^{k+1}=y^{k+1}+\eta (y^{k+1}-\bar{y}^{k})\), and (37), it can be obtained through simplification and arrangement

$$\begin{aligned}{} & {} {\mathcal {L}}_{\beta }(w^{k+1})+\frac{1}{2\beta }\Vert p^{k}-p^{k+1}\Vert ^{2}\nonumber \\{} & {} \qquad +\delta \Vert y^{k+1}-\bar{y}^{k+1}\Vert ^{2}+\frac{1}{2}\Vert x^{k+1}-\bar{x}^{k+1}\Vert ^{2}_{S}\nonumber \\{} & {} \quad \le {\mathcal {L}}_{\beta }(w^{k})+\frac{1}{2\beta }\Vert p^{k-1}-p^{k}\Vert ^{2}\nonumber \\{} & {} \qquad +\delta \Vert y^{k}-\bar{y}^{k}\Vert ^{2}+\frac{1}{2}\Vert x^{k}-\bar{x}^{k}\Vert ^{2}_{S}\nonumber \\{} & {} \qquad -(M_{0}-\eta ^{2}\delta )\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}-\frac{1-\theta ^{2}}{2}\Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}. \end{aligned}$$

where \(\delta =\beta \lambda _{B^{\top }}+\frac{L}{2}+\frac{3L^{2}}{\beta \mu _{B^{\top }}},M_{0}=\frac{\beta \mu _{B^{\top }}-L}{2}-\frac{3L^{2}}{\beta \mu _{B^{\top }}}.\) Let \(\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k})={\mathcal {L}}_{\beta }(w^{k})+\frac{1}{2\beta }\Vert p^{k-1}-p^{k}\Vert ^{2}+\delta \Vert y^{k}-\bar{y}^{k}\Vert ^{2}+\frac{1}{2}\Vert x^{k}-\bar{x}^{k}\Vert ^{2}_{S}.\) Thus, we have

$$\begin{aligned}{} & {} \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})-\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k})\nonumber \\{} & {} \quad \le -\frac{1-\theta ^{2}}{2}\Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}-M\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}. \end{aligned}$$

where \(M=M_{0}-\eta ^2\delta ,\) from Assumption 1(iv), we know \(M>0.\) This completes the proof. \(\square \)

1.2 A.2 Proof of Lemma 6

Proof

Since \(\{w^{k}\}\) is bounded, \(\{\hat{w}^{k}\}\) is bounded, which has at least one cluster point. Let \(\hat{w}^{*}\) be a cluster point of \(\{\hat{w}^{k}\}\), then the subsequence \(\{\hat{w}^{k_{j}}\}\) to converge it, i.e., \(\lim \limits _{j\rightarrow +\infty }\hat{w}^{k_{j}}=\hat{w}^{*}\). Since f is the lower semicontinuous and g is the continuous, then \(\hat{{\mathcal {L}}}_{\beta }(\cdot )\) is the lower semicontinuous, and hence

$$\begin{aligned} \liminf \limits _{j\rightarrow +\infty }\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k_{j}})\ge \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{*}). \end{aligned}$$
(40)

Consequently, \(\{\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k_{j}})\}\) is bounded from below, which, together with the fact that \(\{\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k})\}\) is nonincreasing, means that \(\{\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k_{j}})\}\) is convergent. Moreover, \(\{\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k})\}\) is convergent and \(\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k})\ge \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{*})\). Rearranging terms of (9), summing up for \(k=0,\cdot \cdot \cdot n\), we have

$$\begin{aligned}{} & {} \sum \limits _{k=0}^{n}\left( \frac{1-\theta ^{2}}{2}\Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}+M\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}\right) \\{} & {} \quad \le \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{0})-\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{n+1})\le \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{0})-\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{*})<+\infty . \end{aligned}$$

Since \(\theta \in [0,1),M>0\), we have \(\sum \limits _{k=0}^{+\infty }\Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}<+\infty \), \(\sum \limits _{k=0}^{+\infty }\Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}<+\infty \). It is apparent from the inequality property that

$$\begin{aligned}{} & {} \Vert x^{k+1}-x^{k}\Vert ^{2}_{S}\le 2\left( \Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}+\Vert \bar{x}^{k}-x^{k}\Vert ^{2}_{S}\right) \nonumber \\{} & {} \quad \le 2\left( \Vert x^{k+1}-\bar{x}^{k}\Vert ^{2}_{S}+\theta ^{2}\Vert x^{k}-\bar{x}^{k-1}\Vert ^{2}_{S}\right) . \end{aligned}$$
(41)
$$\begin{aligned}{} & {} \quad \Vert y^{k+1}-y^{k}\Vert ^{2}\le 2\left( \Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}+\Vert \bar{y}^{k}-y^{k}\Vert ^{2}\right) \nonumber \\{} & {} \quad \le 2\left( \Vert y^{k+1}-\bar{y}^{k}\Vert ^{2}+\eta ^{2}\Vert y^{k}-\bar{y}^{k-1}\Vert ^{2}\right) . \end{aligned}$$
(42)

From (41) and (42), we have \(\sum \limits _{k=0}^{+\infty }\Vert x^{k+1}-x^{k}\Vert ^{2}<+\infty ,\) \(\sum \limits _{k=0}^{+\infty }\Vert y^{k+1}-y^{k}\Vert ^{2}<+\infty \), it follows from (38) that \(\sum \limits _{k=0}^{+\infty }\Vert p^{k+1}-p^{k}\Vert ^{2}<+\infty .\) Therefore, we obtain \(\sum \limits _{k=0}^{+\infty }\Vert w^{k+1}-w^{k}\Vert ^{2}<+\infty .\) \(\square \)

1.3 A.4 Proof of Lemma 7

Proof

From the definition of the augment Lagrangian function \({\mathcal {L}}_{\beta }(\cdot ),\) it follows

$$\begin{aligned} \begin{aligned} \left\{ \begin{array}{l} \partial _{x}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})=\partial f(x^{k+1})-A^{\top }p^{k+1}+\beta A^{\top }(Ax^{k+1}\\ \quad +By^{k+1}-b)+S(x^{k+1}-\bar{x}^{k+1}),\\ \partial _{y}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})=\nabla g(x^{k+1})-B^{\top }p^{k+1}+\beta B^{\top }(Ax^{k+1}\\ \quad +By^{k+1}-b)+2\delta (y^{k+1}-\bar{y}^{k+1}),\\ \partial _{p}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})=\frac{2}{\beta }(p^{k+1}-p^{k}),\\ \partial _{\bar{x}}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})=S(\bar{x}^{k+1}-x^{k+1}),\\ \partial _{\bar{y}}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})=2\delta (\bar{y}^{k+1}-x^{k+1}),\\ \partial _{\bar{p}}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})=\frac{1}{\beta }(p^{k}-p^{k+1}). \end{array} \right. \end{aligned} \end{aligned}$$

This together with optimality condition (8) yields

$$\begin{aligned} \begin{aligned} \left\{ \begin{array}{l} A^{\top }(p^{k}-p^{k+1})+\beta A^{\top }B(y^{k+1}-\bar{y}^{k})-(1+\theta )S(x^{k+1}\\ \quad -\bar{x}^{k})\in \partial _{x}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}),\\ B^{\top }(p^{k}-p^{k+1})+2\delta (y^{k+1}-\bar{y}^{k+1})\in \partial _{y}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}),\\ \frac{2}{\beta }(p^{k+1}-p^{k})=\partial _{p}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}),\\ \theta S(x^{k+1}-\bar{x}^{k})=\partial _{\bar{x}}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}),\\ 2\delta \eta (y^{k+1}-\bar{y}^{k})=\partial _{\bar{y}}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}),\\ \frac{1}{\beta }(p^{k}-p^{k+1})\in \partial _{\bar{p}}\hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}). \end{array} \right. \end{aligned}\end{aligned}$$
(43)

From Lemma 1, we obtain \((\varepsilon _{1}^{k+1},\varepsilon _{2}^{k+1},\varepsilon _{3}^{k+1},\varepsilon _{4}^{k+1},\varepsilon _{5}^{k+1},\varepsilon _{6}^{k+1})^{\top }\in \partial \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1})\). Furthermore, from (43), there exists a real number \(\zeta _{0}\) such that

$$\begin{aligned}{} & {} \Vert (\varepsilon _{1}^{k+1},\varepsilon _{2}^{k+1},\varepsilon _{3}^{k+1},\varepsilon _{4}^{k+1},\varepsilon _{5}^{k+1},\varepsilon _{6}^{k+1})\Vert ^{2}\nonumber \\{} & {} \quad \le \zeta _{0}(\Vert x^{k+1}-\bar{x}^{k}\Vert +\Vert y^{k+1}-\bar{y}^{k}\Vert +\Vert p^{k}-p^{k+1}\Vert ).\nonumber \\ \end{aligned}$$
(44)

It follows from (39) that there exists \(\zeta _{1}>0\) such that

$$\begin{aligned} \begin{aligned} \begin{array}{l} \Vert p^{k+1}-p^{k}\Vert \le \zeta _{1}(\Vert y^{k+1}-\bar{y}^{k}\Vert +\Vert y^{k}-\bar{y}^{k-1}\Vert ),k\ge 1. \end{array} \end{aligned}\nonumber \\\end{aligned}$$
(45)

Thus, combining (44) and (45), there exists \(\zeta >0\) such that

$$\begin{aligned}{} & {} d(0,\partial \hat{{\mathcal {L}}}_{\beta }(\hat{w}^{k+1}))\le \Vert (\varepsilon _{1}^{k+1},\varepsilon _{2}^{k+1},\varepsilon _{3}^{k+1},\varepsilon _{4}^{k+1},\varepsilon _{5}^{k+1},\varepsilon _{6}^{k+1})\Vert \\{} & {} \quad \le \zeta (\Vert x^{k+1}-\bar{x}^{k}\Vert +\Vert y^{k+1}-\bar{y}^{k}\Vert +\Vert y^{k}-\bar{y}^{k-1}\Vert ),k\ge 1. \end{aligned}$$

This completes the proof. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chao, M., Geng, Y. & Zhao, Y. A Method of Inertial Regularized ADMM for Separable Nonconvex Optimization Problems. Soft Comput 27, 16741–16757 (2023). https://doi.org/10.1007/s00500-023-09017-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-09017-8

Keywords

Navigation