A conjugate gradient algorithm and its application in largescale optimization problems and image restoration
 215 Downloads
Abstract
To solve largescale unconstrained optimization problems, a modified PRP conjugate gradient algorithm is proposed and is found to be interesting because it combines the steepest descent algorithm with the conjugate gradient method and successfully fully utilizes their excellent properties. For smooth functions, the objective algorithm sufficiently utilizes information about the gradient function and the previous direction to determine the next search direction. For nonsmooth functions, a Moreau–Yosida regularization is introduced into the proposed algorithm, which simplifies the process in addressing complex problems. The proposed algorithm has the following characteristics: (i) a sufficient descent feature as well as a trust region trait; (ii) the ability to achieve global convergence; (iii) numerical results for largescale smooth/nonsmooth functions prove that the proposed algorithm is outstanding compared to other similar optimization methods; (iv) image restoration problems are done to turn out that the given algorithm is successful.
Keywords
Conjugate gradient Nonconvex and nonsmooth Descent property Global convergenceMSC
90C261 Introduction

The search direction has a sufficient decrease and a trust region property.

For general functions, the proposed algorithm under mild assumptions possesses global convergence.

The new algorithm combines the deepest descent method with the conjugate gradient algorithm through the size of the coefficients, and the numerical results demonstrate the method’s good performance compared with established algorithms.

The corresponding numerical results prove that the discussed method is efficient as well as successful at solving general problems.

The paper successfully combines the mathematic theory with realworld application. On the one hand, the proposed algorithm has a good performance in solving the largescale optimization problems, on the other hand, it is introduced in the image restoration, which has wild application in biological engineering, medical sciences and other areas of science and engineering.
The remainder of this paper is organized as follows: The next section presents the motivation and the content of the algorithm to solve largescale smooth problems includes the important mathematical characters; the similar optimization algorithm was presented to solve largescale nonsmooth optimization problems; the Sect. 4 presents the application of the Sect. 3 in the problem of the image restoration; the paper’s conclusion and algorithm’s characters was listed in Sect. 5. Without loss of generality, \(f(x_{k})\) and \(f(x_{k+1})\) are replaced by \(f_{k}\) and \(f_{k+1}\), and \(\\cdot \\) is the Euclidean norm.
2 New threeterm conjugate gradient algorithm for smooth problems
2.1 Algorithm steps
Algorithm 2.1
 Step 1:

(Initiation) Choose an initial point \(x_{0}\), \(\gamma \in (0,1)\), \(\xi _{2}\), \(\xi _{3}\), \(\xi _{4}> 0\), and positive constants \(\varepsilon \in (0,1)\). Let \(k=0\), \(d_{0}=g_{0}\).
 Step 2:

If \(\g_{k}\ \leq \varepsilon \), then stop.
 Step 3:

Find the step length, where the calculation \(\alpha _{k} = \max \{\gamma ^{k}\mid k=0, 1, 2, \ldots \}\) stems from (1.9).
 Step 4:

Set the new iteration point of \(x_{k+1}=x_{k}+\alpha _{k}d_{k}\).
 Step 5:

Update the search direction by (2.4).
 Step 6:

If \(\g_{k+1}\\leq \varepsilon \) holds, the algorithm stops. Otherwise, go to next step.
 Step 7:

Let \(k:=k+1\) and go to Step 3.
2.2 Algorithm characteristics
This section states the properties of the sufficient descent, trust region as well as global convergence of Algorithm 2.1.
Lemma 2.1
Proof
On the one hand, it is true that (2.5) and (2.6) are correct if \(k=0\).
It is true that (2.5) and (2.6) demonstrate that the search direction has a sufficient descent trait and a trust region property, respectively. □
Aiming at achieving global convergence, we propose the following mild assumptions.
Assumption (i)
The level set of \(\varOmega =\{x\mid f(x) \leq f(x _{0})\}\) is bounded.
Assumption (ii)
Based on the above discussion and established conclusion concerning the modified Armijo line search of being reasonable and necessary (see [48]), the global convergence algorithm is established as follows.
Theorem 2.1
Proof
This means that \(\{\alpha _{k}\} \rightarrow 0\), \(k \rightarrow \infty \) or \(\{\g_{k}\\} \rightarrow 0\), \(k \rightarrow \infty \). We then state two cases:
(ii) Clearly, \(\{g_{k}\} \rightarrow 0\) if \(\alpha _{k}\) is a positive finite constant when k is a sufficiently large constant from the formula of (2.14). This conclusion does not satisfy the assumption of (2.10); this completes the proof. □
2.3 Numerical results
Related content is presented in this section and consists of two parts: test problems and corresponding numerical results. To measure the algorithm’s efficiency, we compare Algorithm 2.1 with Algorithm 1 in [51] in terms of NI, NFG, and CPU on the test problems listed in Table 2 of Appendix 1, which are from [3], where NI, NFG, and CPU indicate the number of iterations, the sum of the calculation’s frequency of the objective function and gradient function, and the calculation time needed to solve various test problems (in seconds), respectively. Algorithm 1 is different from the objective algorithm in the formula of \(d_{k+1}\) that was determined by (2.1), and the remainder of Algorithm 1 is identical to Algorithm 2.1.
Stopping rule: If \( f(x_{k}) > e_{1}\), let \(\mathit{stop}1=\frac{ f(x_{k})f(x_{k+1}) }{ f(x_{k}) }\) or \(\mathit{stop}1= f(x_{k})f(x_{k+1}) \). If the condition \(\g(x)\< \epsilon \) or \(\mathit{stop}1 < e_{2}\) is satisfied, the algorithm stops, where \(e_{1}=e_{2}=10^{4}\), \(\epsilon =10^{4}\). On the one hand, based on the virtual case, the proposed algorithm also stops if the number of iterations is greater than 10,000 and the iteration number of \(\alpha _{k}\) is greater than 5. On the other hand, ‘NO’ and ‘problem’ in Table 2 indicate the number of the tested problem and the name of the problem, respectively.
Initiation: \(\lambda =0.9\), \(\lambda _{1}=0.4\), \(\xi _{3}= 300\), \(\xi _{2}=\xi _{4}=0.01\), \(\gamma =0.01\).
Dimension: \(30\text{,}000\), \(90\text{,}000\), \(150\text{,}000\), \(210\text{,}000\).
Calculation environment: The calculation environment is a computer with 2 GB of memory, a Pentium (R) DualCore CPU E5800@3.20 GHz, and the 64bit Windows 7 operating system.
3 Algorithm for nonsmooth problems
From the previous section, the proposed algorithm is trustworthy and has good potential based on the fundamental numerical results. Thus, this section attempts to apply the proposed method to nonsmooth problems. It is interesting that the vast majority of practical conditions are harsh; therefore, Newton’s series of methods are often unsatisfactory for solving such problems because they require information about the gradient function [37, 39, 46]. Currently, most experts and scholars focus on bundled methods, which are successful solutions to smallscale problems (see [18, 19, 24, 36]) but fail to solve largescale practical problems. With the development of science and technology, it is becoming an urgent need to design a simple but effective algorithm to solve largescale nonsmooth problems. Based on the simplicity of the conjugate gradient method, some experts and scholars have proposed relevant algorithms and made numerous fruitful theoretical achievements (see [20, 28]).
3.1 New algorithm and its necessary properties
Algorithm 3.1
 Step 1:

(Initiation) Choose an initial point \(x_{0}\), \(\gamma \in (0,1)\), \(\xi _{2}\), \(\xi _{3}\), and \(\xi _{4}> 0\) and positive constants \(\varepsilon \in (0,1)\). Let \(k=0\), \(d_{0}=\nabla \theta ^{M}(x _{0})\).
 Step 2:

If \(\\nabla \theta ^{M}(x_{k})\ \leq \varepsilon \), then stop.
 Step 3:

Find the step length, i.e., the calculation \(\alpha _{k} = \max \{\gamma ^{k}\mid k=0, 1, 2, \ldots \}\) stemming from (3.6).
 Step 4:

Set the new iteration point \(x_{k+1}=x_{k}+\alpha _{k}d_{k}\).
 Step 5:

Update the search direction by (3.5).
 Step 6:

If \(\\nabla \theta ^{M}(x_{k+1})\\leq \varepsilon \) holds, the algorithm stops; otherwise, go to the next step.
 Step 7:

Let \(k:=k+1\) and go to Step 3.
To express the validity of the step length \(\alpha _{k}\) in (3.6) and the global convergence of Algorithm 3.1, the following assumptions are necessary.
Assumption
 (i)
The level set \(\pi =\{x\mid \theta ^{M}(x) \leq \theta ^{M}(x_{0})\}\) is bounded.
 (ii)
The function \(\theta ^{M}(x) \in C^{2}\) is bounded from below.
Theorem 3.1
If Assumptions (i)–(ii) are true, then there exists a constant\(\alpha _{k}\)that satisfies the requirements of (3.6).
Proof
Theorem 3.2
If the above assumptions are satisfied and the relative sequences\(\{x_{k}\}\), \(\{\alpha _{k}\}\), \(\{d_{k}\}\), \(\{\theta ^{M}(x_{k})\}\)are generated by Algorithm3.1, then we have\(\lim_{k \rightarrow \infty }\\nabla \theta ^{M}(x_{k})\=0\).
We neglect the proof because its proof is similar to that of Theorem 2.1.
3.2 Nonsmooth numerical experiment
Dimension: 150,000, 180,000, 192,000, 210,000, 222,000, 231,000, 240,000, 252,000, 270,000.
Initiation: \(\lambda =0.9\), \(\lambda _{1}=0.4\), \(\xi _{3}= 100\), \(\xi _{2}=\xi _{4}=0.01\), \(\gamma =0.5\).
Stopping rule: If NI is no greater than 10,000, \(f(x_{k+1})f(x _{k}) < 1e7\) and if the iteration number of \(\alpha _{k}\) is no greater than 5, then the algorithm stops.
Calculation environment: The calculation environment is a computer with 2 GB of memory, a Pentium (R) DualCore CPU E5800@3.20 GHz and the 64bit Windows 7 operating system.
4 Applications of Algorithm 3.1 in image restoration
4.1 Image restoration problem
The CPU time of PRP algorithm and Algorithm 3.1 in seconds
4.2 Results and discussion
5 Conclusion
This paper proposes a new PRP algorithm that combines the innovative formula of the search direction \(d_{k+1}\) with the modified Armijo line technique: (i) In the design of the proposed algorithm, the key information about the objective function, the gradient function and its current direction is collected and applied to complex problems, and the numerical results show that the proposed algorithm is efficient. (ii) For nonsmooth problems, the introduced ‘Moreau–Yosida’ regularization technique succeeds in enhancing the proposed algorithm, and the numerical results prove the validity and simplicity of the discussed algorithm. (iii) Image restoration problems are done by Algorithm 3.1 and from the tested results it turns out that the given algorithm has better performance than those of the normal PRP algorithm. However, there are some problems with the optimization method that need to be studied such as how to better leverage the benefits of the steepest descent method while overcoming its shortcomings.
Notes
Acknowledgements
The authors would like to thank for the support funds.
Authors’ contributions
GY mainly analyzed the theory results and organized this paper, TL did the numerical experiments of smooth problems and WH focused on the nonsmooth problems and image problems. All authors read and approved the final manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 11661009), the Guangxi Natural Science Fund for Distinguished Young Scholars (No. 2015GXNSFGA139001), and the Guangxi Natural Science Key Fund (No. 2017GXNSFDA198046).
Competing interests
The authors declare to have no competing interests.
References
 1.AlBaali, A., Albaali, M.: Descent property and global convergence of the Fletcher–Reeves method with inexact line search. IMA J. Numer. Anal. 5(1), 121–124 (1985) MathSciNetzbMATHGoogle Scholar
 2.AlBaali, M., Narushima, Y., Yabe, H.: A family of threeterm conjugate gradient methods with sufficient descent property for unconstrained optimization. Comput. Optim. Appl. 60(1), 89–110 (2015) MathSciNetzbMATHGoogle Scholar
 3.Andrei, N.: An unconstrained optimization test functions collection. Environ. Sci. Technol. 10(1), 6552–6558 (2008) MathSciNetGoogle Scholar
 4.Andrei, N.: On threeterm conjugate gradient algorithms for unconstrained optimization. Appl. Math. Comput. 241(11), 19–29 (2008) MathSciNetGoogle Scholar
 5.Argyros, I.K., George, S.: Local convergence analysis of Jarratttype schemes for solving equations. Appl. SetValued Anal. Optim. 1, 53–62 (2019) Google Scholar
 6.Banham, M.R., Katsaggelos, A.K.: Digital image restoration. IEEE Signal Process. Mag. 14, 24–41 (1997) Google Scholar
 7.Cai, J.F., Chan, R.H., Fiore, C.D.: Minimization of a detailpreserving regularization functional for impulse noise removal. J. Math. Imaging Vis. 29, 79–91 (2007) MathSciNetGoogle Scholar
 8.Cai, J.F., Chan, R.H., Morini, B.: Minimization of an edgepreserving regularization functional by conjugate gradient types methods. In: Image Processing Based on Partial Differential Equations: Prceedings of the International Conference on PDEBased Image Processsing and Related Inverse Problems, CMA, Oslo, August 8–12, 2005, pp. 109–122. Springer, Berlin (2007) Google Scholar
 9.Cardenas, S.: Efficient generalized conjugate gradient algorithms. I. Theory. J. Optim. Theory Appl. 69(1), 129–137 (1991) MathSciNetGoogle Scholar
 10.Chan, C.L., Katsaggelos, A.K., Sahakian, A.V.: Image sequence filtering in quantumlimited noise with applications to lowdose fluoroscopy. IEEE Trans. Med. Imaging 12, 610–621 (1993) Google Scholar
 11.Dai, Z.: A mixed HS–DY conjugate gradient methods. Math. Numer. Sin. (2005) Google Scholar
 12.Dai, Z., Wen, F.: A generalized approach to sparse and stable portfolio optimization problem. J. Ind. Manag. Optim. 14, 1651–1666 (2018) MathSciNetGoogle Scholar
 13.Dai, Z.F.: Two modified HS type conjugate gradient methods for unconstrained optimization problems. Nonlinear Anal., Theory Methods Appl. 74(3), 927–936 (2011) MathSciNetzbMATHGoogle Scholar
 14.Deng, S., Wan, Z.: A threeterm conjugate gradient algorithm for largescale unconstrained optimization problems. Appl. Numer. Math. 92, 70–81 (2015) MathSciNetzbMATHGoogle Scholar
 15.Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2001) MathSciNetzbMATHGoogle Scholar
 16.Du, X., Liu, J.: Global convergence of a spectral HS conjugate gradient method. Proc. Eng. 15(4), 1487–1492 (2011) Google Scholar
 17.Gilbert, J.C., Nocedal, J.: Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 2(1), 21–42 (1990) MathSciNetzbMATHGoogle Scholar
 18.Haarala, M., Miettinen, K., Mäkelä, M.M.: New limited memory bundle method for largescale nonsmooth optimization. Optim. Methods Softw. 19(6), 673–692 (2004) MathSciNetzbMATHGoogle Scholar
 19.Haarala, N., Miettinen, K., Mäkelä, M.M.: Globally convergent limited memory bundle method for largescale nonsmooth optimization. Math. Program. 109(1), 181–205 (2007) MathSciNetzbMATHGoogle Scholar
 20.Lin, C.J., Weng, R.C., Keerthi, S.S.: Trust region Newton method for logistic regression. J. Mach. Learn. Res. 9(2), 627–650 (2008) MathSciNetzbMATHGoogle Scholar
 21.Meng, F., Zhao, G.: On secondorder properties of the Moreau–Yosida regularization for constrained nonsmooth convex programs. Numer. Funct. Anal. Optim. 25(5–6), 515–529 (2004) MathSciNetzbMATHGoogle Scholar
 22.Narushima, Y., Yabe, H., Ford, J.A.: A threeterm conjugate gradient method with sufficient descent property for unconstrained optimization. SIAM J. Optim. 21(1), 212–230 (2016) MathSciNetzbMATHGoogle Scholar
 23.Nazareth, L.: A conjugate direction algorithm without line searches. J. Optim. Theory Appl. 23(3), 373–387 (1977) MathSciNetzbMATHGoogle Scholar
 24.Oustry, F.: A secondorder bundle method to minimize the maximum eigenvalue function. Math. Program. 89(1), 1–33 (2000) MathSciNetzbMATHGoogle Scholar
 25.Pearson, J.W., Martin, S., Wathen, A.J.: Preconditioners for stateconstrained optimal control problems with Moreau–Yosida penalty function. Numer. Linear Algebra Appl. 21(1), 81–97 (2013) MathSciNetzbMATHGoogle Scholar
 26.Polak, E.: The conjugate gradient method in extreme problems. Comput. Math. Math. Phys. 9, 94–112 (1969) Google Scholar
 27.Polak, E., Ribière, G.: Note sur la convergence de directions conjugees. Rev. Fr. Inform. Rech. Opér. 3, 35–43 (1969) zbMATHGoogle Scholar
 28.Schramm, H., Zowe, J.: A version of the bundle method for minimizing a nonsmooth function: conceptual idea, convergence analysis, numerical results. SIAM J. Optim. 2(1), 121–152 (2006) zbMATHGoogle Scholar
 29.Sheng, Z., Yuan, G.: An effective adaptive trust region algorithm for nonsmooth minimization. Comput. Optim. Appl. 71, 251–271 (2018) MathSciNetzbMATHGoogle Scholar
 30.Sheng, Z., Yuan, G., et al.: An adaptive trust region algorithm for largeresidual nonsmooth least squares problems. J. Ind. Manag. Optim. 14, 707–718 (2018) MathSciNetzbMATHGoogle Scholar
 31.Sheng, Z., Yuan, G., Cui, Z.: A new adaptive trust region algorithm for optimization problems. Acta Math. Sci. 38(2), 479–496 (2018) MathSciNetzbMATHGoogle Scholar
 32.Slump, C.H.: Realtime image restoration in diagnostic Xray imaging, the effects on quantum noise. In: Proceedings of the 11th IAPR International Conference on Pattern Recognition, Vol. II, Conference B: Pattern Recognition Methodology and Systems, pp. 693–696 (1992) Google Scholar
 33.Sun, Q., Liu, Q.: Global convergence of modified HS conjugate gradient method. J. Appl. Math. Comput. 22(3), 289–297 (2009) MathSciNetzbMATHGoogle Scholar
 34.TouatiAhmed, D., Storey, C.: Efficient hybrid conjugate gradient techniques. J. Optim. Theory Appl. 64(2), 379–397 (1990) MathSciNetzbMATHGoogle Scholar
 35.Wan, Z., Yang, Z.L. Wang, Y.L.: New spectral PRP conjugate gradient method for unconstrained optimization. Appl. Math. Lett. 24(1), 16–22 (2011) MathSciNetzbMATHGoogle Scholar
 36.Wang, W., Qiao, X., Han, Y.: A proximal bundle method for nonsmooth and nonconvex constrained optimization. Comput. Stat. Data Anal. 34(34), 3464–3485 (2011) zbMATHGoogle Scholar
 37.Wei, Z., Li, G., Qi, L.: New quasiNewton methods for unconstrained optimization problems. Appl. Math. Comput. 175(2), 1156–1188 (2006) MathSciNetzbMATHGoogle Scholar
 38.Wei, Z., Yao, S., Liu, L.: The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 183(2), 1341–1350 (2006) MathSciNetzbMATHGoogle Scholar
 39.Wei, Z., Yu, G., Yuan, G., et al.: The superlinear convergence of a modified BFGStype method for unconstrained optimization. Comput. Optim. Appl. 29(3), 315–332 (2004) MathSciNetzbMATHGoogle Scholar
 40.Ying, L., Hai, Z., Fei, L.: A modified proximal gradient method for a family of nonsmooth convex optimization problems. J. Oper. Res. Soc. China 5, 391–403 (2017) MathSciNetzbMATHGoogle Scholar
 41.Yuan, G., Hu, W., Wang, B.: A modified Armijo line search technique for largescale nonconvex smooth and convex nonsmooth optimization problems. Preprint (2017) Google Scholar
 42.Yuan, G., Li, Y., Li, Y.: A modified Hestenes and Stiefel conjugate gradient algorithm for largescale nonsmooth minimizations and nonlinear equations. J. Optim. Theory Appl. 168(1), 129–152 (2016) MathSciNetzbMATHGoogle Scholar
 43.Yuan, G., Lu, X.: A modified PRP conjugate gradient method. Ann. Oper. Res. 166(1), 73–90 (2009) MathSciNetzbMATHGoogle Scholar
 44.Yuan, G., Lu, X., Wei, Z.: A conjugate gradient method with descent direction for unconstrained optimization. J. Comput. Appl. Math. 233(2), 519–530 (2009) MathSciNetzbMATHGoogle Scholar
 45.Yuan, G., Sheng, Z., Wang, P., Hu, W., Li, C.: The global convergence of a modified BFGS method for nonconvex functions. J. Comput. Appl. Math. 327, 274–294 (2018) MathSciNetzbMATHGoogle Scholar
 46.Yuan, G., Wei, Z.: Convergence analysis of a modified BFGS method on convex minimizations. Comput. Optim. Appl. 47(2), 237–255 (2010) MathSciNetzbMATHGoogle Scholar
 47.Yuan, G., Wei, Z., Li, G.: A modified Polak–Ribière–Polyak conjugate gradient algorithm for nonsmooth convex programs. J. Comput. Appl. Math. 255, 86–96 (2014) MathSciNetzbMATHGoogle Scholar
 48.Yuan, G., Wei, Z., Lu, X.: Global convergence of BFGS and PRP methods under a modified weak Wolfe–Powell line search. Appl. Math. Model. 47, 811–825 (2017) MathSciNetGoogle Scholar
 49.Yuan, G., Zhang, M.: A threeterms Polak–Ribière–Polyak conjugate gradient algorithm for largescale nonlinear equations. J. Comput. Appl. Math. 286, 186–195 (2015) MathSciNetzbMATHGoogle Scholar
 50.Zaslavski, A.J.: Three convergence results for continuous descent methods with a convex objective function. J. Appl. Numer. Optim. 1, 53–61 (2019) Google Scholar
 51.Zhang, L., Zhou, W., Li, D.H.: A descent modified Polak–Ribière–Polyak conjugate gradient method and its global convergence. IMA J. Numer. Anal. 26(4), 629–640 (2006) MathSciNetzbMATHGoogle Scholar
 52.Zhao, X., Ng, K.F., Li, C., Yao, J.C.: Linear regularity and linear convergence of projectionbased methods for solving convex feasibility problems. Appl. Math. Optim. 78, 613–641 (2018) MathSciNetzbMATHGoogle Scholar
 53.Zhou, W.: Some descent threeterm conjugate gradient methods and their global convergence. Optim. Methods Softw. 22(4), 697–711 (2007) MathSciNetzbMATHGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.