Abstract
Considering a general linear ill-posed equation, we explore the duality arising from the requirement that the discrepancy should take a given value based on the estimation of the noise level, as is notably the case when using the Morozov principle. We show that, under reasonable assumptions, the dual function is smooth, and that its maximization points out the appropriate value of Tikhonov’s regularization parameter. The numerical relevance of our approach is established by means of an illustrative example from nonparametric instrumental regression, a standard problem in statistics.
Similar content being viewed by others
References
Borwein, J., Lewis, A.: Convex Analysis and Nonlinear Optimization, CMS Books in Mathematics, 2nd edn. Springer, Berlin (2005)
Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Springer, Berlin (1996)
Fletcher, R.: Practical Methods of Optimization: Unconstrained Optimization. Wiley, New York (1980)
Frick, K., Grasmair, M.: Regularization of linear ill-posed problems by the augmented lagrangian method and variational inequalities. Inverse Prob. 28, 1–16 (2012)
Hall, P., Horowitz, J.: Nonparametric methods for inference in the presence of instrumental variables. Ann. Stat. 33(6), 2904–2929 (2005)
Hantoute, A., López, M.A., Zălinescu, C.: Subdifferential calculus rules in convex analysis: a unifying approach via pointwise supremum functions. SIAM J. Optim. 19(2), 863–882 (2008)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I. A Series of Comprehensive Studies in Mathematics. Springer, Berlin (1993)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II. A Series of Comprehensive Studies in Mathematics. Springer, Berlin (1993)
Hnětynková, I., Plešinger, M., Strakoš, Z.: The regularizing effect of the Golub-Kahan iterative bidiagonalization and revealing the noise level in the data. BIT Numer. Math. 49, 669–696 (2009)
Kirsch, A: An Introduction to the Mathematical Theory of Inverse Problems. Springer, Berlin (2011)
Lemaréchal, C: A view of line-searches. In: Auslender, A., Oettli, W., Stoer, J. (eds.) Optimization and Optimal Control, Lecture Notes in Control and Information Sciences, vol. 30. Springer, Berlin (1981)
Morozov, V.A.: Choice of parameter for the solution of functional equations by the regularization method. Sov. Math. Dokl. 8, 1000–1003 (1967)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977)
Wolfe, P: Convergence conditions for ascent methods. SIAM Rev. 11, 226–235 (1969)
Zălinescu, C: Convex Analysis in General Vector Spaces. World Scientific, Singapore (2002)
Acknowledgements
We thank the anonymous referees for their helpful comments and suggestions which have enabled us to improve the manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendix: Algorithmic Details
Appendix: Algorithmic Details
In order to maximize D, which we reformulate here as the minimization of \(\bar {D}:=-D\), we combine a Quasi-Newton method with Wolfe-Lemaréchal line-search (see the algorithms 1 and 2 below).
Recall that at each iteration k, we have an update of the form λk+ 1 = λ k + α k d k , where d k denotes the direction of descent computed via \(\bar {D}^{\prime }\) and an approximation of \(\bar {D}^{\prime \prime }\). The stepsize α k is chosen so as to satisfy the two Wolfe conditions (C1) and (C2) (see [11]) in order to guarantee the monotonicity of the sequence \(\bar {D}(\lambda _{k})\):
In (C1) and (C2), the parameters β1 and β2 are taken in (0,1) (see [3, 15]). In Algorithm 1, the stopping criterion is \(| \bar {D}^{\prime }(\lambda _{k})| < \epsilon \), with 𝜖 > 0, defining the tolerance. For computing α k at line 6, we propose Algorithm 2, which is based on the line-search algorithm (see [11]).
In Algorithm 2, (α g ,α d ) is the interval in which the step α k will be chosen. Here, M is a large number which emulates ∞. For example, we may set M = 1010. Recall that the failure of Condition (C1) means that α k is too large, while the failure of Condition (C2) means that α k is too small. If it happens that \(|\alpha _{d} - \alpha _{g}| \thickapprox 0\), indicating that β1 is too big or β2 is too small, one may then merely adjust these parameters. This situation occurs rarely. For instance, β1 = 0.25 and β2 = 0.75 worked perfectly well for all our simulations.
Rights and permissions
About this article
Cite this article
Bonnefond, X., Maréchal, P. & Lee, W.C.S.T. A Note on the Morozov Principle via Lagrange Duality. Set-Valued Var. Anal 26, 265–275 (2018). https://doi.org/10.1007/s11228-018-0470-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11228-018-0470-y