Condition pp 101-117 | Cite as

Condition Numbers and Iterative Algorithms

  • Peter Bürgisser
  • Felipe Cucker
Part of the Grundlehren der mathematischen Wissenschaften book series (GL, volume 349)

Abstract

Consider a full-rank rectangular matrix \(R\in \mathbb {R}^{q\times n}\) with q>n, a vector \(c\in \mathbb {R}^{q}\), and the least-squares problem
$$\min_{v\in \mathbb {R}^n}\|Rv-c\|. $$
The solution \(x\in \mathbb {R}^{n}\) of this problem is given by
$$x=R^\dagger c=\bigl(R^{\mathrm {T}}R\bigr)^{-1}R^{\mathrm {T}}c. $$
It follows that we can find x as the solution of the system Ax=b with A:=R T R, \(A\in \mathbb {R}^{n\times n}\), and b:=R T c.

A key remark at this stage is that by construction, A is symmetric and positive definite. One may therefore consider algorithms exploiting symmetry and positive definiteness. We do so in this chapter.

The algorithms we describe, steepest descent and conjugate gradient, serve to deepen our understanding of the only facet of conditioning that we have not dealt with up to now: the relationship between condition and complexity. To better focus on this issue, we disregard all issues concerning finite precision and assume, instead, infinite precision in all computations. Remarkably, the condition number κ(A) of A will naturally occur in the analysis of the running time for these algorithms. And this occurrence leads us to the last issue we discuss in this introduction.

Complexity bounds in terms of κ(A) are not directly applicable, since κ(A) is not known a priori. We have already argued that one can remove κ(A) from these bounds by trading worst-case for, say, average-case complexity. This passes through an average analysis of κ(A), and in turn, such an analysis assumes that the set of matrices A is endowed with a probability distribution. When A is arbitrary in \(\mathbb {R}^{n\times n}\), we endow this space with a standard Gaussian. In our case, when A is positive definite, this choice is no longer available. A look at our original computational problem may, however, shed some light. Matrix A is obtained as A=R T R. It then makes sense to consider R as our primary random data—and for R we can assume Gaussianity—and endow A with the distribution inherited from that of R. Furthermore, one has κ(A)=κ 2(R). Therefore, the analysis of κ(A) for this inherited distribution reduces to the analysis of κ(R) when R is Gaussian.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Peter Bürgisser
    • 1
  • Felipe Cucker
    • 2
  1. 1.Institut für MathematikTechnische Universität BerlinBerlinGermany
  2. 2.Department of MathematicsCity University of Hong KongHong KongHong Kong SAR

Personalised recommendations