AWOC 1986: VLSI Algorithms and Architectures pp 283-295

# Fast and efficient parallel linear programming and linear least squares computations

• Victor Pan
• John Reif
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 227)

## Abstract

We present a new parallel algorithm for computing a least squares solution to a sparse overdetermined system of linear equations Ax=b such that m×n matrix A is sparse and the graph, G=(V,E), of the matrix $$H = \left[ {\begin{array}{*{20}c}I & {A^T } \\A & O \\\end{array} } \right]$$ has an s(m+n)-separator family, that is, either |V| <n0 for a fixed constant n0 or, by deleting a separator subset S of vertices of the size ≤ s(m+n), G can be partitioned into two disconnected subgraphs having vertex sets V1,V2 of the sizes ≤ 2/3 (m+n) and each of the two resulting subgraphs induced by the vertex sets S ∪ Vi, i=1,2, can be recursively s(|S ∪ Vi|)-separated in a similar way. Our algorithm uses O(log (m+n) log2s(m+n)) steps and ≤ s3(m+n) processors; it relies on our recent parallel algorithm for solving sparse linear systems and has several immediate applications of interest, in particular to mathematical programming, to sparse nonsymmetric systems of linear equations, and to the path algebra computations. We most closely examine the impact on the linear programming problem, l.p.p., which requires to maximize cTy subject to ATy ≤ b, y ≥ 0 where A is an m×n matrix. Hereafter it is assumed that m ≥ n. The recent algorithm by N. Karmarkar gives the best known upper estimate (O(m3.5 L) arithmetic operations where L is the input size) for the cost of the solution of this problem in the worst case. We prove an asymptotic improvement of that result in the case where the graph of the associated matrix H has an s(m+n)-separator family; then our algorithm can be implemented using O(m L log m log2s(m+n)) parallel arithmetic steps, s3(m+n) processors and a total of O(m L s3(m+n) log m log2s(m+n)) arithmetic operations. In many cases of practical importance this is a considerable improvement of the known estimates: for example, s(m+n) = √8 (m+n) if G is planar (as occurs in many operations research applications, for instance, in the problem of computing the maximum multicommodity flow with a bounded number of commodities in a network having an s(m+n)-separator family), so that the processor bound is only 8 √8 (m+n)1.5 and the total number of arithmetic steps is O(m2.5L) in that case. Similarly Karmarkar's algorithm and the known algorithms for the solution of overdetermined linear systems are accelerated in the case of dense input matrices via our recent parallel algorithms for the inversion of dense k×k matrices using O(log2k) steps, k3 processors. Combined with a modification of Karmarkar's algorithm, this implies solution of l.p.p. using O(Lm log2m) steps, m2.5 processors. The stated results promise some important practical applications. Theoretically the above processor bounds can be reduced for dense matrix inversion to o(k2.5) and for l.p.p. to o(m2.165) in the dense case and to o(s2.5(m+n)) in the sparse case (preserving the same number of parallel steps); this also decreases the sequential time bound for the l.p.p. by a factor of m0.335, that is, to O(Lm3.165).

## Key words

Linear programming least squares parallel algorithms

## References

1. [1]
Å. Björck 1976, Methods for Sparse Linear Least Squares Problems, 177–200, in Sparse Matrix Computations (J.R. Bunch and D.J. Rose edits.), Academic Press, N.Y.Google Scholar
2. [2]
V. Chvatal 1983, Linear Programming, W.H. Freeman, San Francisco.Google Scholar
3. [3]
P.A. Gartenberg 1985, Fast Rectangular Matrix Multiplication, Ph.D. Thesis, Dept. of Math., University of California, Los Angeles.Google Scholar
4. [4]
J.A. George 1973, Nested Dissection of a Regular Finite Element Mesh, SIAM J. on Numerical Analysis, 10,2, 345–367.Google Scholar
5. [5]
G.H. Golub and C.F. van Loan 1983, Matrix Computations, the Johns Hopkins Univ. Press, Baltimore, Maryland.Google Scholar
6. [6]
M. Gondran and M. Minoux 1984, Graphs and Algorithms, Wiley-Interscience, New York.Google Scholar
7. [7]
N.K. Karmarkar 1984, A New Polynomial Time Algorithm for Linear Programming, Combinatorica 4,4, 373–395.Google Scholar
8. [8]
R. Lipton, D. Rose and R.E. Tarjan 1979, Generalized Nested Dissection, SIAM J. on Numerical Analysis 16,2, 346–358.Google Scholar
9. [9]
R.J. Lipton and R.E. Tarjan 1979, A Separator Theorem for Planar Graphs, SIAM J. on Applied Math. 36, 177–189.Google Scholar
10. [10]
G. Lotti and F. Romani 1983, On the Asymptotic Complexity of Rectangular Matrix Multiplication, Theoretical Computer Science 23, 171–185.Google Scholar
11. [11]
K.G. Murty 1976, Linear and Combinatorial Programming, Wiley, New York.Google Scholar
12. [12]
K.G. Murty 1983, Linear Programming, Wiley, New York.Google Scholar
13. [13.
V. Pan 1984, How to Multiply Matrices Faster, Lecture Notes in Computer Science 179, Springer-Verlag, Berlin.Google Scholar
14. [14]
V. Pan 1985, Fast Finite Methods for a System of Linear Inequalities, Computers and Mathematics (with Applics.) 11,4, 355–394.Google Scholar
15. [15]
V. Pan 1985, Fast and Efficient Algorithms for the Exact Inversion of Integer Matrices, Proc. Fifth Conference on Foundations of Software Engin. and Theor. Computer Science, Indian Inst. of Techn. and Tata Inst. of Fundam. Research, New Delhi, India (Dec. 1985).Google Scholar
16. [16]
V. Pan 1985, On the Complexity of a Pivot Step of Revised Simplex Algorithm, Computers and Mathematics (with Applics.) 11, 11, 1127–1140.Google Scholar
17. [17]
V. Pan and J. Reif 1984, Efficient Parallel Solution of Linear Systems, Proc. 17-th Ann. ACM STOC, 143–152, Providence, R.I.Google Scholar
18. [18]
V. Pan and J. Reif 1985, Extension of the Parallel Nested Dissection Algorithm to the Path Algebra Problems, Tech. Report 85-9, Computer Science Dept. SUNY Albany (June 1985).Google Scholar
19. [19]
20. [20]
N.Z. Shor 1977, New Development Trend in Nondifferentiable Optimization, Kibernetika 13,6, 87–91, (transl. in Cybernetics 13,6, 881–886 (1977)).Google Scholar