I/O Efficient Algorithms for Block Hessenberg Reduction Using Panel Approach

Mohanty, Sraban Kumar; Sajith, Gopalan

doi:10.1007/978-3-642-35542-4_12

Sraban Kumar Mohanty¹⁸ &
Gopalan Sajith¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7678))

Included in the following conference series:

International Conference on Big Data Analytics

4716 Accesses

Abstract

Reduction to Hessenberg form is a major performance bottleneck in the computation of the eigenvalues of a nonsymmetric matrix; which takes O(N ³) flops. All the known blocked and unblocked direct Hessenberg reduction algorithms have an I/O complexity of O(N ³/B). To improve the performance by incorporating matrix-matrix operations in the computation, usually the Hessenberg reduction is computed in two steps: the first reducing the matrix to a banded Hessenberg form, and the second further reducing it to Hessenberg form. We propose and analyse the first step of the reduction, i.e., reduction of a nonsymmetric matrix to banded Hessenberg form of bandwidth t for varying values of N and M (the size of the internal memory), on external memory model introduced by Aggarwal and Vitter for the I/O complexity and show that the reduction can be performed in \(O(N^3/\min\{t,\sqrt{M}\}B)\) I/Os.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Comm. ACM 31(9), 1116–1127 (1988)
Article MathSciNet Google Scholar
Vitter, J.S.: External memory algorithms. In: Handbook of Massive Data Sets. Massive Comput., vol. 4, pp. 359–416. Kluwer Acad. Publ., Dordrecht (2002)
Google Scholar
Mohanty, S.K.: I/O Efficient Algorithms for Matrix Computations. PhD thesis, Indian Institute of Technology Guwahati, Guwahati, India (2010)
Google Scholar
Mohanty, S.K., Sajith, G.: I/O efficient QR and QZ algorithms. In: 19th IEEE Annual International Conference on High Performance Computing (HiPC 2012), Pune, India (accepted, December 2012)
Google Scholar
Roh, K., Crochemore, M., Iliopoulos, C.S., Park, K.: External memory algorithms for string problems. Fund. Inform. 84(1), 17–32 (2008)
MathSciNet MATH Google Scholar
Chiang, Y.J., Goodrich, M.T., Grove, E.F., Tamassia, R., Vengroff, D.E., Vitter, J.S.: External-memory graph algorithms. In: Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 139–149. ACM, Philadelphia (1995)
Google Scholar
Chiang, Y.J.: Dynamic and I/O-Efficient Algorithms for Computational Geometry and Graph Problems: Theoretical and Experimental Results. PhD thesis, Brown University, Providence, RI, USA (1996)
Google Scholar
Goodrich, M.T., Tsay, J.J., Vengroff, D.E., Vitter, J.S.: External-memory computational geometry. In: Proceedings of the 34th Annual IEEE Symposium on Foundations of Computer Science, pp. 714–723. IEEE Computer Society Press, Palo Alto (1993)
Google Scholar
Arge, L.: The buffer tree: a technique for designing batched external data structures. Algorithmica 37(1), 1–24 (2003)
Article MathSciNet MATH Google Scholar
Vitter, J.S.: External memory algorithms and data structures: dealing with massive data. ACM Comput. Surv. 33(2), 209–271 (2001)
Article Google Scholar
Demaine, E.D.: Cache-oblivious algorithms and data structures. Lecture Notes from the EEF Summer School on Massive Data Sets, BRICS, University of Aarhus, Denmark (2002)
Google Scholar
Vitter, J.S., Shriver, E.A.M.: Algorithms for parallel memory. I. Two-level memories. Algorithmica 12(2-3), 110–147 (1994)
Article MathSciNet MATH Google Scholar
Toledo, S., Gustavson, F.G.: The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In: Fourth Workshop on Input/Output in Parallel and Distributed Systems, pp. 28–40. ACM Press (1996)
Google Scholar
Reiley, W.C., Van de Geijn, R.A.: POOCLAPACK: parallel out-of-core linear algebra package. Technical Report CS-TR-99-33, Department of Computer Science, The University of Texas at Austin (November 1999)
Google Scholar
Alpatov, P., Baker, G., Edwards, H.C., Gunnels, J., Morrow, G., Overfelt, J., de Geijn, R.A.V.: PLAPACK: Parallel linear algebra package design overview. In: Supercomputing 1997: Proceedings of the ACM/IEEE Conference on Supercomputing, pp. 1–16. ACM, New York (1997)
Chapter Google Scholar
Van de Geijn, R.A., Alpatou, P., Baker, G., Edwards, C., Gunnels, J., Morrow, G., Overfelt, J.: Using PLAPACK: Parallel Linear Algebra Package. MIT Press, Cambridge (1997)
Google Scholar
Choi, J., Dongarra, J.J., Pozo, R., Walker, D.W.: ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In: Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pp. 120–127. IEEE Computer Society Press (1992)
Google Scholar
Anderson, E., Bai, Z., Bischof, C.H., Demmel, J., Dongarra, J.J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.C.: LAPACK Users’ Guide, 2nd edn. SIAM, Philadelphia (1995)
Google Scholar
Basic Linear Algebra Subprograms(BLAS), http://www.netlib.org/blas/
Toledo, S.: A survey of out-of-core algorithms in numerical linear algebra. In: External Memory Algorithms. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. Amer. Math. Soc., vol. 50, pp. 161–179, Piscataway, NJ, Providence, RI (1999)
Google Scholar
Elmroth, E., Gustavson, F.G., Jonsson, I., Kågström, B.: Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Rev. 46(1), 3–45 (2004)
Article MathSciNet MATH Google Scholar
Haveliwala, T., Kamvar, S.D.: The second eigenvalue of the google matrix. Technical Report 2003-20, Stanford InfoLab (2003)
Google Scholar
Christopher, M.D., Eugenia, K., Takemasa, M.: Estimating and correcting global weather model error. Monthly Weather Review 135(2), 281–299 (2007)
Article Google Scholar
Alter, O., Brown, P.O., Botstein, D.: Processing and modeling genome-wide expression data using singular value decomposition. In: Bittner, M.L., Chen, Y., Dorsel, A.N., Dougherty, E.R. (eds.) Microarrays: Optical Technologies and Informatics, vol. 4266, pp. 171–186. SPIE (2001)
Google Scholar
Xu, S., Bai, Z., Yang, Q., Kwak, K.S.: Singular value decomposition-based algorithm for IEEE 802.11a interference suppression in DS-UWB systems. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E89-A(7), 1913–1918 (2006)
Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Watkins, D.S.: Fundamentals of Matrix Computations, 2nd edn. Pure and Applied Mathematics. Wiley-Interscience. John Wiley & Sons, New York (2002)
Book MATH Google Scholar
Dongarra, J.J., Duff, I.S., Sorensen, D.C., Van der Vorst, H.A.: Numerical Linear Algebra for High Performance Computers. Software, Environments and Tools, vol. 7. SIAM, Philadelphia (1998)
Book MATH Google Scholar
Dongarra, J.J., Croz, J.D., Hammarling, S., Duff, I.S.: A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16(1), 1–17 (1990)
Article MATH Google Scholar
Elmroth, E., Gustavson, F.G.: New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems. In: Kågström, B., Elmroth, E., Waśniewski, J., Dongarra, J. (eds.) PARA 1998. LNCS, vol. 1541, pp. 120–128. Springer, Heidelberg (1998)
Chapter Google Scholar
Gunter, B.C., Reiley, W.C., Van de Geijn, R.A.: Implementation of out-of-core Cholesky and QR factorizations with POOCLAPACK. Technical Report CS-TR-00-21, Austin, TX, USA (2000)
Google Scholar
Gunter, B.C., Reiley, W.C., Van De Geijn, R.A.: Parallel out-of-core Cholesky and QR factorization with POOCLAPACK. In: IPDPS 2001: Proceedings of the 15th International Parallel & Distributed Processing Symposium. IEEE Computer Society, Washington, DC (2001)
Google Scholar
Gunter, B.C., Van de Geijn, R.A.: Parallel out-of-core computation and updating of the QR factorization. ACM Trans. Math. Software 31(1), 60–78 (2005)
Article MathSciNet MATH Google Scholar
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)
Article MathSciNet Google Scholar
Bischof, C.H., Lang, B., Sun, X.: A framework for symmetric band reduction. ACM Trans. Math. Software 26(4), 581–601 (2000)
Article MathSciNet Google Scholar
Quintana Ortí, G., de Geijn, R.A.V.: Improving the performance of reduction to Hessenberg form. ACM Trans. Math. Software 32(2), 180–194 (2006)
Article MathSciNet Google Scholar
Dongarra, J.J., Sorensen, D.C., Hammarling, S.J.: Block reduction of matrices to condensed forms for eigenvalue computations. J. Comput. Appl. Math. 27(1-2), 215–227 (1989)
Article MathSciNet MATH Google Scholar
Dongarra, J.J., van de Geijn, R.A.: Reduction to condensed form for the eigenvalue problem on distributed memory architectures. Parallel Comput. 18(9), 973–982 (1992)
Article MathSciNet MATH Google Scholar
Bischof, C.H., Lang, B., Sun, X.: Parellel tridiagonal through two-step band reduction. In: Proceedings of the Scalable High-Performance Computing Conference, pp. 23–27. IEEE Computer Society Press (May 1994)
Google Scholar
Lang, B.: Using level 3 BLAS in rotation-based algorithms. SIAM J. Sci. Comput. 19(2), 626–634 (1998)
Article MathSciNet MATH Google Scholar
Lang, B.: A parallel algorithm for reducing symmetric banded matrices to tridiagonal form. SIAM J. Sci. Comput. 14(6), 1320–1338 (1993)
Article MathSciNet MATH Google Scholar
Berry, M.W., Dongarra, J.J., Kim, Y.: A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-Hessenberg form. Parallel Comput. 21(8), 1189–1211 (1995)
Article MathSciNet MATH Google Scholar
Ltaief, H., Kurzak, J., Dongarra, J.J.: Parallel block Hessenberg reduction using algorithms-by-tiles for multicore architectures revisited. LAPACK Working Note #208, University of Tennessee, Knoxville (2008)
Google Scholar
Bai, Y., Ward, R.C.: Parallel block tridiagonalization of real symmetric matrices. J. Parallel Distrib. Comput. 68(5), 703–715 (2008)
Article MATH Google Scholar
Großer, B., Lang, B.: Efficient parallel reduction to bidiagonal form. Parallel Comput. 25(8), 969–986 (1999)
Article MathSciNet MATH Google Scholar
Lang, B.: Parallel reduction of banded matrices to bidiagonal form. Parallel Comput. 22(1), 1–18 (1996)
Article MathSciNet MATH Google Scholar
Trefethen, L.N., Bau III, D.: Numerical Linear Algebra. SIAM (1997)
Google Scholar
Ltaief, H., Kurzak, J., Dongarra, J.J.: Scheduling two-sided transformations using algorithms-by-tiles on multicore architectures. LAPACK Working Note #214, University of Tennessee, Knoxville (2009)
Google Scholar
Bischof, C.H., Van Loan, C.F.: The WY representation for products of Householder matrices. SIAM J. Sci. Statist. Comput. 8(1), S2–S13 (1987)
Article MathSciNet Google Scholar
Wu, Y.J.J., Alpatov, P., Bischof, C.H., van de Geijn, R.A.: A parallel implementation of symmetric band reduction using PLAPACK. In: Proceedings of Scalable Parallel Library Conference. PRISM Working Note 35, Mississippi State University (1996)
Google Scholar
Bai, Y.: High performance parallel approximate eigensolver for real symmetric matrices. PhD thesis, University of Tennessee, Knoxville (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science & Engineering Discipline, PDPM Indian Institute of Information Technology, Design and Manufacturing Jabalpur, Jabalpur, 482005, MP, India
Sraban Kumar Mohanty
Computer Science & Engineering Department, Indian Institute of Technology Guwahati, Guwahati, 781039, Assam, India
Gopalan Sajith

Authors

Sraban Kumar Mohanty
View author publications
You can also search for this author in PubMed Google Scholar
Gopalan Sajith
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

International Institute of Information Technology Bangalore (IIIIT Bangalore), 26/C, Electronics City, Hosur Road, 560100, Bangalore, India
Srinath Srinivasa
Faculty of Mathematical Sciences, Department of Computer Science, University of Delhi, Delhi, India
Vasudha Bhatnagar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mohanty, S.K., Sajith, G. (2012). I/O Efficient Algorithms for Block Hessenberg Reduction Using Panel Approach. In: Srinivasa, S., Bhatnagar, V. (eds) Big Data Analytics. BDA 2012. Lecture Notes in Computer Science, vol 7678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35542-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-35542-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35541-7
Online ISBN: 978-3-642-35542-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics