Parallel Numerical Computing from Illiac IV to Exascale—The Contributions of Ahmed H. Sameh

  • Kyle A. Gallivan
  • Efstratios Gallopoulos
  • Ananth Grama
  • Bernard Philippe
  • Eric Polizzi
  • Yousef Saad
  • Faisal Saied
  • Danny Sorensen

Abstract

As exascale computing is looming on the horizon while multicore and GPU’s are routinely used, we survey the achievements of Ahmed H. Sameh, a pioneer in parallel matrix algorithms. Studying his contributions since the days of Illiac IV as well as the work that he directed and inspired in the building of the Cedar multiprocessor and his recent research unfolds a useful historical perspective in the field of parallel scientific computing.

References

  1. 1.
    Aktulga, H.M., Fogarty, J.C., Pandit, S.A., Grama, A.Y.: Parallel reactive molecular dynamics: Numerical methods and algorithmic techniques. Parallel Comput. (2011). doi:10.1016/j.parco.2011.08.005 Google Scholar
  2. 2.
    Aktulga, H., Pandit, S., van Duin, A., Grama, A.: Reactive molecular dynamics: Numerical methods and algorithmic techniques. SIAM J. Sci. Comput. (2011, to appear) Google Scholar
  3. 3.
    Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammerling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. SIAM, Philadelphia (1999) CrossRefGoogle Scholar
  4. 4.
    Arbenz, P., Cleary, A., Dongarra, J., Hegland, M.: A comparison of parallel solvers for diagonally dominant and general narrow-banded linear systems. Parallel Dist. Comp. Pract. 2, 385–400 (1999) Google Scholar
  5. 5.
    Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel comput. research: A view from Berkeley. Tech. Rep. UCB/EECS-2006-183, EECS Department. University of California, Berkeley (2006). http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html Google Scholar
  6. 6.
    Barnes, G., Brown, R., Kato, M., Kuck, D., Slotnick, D., Stokes, R.: The ILLIAC IV computer. IEEE Trans. Comput. 17, 746–757 (1968). http://doi.ieeecomputersociety.org/10.1109/TC.1968.229158 MATHCrossRefGoogle Scholar
  7. 7.
    Ben-Artzi, M., Croisille, J.P., Fishelov, D.: A fast direct solver for the biharmonic problem in a rectangular grid. SIAM J. Sci. Comput. 31(1), 303–333 (2008) MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Berry, M.: Multiprocessor sparse SVD algorithms and applications. Ph.D. thesis, University of Illinois at Urbana-Champaign (1991) Google Scholar
  9. 9.
    Berry, M.: Large scale singular value decomposition. Int. J. Supercomput. Appl. 6, 13–49 (1992) Google Scholar
  10. 10.
    Berry, M., Drmac, Z., Jessup, E.: Matrices, vector spaces, and information retrieval. SIAM Rev. 41, 335–362 (1998) MathSciNetCrossRefGoogle Scholar
  11. 11.
    Berry, M., Dumais, S., O’Brien, G.: Using linear algebra for intelligent information retrieval. SIAM Rev. 37, 573–595 (1995) MathSciNetMATHCrossRefGoogle Scholar
  12. 12.
    Berry, M., Mezher, D., Philippe, B., Sameh, A.: Parallel algorithms for the singular value decomposition. In: Kontoghiorghes, E. (ed.) Handbook of Parallel Computing and Statistics, pp. 117–164. Chapman & Hall/CRC, Boca Raton (2006) Google Scholar
  13. 13.
    Berry, M., Sameh, A.: Multiprocessor schemes for solving block tridiagonal linear systems. Int. J. Supercomput. Appl. 2(3), 37–57 (1988) CrossRefGoogle Scholar
  14. 14.
    Berry, M., Sameh, A.: An overview of parallel algorithms for the singular value and symmetric eigenvalue problems. J. Comput. Appl. Math. 27(1–2), 191–213 (1989). doi:10.1016/0377-0427(89)90366-X. http://www.sciencedirect.com/science/article/pii/037704278990366X. Special Issue on Parallel Algorithms for Numerical Linear Algebra MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Berry, M., et al.: The Perfect club benchmarks: Effective performance evaluation of supercomputers. Int. J. High Perform. Comput. Appl. 3(3), 5–40 (1989) CrossRefGoogle Scholar
  16. 16.
    Bik, A.J.C., Wijshoff, H.A.G.: Compilation techniques for sparse matrix computations. In: Proc. Int’l. Conf. Supercomp, pp. 416–424 (1993) Google Scholar
  17. 17.
    Bik, A.J.C., Wijshoff, H.A.G.: Advanced compiler optimizations for sparse computations. J. Parallel Distrib. Comput. 31(1), 14–24 (1995) CrossRefGoogle Scholar
  18. 18.
    Bini, D.: Parallel solution of certain Toeplitz linear systems. SIAM J. Comput. 13, 268–276 (1984) MathSciNetMATHCrossRefGoogle Scholar
  19. 19.
    Bini, D.: Matrix structures in parallel matrix computations. Calcolo 25(1–2), 37–51 (1988) MathSciNetMATHCrossRefGoogle Scholar
  20. 20.
    Bini, D., Capovani, M.: Fast parallel and sequential computations and spectral properties concerning band Toeplitz matrices. Calcolo 20, 177–189 (1983) MathSciNetMATHCrossRefGoogle Scholar
  21. 21.
    Blackford, L., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.: ScaLAPACK User’s Guide. SIAM, Philadelphia (1997). See also www.netlib.org/scalapack CrossRefGoogle Scholar
  22. 22.
    Bouknight, W., Denenberg, S., McIntyre, D., Randall, J., Sameh, A., Slotnick, D.: The ILLIAC IV system. Proc. IEEE 60(4), 369–388 (1972) CrossRefGoogle Scholar
  23. 23.
    Bramley, R., Chen, H.C., Meier, U., Sameh, A.: On some parallel preconditioned CG schemes. In: Axelsson, O. (ed.) Preconditioned Conjugate Gradient Methods. Lecture Notes in Mathematics. Springer, Berlin (1990) Google Scholar
  24. 24.
    Bramley, R., Sameh, A.: Row projection methods for large nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13, 168–193 (1992) MathSciNetMATHCrossRefGoogle Scholar
  25. 25.
    Buzbee, B.: A fast Poisson solver amenable to parallel computation. IEEE Trans. Comput. C-22(8), 793–796 (1973) CrossRefGoogle Scholar
  26. 26.
    Buzbee, B., Golub, G., Nielson, C.: On direct methods for solving Poisson’s equation. SIAM J. Numer. Anal. 7(4), 627–656 (1970) MathSciNetMATHCrossRefGoogle Scholar
  27. 27.
    Chen, H.C., Sameh, A.: A matrix decomposition method for orthotropic elasticity problems. SIAM J. Matrix Anal. Appl. 10(1), 39–64 (1989) MathSciNetMATHCrossRefGoogle Scholar
  28. 28.
    Chen, S.C., Kuck, D.J., Sameh, A.H.: Practical band triangular system solvers. ACM Trans. Math. Softw. 4(3), 270–277 (1978) MathSciNetMATHCrossRefGoogle Scholar
  29. 29.
    Cleary, A., Dongarra, J.: Implementation in ScaLAPACK of divide and conquer algorithms for banded and tridiagonal linear systems. Tech. Rep. UT-CS-97-358, University of Tennessee Computer Science Technical Report (1997) Google Scholar
  30. 30.
    Cooley, J., Lewis, P., Welch, P.: The fast Fourier transform algorithm: Programming considerations in the calculation of sine, cosine, and Laplace transforms. J. Sound Vib. 12(2), 315–337 (1970) MATHCrossRefGoogle Scholar
  31. 31.
    Crouzeix, M., Philippe, B., Sadkane, M.: The Davidson method. SIAM J. Sci. Stat. Comput. 15, 62–76 (1984) MathSciNetCrossRefGoogle Scholar
  32. 32.
    Davidson, E., Kuck, D., Lawrie, D., Sameh, A.: Supercomputing tradeoffs and the Cedar system. In: Wilhelmson, R. (ed.) High-Speed Computing, Scientific Applications and Algorithm Design. University of Illinois Press, Champaign (1986) Google Scholar
  33. 33.
    Demko, S., Moss, W., Smith, P.: Decay rates for inverses of band matrices. Math. Comput. 43(168), 491–499 (1984) MathSciNetMATHCrossRefGoogle Scholar
  34. 34.
    Demmel, J., Dhillon, I., Ren, H.: On the correctness of some bisection-like parallel eigenvalue algorithms in floating point arithmetic. Electron. Trans. Numer. Anal. 3, 116–149 (1995) MathSciNetMATHGoogle Scholar
  35. 35.
    DeRose, L., Gallivan, K., Gallopoulos, E., Marsolf, B., Padua, D.: FALCON: A MATLAB interactive restructuring compiler. In: Huang, C.H., et al. (eds.) Languages and Compilers for Parallel Comp. Lecture Notes in Computer Science, vol. 1033, pp. 269–288. Springer, Berlin (1996) CrossRefGoogle Scholar
  36. 36.
    Dongarra, J.: Getting the performance out of high performance computing. Presentation (2003). http://www.netlib.org/utk/people/JackDongarra/SLIDES/scidac-napa-0303.pdf. DOE SciDAC Review
  37. 37.
    Dongarra, J., Sameh, A.: On some parallel banded system solvers. Parallel Comput. 1(3–4), 223–236 (1984) CrossRefGoogle Scholar
  38. 38.
    van Duin, A.C.T., Dasgupta, S., Lorant, F. III: ReaxFF: A reactive force field for hydrocarbons. J. Phys. Chem. A 105, 9396–9409 (2001) CrossRefGoogle Scholar
  39. 39.
    Emrath, P.: XYLEM: An operating system for the Cedar multiprocessor. IEEE Softw. 2(4), 30–37 (1985) CrossRefGoogle Scholar
  40. 40.
    Emrath, P., Padua, D., Yew, P.C.: Cedar architecture and its software. In: Architecture Track, Proc. of the Twenty-Second Annual Hawaii Intl. Conf. on System Sciences, vol. 1, pp. 306–315 (1989) CrossRefGoogle Scholar
  41. 41.
    Ericksen, J.: Iterative and direct methods for solving Poisson’s equation and their adaptability to ILLIAC IV. Tech. Rep. UIUCDCS-R-72-574, University of Illinois at Urbana-Champaign, Department of Computer Science (Dec. 1972) Google Scholar
  42. 42.
    Ericksen, J., Wilhelmson, R.B.: Implementation of a convective problem requiring auxiliary storage. ACM Trans. Math. Softw. 2, 187–195 (1976) MATHCrossRefGoogle Scholar
  43. 43.
    Fairweather, G., Karageorghis, A., Martin, P.: The method of fundamental solutions for scattering and radiation problems. Eng. Anal. Bound. Elem. 27(7), 759–769 (2003). Special issue on Acoustics MATHCrossRefGoogle Scholar
  44. 44.
    Fischer, D., Golub, G., Hald, O., Leiva, C., Widlund, O.: On Fourier–Toeplitz methods for separable elliptic problems. Math. Comput. 28(126), 349–368 (1974) MathSciNetMATHCrossRefGoogle Scholar
  45. 45.
    Gallivan, K., Jalby, W., Malony, A., Wijshoff, H.: Performance prediction for parallel numerical algorithms. Int. J. High Speed Comput. 3(1), 31–62 (1991) MATHCrossRefGoogle Scholar
  46. 46.
    Gallivan, K., Jalby, W., Malony, A., Yew, P.C.: Performance analysis on the Cedar system. In: Martin, J.L. (ed.) Performance Evaluation of Supercomp.s, pp. 109–142. Elsevier, Amsterdam (1988) Google Scholar
  47. 47.
    Gallivan, K., Jalby, W., Meier, U., Sameh, A.H.: Impact of hierarchical memory systems on linear algebra algorithm design. Int. J. Supercomput. Appl. 2(1), 12–48 (1988) CrossRefGoogle Scholar
  48. 48.
    Gallivan, K., Sameh, A., Zlatev, Z.: A parallel hybrid sparse linear system solver. Comp. Sys. Eng. 1(2–4) (1990) Google Scholar
  49. 49.
    Gallivan, K., Sameh, A., Zlatev, Z.: Solving general sparse linear systems using conjugate gradient-type methods. Proc. Int. Conf. Supercomput. 18, 132–139 (1990) doi:http://doi.acm.org/10.1145/255129.255149 Google Scholar
  50. 50.
    Gallivan, K.A., Plemmons, R.J., Sameh, A.H.: Parallel numerical algorithms for dense linear algebra computations. SIAM Rev. 32(1), 54–135 (1990) MathSciNetMATHCrossRefGoogle Scholar
  51. 51.
    Gallopoulos, E.: Rapid elliptic solvers. In: Padua, D. (ed.) Encyclopedia of Parallel Comput. Springer, Berlin (2011) Google Scholar
  52. 52.
    Gallopoulos, E., Houstis, E., Rice, J.: Computer as thinker/doer: Problem solving environments for CSE. IEEE Comput. Sci. Eng. 1(2), 11–23 (1994) CrossRefGoogle Scholar
  53. 53.
    Gallopoulos, E., Lee, D.: Boundary integral domain decomposition on hierarchical memory multiprocessors. In: Proc. 1988 ACM Int’l. Conf. Supercomp, pp. 488–499 (1988) Google Scholar
  54. 54.
    Gallopoulos, E., Sameh, A.: Solving elliptic equations on the Cedar multiprocessor. In: Wright, M.H. (ed.) Aspects of Computation on Asynchronous Parallel Processors, pp. 1–12. Elsevier, Amsterdam (1989) Google Scholar
  55. 55.
    Gallopoulos, E., Sameh, A.: CSE: content and product. IEEE Comput. Sci. Eng. Mag. 4, 39–43 (1997) CrossRefGoogle Scholar
  56. 56.
    Gannon, D., Jalby, W., Gallivan, K.: Strategies for cache and local memory management by global program transformation. J. Parallel Distrib. Comput. 5(5), 587–616 (1988) CrossRefGoogle Scholar
  57. 57.
    Giraud, L.: Parallel distributed FFT-based solvers for 3-D Poisson problems in meso-scale atmospheric simulations. Int. J. High Perform. Comput. Appl. 15(1), 36–46 (2001) MathSciNetCrossRefGoogle Scholar
  58. 58.
    Golub, G., Sameh, A., Sarin, V.: Parallel balance scheme for banded linear systems. Numer. Linear Algebra Appl. 8(5), 297–316 (2001) MathSciNetMATHCrossRefGoogle Scholar
  59. 59.
    Grama, A., Kumar, V., Sameh, A.: Scalable parallel formulations of the Barnes–Hut algorithm for n-Body simulations. In: Proceedings of the Supercomputing Conference, Washington, DC, p. 8 (1994) Google Scholar
  60. 60.
    Grama, A., Kumar, V., Sameh, A.: Parallel matrix-vector product using approximate hierarchical methods. In: Proceedings of the Supercomputing Conference, San Diego, CA, p. 8 (1995) Google Scholar
  61. 61.
    Grama, A., Kumar, V., Sameh, A.: Parallel hierarchical solvers and preconditioners for boundary element methods. In: Proceedings of the Supercomputing Conference, Pittsburgh, PA, p. 8 (1996). Proc. on CD and online at http://www.supercomp.org/sc96/proceedings/ Google Scholar
  62. 62.
    Grama, A., Kumar, V., Sameh, A.: Parallel hierarchical solvers and preconditioners for boundary element methods. SIAM J. Sci. Comput. 20(1), 337–358 (1998) MathSciNetCrossRefGoogle Scholar
  63. 63.
    Grama, A., Kumar, V., Sameh, A.: Scalable parallel formulations of the Barnes–Hut method for n-Body simulations. Parallel Comput. 24(5–6), 797–822 (1998) MATHCrossRefGoogle Scholar
  64. 64.
    Grama, A., Sarin, V., Sameh, A.: Analyzing the error bounds of multipole-based treecodes. In: Proceedings of the Supercomputing Conference, Orlando, FL, p. 10 (1998). Proc. on CD or online at http://www.supercomp.org/sc98/papers/index.html Google Scholar
  65. 65.
    Grama, A., Sarin, V., Sameh, A.: Improving error bounds for multipole-based treecodes. In: Proceedings of 5th International Conference on High Performance Computing, Chennai, India, p. 8 (1998). Proc. on CD and online at http://www.hipc.org/hipc98/adpgm98.html Google Scholar
  66. 66.
    Grama, A., Sarin, V., Sameh, A.: Improving error bounds for multipole-based treecodes. SIAM J. Sci. Comput. 21(5), 1790–1803 (2000) MathSciNetMATHCrossRefGoogle Scholar
  67. 67.
    Grcar, J., Sameh, A.: On certain parallel Toeplitz linear system solvers. SIAM J. Sci. Stat. Comput. 2(2), 238–256 (1981) MathSciNetMATHCrossRefGoogle Scholar
  68. 68.
    Gupta, A., Kumar, V., Sameh, A.H.: Performance and scalability of preconditioned conjugate gradient methods on parallel computers. IEEE Trans. Parallel Distrib. Syst. 6(5), 455–469 (1995) CrossRefGoogle Scholar
  69. 69.
    Guzzi, M., Padua, D., Hoeflinger, J., Lawrie, D.: Cedar Fortran and other vector and parallel Fortran dialects. In: Proc. Supercomp. 1988, vol. 1, pp. 114–121 (1988) CrossRefGoogle Scholar
  70. 70.
    Heller, D.: A survey of parallel algorithms in numerical linear algebra. SIAM Rev. 20(4), 740–777 (1978) MathSciNetMATHCrossRefGoogle Scholar
  71. 71.
    Higham, N.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002) MATHCrossRefGoogle Scholar
  72. 72.
    Hockney, R.: A fast direct solution of Poisson’s equation using Fourier analysis. J. Assoc. Comput. Mach. 12, 95–113 (1965) MathSciNetMATHCrossRefGoogle Scholar
  73. 73.
    Hoemmen, M.: Communication-avoiding Krylov subspace methods. Ph.D. thesis, University of California at Berkeley (2010) Google Scholar
  74. 74.
    Huang, H.M.: A parallel algorithm for symmetric tridiagonal eigenvalue problems. CAC Document 109, Center for Advanced Computation, Univ. Illinois at Urbana-Champaign (1974) Google Scholar
  75. 75.
    Jacobi, C.: Über ein leichtes verfahren die in der theorie der säculärstörungen vorkommenden gleichungen numerisch aufzulösen. Crelle’s J. für reine und angewandte Mathematik 30, 51–94 (1846) MATHCrossRefGoogle Scholar
  76. 76.
    Jalby, W., Philippe, B.: Stability analysis and improvement of the block Gram-Schmidt algorithm. SIAM J. Sci. Stat. Comput. 12(5), 1058–1073 (1991) MathSciNetMATHCrossRefGoogle Scholar
  77. 77.
    Kamath, C., Sameh, A.: The preconditioned conjugate gradient algorithm on a multiprocessor. In: Fifth IMACS International Symp. on Computer Methods for Partial Differential Equations, pp. 210–217 IMACS (1984) Google Scholar
  78. 78.
    Kamath, C., Sameh, A.: A projection method for solving nonsymmetric linear systems on multiprocessors. Parallel Comput. 9, 291–312 (1989) MathSciNetMATHCrossRefGoogle Scholar
  79. 79.
    Kuck, D., Davidson, E., Lawrie, D., Sameh, A., Zhu, C.Q., Veidenbaum, A., Konicek, J., Yew, P., Gallivan, K., Jalby, W., Wijshoff, H., Bramley, R., Yang, U.M., Emrath, P., Padua, D., Eigenmann, R., Hoeflinger, J., Jaxon, G., Li, Z., Murphy, T., Andrews, J., Turner, S.: The Cedar system and an initial performance study. In: Proc. of the 20th ACM/IEEE Intl. Symposium on Computer Architecture, pp. 213–223. ACM, New York (1993) CrossRefGoogle Scholar
  80. 80.
    Kuck, D., Parker, D. Jr., Sameh, A.: ROM rounding: A new rounding scheme. In: Proc. 3d IEEE Symp. Comput. Arith, pp. 67–72 (1975) Google Scholar
  81. 81.
    Kuck, D., Parker, D. Jr., Sameh, A.: Analysis of rounding methods in floating-point arithmetic. IEEE Trans. Comput. C-26(7), 643–650 (1977) MathSciNetCrossRefGoogle Scholar
  82. 82.
    Kuck, D., Sameh, A.: Parallel computation of eigenvalues of real matrices. In: IFIP Congress 1971, vol. 2, pp. 1266–1272 (1972) Google Scholar
  83. 83.
    Kuck, D.J., Davidson, E.S., Lawrie, D.H., Sameh, A.H.: Parallel supercomputing today and the Cedar approach. Science 231, 967–974 (1987) CrossRefGoogle Scholar
  84. 84.
    Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand. 45, 255–282 (1950) MathSciNetGoogle Scholar
  85. 85.
    Larson, J.L.: Automatic error analysis for serial and parallel algorithms. In: Kuck, D., Sameh, A., Gajski, D. (eds.) High Speed Computers and Algorithm Organization, pp. 457–459. Academic Press, New York (1976) Google Scholar
  86. 86.
    Larson, J.L., Sameh, A.: Algorithms for roundoff error analysis—a relative error approach. Computer 24(4), 275–297 (1980) MathSciNetMATHCrossRefGoogle Scholar
  87. 87.
    Larson, J.L., Sameh, A.: Efficient calculation of the effects of roundoff errors. ACM Trans. Math. Softw. 4(3), 228–236 (1978) MathSciNetMATHCrossRefGoogle Scholar
  88. 88.
    Lawrie, D.H., Sameh, A.H.: The computation and communication complexity of a parallel banded system solver. ACM Trans. Math. Softw. 10(2), 185–195 (1984) MathSciNetMATHCrossRefGoogle Scholar
  89. 89.
    Lo, S., Philippe, B., Sameh, A.: A multiprocessor algorithm for the symmetric tridiagonal eigenvalue problem. SIAM J. Sci. Stat. Comput. 8, s155–s165 (1987) MathSciNetCrossRefGoogle Scholar
  90. 90.
    Manguoglu, M., Koyuturk, M., Sameh, A., Grama, A.: Weighted matrix ordering and parallel banded preconditioners for iterative linear system solvers. SIAM J. Sci. Comput. 32(3), 1201–1216 (2010) MathSciNetMATHCrossRefGoogle Scholar
  91. 91.
    Manguoglu, M., Saied, F., Sameh, A., Grama, A.: Performance models for the SPIKE banded linear system solver. Sci. Program. 19(1), 13–25 (2011) Google Scholar
  92. 92.
    Manguoglu, M., Sameh, A.H., Schenk, O.: PSPIKE: A parallel hybrid sparse linear system solver. In: Proc. 15th Int’l. Euro-Par Conf. on Parallel Proc., Euro-Par ’09, pp. 797–808. Springer, Berlin (2009) CrossRefGoogle Scholar
  93. 93.
    Marsolf, B.A., Gallivan, K.A., Wijshoff, H.A.G.: The utilization of matrix structure to generate optimized code from MATLAB programs. Int. J. Parallel Program. 27(2), 73–96 (1999) CrossRefGoogle Scholar
  94. 94.
    Meier, U., Sameh, A.: The behavior of conjugate gradient methods on a multivector processor. J. Comput. Appl. Math. 24, 13–32 (1988) MathSciNetMATHCrossRefGoogle Scholar
  95. 95.
    Mendiratta, K., Polizzi, E.: A threaded SPIKE algorithm for solving general banded systems. Paralel Comput. 37(12), 733–741 (2011) CrossRefGoogle Scholar
  96. 96.
    Mikkelsen, C.C.K., Manguoglu, M.: Analysis of the truncated SPIKE algorithm. SIAM J. Matrix Anal. Appl. 30, 1500–1519 (2008) MathSciNetMATHCrossRefGoogle Scholar
  97. 97.
    Morgan, R., Scott, D.S.: Generalizations of Davidson’s method for computing eigenvalues of sparse symmetric matrices. SIAM J. Sci. Stat. Comput. 7, 817–825 (1986) MathSciNetMATHCrossRefGoogle Scholar
  98. 98.
    Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010) MATHCrossRefGoogle Scholar
  99. 99.
    Naumov, M., Manguoglu, M., Sameh, A.: A tearing-based hybrid parallel sparse linear system solver. J. Comput. Appl. Math. 234, 3025–3038 (2010) MathSciNetMATHCrossRefGoogle Scholar
  100. 100.
    Polizzi, E., Sameh, A.H.: A parallel hybrid banded system solver: The SPIKE algorithm. Parallel Comput. 32(2), 177–194 (2006) MathSciNetCrossRefGoogle Scholar
  101. 101.
    Polizzi, E., Sameh, A.H.: SPIKE: A parallel environment for solving banded linear systems. Comput. Fluids 36(1), 113–120 (2007) MathSciNetMATHCrossRefGoogle Scholar
  102. 102.
    Rutishauser, H.: Simultaneous iteration method for symmetric matrices. Numer. Math. 13, 204–223 (1970) Google Scholar
  103. 103.
    Saad, Y., Sameh, A.: Iterative methods for the solution of elliptic difference equations on multiprocessors. In: Proc. CONPAR’81, Lecture Notes in Computer Science, pp. 395–413. Springer, Berlin (1981) Google Scholar
  104. 104.
    Saad, Y., Sameh, A., Saylor, P.: Solving elliptic difference equations on a linear array of processors. SIAM J. Sci. Stat. Comput. 6(4), 1049–1063 (1985) MathSciNetMATHCrossRefGoogle Scholar
  105. 105.
    Saad, Y., Wijshoff, H.A.G.: Performance study of some supercomputers using a sparse matrix benchmark. In: Proc. SIAM Conf. Parallel Proc. Sci. Comp, pp. 451–453 (1989) Google Scholar
  106. 106.
    Saad, Y., Wijshoff, H.A.G.: SPARK: a benchmark package for sparse computations. In: Proc. Intl. Conf. Supercomp, pp. 239–253 (1990) Google Scholar
  107. 107.
    Sameh, A.: Numerical analysis of axisymmetric wave propagation in elastic-plastic layered media. Ph.D. thesis, Dept. of Civil Engineering, University of Illinois at Urbana-Champaign (1968) Google Scholar
  108. 108.
    Sameh, A.: A discrete-variable approach for elastic-plastic wave motions in layered solids. J. Comput. Phys. 8, 342–368 (1971) CrossRefGoogle Scholar
  109. 109.
    Sameh, A.: On two numerical algorithms for multiprocessors. In: Proc. NATO Advanced Research Workshop on High-Speed Comp. Series F: Computer and Systems Sciences, p. 18 (1983) Google Scholar
  110. 110.
    Sameh, A.: On some parallel algorithms on a ring of processors. Comput. Phys. Commun. 37, 159–166 (1985) MathSciNetCrossRefGoogle Scholar
  111. 111.
    Sameh, A., Han, L.: Eigenvalue problems. Tech. rep., ILLIAC IV Document 127, Dept. of Computer Science, University of Illinois, Urbana (1968) Google Scholar
  112. 112.
    Sameh, A., Sarin, V.: Hybrid parallel linear solvers. Int. J. Comput. Fluid Dyn. 12, 213–223 (1999) MathSciNetMATHCrossRefGoogle Scholar
  113. 113.
    Sameh, A., Tong, Z.: The trace minimization method for the symmetric generalized eigenvalue problem. J. Comput. Appl. Math. 123, 155–175 (2000) MathSciNetMATHCrossRefGoogle Scholar
  114. 114.
    Sameh, A.H.: On Jacobi and Jacobi-like algorithms for a parallel computer. Math. Comput. 25, 579–590 (1971) MathSciNetMATHCrossRefGoogle Scholar
  115. 115.
    Sameh, A.H.: Numerical parallel algorithms–a survey. In: Kuck, D., Lawrie, D., Sameh, A. (eds.) High Speed Computer and Algorithm Organization, pp. 207–228. Academic Press, San Diego (1977) Google Scholar
  116. 116.
    Sameh, A.H.: A fast Poisson solver for multiprocessors. In: Birkhoff, G., Schoenstadt, A. (eds.) Elliptic Problem Solvers II, pp. 175–186. Academic Press, San Diego (1984) Google Scholar
  117. 117.
    Sameh, A.H., Brent, R.P.: Solving triangular systems on a parallel computer. SIAM J. Numer. Anal. 14(6), 1101–1113 (1977) MathSciNetMATHCrossRefGoogle Scholar
  118. 118.
    Sameh, A.H., Chen, S.C., Kuck, D.J.: Parallel direct Poisson and biharmonic solvers. Tech. Rep. 684, Dept. Computer Science, University of Illinois (1974) Google Scholar
  119. 119.
    Sameh, A.H., Chen, S.C., Kuck, D.J.: Parallel Poisson and biharmonic solvers. Computer 17, 219–230 (1976) MathSciNetMATHCrossRefGoogle Scholar
  120. 120.
    Sameh, A.H., Kuck, D.J.: On stable parallel linear system solvers. J. Assoc. Comput. Mach. 25(1), 81–91 (1978) MathSciNetMATHCrossRefGoogle Scholar
  121. 121.
    Sameh, A.H., Lermit, J., Noh, K.: On the intermediate eigenvalues of symmetric sparse matrices. BIT Numer. Math. 15, 185–191 (1975) MathSciNetMATHCrossRefGoogle Scholar
  122. 122.
    Sameh, A.H., Sarin, V.: Large scale simulation of particulate flows. In: Proceedings 13th International Parallel Processing Symposium/10th Symposium on Parallel and Distributed Processing (IPPS/SPDP ’99), 12–16 April 1999, San Juan, Puerto Rico, pp. 660–667. IEEE Computer Society, Los Alamitos (1999) CrossRefGoogle Scholar
  123. 123.
    Sameh, A.H., Wisniewski, J.A.: A trace minimization algorithm for the generalized eigenvalue problem. SIAM J. Numer. Anal. 19(6), 1243–1259 (1982) MathSciNetMATHCrossRefGoogle Scholar
  124. 124.
    Sarin, V., Kneppley, M., Sameh, A.H.: Parallel simulation of particulate flows. In: Ferreira, A., Rolim, J.D.P., Simon, H.D., Teng, S.H. (eds.) Proceedings of Solving Irregularly Structured Problems in Parallel, 5th International Symposium, IRREGULAR ’98, Berkeley, California, USA, August 9–11, 1998. Lecture Notes in Computer Science, vol. 1457, pp. 226–237. Springer, Berlin (1998) Google Scholar
  125. 125.
    Schenk, O., Gärtner, K.: Solving unsymmetric sparse systems of linear equations with PARDISO. Future Gener. Comput. Syst., 20(3), 475–487 (2004) CrossRefGoogle Scholar
  126. 126.
    Sharma, S., Malony, A., Berry, M., Sinvhal-Sharma, P.: Run-time monitoring of concurrent programs on the Cedar multiprocessor. In: Proc. Supercomp. 1990, pp. 784–793 (1990) CrossRefGoogle Scholar
  127. 127.
    SIAM Oral Histories: The history of numerical analysis and scientific computing: An interview with Bill Buzbee (2005). Conducted by Thomas Haigh and accessible from http://history.siam.org/pdfs2/Buzbee_returned_SIAM_copy.pdf
  128. 128.
    Sleijpen, G.L.G., van der Vorst, H.A.: A Jacobi–Davidson iteration method for linear eigenvalue problems. SIAM J. Matrix Anal. Appl. 17, 401–425 (1996) MathSciNetMATHCrossRefGoogle Scholar
  129. 129.
    Slotnick, D., Sameh, A.: Numerical calculation and computer design. Comput. Math. Appl. 3, 201–210 (1978) CrossRefGoogle Scholar
  130. 130.
    Slotnick, D.L., Borck, W.C., McReynolds, R.C.: The SOLOMON computer. In: Proc. Fall Joint Computer Conference, vol. 22, pp. 97–107 AFIPS (1962) Google Scholar
  131. 131.
    Sorensen, D.: Analysis of pairwise pivoting in Gaussian elimination. IEEE Trans. Comput. C-34(3), 274–278 (1985) MathSciNetCrossRefGoogle Scholar
  132. 132.
    SPIKE. A distributed memory version of the SPIKE package. Obtained from http://software.intel.com/en-us/articles/intel-adaptive-spike-based-solver/
  133. 133.
    Stone, H.S.: An efficient parallel algorithm for the solution of a tridiagonal linear system of equations. J. ACM 20(1), 27–38 (1973) MATHCrossRefGoogle Scholar
  134. 134.
    Wilhelmson, R.: Solving partial differential equations using ILLIAC IV. In: Colton, D., Gilbert, R. (eds.) Constructive and Computational Methods for Differential and Integral Equations. Lecture Notes in Mathematics, vol. 430, pp. 453–476. Springer, Berlin (1974). doi:10.1007/BFb0066281 CrossRefGoogle Scholar
  135. 135.
    Wisniewski, J.A., Sameh, A.H.: Parallel algorithms for network routing problems and recurrences. SIAM J. Algebr. Discrete Methods 3(3), 379–394 (1982) MathSciNetMATHCrossRefGoogle Scholar
  136. 136.
    Zhu, C.Q., Yew, P.C.: A scheme to enforce data dependences on large multiprocessor systems. IEEE Trans. Softw. Eng. SE-13(6), 726–739 (1987) CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • Kyle A. Gallivan
    • 1
  • Efstratios Gallopoulos
    • 2
  • Ananth Grama
    • 3
  • Bernard Philippe
    • 4
  • Eric Polizzi
    • 5
  • Yousef Saad
    • 6
  • Faisal Saied
    • 3
  • Danny Sorensen
    • 7
  1. 1.Department of MathematicsFlorida State UniversityTallahasseeUSA
  2. 2.CEIDUniversity of PatrasRioGreece
  3. 3.Computer Science DepartmentPurdue UniversityWest-LafayetteUSA
  4. 4.INRIA Research Center Rennes Bretagne AtlantiqueRennesFrance
  5. 5.Department of Electrical and Computer EngineeringUniversity of MassachusettsAmherstUSA
  6. 6.Department of Computer Science and EngineeringUniversity of MinnesotaMinneapolisUSA
  7. 7.Computational and Applied MathematicsRice UniversityHoustonUSA

Personalised recommendations