Large-Scale Linear Computations with Dedicated Real-Time Architectures

  • Patrick Dewilde
  • Klaus Diepold


Understanding the connection between algorithms and solvers for large scale systems on the one hand, and appropriate architectures that execute them efficiently on the other is key to the effective design of modern signal processing applications. This trend has started with the emergence of digital media, large scale signal processing for image coding and analysis, digital mobile telephony and digital processing in medical imaging. In many cases dedicated, hardware or software dominated methods on a single processor have been used, only in recent times more generic methods based on general purpose array architectures, either dedicated to media processing or for general computing have become realizable. Massive use of parallelism becomes attractive and should, in the future, allow us to tackle large scale problems in a streamlined fashion.


Graphical Processing Unit Memory Bandwidth Sparse Matrice Processor Array Sparsity Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ahmed HM, Delosme J-M, Morf M (1982) Highly concurrent computing structures for matrix arithmetic and signal processing. IEEE Comp 15(1):65–86CrossRefGoogle Scholar
  2. 2.
    Björck A (1996) Numerical methods for least squares problems. SIAM, Philadelphia, PAzbMATHCrossRefGoogle Scholar
  3. 3.
    Bolz J, Farmer I, Grinspun E, Schröder P (2003) Sparse matrix solvers on the gpu: Conjugate gradients and multigrid. In: SIGGRAPH 2003, ACM, New York, pp 917–924Google Scholar
  4. 4.
    Chandrasekaran S, Dewilde P, Gu M, Pals T, van der Veen A-J, Xia J (2002) A fast backward stable solver for sequentially semi-separable matrices, volume HiPC202 of Lecture notes in computer science, Springer, Berlin, pp 545–554Google Scholar
  5. 5.
    Dewilde P (1988) New algebraic methods for modelling large-scale integrated circuits. Int J Circ Theory Appl 16(4):473–503zbMATHCrossRefGoogle Scholar
  6. 6.
    Dewilde P, van der Veen A-J (1998) Time-varying systems and computations. Kluwer, DordrechtzbMATHGoogle Scholar
  7. 7.
    Dewilde P, van der Veen A-J (2000) Inner-outer factorization and the inversion of locally finite systems of equations. Linear Algebra Appl 313:53–100MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Dewilde P, Diepold K, Bamberger W (2004a) Optic flow computation and time-varying system theory. In: Proceedings of the international symposium on mathematical theory of networks and systems (MTNS). Katholieke Universiteit Leuven, BelgiumGoogle Scholar
  9. 9.
    Dewilde P, Diepold K, Bamberger W (2004b) A semi-separable approach to a tridiagonal hierarchy of matrices with applications to image analysis. In: Proceedings of the international symposium on mathematical theory of networks and systems (MTNS). Katholieke Universiteit Leuven, BelgiumGoogle Scholar
  10. 10.
    Durkovic M, Zwick M, Obermeier F, Diepold K (2006) Performance of optical flow techniques on graphics hardware. In: IEEE international conference on multimedia and expo (ICME)Google Scholar
  11. 11.
    Fadeeva VN (1959) Computational methods of linear algebra. Dover, New YorkGoogle Scholar
  12. 12.
    Gastona FMF, Irwina GW (1989) Systolic approach to square root information kalman filtering. Int J Control 50(1):225–248CrossRefGoogle Scholar
  13. 13.
    Gohberg I, Kailath T, Koltracht I (1985) Linear complexity algorithms for semiseparable matrices. Integral Equations Operator Theory 8:780–804MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Golub G, van Loan Ch (1989) Matrix computations. John Hopkins University Press, Baltimore, MDzbMATHGoogle Scholar
  15. 15.
    Götze J, Schwiegelshohn U (1988) Sparse matrix-vector multiplication on a systolic array. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol 4. IEEE, pp 2061–2064Google Scholar
  16. 16.
    Hartley RI (1997) In defence of the 8-point algorithm. IEEE Trans Pattern Anal Machine Intell 19(6):580–593CrossRefGoogle Scholar
  17. 17.
    Jainandunsing K, Deprettere EF (1989a) A new class of parallel algorithms for solving systems of linear equations. SIAM J Scientific Comput 10(5):880–912MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Jainandunsing K, Deprettere EdF (1989b) A new class of parallel algorithms for solving, systems of linear equations. SIAM J Sct Stat Comput 10(5):880–912zbMATHCrossRefGoogle Scholar
  19. 19.
    Jean-Marc D, Ipsen Ilse C.F. (1986) Parallel solution of symmetric positive definite systems with hyperbolic rotations. Linear Algebra Appl 77:75–111.MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Jichun Bu, Deprettere F, Dewilde P (1990) A design methodology for fixed-size systolic arrays. In: Proceedings of the international conference on application specific array processorsGoogle Scholar
  21. 21.
    Kailath T (1981) Lectures on Wiener and Kalman Filtering. CISM Courses and Lectures No. 140, Springer, Wien, New YorkGoogle Scholar
  22. 22.
    Kailath T, Sayed A (1999) Fast reliable algorithms for matrices with structure. SIAM, Philadelphia, PAzbMATHCrossRefGoogle Scholar
  23. 23.
    Kailath T, Sayed A, Hasibi B (2000) Linear esimtation. Prentice Hall, Upper Saddle River, NJGoogle Scholar
  24. 24.
    Kienhuis B e.a. (2010) Hotspot parallelizer. Compaan Design., visitied on Nov.14.2011.
  25. 25.
    Kronecker L (1890) Algebraische Reduction der schaaren bilinearer Formen. S.B. Akad. Berlin, pp 663–776Google Scholar
  26. 26.
    Krüger J, Westermann R (2005) Linear algebra operators for gpu implementation of numerical algorithms. In: SIGGRAPH 2005. ACM, New YorkGoogle Scholar
  27. 27.
    Kung SY (1988) VLSI array processors. Prentice Hall, Englewood Cliffs, NJGoogle Scholar
  28. 28.
    Larsson D, Schinner e.a. P (1986) The CADMUS 9230 ICD Graphic Workstation 1, volume The integrated design handbook, chapter 8, Delft University Press, Delft, pp 8.1–8.29Google Scholar
  29. 29.
    Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: Proceedings of imaging understanding workshop, pp 121–130Google Scholar
  30. 30.
    Misraa M, Nassimib D, Prasannaa VK (1993) Efficient vlsi implementation of iterative solutions to sparse linear systems. Parallel Comput 19(5):525–544CrossRefGoogle Scholar
  31. 31.
    Nash JG, Hansen S (1988) Modified faddeeva algorithm for concurrent execution of linear algebraic operations. IEEE Trans Comp 37(2):129–137zbMATHCrossRefGoogle Scholar
  32. 32.
    Paige ChC, Saunders MA (1982) LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Trans Math Softw 8:43–71MathSciNetzbMATHCrossRefGoogle Scholar
  33. 33.
    Polka LA, Kalyanam H, Hu G, Krishnamoorthy S (2007) Package technology to address the memory bandwidth challenge for tera-scale computing. Intel Technol J 11(3):197–206Google Scholar
  34. 34.
    Proudler IK, McWhirter JG, Shepherd TJ (1991) Computationally efficient qr decomposition approach to least squares adaptive filtering. IEE Proc F, Radar Signal Processing 138(4): 341–353CrossRefGoogle Scholar
  35. 35.
    Schunck BG, Horn BKP (1981) Determining optical flow. Artif Intell 17(1-3):185–203Google Scholar
  36. 36.
    Saad Y (2003) Iterative methods for sparse linear systems. SIAM, Philadelphia, PAzbMATHCrossRefGoogle Scholar
  37. 37.
    Strang G (2007) Computational science and engineering. Wellesley-Cambridge Press, Wellesley, MAzbMATHGoogle Scholar
  38. 38.
    Tong L, van der Veen A-J, Dewilde P (2002) A new decorrelating rake receiver for long-code wcdma. In: Proceedings 2002 conference on information sciences systems. Princeton University, NJGoogle Scholar
  39. 39.
    Vanderbril R, van Barel M, Mastronardi N (2008) Matrix computations and semi-separable matrices. John Hopkins University Press, Baltimore, MDGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Information TechnologyTechnische Universität MünchenMünchenGermany
  2. 2.Institute for Advanced StudyTechnische Universität MünchenMünchenGermany

Personalised recommendations