Parallel Implementation of Cholesky LLT-Algorithm in FPGA-Based Processor

  • Oleg Maslennikow
  • Volodymyr Lepekha
  • Anatoli Sergiyenko
  • Adam Tomas
  • Roman Wyrzykowski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4967)


The fixed-size processor array architecture, which is intended for realization of matrix LL T -decomposition based on Cholesky algorithm, is proposed. In order to implement this architecture in modern FPGA devices, the arithmetic unit (AU) operating in the rational fraction arithmetic is designed. The AU is intended for configuring in the Xilinx Virtex4 FPGAs, and its hardware complexity is much less than the complexity of similar AUs operating with floating-point numbers.


Rational Fraction Dependence Graph Parallel Implementation Processor Utilization Hardware Complexity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Underwood, K.D., Hemmert, K.S.: Closing the Gap: CPU and FPGA Trends in Sustained Floating Point BLAS Performance. In: Proc. IEEE Symp. Field Programmable Custom Computing Machines, FCCM-(2004)Google Scholar
  2. 2.
    Dou, Y., Vassiliadis, S., Kuzmanov, G.K., Gaydadjiev, G.N.: 64-bit Floating Point FPGA Matrix Multiplication. In: ACM/SIGDA 13-th Int. Symp. on Field Programmable Gate Arrays, FPGA-2005, pp. 86–95 (2005)Google Scholar
  3. 3.
    Johnson, J., Nagvajara, P., Nwankpa, C.: High-Performance Linear Algebra Processor using FPGA. In: Proc. High Performance Embedded Computing, HPEC 2003 (2003)Google Scholar
  4. 4.
    El-Kurdi, Y., Gross, W.J., Giannacopoulos, D.: Sparse Matrix-Vector Multiplication for Finite Element Method Matrices on FPGAs. In: Proc. 14th IEEE Symp. on Field-Programmable Custom Computing Machines FCCM 2006 (2006)Google Scholar
  5. 5.
    Beauchamp, M.J., Hauck, S., Underwood, K.D., Hemmert, K.S.: Embedded Floating-Point Units in FPGAs. In: Proc. ACM Int. Symp. on Field Programmable Gate Arrays, Monterey, CA (2006)Google Scholar
  6. 6.
    Durbano, J.P., Ortiz, F.E., Humphrey, J.R., Prather, D.W.: FPGA-Based Acceleration of the 3D Finite-Difference Time-Domain Method. In: Proc. 12-th IEEE Symp. on Field-Programmable Custom Computing Machines FCCM 2004 (2004)Google Scholar
  7. 7.
    Xilinx Floating-point Operators v2.0 DS335, Xilinx (2006),
  8. 8.
    Storaasli, O.: Scientific Applications on a NASA Reconfigurable Hypercomputer. In: Proc. 5-th MAPLD Conf. (2002)Google Scholar
  9. 9.
    Strenski, D.: Computational Bottlenecks and Hardware Decisions for FPGAs. FPGA and Structured ASIC Journal (14), 1–20 (2006)Google Scholar
  10. 10.
    Craven, S., Athanas, P.: Examining the Viability of FPGA Supercomputing. EURASIP Journal on Embedded Systems (2) (2007)Google Scholar
  11. 11.
    Lienhart, G., Kugel, A., Männer, R.: Using floating-point arithmetic on FPGAs to accelerate scientific n-body simulations. In: Proc. 10-th Ann. IEEE Symp. on Field-Programmable Custom Computing Machines, FCCM 2002, p. 182 (2002)Google Scholar
  12. 12.
    Strzodka, R., Goddeke, D.: Pipelined Mixed Precision Algorithms on FPGAs for Fast and Accurate PDE. Solvers from Low Precision Components. In: Proc. Field-Programmable Custom Computing Machines (2006)Google Scholar
  13. 13.
    Matousek, R., Tichy, M., Phol, Z., Kadlec, J., Softley, C., Coleman, N.: Logarithmic number systems and floating-point arithmetics on FPGA. In: Proc. 12-th Int. Conf. on Field Programmable Logic and Applications, London, pp. 627–636 (2002)Google Scholar
  14. 14.
    Maslennikow, O., Shevtshenko, J., Sergyienko, A.: Configurable Microprocessor Array for DSP applications. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2004. LNCS, vol. 3019, pp. 36–41. Springer, Heidelberg (2004)Google Scholar
  15. 15.
    Maslennikow, O., Sergyienko, A., Lepekha, V.: FPGA Implementation of the Conjugate Gradient Method. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds.) PPAM 2005. LNCS, vol. 3911, pp. 526–533. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Golub, G.G., Van Loan, C.F.: Matrix Computations, 2nd edn., p. 642. J. Hopkins Univ. Press, Baltimore (1989)MATHGoogle Scholar
  17. 17.
    Irvin, M.J., Smith, D.R.: A rational arithmetic processor. In: Proc. 5-th Symp. Comput. Arithmetic, pp. 241–244 (1981)Google Scholar
  18. 18.
    Wyrzykowski, R., Kanevski, J., Maslennikova, N., Maslennikov, O., Ovramenko, S.: Formalized Construction Method of Array Functional Graphs for Regular Algorithms. In: Engineering Simulation, vol. 14, pp. 217–232. Gordon and Breach Science Publishers, England (1997)Google Scholar
  19. 19.
    Kung, S.Y.: VLSI Array Processors. Prentice-Hall, Englewood Cliffs (1988)Google Scholar
  20. 20.
    Moreno, J.H., Lang, T.: Matrix Computation on Systolic-Type Arrays. Kluwer Acad. Publ., Boston (1992)Google Scholar
  21. 21.
    Quinton, P., Robert, Y.: Systolic Algorithms and Architectures. Prentice-Hall, Engl. Cliffs (1991)Google Scholar
  22. 22.
    Cosnard, M., Trystram, D.: Parallel Algorithms and Architectures. Int. Thomson Computer Press, Boston (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Oleg Maslennikow
    • 1
  • Volodymyr Lepekha
    • 2
  • Anatoli Sergiyenko
    • 2
  • Adam Tomas
    • 3
  • Roman Wyrzykowski
    • 3
  1. 1.Technical University of KoszalinKoszalinPoland
  2. 2.National Technical University of UkraineKievUkraine
  3. 3.Czestochowa University of TechnologyCzestochowaPoland

Personalised recommendations