Efficient Implementation of Tree Accumulations on Distributed-Memory Parallel Computers

  • Kiminori Matsuzaki
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4488)


In this paper, we develop an efficient implementation of two tree accumulations. In this implementation, we divide a binary tree based on the idea of m-bridges to obtain high locality, and represent local segments as serialized arrays to obtain high sequential performance. We furthermore develop a cost model for our implementation. The experiment results show good performance.


Binary Tree Cost Model Critical Node Tree Accumulation Local Segment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Matsuzaki, K., Hu, Z.: Efficient implementation of tree skeletons on distributed-memory parallel computers. Technical Report METR 2006-65, Department of Mathematical Informatics, the University of Tokyo (2006)Google Scholar
  2. 2.
    He, X.: Efficient parallel algorithms for solving some tree problems. In: 24th Allerton Conference on Communication, Control and Computing (1986)Google Scholar
  3. 3.
    Miller, G.L., Reif, J.H.: Parallel tree contraction and its application. In: 26th Annual Symposium on Foundations of Computer Science, Portland, Oregon, USA, 21–23 October 1985, IEEE Computer Society Press, Los Alamitos (1985)Google Scholar
  4. 4.
    Abrahamson, K.R., Dadoun, N., Kirkpatrick, D.G., Przytycka, T.M.: A simple parallel tree contraction algorithm. Journal of Algorithms 10(2) (1989)Google Scholar
  5. 5.
    Mayr, E.W., Werchner, R.: Optimal tree contraction and term matching on the hypercube and related networks. Algorithmica 18(3) (1997)Google Scholar
  6. 6.
    Dehne, F.K.H.A., Ferreira, A., Cáceres, E., Song, S.W., Roncato, A.: Efficient parallel graph algorithms for coarse-grained multicomputers and BSP. Algorithmica 33(2) (2002)Google Scholar
  7. 7.
    Vishkin, U.: A no-busy-wait balanced tree parallel algorithmic paradigm. In: SPAA 2000: Proceedings of the 12th Annual ACM Symposium on Parallel Algorithms and Architectures, Bar Harbor, Maine, USA, July 9–13, 2000, ACM Press, New York (2000)Google Scholar
  8. 8.
    Gibbons, J., Cai, W., Skillicorn, D.B.: Efficient parallel algorithms for tree accumulations. Science of Computer Programming 23(1) (1994)Google Scholar
  9. 9.
    Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In: InfoScale ’06: Proceedings of the 1st international conference on Scalable information systems. ACM International Conference Proceeding Series, vol. 152, ACM Press, New York (2006)Google Scholar
  10. 10.
    Reif, J.H. (ed.): Synthesis of Parallel Algorithms. Morgan Kaufmann, San Francisco (1993)Google Scholar
  11. 11.
    Matsuzaki, K., Hu, Z., Takeichi, M.: Implementation of parallel tree skeletons on distributed systems. In: Proceedings of the Third Asian Workshop on Programming Languages and Systems, APLAS’02, Shanghai Jiao Tong University, Shanghai, China, November 29 – December 1 (2002)Google Scholar
  12. 12.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)zbMATHGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Kiminori Matsuzaki
    • 1
  1. 1.Graduate School of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, TokyoJapan

Personalised recommendations