Performance of Multicore Systems on Parallel Data Clustering with Deterministic Annealing

  • Xiaohong Qiu
  • Geoffrey C. Fox
  • Huapeng Yuan
  • Seung-Hee Bae
  • George Chrysanthakopoulos
  • Henrik Frystyk Nielsen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5101)


We present a performance analysis of a scalable parallel data clustering algorithm with deterministic annealing for multicore systems that compares MPI and a new C# messaging runtime library CCR (Concurrency and Coordination Runtime) with Windows and Linux and using both threads and processes. We investigate effects of memory bandwidth and fluctuations of run times of loosely synchronized threads. We give results on message latency and bandwidth for two processor multicore systems based on AMD and Intel architectures with a total of four and eight cores. We compare our C# results with C using MPICH2 and Nemesis and Java with both mpiJava and MPJ Express. We show initial speedup results from Geographical Information Systems and Cheminformatics clustering problems. We abstract the key features of the algorithm and multicore systems that lead to the observed scalable parallel performance.


Data mining MPI Multicore Parallel Computing Performance Threads Windows 


  1. 1.
    Patterson, D.: The Landscape of Parallel Computing Research: A View from Berkeley 2.0 Presentation at Manycore Computing, Seattle, June 20 (2007)Google Scholar
  2. 2.
    Dongarra, J. (ed.): The Promise and Perils of the Coming Multicore Revolution and Its Impact, CTWatch Quarterly, February 2007, vol. 3(1) (2007),
  3. 3.
    Sutter, H.: The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software. Dr. Dobb’s Journal 30(3) (March 2005)Google Scholar
  4. 4.
    Annotated list of multicore sites:
  5. 5.
    Dubey P.: Teraflops for the Masses: Killer Apps of Tomorrow Workshop on Edge Computing Using New Commodity Architectures, UNC (May 23, 2006),
  6. 6.
    Fox G.: Parallel Computing 2007: Lessons for a Multicore Future from the Past Tutorial at Microsoft Research (February 26 to March 1 2007)Google Scholar
  7. 7.
    Qiu, X., Fox, G., Ho, A.: Analysis of Concurrency and Coordination Runtime CCR and DSS, Technical Report January 21 (2007)Google Scholar
  8. 8.
    Qiu, X., Fox, G., Yuan, H., Bae, S., Chrysanthakopoulos, G., Nielsen, H.: Performance Measurements of CCR and MPI on Multicore Systems Summary, September 23 (2007)Google Scholar
  9. 9.
    Qiu, X., Fox, G., Yuan, H., Bae, S., Chrysanthakopoulos, G., Nielsen, H.: High Performance Multi-Paradigm Messaging Runtime Integrating Grids and Multicore Systems. In: Proceedings of eScience 2007 Conference, Bangalore, India, December 10-13 (2007)Google Scholar
  10. 10.
    Gannon, D., Fox, G.: Workflow in Grid Systems Concurrency and Computation. Practice & Experience 18(10), 1009–1019 (2006)Google Scholar
  11. 11.
    Nielsen, H., Chrysanthakopoulos, G.: Decentralized Software Services Protocol – DSSP,
  12. 12.
    Chrysanthakopoulos, G.: Concurrency Runtime: An Asynchronous Messaging Library for C# 2.0, Channel9 Wiki Microsoft,
  13. 13.
    Richter J.: Concurrent Affairs: Concurrent Affairs: Concurrency and Coordination Runtime, Microsoft,
  14. 14.
    Microsoft Robotics Studio is a Windows-based environment that includes end-to-end Robotics Development Platform, lightweight service-oriented runtime, and a scalable and extensible platform,
  15. 15.
    Chrysanthakopoulos, G., Singh, S.: An Asynchronous Messaging Library for C#, Synchronization and Concurrency in Object-Oriented Languages (SCOOL) at OOPSLA Workshop, San Diego, CA (October 2005),
  16. 16.
    SALSA Multicore research Web site, For Indiana University papers cited here,
  17. 17.
    Qiu X., Fox G., Yuan H., Bae S., Chrysanthakopoulos G., Nielsen H.: Parallel Clustering and Dimensional Scaling on Multicore Systems Technical Report (February 21 2008)Google Scholar
  18. 18.
    Downs, G., Barnard, J.: Clustering Methods and Their Uses in Computational Chemistry. Reviews in Computational Chemistry 18, 1–40 (2003)Google Scholar
  19. 19.
  20. 20.
    Rose, K.: Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proc IEEE 86, 2210–2239 (1998)CrossRefGoogle Scholar
  21. 21.
    Dongarra, J., Foster, I., Fox, G., Gropp, W., Kennedy, K., Torczon, L., White, A. (eds.): The Sourcebook of Parallel Computing. Morgan Kaufmann, San Francisco (2002)Google Scholar
  22. 22.
    Fox, G., Johnson, M., Lyzenga, G., Otto, S., Salmon, J., Walker, D.: Solving Problems in Concurrent Processors, vol. 1. Prentice-Hall, Englewood Cliffs (1988)Google Scholar
  23. 23.
    Fox, G., Messina, P., Williams, R.: Parallel Computing Work! Morgan Kaufmann, San Mateo Ca (1994)Google Scholar
  24. 24.
    How to Align Data Structures on Cache Boundaries, Internet resource from Intel,
  25. 25.
    Message passing Interface MPI Forum,
  26. 26.
    MPICH2 implementation of the Message-Passing Interface (MPI),
  27. 27.
    Baker, M., Carpenter, B., Shafi, A.: MPJ Express: Towards Thread Safe Java HPC. In: IEEE International Conference on Cluster Computing (Cluster 2006), Barcelona, Spain, September 25-28 (2006),
  28. 28.
    mpiJava Java interface to the standard MPI runtime including MPICH and LAM-MPI,
  29. 29.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Xiaohong Qiu
    • 1
  • Geoffrey C. Fox
    • 2
  • Huapeng Yuan
    • 2
  • Seung-Hee Bae
    • 2
  • George Chrysanthakopoulos
    • 3
  • Henrik Frystyk Nielsen
    • 3
  1. 1.Research Computing UITSIndiana University Bloomington 
  2. 2.Community Grids LabIndiana University Bloomington 
  3. 3.Microsoft Research Redmond WA 

Personalised recommendations