A Knowledge-Based Operator for a Genetic Algorithm which Optimizes the Distribution of Sparse Matrix Data

  • Una-May O’Reilly
  • Nadya Bliss
  • Sanjeev Mohindra
  • Julie Mullen
  • Eric Robinson
Part of the Studies in Computational Intelligence book series (SCI, volume 415)


We present the Hogs and Slackers genetic algorithm (GA) which addresses the problem of improving the parallelization efficiency of sparse matrix computations by optimally distributing blocks of matrices data. The performance of a distribution is sensitive to the non-zero patterns in the data, the algorithm, and the hardware architecture. In a candidate distributions the Hogs and Slackers GA identifies processors with many operations – hogs, and processors with fewer operations – slackers. Its intelligent operation-balancing mutation operator then swaps data blocks between hogs and slackers to explore a new data distribution.We show that the Hogs and Slackers GA performs better than a baseline GA. We demonstrate Hogs and Slackers GA’s optimization capability with an architecture study of varied network and memory bandwidth and latency.


Genetic Algorithm IEEE Computer Society Dependency Graph Betweenness Centrality Sparse Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Biriukov, A., Ulyanov, D.: Simulation of parallel time-critical programs with the dynamo system. In: Proceedings of the Third IEEE Conference on Control Applications, pp. 825–829. IEEE Computer Society (1994), doi:10.1109/CCA.1994.381214Google Scholar
  2. 2.
    Bliss, N., Kepner, J.: pMatlab Parallel MATLAB Library. International Journal of High Performance Computing Applications (IJHPCA), Special Issue on High-Productivity Programming Languages and Models 21(3), 336–359 (2007)Google Scholar
  3. 3.
    Buluç, A., Gilbert, J.R.: Challenges and advances in parallel sparse matrix-matrix multiplication. In: The 37th International Conference on Parallel Processing (ICPP 2008), pp. 503–510. IEEE Computer Society (2008)Google Scholar
  4. 4.
    Buluç, A., Gilbert, J.R.: New ideas in sparse matrix-matrix multiplication. In: Kepner, J., Gilbert, J.R. (eds.) Graph Algorithms in the Language of Linear Algebra. SIAM Press (2008)Google Scholar
  5. 5.
    Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: A recursive model for graph mining. In: SIAM Data Mining SDM 2004 (2004)Google Scholar
  6. 6.
    D’Alberto, P., Nicolau, A.: R-kleene: A high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks. Algorithmica 47(2), 203–213 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Davis, T.A.: Direct Methods for Sparse Linear Systems. SIAM, Philadelphia (2006)zbMATHCrossRefGoogle Scholar
  8. 8.
    Filippone, S., Colajanni, M.: Psblas: a library for parallel linear algebra computation on sparse matrices. ACM Trans. Math. Softw. 26(4), 527–550 (2000)CrossRefGoogle Scholar
  9. 9.
    Floyd, R.W.: Algorithm 97: Shortest path. Commun. ACM 5(6), 345 (1962)CrossRefGoogle Scholar
  10. 10.
    Foo, S.K., Saratchandran, P., Sundararajan, N.: Genetic algorithm based pattern allocation schemes for training set parallelism in backpropagation neural networks. In: IEEE International Conference on Evolutionary Computation, pp. 545–550. IEEE Computer Society (1995), doi:10.1109/ICEC.1995.487442Google Scholar
  11. 11.
    Gilbert, J.R., Reinhardt, S., Shah, V.B.: A unified framework for numerical and combinatorial computing. Computing in Science and Engg. 10(2), 20–25 (2008)CrossRefGoogle Scholar
  12. 12.
    Grefenstette, J.J.: Incorporating problem-specific knowledge into genetic algorithms. In: Davis, L. (ed.) Genetic Algorithms and Simulated Annealing, ch. 4, pp. 42–60. Morgan Kaufmann (1987)Google Scholar
  13. 13.
    Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985)zbMATHGoogle Scholar
  14. 14.
    Hosny, M.I., Mumford, C.L.: Single vehicle pickup and delivery with time windows: made to measure genetic encoding and operators. In: Proceedings of the 2007 Genetic and Evolutionary Computation Conference, GECCO 2007, pp. 2489–2496. ACM, New York (2007),, doi:10.1145/1274000.1274015CrossRefGoogle Scholar
  15. 15.
    Du, I.S., Erisman, A., Reid, J.K.: Direct Methods for Sparse Matrices. Oxford University Press, Oxford (1986)Google Scholar
  16. 16.
    Jose, A.: An approach to mapping parallel programs on hypercube multiprocessors. In: Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing, PDP 1999, pp. 221–225 (1999), doi:10.1109/EMPDP.1999.746675Google Scholar
  17. 17.
    Kalinowski, T.: Solving the mapping problem with a genetic algorithm on the maspar-1. In: Proceedings of the First International Conference on Massively Parallel Computing Systems, pp. 370–374. IEEE Computer Society (1994), doi:10.1109/MPCS.1994.367057Google Scholar
  18. 18.
    Kepner, J., Bliss, N., Robinson, E.: Linear algebraic graph algorithms for back end processing. In: Proceedings of Workshop on High Performance Embedded Computing, HPEC 2008 (2008)Google Scholar
  19. 19.
    Kuck, D.: High Performance Computing: Challenges for Future Systems. Oxford University Press, New York (1996)Google Scholar
  20. 20.
    Kwok, Y.K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Computing Surveys 31(4), 406–471 (1999),, doi:10.1145/344588.344618CrossRefGoogle Scholar
  21. 21.
    Laird, J.E., Newell, A.: A universal weak method: summary of results. In: IJCAI 1983: Proceedings of the Eighth International Joint Conference on Artificial intelligence, pp. 771–773. Morgan Kaufmann Publishers Inc., San Francisco (1983)Google Scholar
  22. 22.
    Lee, S., Eigenmann, R.: Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems. In: ICS 2008: Proceedings of the 22nd Annual International Conference on Supercomputing, pp. 195–204. ACM, New York (2008)CrossRefGoogle Scholar
  23. 23.
    Lin, W.Y.: Parallel sparse matrix ordering: quality improvement using genetic algorithms. In: Proceedings of the 1999 Congress on Evolutionary Computation, CEC 1999, pp. 2295–2301. IEEE Computer Society (1999), doi:10.1109/CEC.1999.785560Google Scholar
  24. 24.
    O’Reilly, U., Bliss, N., Mohindra, S., Mullen, J., Robinson, E.: Multi-objective optimization of sparse array computations. In: Proceedings of Workshop on High Performance Embedded Computing, HPEC 2009 (2009)Google Scholar
  25. 25.
    Rabin, M.O., Vazirani, V.V.: Maximum matchings in general graphs through randomization. J. Algorithms 10(4), 557–567 (1989)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Ramaswamy, S., Banerjee, P.: Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In: Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers 1995). IEEE Computer Society (1995)Google Scholar
  27. 27.
    Reuther, A., Kepner, J., McCabe, A., Mullen, J., Bliss, N., Kim, H.: Technical challenges of supporting interactive HPC. In: HPCMP Users Group Conference, pp. 403–409. IEEE Computer Society (2007)Google Scholar
  28. 28.
    Robinson, E.: Array based betweenness centrality. In: SIAM Conference on Parallel Processing for Scientific Computing (2008)Google Scholar
  29. 29.
    Sadd, Y.: Iterative methods for sparse linear systems. SIAM, Philadelphia (2007)Google Scholar
  30. 30.
    Talbi, E.G., Muntean, T.: Hill-climbing, simulated annealing and genetic algorithms: a comparative study and application to the mapping problem. In: Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, pp. 565–573. IEEE Computer Society (1993), doi:10.1109/HICSS.1993.284069Google Scholar
  31. 31.
    Tarjan, R.E.: A unified approach to path problems. J. ACM 28(3), 577–593 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
  32. 32.
    Travinin, N., Hoffman, H., Bond, R., Chan, H., Kepner, J., Wong, E.: pMapper: Automatic mapping of parallel matlab programs. In: HPCMP Users Group Conference, pp. 254–261. IEEE Computer Society (2005)Google Scholar
  33. 33.
    Travinin Bliss, N., Mohindra, S., O’Reilly, U.: Performance modeling and mapping of sparse computations. In: HPCMP Users Group Conference, pp. 448–456. IEEE Computer Society (2008)Google Scholar
  34. 34.
    Xiong, K., Suh, S., Yang, M., Yang, J., Arabnia, H.: Next generation sequence analysis using genetic algorithms on multi-core technology. In: International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing (IJCBS 2009), pp. 190–191 (2009), doi:10.1109/IJCBS.2009.104Google Scholar
  35. 35.
    Yoo, A., Chow, E., Henderson, K., McLendon, W., Hendrickson, B., Catalyurek, U.: A scalable distributed parallel breadth-first search algorithm on BlueGene/L. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC 2005, p. 25. IEEE Computer Society, Washington, DC (2005)Google Scholar
  36. 36.
    Yuster, R., Zwick, U.: Detecting short directed cycles using rectangular matrix multiplication and dynamic programming. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, pp. 254–260. Society for Industrial and Applied Mathematics, Philadelphia (2004)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2012

Authors and Affiliations

  • Una-May O’Reilly
    • 1
  • Nadya Bliss
    • 2
  • Sanjeev Mohindra
    • 2
  • Julie Mullen
    • 2
  • Eric Robinson
    • 2
  1. 1.Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of TechnologyCambridgeUSA
  2. 2.Lincoln LaboratoryMassachuestts Institute of TechnologyCambridgeUSA

Personalised recommendations