Knowledge and Information Systems

, Volume 3, Issue 1, pp 1–29 | Cite as

Parallel Data Mining for Association Rules on Shared-Memory Systems

  • S. Parthasarathy
  • M. J. Zaki
  • M. Ogihara
  • W. Li
Regular Paper

Abstract.

In this paper we present a new parallel algorithm for data mining of association rules on shared-memory multiprocessors. We study the degree of parallelism, synchronization, and data locality issues, and present optimizations for fast frequency computation. Experiments show that a significant improvement of performance is achieved using our proposed optimizations. We also achieved good speed-up for the parallel algorithm.

A lot of data-mining tasks (e.g. association rules, sequential patterns) use complex pointer-based data structures (e.g. hash trees) that typically suffer from suboptimal data locality. In the multiprocessor case shared access to these data structures may also result in false sharing. For these tasks it is commonly observed that the recursive data structure is built once and accessed multiple times during each iteration. Furthermore, the access patterns after the build phase are highly ordered. In such cases locality and false sharing sensitive memory placement of these structures can enhance performance significantly. We evaluate a set of placement policies for parallel association discovery, and show that simple placement schemes can improve execution time by more than a factor of two. More complex schemes yield additional gains.

Keywords: Association rules; Improving locality; Memory placement; Parallel data mining; Reducing false sharing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag London Limited 2001

Authors and Affiliations

  • S. Parthasarathy
    • 1
  • M. J. Zaki
    • 2
  • M. Ogihara
    • 3
  • W. Li
    • 4
  1. 1.Department of Computer and Information Sciences, Ohio State University, Columbus, OH, USAUS
  2. 2.Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USAUS
  3. 3.Department of Computer Science, University of Rochester, Rochester, NY, USAUS
  4. 4.Intel Corporation, Santa Clara, CA, USAUS

Personalised recommendations