Soft Computing

, Volume 21, Issue 9, pp 2237–2249

A parallel algorithm for mining constrained frequent patterns using MapReduce

Methodologies and Application
  • 297 Downloads

Abstract

Constrained frequent pattern refers to a frequent pattern generated using constrained conditions given by users and has characteristics of stronger pertinence, higher practicability and mining efficiency, etc. With the increasing of datasets, there are defects during the construction of the constrained frequent pattern tree, so that the constrained frequent pattern tree is difficult to apply to massive datasets. In this paper, a parallel mining algorithm of the constrained frequent pattern, called PACFP, is proposed using the MapReduce programming model. First, key steps in the algorithm, such as mapping transaction in datasets to frequent item support count, constructing the constrained frequent pattern tree, generating the constrained frequent pattern, and aggregating frequent patterns, are implemented by three pairs of Map and Reduce functions. Second, migration of data recording is achieved by applying a data grouping strategy based on frequent item support, and load balance is effectively solved while generating the constrained frequent pattern. In the end, experimental results validate availability, scalability, and expandability of the algorithm using celestial spectrum datasets.

Keywords

Association rule Constrained frequent pattern MapReduce Frequent item support Load balance 

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyTaiyuan University of Science and TechnologyTaiyuanChina
  2. 2.Department of Science and Software EngineeringAuburn UniversityAuburnUSA

Personalised recommendations