Advertisement

Progressive Visual Analytics in Big Data Using MapReduce FPM

Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 9)

Abstract

Visual analytics uses interactive visualizations in order to incorporate user’s knowledge and cognitive capability into data analytics processes. The progressive visual analytic paradigm simplifies the analytic process when it comes to large datasets. It uses the interactive sequential pattern mining algorithm which reports patterns as it finds them. But, the sequential pattern mining algorithms like SPAM, SPADE and PrefixSpan are suited for a single-node environment only. It is also constrained by the available size of memory and computational power while handling a very large quantity of data. So to overcome these challenges, the proposed MapReduce frequent pattern mining (MR-FPM) algorithm distributes data across various nodes in the Hadoop cluster, finds the candidate itemsets and counts their support using the MapReduce paradigm. The patterns with supportless than the user-defined minsup are discarded. Experimental results show that MR-FPM continuously outperforms SPAM when the minsup is decreased.

Keywords

MapReduce FPM Progressive visual analytics Sequential pattern mining (SPAM) algorithm 

References

  1. 1.
    Keim DA, Mansmann F, Schneidewind J, Thomas J, Ziegler H (2012) Visual analytics: scope and challenges. SpringerGoogle Scholar
  2. 2.
    Stolper CD, Perer A, Gotz D (2014) Progressive visual analytics: user-driven visual exploration of in-progress analytics. IEEE Trans Vis Comput Graph 20(12):1653–1662CrossRefGoogle Scholar
  3. 3.
    Keim D, Huamin Q, Ma K-L (2013) Big-data visualization. Comput Graph Appl IEEE 33(4):20–21CrossRefGoogle Scholar
  4. 4.
    Ayres J, Flannick J, Gehrke J, Yiu T (2008) Sequential pattern mining using a bitmap representation. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 429–435Google Scholar
  5. 5.
    Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. In: ACM computing surveys (CSUR’10), 2010Google Scholar
  6. 6.
    Pei J, Han J, Mortazavi-asl B, Pinto H, Chen Q, Dayal U, Hsu M-C (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. IEEE Comput Soc, 215Google Scholar
  7. 7.
    Zaki MJ (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42:31–60Google Scholar
  8. 8.
    Huang J-W, Tseng C-Y, Ou J-C, Chen M-S (2008) A general model for sequential pattern mining with a progressive database. In: IEEE transactions on knowledge and data engineering (TKDE’08), 2008Google Scholar
  9. 9.
    Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. CACM 51(1):107–113CrossRefGoogle Scholar
  10. 10.

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Central University of South BiharPatnaIndia

Personalised recommendations