Advertisement

Mining the Associated Patterns in Big Data Using Hadoop Cluster

  • P. AshaEmail author
  • T. Prem Jacob
  • A. Pravin
  • A. Asbern
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 26)

Abstract

The level of data usage goes on increasing day by day in every aspect of life, the notion of data mining and Big data lies in the fact that, how the related (associated) pattern or the information is maintained and reused. This doesn’t mean it can only be implemented in the huge volume of data. It can be applied to all fields of data collection, but the relative need is the association existing between the data sets. There exists n number of methods to find the associations between the data, but comforting them to scale up with the big data seems really challenging. The paper aims at retrieving the recurrent patterns with respect to big datasets. Apriori algorithm is used to fetch the associated patterns and their performance enhancements over various data sets were evaluated.

Keywords

Big data Hadoop Data mining Frequent item sets Cluster Associations 

References

  1. 1.
    Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 8(6), 962–969 (1996)CrossRefGoogle Scholar
  2. 2.
    Bog, A.: Benchmarking transaction and analytical processing systems. In: Memory Data Management Research, Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-642-38070_3
  3. 3.
    Wu, X., Zhu, X., Wu, G.-Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97—107 (2014)CrossRefGoogle Scholar
  4. 4.
    Asbern, A., Asha, P.: Performance evaluation of association mining in Hadoop single node cluster with big data. In: 2015 International Conference on Circuit, Power and Computing Technologies (ICCPCT), 19 March 2015, pp. 1–5 (2015)Google Scholar
  5. 5.
    Dhamodaran, S., Sachin, K.R., Kumar, R.: Big data implementation of natural disaster monitoring and alerting system in real time social network using hadoop technology. Indian. J. Sci. Technol. 8(22), 20 (2015)Google Scholar
  6. 6.
    Han, E.-H., Karypis, G., Kumar, V.: Scalable parallel data mining for association rules. IEEEE Trans. Knowl. Data Eng. 12(3), 337–352 (2000)CrossRefGoogle Scholar
  7. 7.
    Shen, Y.-D., Zhang, Z., Yang, Q.: Objective-oriented utility-based association mining. In: 2002 IEEE International Conference on Data Mining, 2002. ICDM 2003. Proceedings, pp. 426–433 (2002)Google Scholar
  8. 8.
    Li, S., Hu, S., Wang, S., Su, L., Abdelzaher, T.F., Gupta, I., Pace, R.: WOHA: deadline-aware map-reduce workflow scheduling framework over Hadoop clusters. In: 2014 IEEE 34th International Conference on Distributed Computing Systems, pp. 93–103 (2014)Google Scholar
  9. 9.
    Menezes, S.L., Freitas, R.S., Parpinelli, R.S.: Mineraçao em grandes massas de dados utilizando hadoop mapreduce e algoritmos bio-inspirados: Uma revisao sistemática. Revista de Informática Teórica e Aplicada 23(1), 69–101 (2016)CrossRefGoogle Scholar
  10. 10.
    Wang, Z., Huo, Y., Wang, J., Zhao, K., Yang, Y.: Research on ant colony clustering algorithm based on HADOOP platform. In: International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 514–520. Springer, Cham (2016)CrossRefGoogle Scholar
  11. 11.
    Wang, Y., Rao, R., Wang, Y.: A round robin with multiple feedback job scheduler in Hadoop. In: 2014 IEEE International Conference on Progress in Informatics and Computing, pp. 471–475 (2014)Google Scholar
  12. 12.
    Uthirakumari, A., Asha, P.: Hybrid scheduler to overcome the negative impact of job preemption for heterogeneous Hadoop systems. In: 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), pp. 1–5. IEEE (2016)Google Scholar
  13. 13.
    Usama, M., Liu, M., Chen, M.: Job schedulers for big data processing in Hadoop environment: testing real-life schedulers using benchmark programs. Digit. Commun. Netw. 3(4), 260–273 (2017)CrossRefGoogle Scholar
  14. 14.
    Apache hadoop (2013). http://hadoop.apache.org/
  15. 15.
    Frequent itemset mining dataset repository (2004). http://fimi.ua.ac.be/data

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringSathyabama Institute of Science and TechnologyChennaiIndia
  2. 2.Global Knowledge Network India Private LimitedChennaiIndia

Personalised recommendations