Abstract
In the last few years, organizations have become much more interested in using data to create value. Big Data, however, presents new challenges to the extraction of knowledge using traditional Data Mining methods. In this paper we focus on a concrete implementation of association rules generation. The proposed algorithm is specialized for four datasets and its performance for different support thresholds is measured. This is done for two Database Management Systems (DBMS) – a traditional row-oriented DMBS in the face of Oracle and a column-oriented DBMS represented by Vertica. The results indicate the suitability of these DBMSs as tools for association rules generation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Adhikary, D., Roy, S.: Issues in Quantitative Association Rule Mining: A Big Data Perspective. ICT for Sustainable Development, India (2015)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th VLDB Conference, Santiago, Chile, vol. 42 (1994)
Al-khoder, A., Harmouch, H.: Evaluating four of the most popular open source and free data mining tools. Int. J. Acad. Sci. Res. 3(1), 13–23 (2015)
Brice, B., Alexander, W.: Finding interesting things in lots of data. In: 23rd Hawaii International Conference on System Sciences (1990)
Danubianu, M., et al.: Mining association rules inside a relational database–a case study. In: 6th ICCGI, pp. 14–19 (2011)
Garcia-Molina, H., et al.: Database Systems: The Complete Book. Pearson Prentice Hall, Upper Saddle River (2009)
Han, J., et al.: Data Mining: Concepts and Techniques. Elsevier, San Francisco (2012)
Han, J., et al.: Knowledge discovery in databases: an attribute-oriented approach. In: Proceedings of the 18th VLDB Conference, Vancouver, British Columbia, Canada, pp. 547–559 (1992)
Han, J., et al.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 1–12 (2000)
Khurana, N., Datta, R.K.: Pruning large data sets for finding association rule in cloud: CBPA (Count Based Pruning Algorithm). Int. J. Softw. Web Sci. 5, 118–122 (2013)
Kyurkchiev, H., Kaloyanova, K.: Performance Study of Analytical Queries of Oracle and Vertica. In: Proceedings of the 7th ISGT International Conference. pp. 127–139, Sofia (2013)
Moens, S., et al.: Frequent itemset mining for big data. In: Proceedings of 2013 IEEE International Conference on Big Data, vol. 1, pp. 111–118 (2013)
Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 229–248. MIT Press, Cambridge (1991)
Psaila, G., Torino, P.: SQL-like operator for mining. In: Proceedings of 22nd VLDB Conference, Mumbai (Bombay), India, pp. 122–133 (1996)
Shiby, T., Sarawagi, S.: Mining generalized association rules and sequential patterns using SQL queries. In: Conference on Knowledge Discovery and Data Mining, pp. 344–348 (1998)
Woodie, A.: Array Databases: The Next Big Thing in Data Analytics? http://www.datanami.com/2014/04/09/array_databases_the_next_big_thing_in_data_analytics_/
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Zhao, Y.: R and Data Mining: Examples and Case Studies. Elsevier, San Francisco (2013)
Zikopoulos, P., et al.: Understanding Big Data. McGraw-Hill, San Francisco (2012)
Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/
Acknowledgments
This work is partially supported by Sofia University SRF/2016 under contract “Big data: analysis and management of data and projects” and FMI – Sofia University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kyurkchiev, H., Kaloyanova, K. (2016). Oracle and Vertica for Frequent Itemset Mining. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2016. Lecture Notes in Computer Science(), vol 9714. Springer, Cham. https://doi.org/10.1007/978-3-319-40973-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-40973-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40972-6
Online ISBN: 978-3-319-40973-3
eBook Packages: Computer ScienceComputer Science (R0)