Oracle and Vertica for Frequent Itemset Mining

Kyurkchiev, Hristo; Kaloyanova, Kalinka

doi:10.1007/978-3-319-40973-3_8

Oracle and Vertica for Frequent Itemset Mining

Hristo Kyurkchiev¹⁵ &
Kalinka Kaloyanova¹⁵

Conference paper
First Online: 14 June 2016

2868 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9714))

Abstract

In the last few years, organizations have become much more interested in using data to create value. Big Data, however, presents new challenges to the extraction of knowledge using traditional Data Mining methods. In this paper we focus on a concrete implementation of association rules generation. The proposed algorithm is specialized for four datasets and its performance for different support thresholds is measured. This is done for two Database Management Systems (DBMS) – a traditional row-oriented DMBS in the face of Oracle and a column-oriented DBMS represented by Vertica. The results indicate the suitability of these DBMSs as tools for association rules generation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Adhikary, D., Roy, S.: Issues in Quantitative Association Rule Mining: A Big Data Perspective. ICT for Sustainable Development, India (2015)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th VLDB Conference, Santiago, Chile, vol. 42 (1994)
Google Scholar
Al-khoder, A., Harmouch, H.: Evaluating four of the most popular open source and free data mining tools. Int. J. Acad. Sci. Res. 3(1), 13–23 (2015)
Article Google Scholar
Brice, B., Alexander, W.: Finding interesting things in lots of data. In: 23rd Hawaii International Conference on System Sciences (1990)
Google Scholar
Danubianu, M., et al.: Mining association rules inside a relational database–a case study. In: 6th ICCGI, pp. 14–19 (2011)
Google Scholar
Garcia-Molina, H., et al.: Database Systems: The Complete Book. Pearson Prentice Hall, Upper Saddle River (2009)
Google Scholar
Han, J., et al.: Data Mining: Concepts and Techniques. Elsevier, San Francisco (2012)
Book MATH Google Scholar
Han, J., et al.: Knowledge discovery in databases: an attribute-oriented approach. In: Proceedings of the 18th VLDB Conference, Vancouver, British Columbia, Canada, pp. 547–559 (1992)
Google Scholar
Han, J., et al.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 1–12 (2000)
Google Scholar
Khurana, N., Datta, R.K.: Pruning large data sets for finding association rule in cloud: CBPA (Count Based Pruning Algorithm). Int. J. Softw. Web Sci. 5, 118–122 (2013)
Google Scholar
Kyurkchiev, H., Kaloyanova, K.: Performance Study of Analytical Queries of Oracle and Vertica. In: Proceedings of the 7th ISGT International Conference. pp. 127–139, Sofia (2013)
Google Scholar
Moens, S., et al.: Frequent itemset mining for big data. In: Proceedings of 2013 IEEE International Conference on Big Data, vol. 1, pp. 111–118 (2013)
Google Scholar
Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 229–248. MIT Press, Cambridge (1991)
Google Scholar
Psaila, G., Torino, P.: SQL-like operator for mining. In: Proceedings of 22nd VLDB Conference, Mumbai (Bombay), India, pp. 122–133 (1996)
Google Scholar
Shiby, T., Sarawagi, S.: Mining generalized association rules and sequential patterns using SQL queries. In: Conference on Knowledge Discovery and Data Mining, pp. 344–348 (1998)
Google Scholar
Woodie, A.: Array Databases: The Next Big Thing in Data Analytics? http://www.datanami.com/2014/04/09/array_databases_the_next_big_thing_in_data_analytics_/
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article MathSciNet Google Scholar
Zhao, Y.: R and Data Mining: Examples and Case Studies. Elsevier, San Francisco (2013)
MATH Google Scholar
Zikopoulos, P., et al.: Understanding Big Data. McGraw-Hill, San Francisco (2012)
Google Scholar
Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/

Download references

Acknowledgments

This work is partially supported by Sofia University SRF/2016 under contract “Big data: analysis and management of data and projects” and FMI – Sofia University.

Author information

Authors and Affiliations

Faculty of Mathematics and Informatics, Sofia University, 5 James Bourchier Blvd, 1164, Sofia, Bulgaria
Hristo Kyurkchiev & Kalinka Kaloyanova

Authors

Hristo Kyurkchiev
View author publications
You can also search for this author in PubMed Google Scholar
Kalinka Kaloyanova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kalinka Kaloyanova .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Ying Tan
Xi'an Jiaotong-Liverpool University, Suzhou, China
Yuhui Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kyurkchiev, H., Kaloyanova, K. (2016). Oracle and Vertica for Frequent Itemset Mining. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2016. Lecture Notes in Computer Science(), vol 9714. Springer, Cham. https://doi.org/10.1007/978-3-319-40973-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-40973-3_8
Published: 14 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40972-6
Online ISBN: 978-3-319-40973-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics