Skip to main content

Oracle and Vertica for Frequent Itemset Mining

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9714))

Abstract

In the last few years, organizations have become much more interested in using data to create value. Big Data, however, presents new challenges to the extraction of knowledge using traditional Data Mining methods. In this paper we focus on a concrete implementation of association rules generation. The proposed algorithm is specialized for four datasets and its performance for different support thresholds is measured. This is done for two Database Management Systems (DBMS) – a traditional row-oriented DMBS in the face of Oracle and a column-oriented DBMS represented by Vertica. The results indicate the suitability of these DBMSs as tools for association rules generation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Adhikary, D., Roy, S.: Issues in Quantitative Association Rule Mining: A Big Data Perspective. ICT for Sustainable Development, India (2015)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th VLDB Conference, Santiago, Chile, vol. 42 (1994)

    Google Scholar 

  3. Al-khoder, A., Harmouch, H.: Evaluating four of the most popular open source and free data mining tools. Int. J. Acad. Sci. Res. 3(1), 13–23 (2015)

    Article  Google Scholar 

  4. Brice, B., Alexander, W.: Finding interesting things in lots of data. In: 23rd Hawaii International Conference on System Sciences (1990)

    Google Scholar 

  5. Danubianu, M., et al.: Mining association rules inside a relational database–a case study. In: 6th ICCGI, pp. 14–19 (2011)

    Google Scholar 

  6. Garcia-Molina, H., et al.: Database Systems: The Complete Book. Pearson Prentice Hall, Upper Saddle River (2009)

    Google Scholar 

  7. Han, J., et al.: Data Mining: Concepts and Techniques. Elsevier, San Francisco (2012)

    Book  MATH  Google Scholar 

  8. Han, J., et al.: Knowledge discovery in databases: an attribute-oriented approach. In: Proceedings of the 18th VLDB Conference, Vancouver, British Columbia, Canada, pp. 547–559 (1992)

    Google Scholar 

  9. Han, J., et al.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 1–12 (2000)

    Google Scholar 

  10. Khurana, N., Datta, R.K.: Pruning large data sets for finding association rule in cloud: CBPA (Count Based Pruning Algorithm). Int. J. Softw. Web Sci. 5, 118–122 (2013)

    Google Scholar 

  11. Kyurkchiev, H., Kaloyanova, K.: Performance Study of Analytical Queries of Oracle and Vertica. In: Proceedings of the 7th ISGT International Conference. pp. 127–139, Sofia (2013)

    Google Scholar 

  12. Moens, S., et al.: Frequent itemset mining for big data. In: Proceedings of 2013 IEEE International Conference on Big Data, vol. 1, pp. 111–118 (2013)

    Google Scholar 

  13. Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 229–248. MIT Press, Cambridge (1991)

    Google Scholar 

  14. Psaila, G., Torino, P.: SQL-like operator for mining. In: Proceedings of 22nd VLDB Conference, Mumbai (Bombay), India, pp. 122–133 (1996)

    Google Scholar 

  15. Shiby, T., Sarawagi, S.: Mining generalized association rules and sequential patterns using SQL queries. In: Conference on Knowledge Discovery and Data Mining, pp. 344–348 (1998)

    Google Scholar 

  16. Woodie, A.: Array Databases: The Next Big Thing in Data Analytics? http://www.datanami.com/2014/04/09/array_databases_the_next_big_thing_in_data_analytics_/

  17. Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)

    Article  MathSciNet  Google Scholar 

  18. Zhao, Y.: R and Data Mining: Examples and Case Studies. Elsevier, San Francisco (2013)

    MATH  Google Scholar 

  19. Zikopoulos, P., et al.: Understanding Big Data. McGraw-Hill, San Francisco (2012)

    Google Scholar 

  20. Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/

Download references

Acknowledgments

This work is partially supported by Sofia University SRF/2016 under contract “Big data: analysis and management of data and projects” and FMI – Sofia University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kalinka Kaloyanova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kyurkchiev, H., Kaloyanova, K. (2016). Oracle and Vertica for Frequent Itemset Mining. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2016. Lecture Notes in Computer Science(), vol 9714. Springer, Cham. https://doi.org/10.1007/978-3-319-40973-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40973-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40972-6

  • Online ISBN: 978-3-319-40973-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics