A Multi-Tier Architecture for High-Performance Data Mining
Data mining has been recognised as an essential element of decision support, which has increasingly become a focus of the database industry. Like all computationally expensive data analysis applications, for example Online Analytical Processing (OLAP), performance is a key factor for usefulness and acceptance in business. In the course of the CRITIKAL1 project (Client-Server Rule Induction Technology for Industrial Knowledge Acquisition from Large Databases), which is funded by the European Commission, several kinds of architectures for data mining were evaluated with a strong focus on high performance. Specifically, the data mining techniques association rule discovery and decision tree induction were implemented into a prototype. We present the architecture developed by the CRITIKAL consortium and compare it to alternative architectures.
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. Proceedings of the ACM SIGMOD International Conference, Washington DC, USA, 207–216, May, 1993.Google Scholar
- 2.Agrawal, R., Imielinski, T., Swami, A.: Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering, 5 (6): 914–925, December, 1993.Google Scholar
- 3.Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Very Large Databases, Santiago, Chile, 487–499, September, 1994.Google Scholar
- 4.Attar Software: XpertRule Profiler Reference Manual, 1996–1998.Google Scholar
- 5.Brin, S., Motwani, R., Ullman, J., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Data. Proceedings of the ACM SIGMOD International Conference, Tucson, Arizona, USA, 255–264, May, 1997.Google Scholar
- 6.Cheung, D., Ng, V., Fu, A., Fu, Y.: Efficient Mining of Association Rules in Distributed Databases. IEEE Transactions on Knowledge and Data Engineering, 8 (6): 911–922, December, 1996.Google Scholar
- 7.Han, E., Karypis, G., Kumar, V., Mobasher, B.: Hypergraph Based Clustering in High-Dimensional Data Sets: A Summary of Results. Bulletin of the Technical Committee on Data Engineering, 21 (1): 15–22, March, 1998.Google Scholar
- 8.Savasere, A., Omiencinski, E., Navathe, S.: An Efficient Algorithm for Mining Association Rules in Large Databases. Proceedings of the 21st International Conference on Very Large Databases, Zürich, Switzerland, 432–444, September, 1995.Google Scholar
- 9.Schwarz, H.: Survey of State-of-Art Association Rules Discovery. Deliverable No. D4.1, European Commission, ESPRIT Project No. 22700, Brussels, Belgium, May, 1997.Google Scholar
- 10.Shafer, J., Agrawal, R., Mehta, M.: SPRINT: A Scalable Parallel Classifier for Data Mining. Proceedings of the 22nd International Conference on Very Large Databases, Bombay, India, 544–555, September, 1996.Google Scholar
- 11.Srikant, R., Agrawal, R.: Mining Generalized Association Rules. Proceedings of the 21st International Conference on Very Large Databases, Zürich, Switzerland, 407–419, September, 1995.Google Scholar
- 12.Srikant, R., Agrawal, R.: Mining Quantitative Association Rules in Large Relational Tables, Proceedings of the ACM SIGMOD International Conference, Montreal, Canada, 1–12, June, 1996.Google Scholar