Abstract
Data mining deals with the extraction of hidden knowledge from large amounts of data. Nowadays, coarse-grained data mining modules are used. This traditional black box approach focuses on specific algorithm improvements and is not flexible enough to be used for more general optimization and beneficial component reuse. The work presented in this paper elaborates on decomposing data mining tasks as data mining execution process plans which are composed of finer-grained data mining operators. The cost of an operator can be analyzed and provides means for more holistic optimizations. This process-based data mining concept is evaluated via an OGSA-DAI based implementations for association rule mining which show the feasibility of our approach as well as the re-usability of some of the data mining operators.
Keywords
- decomposition
- data mining operators
- data mining execution process plan
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Open grid services architecture - database access and integration (ogsa-dai), http://www.ogsadai.org.uk/
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, New York, NY, USA, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 3–14 (1995)
Garcia-Molina, H., Widom, J., Ullman, J.D.: Database Systems: The Complete Book, 2nd edn. Prentice Hall, New Jersey (2009)
Gopalan, R.P., Nuruddin, T., Sucahyo, Y.G.: Algebraic specification of association rule queries. In: Proceedings of the 4th Data Mining and Knowledge Discovery: Theory, Tools, and Technology (2003)
Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2), 73–169 (1993)
Haas, L.M., Freytag, J.C., Lohman, G.M., Pirahesh, H.: Extensible query processing in starburst. SIGMOD Rec. 18(2), 377–388 (1989)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Houtsma, M.A.W., Swami, A.N.: Set-oriented mining for association rules in relational databases. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 25–33 (1995)
Kusiak, A.: Decomposition in data mining: an industrial case study. IEEE Transactions on Electronics Packaging Manufacturing 23(4), 345–353 (2000)
Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook. Springer US, New York (2005)
Meo, R., Psaila, G., Ceri, S.: An extension to sql for mining association rules. Data Min. Knowl. Discov. 2(2), 195–224 (1998)
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: Yale: rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, New York, NY, USA, pp. 935–940 (2006)
Panda, B., Herbach, J.S., Basu, S., Bayardo, R.J.: Planet: massively parallel learning of tree ensembles with mapreduce. Proc. VLDB Endow. 2(2), 1426–1437 (2009)
Sacco, G.M., Yao, S.B.: Query optimization in distributed data base systems. Advances in Computers 21, 225–273 (1982)
Wöhrer, A., Zhang, Y., ul Haq Dar, E., Brezany, P.: Unboxing data mining via decomposition in operators - towards macro optimization and distribution. In: KDIR 2009, pp. 243–248. Funchal-Madeira, Portugal (2009)
Yuan, X.: Data mining query language design and implementation. Master’s thesis, The Chinese University of Hong Kong, Hong Kong (2003)
Zhang, Y., Wöhrer, A., Brezany, P.: Towards China’s Railway Freight Transportation Information Grid. In: Proceedings of the 32nd international Convention on Information and Communication Technology, Electronics and Microelectroincs, MIPRO 2009, Opatija, Croatia (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Y., Li, H., Wöhrer, A., Brezany, P., Dai, G. (2010). Decomposing Data Mining by a Process-Oriented Execution Plan. In: Wang, F.L., Deng, H., Gao, Y., Lei, J. (eds) Artificial Intelligence and Computational Intelligence. AICI 2010. Lecture Notes in Computer Science(), vol 6319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16530-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-16530-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16529-0
Online ISBN: 978-3-642-16530-6
eBook Packages: Computer ScienceComputer Science (R0)