Integrating Quantitative Attributes in Hierarchical Clustering of Transactional Data
Appropriate data mining exploration methods can reveal valuable but hidden information in today’s large quantities of transactional data. While association rules generation is commonly used for transactional data analysis, clustering is rather rarely used for analysis of this type of data. In this paper we provide adaptations of parameters related to association rules generation so they can be used to represent distance. Furthermore, we integrate goal-oriented quantitative attributes in distance measure formulation to increase the quality of gained results and streamline the decision making process. As a proof of concept, newly developed measures are tested and results are discussed both on a referent dataset as well as a large real-life retail dataset.
KeywordsTransactional Data Hierarchical Clustering Quantitative Attributes Distance Measures Retail Data
Unable to display preview. Download preview PDF.
- 1.Han, J., Kamber, M.: Data mining: concepts and techniques. The Morgan Kaufmann series in data management systems. Elsevier (2006)Google Scholar
- 2.Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, New York, pp. 207–216 (1993)Google Scholar
- 4.Piatetsky-Shapiro, G.: Discovery, analysis and presentation of strong rules. In: Knowledge Discovery in Databases, pp. 229–248. AAAI Press (1991)Google Scholar
- 6.Webb, G.I.: Discovering significant rules. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, New York, pp. 434–443 (2006)Google Scholar
- 8.Pinjušić, S., Vranić, M., Pintar, D.: Improvement of hierarchical clustering results by refinement of variable types and distance measures. Automatika: Journal for Control, Measurement, Electronics, Computing and Communications 52(4), 353–364 (2011)Google Scholar
- 9.Vranić, M.: Designing concise representation of correlations among elements in transactional data. PhD thesis, FER, Zagreb, Croatia (2011)Google Scholar
- 10.Vranić, M., Pintar, D., Gamberger, D.: Adapting hierarchical clustering distance measures for improved presentation of relationships between transaction elements. Journal of Information and Organizational Sciences 36(1) (in press, 2012)Google Scholar
- 13.Aumann, Y., Lindel, Y.: A statistical theory for quantitative association rules. Journal of Intelligent Information Systems, 261–270 (1999)Google Scholar
- 15.Vranić, M., Pintar, D., Skočir, Z.: Generation and analysis of tree structures based on association rules and hierarchical clustering. In: Proceedings of the 2010 Fifth International Multi-conference on Computing in the Global Information Technology, ICCGI 2010, pp. 48–53. IEEE Computer Society, Washington DC (2010)CrossRefGoogle Scholar