Abstract
We propose DepMiner, a method implementing a simple but effective model for the evaluation of the high-order dependencies in a set S of observations. S can be either ordered—thus forming a sequence of events—or not. DepMiner is based on \(\Updelta,\) a measure of the degree of surprise of S based on the departure of the probability of S from a referential probability estimated in the condition of maximum entropy. The method is powerful: at the same time it detects significant positive dependencies as well as negative ones suitable to identify rare events. The system returns the patterns ranked by \(\Updelta;\) they are guaranteed to be statistically significant and their number results reduced in comparison with other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Amer. Stat. Ass. 49(268), 732–764 (1954)
Calders, T., Goethals, B.: Non-derivable itemset mining. Data Min. Knowl. Discov. 14(1), 171–206 (2007)
Zhang, X., Pan, F., Wang, W., Nobel, A.B.: Mining non-redundant high order correlations in binary data. PVLDB 1(1), 1178–1188 (2008)
Duan, L., Street, W.N.: Finding maximal fully-correlated itemsets in large databases. In: Proceedings of the IEEE International Conference on Data Mining, pp. 770–775 (2009)
Gallo, A., Bie, T.D., Cristianini, N.: Mini: Mining informative non-redundant itemsets. In: Proceddings of PKDD Conference, pp. 438–445 (2007)
Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: Proceedings of the ACM SIGKDD Conference, pp. 444–453 (2006)
Omiecinski, E.: Alternative interest measures for mining associations in databases. TKDE 15(1), 57–69 (2003)
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: Proceedings of the ACM SIGMOD conference, pp. 265–276 (1997)
Chakrabarti, S., Sarawagi, S., Dom, B.: Mining surprising patterns using temporal description length. In: Proceedings 24th International Conference on Very Large Data Bases, pp. 606–617 (1998)
Meo, R.: Maximum independence and mutual information. TOIT 48(1), 318–324 (January 2002)
Gionis, A., Mannila, H., Mielikäinen, T., Tsaparas, P.: Assessing data mining results via swap randomization. In: Proceedings of the SIGKDD, pp. 167–176 (2006)
Aggarwal, C.C., Yu, P.S.: A new framework for itemset generation. In: Proceedings of the PODS, pp. 18–24 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this paper
Cite this paper
Meo, R., D’Ambrosi, L. (2011). Finding High Order Dependencies in Data. In: Gelenbe, E., Lent, R., Sakellari, G. (eds) Computer and Information Sciences II. Springer, London. https://doi.org/10.1007/978-1-4471-2155-8_4
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2155-8_4
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2154-1
Online ISBN: 978-1-4471-2155-8
eBook Packages: EngineeringEngineering (R0)