Finding High Order Dependencies in Data

Meo, Rosa; D’Ambrosi, Leonardo

doi:10.1007/978-1-4471-2155-8_4

Rosa Meo⁴ &
Leonardo D’Ambrosi⁵

929 Accesses

Abstract

We propose DepMiner, a method implementing a simple but effective model for the evaluation of the high-order dependencies in a set S of observations. S can be either ordered—thus forming a sequence of events—or not. DepMiner is based on \(\Updelta,\) a measure of the degree of surprise of S based on the departure of the probability of S from a referential probability estimated in the condition of maximum entropy. The method is powerful: at the same time it detects significant positive dependencies as well as negative ones suitable to identify rare events. The system returns the patterns ranked by \(\Updelta;\) they are guaranteed to be statistically significant and their number results reduced in comparison with other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Amer. Stat. Ass. 49(268), 732–764 (1954)
Google Scholar
Calders, T., Goethals, B.: Non-derivable itemset mining. Data Min. Knowl. Discov. 14(1), 171–206 (2007)
Google Scholar
Zhang, X., Pan, F., Wang, W., Nobel, A.B.: Mining non-redundant high order correlations in binary data. PVLDB 1(1), 1178–1188 (2008)
Google Scholar
Duan, L., Street, W.N.: Finding maximal fully-correlated itemsets in large databases. In: Proceedings of the IEEE International Conference on Data Mining, pp. 770–775 (2009)
Google Scholar
Gallo, A., Bie, T.D., Cristianini, N.: Mini: Mining informative non-redundant itemsets. In: Proceddings of PKDD Conference, pp. 438–445 (2007)
Google Scholar
Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: Proceedings of the ACM SIGKDD Conference, pp. 444–453 (2006)
Google Scholar
Omiecinski, E.: Alternative interest measures for mining associations in databases. TKDE 15(1), 57–69 (2003)
Google Scholar
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: Proceedings of the ACM SIGMOD conference, pp. 265–276 (1997)
Google Scholar
Chakrabarti, S., Sarawagi, S., Dom, B.: Mining surprising patterns using temporal description length. In: Proceedings 24th International Conference on Very Large Data Bases, pp. 606–617 (1998)
Google Scholar
Meo, R.: Maximum independence and mutual information. TOIT 48(1), 318–324 (January 2002)
Google Scholar
Gionis, A., Mannila, H., Mielikäinen, T., Tsaparas, P.: Assessing data mining results via swap randomization. In: Proceedings of the SIGKDD, pp. 167–176 (2006)
Google Scholar
Aggarwal, C.C., Yu, P.S.: A new framework for itemset generation. In: Proceedings of the PODS, pp. 18–24 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Torino, Torino, Italy
Rosa Meo
Regional Agency for Health Care Services—A.Re.S.S, Piemonte, Italy
Leonardo D’Ambrosi

Authors

Rosa Meo
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo D’Ambrosi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rosa Meo .

Editor information

Editors and Affiliations

, Dept of Electrical and Electronics Eng'g, Imperial College, London, SW7 2BT, United Kingdom
Erol Gelenbe
Imperial College, London, United Kingdom
Ricardo Lent
University of East London, London, United Kingdom
Georgia Sakellari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meo, R., D’Ambrosi, L. (2011). Finding High Order Dependencies in Data. In: Gelenbe, E., Lent, R., Sakellari, G. (eds) Computer and Information Sciences II. Springer, London. https://doi.org/10.1007/978-1-4471-2155-8_4

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2155-8_4
Published: 29 September 2011
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2154-1
Online ISBN: 978-1-4471-2155-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics