Skip to main content

Finding High Order Dependencies in Data

  • Conference paper
  • First Online:
Computer and Information Sciences II
  • 929 Accesses

Abstract

We propose DepMiner, a method implementing a simple but effective model for the evaluation of the high-order dependencies in a set S of observations. S can be either ordered—thus forming a sequence of events—or not. DepMiner is based on \(\Updelta,\) a measure of the degree of surprise of S based on the departure of the probability of S from a referential probability estimated in the condition of maximum entropy. The method is powerful: at the same time it detects significant positive dependencies as well as negative ones suitable to identify rare events. The system returns the patterns ranked by \(\Updelta;\) they are guaranteed to be statistically significant and their number results reduced in comparison with other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Amer. Stat. Ass. 49(268), 732–764 (1954)

    Google Scholar 

  2. Calders, T., Goethals, B.: Non-derivable itemset mining. Data Min. Knowl. Discov. 14(1), 171–206 (2007)

    Google Scholar 

  3. Zhang, X., Pan, F., Wang, W., Nobel, A.B.: Mining non-redundant high order correlations in binary data. PVLDB 1(1), 1178–1188 (2008)

    Google Scholar 

  4. Duan, L., Street, W.N.: Finding maximal fully-correlated itemsets in large databases. In: Proceedings of the IEEE International Conference on Data Mining, pp. 770–775 (2009)

    Google Scholar 

  5. Gallo, A., Bie, T.D., Cristianini, N.: Mini: Mining informative non-redundant itemsets. In: Proceddings of PKDD Conference, pp. 438–445 (2007)

    Google Scholar 

  6. Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: Proceedings of the ACM SIGKDD Conference, pp. 444–453 (2006)

    Google Scholar 

  7. Omiecinski, E.: Alternative interest measures for mining associations in databases. TKDE 15(1), 57–69 (2003)

    Google Scholar 

  8. Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: Proceedings of the ACM SIGMOD conference, pp. 265–276 (1997)

    Google Scholar 

  9. Chakrabarti, S., Sarawagi, S., Dom, B.: Mining surprising patterns using temporal description length. In: Proceedings 24th International Conference on Very Large Data Bases, pp. 606–617 (1998)

    Google Scholar 

  10. Meo, R.: Maximum independence and mutual information. TOIT 48(1), 318–324 (January 2002)

    Google Scholar 

  11. Gionis, A., Mannila, H., Mielikäinen, T., Tsaparas, P.: Assessing data mining results via swap randomization. In: Proceedings of the SIGKDD, pp. 167–176 (2006)

    Google Scholar 

  12. Aggarwal, C.C., Yu, P.S.: A new framework for itemset generation. In: Proceedings of the PODS, pp. 18–24 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rosa Meo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this paper

Cite this paper

Meo, R., D’Ambrosi, L. (2011). Finding High Order Dependencies in Data. In: Gelenbe, E., Lent, R., Sakellari, G. (eds) Computer and Information Sciences II. Springer, London. https://doi.org/10.1007/978-1-4471-2155-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2155-8_4

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-2154-1

  • Online ISBN: 978-1-4471-2155-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics