Formalizing Complex Prior Information to Quantify Subjective Interestingness of Frequent Pattern Sets

Kontonasios, Kleanthis-Nikolaos; DeBie, Tijl

doi:10.1007/978-3-642-34156-4_16

Kleanthis-Nikolaos Kontonasios¹⁹ &
Tijl DeBie¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7619))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1758 Accesses
8 Citations
1 Altmetric

Abstract

In this paper, we are concerned with the problem of modelling prior information of a data miner about the data, with the purpose of quantifying subjective interestingness of patterns. Recent results have achieved this for the specific case of prior expectations on the row and column marginals, based on the Maximum Entropy principle [2,9]. In the current paper, we extend these ideas to make them applicable to more general prior information, such as knowledge of frequencies of itemsets, a cluster structure in the data, or the presence of dense areas in the database. As in [2,9], we show how information theory can be used to quantify subjective interestingness against this model, in particular the subjective interestingness of tile patterns [3]. Our method presents an efficient, flexible, and rigorous alternative to the randomization approach presented in [5]. We demonstrate our method by searching for interesting patterns in real-life data with respect to various realistic types of prior information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley-Interscience (2005)
Google Scholar
DeBie, T.: Maximum entropy models and subjective interestingness: an application to tiles in binary databases. In: Data Mining and Knowledge Discovery (2010)
Google Scholar
Geerts, F., Goethals, B., Mielikainen, T.: Tiling databases. In: Discovery Science (2004)
Google Scholar
Gionis, A., Mannila, H., Mielikainen, T., Tsaparas, P.: Assessing data mining results via swap randomization. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(3) (2007)
Google Scholar
Hanhijarvi, S., Ojala, M., Vuokko, N., Puolamaki, K., Tatti, N., Mannila, H.: Tell me something I don’t know: Randomization strategies for iterative data mining. In: Proc. of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2009 (2009)
Google Scholar
Jaynes, E.T.: On the rationale of maximum-entropy methods. Proceedings of the IEEE 70 (1982)
Google Scholar
Khuller, S., Moss, A., Naor, J.: The budgeted maximum coverage problem. Information Processing Letters 70 (1999)
Google Scholar
Koller, D., Friedman, N.: Probabilistic Graphical Models, Principles and Techniques. MIT Press (2009)
Google Scholar
Kontonasios, K.-N., DeBie, T.: An information-theoretic approach to finding informative noisy tiles in binary databases. In: SDM, pp. 153–164. SIAM (2010)
Google Scholar
Kontonasios, K.-N., DeBie, T.: Formalizing complex prior information to quantify subjective interestingness of frequent pattern sets (supplementary document). Technical report, University of Bristol (2011), https://patterns.enm.bris.ac.uk/projects/Mining , subjectively interesting patterns using prior knowledge
Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE Trans. on Knowl. and Data Eng. 8(6), 970–974 (1996)
Article Google Scholar
Tatti, N., Mampaey, M.: Using background knowledge to rank itemsets. Data Min. Knowl. Discov. 21, 293–309 (2010)
Article MathSciNet Google Scholar
Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y.: Mining advisor-advisee relationships from research publication networks. In: KDD 2010 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Systems Laboratory, University of Bristol, Merchant Venturers Building, Bristol, BS8 1UB, UK
Kleanthis-Nikolaos Kontonasios & Tijl DeBie

Authors

Kleanthis-Nikolaos Kontonasios
View author publications
You can also search for this author in PubMed Google Scholar
Tijl DeBie
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Computer Science, Aalto University School of Science, P.O. Box 15400, 00076, Aalto, Finland
Jaakko Hollmén
Department of Computer Science, Ostfalia University of Applied Sciences, Salzdahlumer Straße 46/48, 38302, Wolfenbüttel, Germany
Frank Klawonn
School of Information Systems, Computing and Mathematics, Brunel University, UB8 3PH, Uxbridge, Middlesex, UK
Allan Tucker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kontonasios, KN., DeBie, T. (2012). Formalizing Complex Prior Information to Quantify Subjective Interestingness of Frequent Pattern Sets. In: Hollmén, J., Klawonn, F., Tucker, A. (eds) Advances in Intelligent Data Analysis XI. IDA 2012. Lecture Notes in Computer Science, vol 7619. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34156-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-34156-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34155-7
Online ISBN: 978-3-642-34156-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics