Abstract
We study a discovery framework in which background knowledge on variables and their relations within a discourse area is available in the form of a graphical model. Starting from an initial, hand-crafted or possibly empty graphical model, the network evolves in an interactive process of discovery. We focus on the central step of this process: given a graphical model and a database, we address the problem of finding the most interesting attribute sets. We formalize the concept of interestingness of attribute sets as the divergence between their behavior as observed in the data, and the behavior that can be explained given the current model. We derive an exact algorithm that finds all attribute sets whose interestingness exceeds a given threshold. We then consider the case of a very large network that renders exact inference unfeasible, and a very large database or data stream. We devise an algorithm that efficiently finds the most interesting attribute sets with prescribed approximation bound and confidence probability, even for very large networks and infinite streams. We study the scalability of the methods in controlled experiments; a case-study sheds light on the practical usefulness of the approach.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, Washington, DC, pp 207–216
American Heart Association (2003) Risk factors: high blood cholesterol and other lipids. http://www.americanheart.org/downloadable/heart/1045754065601FS13CHO3.pdf
Andreassen S, Jensen FV, Andersen SK, Falck B, Kjærulff U, Woldbye M, Sørensen AR, Rosenfalck A, Jensen F (1989) MUNIN—an expert EMG assistant. In: John E. Desmedt(eds) Computer-aided electromyography and expert systems, Chap 21. Elsevier Science Publishers, Amsterdam
Bayardo RJ, Agrawal R (1999) Mining the most interesting rules. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, August 1999, pp 145–154
Bøttcher SG, Dethlefsen C (2003) Deal: a package for learning bayesian networks. http://www.math.auc.dk/novo/Publications/bottcher:dethlefsen:03.ps
Carvalho D, Freitas A, Ebecken N (2005) Evaluating the correlation between objective rule interestingness measures and real human interest. In: 9th European conference on principles of data mining and knowledge discovery (PKDD 2005), pp 453–461
Cooper GF, Yoo C (1999) Causal discovery from a mixture of experimental and observational data. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence, UAI, pp 116–125
Dechter R (1999) Bucket elimination: a unifying framework for reasoning. Arti Intell 113(1–2): 41–85
Dong G, Li J (1999) Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the 5th International conference on knowledge discovery and data mining (KDD’99), San Diego, CA, pp 43–52
DuMouchel W, Pregibon D (2001) Empirical bayes screening for multi-item associations. In: Proceedings of the seventh international conference on knowledge discovery and data mining (KDD’01), pp 67–76
Eberhardt F, Glymour C, Scheines R (2005a) N-1 experiments suffice to determine the causal relations among n variables. Technical report, Carnegie Mellon University
Eberhardt F, Glymour C, Scheines R (2005b) On the number of experiments sufficient and in the worst case necessary to identify all causal relations among n variables. In: Proceedings of the 21st conference on uncertainty in artificial intelligence, UAI, pp 178–184
Fayyad U, Piatetski-Shapiro G, Smyth P (1996) Knowledge discovery and data mining: towards a unifying framework. In: Proceedings of the second ACM SIGKDD International conference on knowledge discovery and data mining (KDD-1996), pp 82–88
Gray H (1977) Gray’s anatomy. Grammercy Books, New York
Harinarayan V, Rajaraman A, Ullman JD (1996) Implementing data cubes efficiently. In: Proceedings of the ACM SIGMOD, pp 205–216
Heckerman D (1995) A tutorial on learning with Bayesian networks. Technical Report MSR-TR-95-06, Microsoft Research, Redmond, WA
Hilderman R, Hamilton H (1999) Knowledge discovery and interestingness measures: a survey. Technical Report CS 99-04, Department of Computer Science, University of Regina
Huang C, Darwiche A (1996) Inference in belief networks: a procedural guide. Int J Approx Reason 15(3): 225–263
Jaroszewicz S, Scheffer T (2005) Fast discovery of unexpected patterns in data, relative to a Bayesian network. In: 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2005), Chicago, IL, August 2005, pp 118–127
Jaroszewicz S, Simovici DA (2001) A general measure of rule interestingness. In: 5th European conference on principles of data mining and knowledge discovery (PKDD 2001), pp 253–265
Jaroszewicz S, Simovici DA (2002) Pruning redundant association rules using maximum entropy principle. In: Advances in knowledge discovery and data mining, 6th Pacific-Asia conference, PAKDD’02, Taipei, Taiwan, May 2002, pp 135–147
Jaroszewicz S, Simovici DA (2004) Interestingness of frequent itemsets using bayesian networks as background knowledge. In: 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2004), Seattle, WA, August 2004, pp 178–186
Jensen FV (2001) Bayesian networks and decision graphs. Springer Verlag, New York
Kleiter GD (1996) Propagating imprecise probabilities in bayesian networks. Artif Intell 88(1–2): 143–161
Liu B, Hsu W, Chen S (1997) Using general impressions to analyze discovered classification rules. In: Proceedings of the third international conference on knowledge discovery and data mining (KDD-97). AAAI Press, p 31
Liu B, Jsu W, Ma Y, Chen S (1999) Mining interesting knowledge using DM-II. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, NY, 15–18 August 1999, pp 430–434
Mannila H (2002) Local and global methods in data mining: basic techniques and open problems. In: ICALP 2002, 29th international colloquium on automata, languages, programming, Malaga, Spain, July 2002. Springer-Verlag
Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Min Knowl Disc 1(3): 241–258
Meganck S, Leray P, Manderick B (2006) Learning causal bayesian networks from observations and experiments: a decision theoretic approach. In: Proceedings of the Third International Conference on Modelling Decisions in Artificial Intelligence, MDAI, pp 58–69
Mitchell TM (1997) Machine learning. McGraw-Hill
Murphy K (1998) A brief introduction to graphical models and bayesian networks. http://www.ai.mit.edu/murphyk/Bayes/bnintro.html
Murphy KP (2001) Active learning of causal bayes net structure. Technical report, Department of Computer Science, UC Berkeley
Myllymäki P, Silander T, Tirri H, Uronen P (2002) B-course: a web-based tool for bayesian and causal data analysis. Int J Artif Intelli Tools 11(3): 369–387
Ohsaki M, Kitaguchi S, Okamoto K, Yokoi H, Yamaguchi T (2004) Evaluation of rule interestingness measures with a clinical dataset on hepatitis. In: 8th European conference on principles of data mining and knowledge discovery (PKDD 2004), pp 362–373
Padmanabhan B, Tuzhilin A (1998) Belief-driven method for discovering unexpected patterns. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD’98), August 1998, pp 94–100
Padmanabhan B, Tuzhilin A (2000) Small is beautiful: discovering the minimal set of unexpected patterns. In: Proceedinmgs of the 6th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’00), NY, August 2000, pp 54–63
Pearl J (1998) Probabilistic reasoning in intelligent systems. Morgan Kaufmann, Los Altos, CA
Pearl J (2000) Causality: models, reasoning, and inference. Cambridge University Press, Cambridge, UK
Shah D, Lakshmanan LVS, Ramamritham K, Sudarshan S (1999) Interestingness and pruning of mined patterns. In: 1999 ACM SIGMOD workshop on research issues in data mining and knowledge discovery
Silberschatz A, Tuzhilin A (1995) On subjective measures of interestingness in knowledge discovery. In: Knowledge discovery and data mining, pp 275–281
Smith A, Elkan C (2004) A bayesian network framework for reject inference. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2004), pp 286–295
Spirtes P, Richardson T (1996) A polynomial time algorithm for determining DAG equivalence in the presence of latent variables and selection bias. In: Proceedings of the sixth international workshop on artificial intelligence and statistics
Spirtes P, Meek C, Richardson T (1999) An algorithm for causal inference in the presence of latent variables and selection bias. In: Glymour C, Cooper G (eds) Causation, computation and discovery, Chap. 6. MIT/AAAI Press, pp 211–252
Suzuki E (1997) Autonomous discovery of reliable exception rules. In: Proceedings of the third international conference on knowledge discovery and data mining (KDD-97). AAAI Press, p 259
Suzuki E, Kodratoff Y (1998) Discovery of surprising exception rules based on intensity of implication. In: Proceedings of PKDD-98, Nantes, France, pp 10–18
Tan P-N, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2002), pp 32–41
The TETRAD project: causal models and statistical data. http://www.phil.cmu.edu/projects/tetrad
Tong S, Koller D (2001) Active learning for structure in bayesian networks. In: Proceedings of the 17th international joint conference on artificial intelligence, IJCAI, pp 863–869
Van Allen T, Greiner R, Hooper P (2001) Bayesian error-bars for belief net inference. In: UAI ’01: proceedings of the 17th conference in uncertainty in artificial intelligence, San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., pp 522–529
Zaki MJ (2000) Generating non-redundant association rules. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-00), NY, August 20–23 2000, pp 34–43
Acknowledgements
The authors would like to thank Dr. Ram Dessau from Næstved Hospital, Næstved, Denmark, for providing the Borreliosis network and data. T.S. is supported by the German Science Foundation.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: M. J. Zaki.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Jaroszewicz, S., Scheffer, T. & Simovici, D.A. Scalable pattern mining with Bayesian networks as background knowledge. Data Min Knowl Disc 18, 56–100 (2009). https://doi.org/10.1007/s10618-008-0102-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-008-0102-5