Abstract
On account of the enormous amounts of rules that can be produced by data mining algorithms, knowledge post-processing is a difficult stage in an association rule discovery process. In order to find relevant knowledge for decision making, the user (a decision maker specialized in the data studied) needs to rummage through the rules. To assist him/her in this task, we here propose the rule-focusing methodology, an interactive methodology for the visual post-processing of association rules. It allows the user to explore large sets of rules freely by focusing his/her attention on limited subsets. This new approach relies on rule interestingness measures, on a visual representation, and on interactive navigation among the rules. We have implemented the rule-focusing methodology in a prototype system called ARVis. It exploits the user's focus to guide the generation of the rules by means of a specific constraint-based rule-mining algorithm.
Similar content being viewed by others
References
Aggarwal CC (2002) Towards effective and interpretable data mining by visual interaction. ACM SIGKDD Explor 3(2):11–22
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, Washington, DC, ACM Press, New York, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Booka JB, Jarke M, Zaniolo (eds) Proceedings of the 20th international conference on very large data bases (VLDB), Santiago de Chile, Chile, Morgan Kaufmann, San Fransisco, pp 487–499
Agrawal R, Arning A, Bollinger T, Mehta M, Shafer J, Srikant R (1996) The Quest data mining system. In: Proceedings of the 2nd ACM SIGKDD international conference on knowledge discovery and data mining. AAAI Press, Menlo Park, pp 244–249, www.almaden.ibm.com/software/quest/
Ammoura A, Zaiane OR, Ji Y (2001) Immersed visual data mining: walking the walk. In: BNCOD 18: Proceedings of the 18th British national conference on databases, Chilton, UK. Springer-Verlag, Berlin Heidelberg New York, pp 202–218
Andrews K (1995) Visualising cyberspace: information visualisation in the Harmony internet browser. In: Proceedings of the 1995 IEEE symposium on information visualization, Atlanta, GA. IEEE Computer Society, Washington, DC, pp 97–104
Baird JC (1970) Psychophysical analysis of visual space. Pergamon Press, UK
Barthelemy J-P. Mullet E (1992) A model of selection by aspects. Acta Psychol 79(1):1–19
Bayardo RJ, Jr, Agrawal R (1999) Mining the most interesting rules. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, ACM Press, New York, pp 145–154
Bertin J (1967) Sémiologie Graphique (Gauthier-Villars, English translation by Berg W. J. as Semiology of Graphics, 1983). University of Wisconsin Press, Wisconsin
Bhandari I (1994) Attribute focusing: machine-assisted knowledge discovery applied to software production process control. Knowl Acquisit 6(3):271–294
Bisdorff R (ed) (2003) Proceeding of the mini-EURO conference on human centered processes HCP'2003, Luxemberg, University of Luxembourg, Luxembourg
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA. www.ics.uci.edu/mlearn/MLRepository.html
Blanchard J, Kuntz P, Guillet F, Gras R (2003) Implication intensity: from the basic statistical definition to the entropic version. In: Bozdogan H (ed) Statistical data mining and knowledge discovery. Chapman & Hall/CRC Press, Boca Raton, pp 473–485
Blanchard J (2005) A visualization system for interactive mining, assessment, and exploration of association rules. Ph.D. thesis, University of Nantes (in French)
Blanchard J, Guillet F, Briand H, Gras R (2005) Assessing rule interestingness with a probabilistic measure of deviation from equilibrium. In: Proceedings of the 11th international symposium on applied stochastic models and data analysis ASMDA-2005, ENST, pp 191–200
Blanchard J., Guillet F, Briand H, Gras R (2005) Using information-theoretic measures to assess association rule interestingness. In: Proceedings of the 5th IEEE international conference on data mining ICDM'05, New Orleans, LA. IEEE Computer Society, Washington, DC, pp 66–73
Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2005) Efficient breadth-first mining of frequent pattern with monotone constraints. Knowl Inform Syst 8(2):131–153
Botta M, Boulicaut JF, Masson C, Meo R (2002) A comparison between query languages for the extraction of association rules. In: Proceedings of the 4th international conference on data warehousing and knowlege discovery (DaWaK 2002), Aix-en-Provence, France, Lecture notes in computer science, vol 2454, Springer-Verlag, Berlin Heidelberg New York
Brachman, JR, Anand T (1996) The process of knowledge discovery in databases: a human-centered approach. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI/MIT Press, Melno Park, CA, pp 37–58
Braga D, Campi A, Klemettinen M, Lanzi PL (2002) Mining association rules from XML Data. In: Proceedings of the 4th international conference on data warehousing and knowlege discovery (DaWaK 2002), Aix-en-Provence, France, Lecture notes in computer science, vol 2454, Springer-Verlag, Berlin Heidelberg New York
Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. SIGMOD Rec 26(2):255–264
Brunk C, Kelly J, Kohavi R (1997) MineSet: an integrated system for data mining. In: Proceedings of the 3rd ACM SIGKDD international conference on knowledge discovery and data mining, Washington DC. AAAI Press, Melno Park, pp 135–138
Card SK, Mackinlay JD, Schneiderman B (eds) (1999) Readings in information visualization: using vision to think. Morgan Kaufmann, San Fransisco
Carswell CM, Frankenberger S, Bernhard D (1991) Graphing in depth: perspectives on the use of three-dimensional graphs to represent lower-dimensional data. Behav Inform Technol 10(6):459–474
Chen C (2004) Information visualization: beyond the horizon. Springer-Verlag, Berlin Heidelberg New York
Cleveland WS, McGill R (1984) Graphical perception: theory, experimentation, and application to the development of graphical methods. J Am Stat Assoc 79(387): 531–554
Cockburn A, McKenzie B (2001) 3D or not 3D? Evaluating the effect of the third dimension in a document management system. In: CHI'01: Proceedings of the SIGCHI conference on human factors in computing systems, Pittsburgh, PA. ACM Press, New York, pp 434–441
Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery: an overview. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI/MIT Press, Melno Park, pp 1–34
Fayyad UM, Grinstein GG, Wierse A (2001) Information visualization in data mining and knowledge discovery. Morgan Kaufmann, San Fransisco
Freitas AA (1998) On objective measures of rule surprisingness. In: Proceedings of the 2nd European conference on principles of data mining and knowledge discovery (PKDD'98), Nantes, France. Lecture notes in artificial intelligence, vol 1510, Springer-Verlag, Berlin Heidelberg New York
Fukuda T, Morimoto Y, Morishita S, Tokuyama T (2001) Data mining with optimized two-dimensional association rules. ACM Trans Database Syst 26(2):179–213
Fule P, Roddick JF (2004) Experiences in building a tool for navigating association rule result sets. In: Hogan J, Montague P, Purvis M, Steketee C (eds) CRPIT'04: Proceedings of the second Australasian workshop on data mining and web intelligence, Darlinghurst, Australia. Australian Computer Society, Sydney, pp 103–108
Goethals B, Van den Bussche J. (2000) On Supporting interactive association rule mining. In: Proceedings of the 2nd international conference on data warehousing and knowledge discovery (DaWaK2000), London, UK, Lecture notes in computer science, vol 1874, pp 307–316. Springer-Verlag, Berlin Heidelberg New York
Grahne G, Lakshmanan LVS, Wang X (2000) Efficient mining of constrained correlated sets. In: Proceedings of the sixteenth international conference on data engineering (ICDE), San Diego, CA, 28 February to 3 March 2000. IEEE Computer Society, Washington, DC, pp 512–521
Gras R (1996) L'implication statistique: nouvelle méthode exploratoire de données. La Pensée Sauvage Editions (in French)
Guillaume S, Guillet F, Philippe J (1998) Improving the discovery of association rules with intensity of implication. In: Proceedings of the 2nd European conference on principles of data mining and knowledge discovery (PKDD'98), Nantes, France. Lecture notes in artificial intelligence, vol 1510, Springer-Verlag, Berlin Heidelberg New York
Han J, Fu Y, Wang W, Koperski K, Zaiane O (1996) DMQL: a data mining query language for relational databases. Paper presented at the 1996 SIGMOD workshop on research issues on data mining and knowledge discovery (DMKD), Montreal, Canada
Han J, Chiang JY, Chee S, Chen J, Chen Q, Cheng S, Gong W, Kamber M, Koperski K, Liu G, Lu Y, Stefanovic N, Winstone L, Xia B, Zaiane OR, Zhang S, Zhu H (1997) DBMiner: a system for data mining in relational databases and data warehouses. In: Proceedings of CASCON'97: Meeting of minds, Toronto, Ontario, pp 249–260
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD international conference on management of data, Dallas, Texas, ACM Press, New York, pp 1–12
Han J, An A, Cercone N (2000) CViz: an interactive visualization system for rule induction. In: AI'00: Proceedings of the 13th Biennial conference of the Canadian Society on Computational Studies of Intelligence, Montreal, Quebec, Canada, Springer-Verlag, Berlin Heidelberg New York, pp 214–226
Han J, Hu X, Cercone N (2003) A visualization model of interactive knowledge discovery systems and its implementations. Inform Visual 2(2):105–125
Hao MC, Dayal U, Hsu M, Sprenger T, Gross MH (2001) Visualization of directed associations in e-commerce transaction data. In: Proceedings of VisSym 2001, Ascona, Switzerland. Springer-Verlag, Berlin Heidelberg New York, pp 185–192
Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining: a general survey and comparison. SIGKDD Explor 2(1):58–64
Hipp J, Güntzer U (2002) Is pushing constraints deeply into the mining algorithms really what we want? An alternative approach for association rule mining. SIGKDD Explor 4(1):50–55
Hofmann H, Siebes AP, Wilhelm AF (2000) Visualizing association rules with interactive mosaic plots. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, Boston, MA. ACM Press, New York, pp 227–235
Hofmann H, Wilhelm A (2001) Visual comparison of association rules. Comput Stat 16(3):399–415
Holland JH, Holyoak KJ, Nisbett RE, Thagard PR (1986) Induction: processes of inference, learning and discovery. MIT Press, Cambridge, MA
Hussain F, Liu H, Suzuki E, Lu H (2000) Exception rule mining with a relative interestingness measure. In: Proceedings of the 4th Pacific-Asia conference on knowledge discovery and data mining (PAKDD2000), Kyoto, Japan, Lecture notes in computer science, vol 1805, pp 86–97. Springer-Verlag, Berlin Heidelberg New York
IBM (2006) DB2 intelligent miner visualization. www.ibm.com/software/data/iminer/visu- alization/index.html
Imielinski T, Mannila H (1996) A database perspective on knowledge discovery. Commun ACM 39(11):58–64
Imielinski T, Virmani A (1999) MSQL: a query language for database mining. Data Min Knowl Discov 3(4):373–408
Jeudy B, Boulicaut J-F (2002) Optimization of association rule mining queries. Intell Data Anal 6(4)341–357
Keim DA (2002) Information visualization and visual data mining. IEEE Trans Visual Comput Graphics 8(1):1–8
Klemettinen M, Mannila H, Ronkainen P, Toivonen H, Verkamo AI (1994) Finding interesting rules from large sets of discovered association rules. In: Proceedings of the 3rd international conference on information and knowledge management (CIKM), Gaithersburg, Maryland, ACM Press, New York, pp 401–407
Kopanakis I, Theodoulidis B (2001) Visual data mining and modeling techniques. Paper presented at the KDD-2001 workshop on visual data mining, San Francisco, CA
Kuntz P, Guillet F, Lehn R, Briand H (2000) A user-driven process for mining association rules. In: Proceedings of the 4th European conference on principles of data mining and knowledge discovery (PKDD-2000), Lyon, France, Springer-Verlag, Berlin Heidelberg New York, pp 483–489
Liu B, Hsu W, Wang K, Chen S (1999) Visually aided exploration of interesting association rules. In: Proceedings of the 3rd Pacific-Asia conference on knowledge discovery and data mining (PAKDD1999), Beijing, China, Lectures notes in artificial intelligence, vol 1574, pp 380–389. Springer-Verlag, Berlin Heidelberg New York
Liu B, Hsu W, Chen S, Ma Y (2000) Analyzing the subjective interestingness of association rules. IEEE Intell Syst 15(5):47–55
Loevinger J (1947) A systematic approach to the construction and evaluation of tests of ability. Psychol Monogr 61(4)
Ma Y, Liu B, Wong CK (2000) Web for data mining: organizing and interpreting the discovered rules using the Web. SIGKDD Explor 2(1):16–23
McEachren AM (1995) How maps work: representation, visualization, and design. The Guilford Press, New York
Meo R, Psaila G, Ceri S (1998) An extension to SQL for mining association rules. Data Min Knowl Discov 2(2):195–224
Montgomery H (1983) Decision rules and the search for a dominance structure: towards a process model of decision making. In: Humphreys PC, Svenson O, Vari A (eds) Analysing and aiding decision processes. North Holland, Amsterdam, pp 343–369
Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, ACM Press, New York, pp 13–24
Ordonez C, Ezquerra N, Santana CA (2006) Constraining and summarizing association rules in medical data. Knowl Inform Syst 9(3):1–2
Padmanabhan B, Tuzhilin A (1999) Unexpectedness as a measure of interestingness in knowledge discovery. Decision Support Syst 27(3):303–318
Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro G, Frawley WJ (eds) Knowledge discovery in databases. AAAI/MIT Press, Melno Park, pp 229–248
Rainsford CP, Roddick JF (2000) Visualisation of temporal interval association rules. In: Proceedings of the 2nd international conference on intelligent data engineering and automated learning (IDEAL 2000), Shatin, Hong Kong, Springer-Verlag, Berlin Heidelberg New York, pp 91–96
Robertson G, Czerwinski M, Larson K, Robbins DC, Thiel D, van Dantzich M (1998) Data mountain: using spatial memory for document management. In: UIST'98: Proceedings of the 11th annual ACM symposium on user interface software and technology, San Fransisco, CA, ACM Press, New York, pp 153–162
SAS (2006) Enterprise Miner. www.sas.com/technologies/analytics/datamining/miner/
Sebag M, Schoenauer M (1988) Generation of rules with certainty and confidence factors from incomplete and incoherent learning bases. In: Proceedings of the European knowledge acquisition workshop EKAW'88, Gesellschaft für Mathematik und Datenverarbeitung mbH, pp 28.1–28.20
Schneiderman B (2002) Inventing discovery tools: combining information visualization with data mining. Inform Visual 1(1):5–12
Silberschatz A, Tuzhilin A (1996) User-assisted knowledge discovery: how much should the user be involved. Paper presented at the 1996 SIGMOD workshop on research issues on data mining and knowledge discovery (DMKD), Montreal, Canada
Silberschatz A, Tuzhilin A (1996) What makes patterns interesting in knowledge discovery systems. IEEE Trans Knowl Data Eng 8(6):970–974
Silverstein C, Brin S, Motwani R (1998) Beyond market baskets: generalizing association rules to dependence rules. Data Min Knowl Discov 2(1):39–68
Simon HA (1979) Models of thought. Yale University Press, New Haven, CT
Spence I (1990) Visual psychophysics of simple graphical elements. J Exp Psychol Human Percept Perform 16(4):683–692
Spence R (2000) Information visualization. Addison Wesley, Boston, MA
Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of the 3rd ACM SIGKDD international conference on knowledge discovery and data mining, Washington DC. AAAI Press, Melno Park, pp 67–73
Suzuki E (2002) Undirected discovery of interesting exception rules. Int J Pattern Recog Artif Intell 16(8):1065–1086
Tan P-N, Kumar V, Srivastava J (2004) Selecting the right objective measure for association analysis. Inform Syst 29(4):293–313
Tufte E (1983) The visual display of quantitative information. Graphics Press, Cheshire, CT
Tuzhilin A, Adomavicius G (2002) Handling very large numbers of association rules in the analysis of microarray data. In: KDD'02: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, Edmonton, Alberta, Canada ACM Press, New York, pp 396–404
Unwin AR, Hofmann H, Bernt K (2001) The TwoKey plot for multiple association rules control. In: Proceedings of 5th European conference on principle and practice of knowledge discovery in databases (PKDD'01), Freiburg, Germany, Springer-Verlag, Berlin Heidelberg New York, pp 472–483
Ware C, Franck G (1996) Evaluating stereo and motion cues for visualizing information nets in three dimensions. ACM Trans Graphics 15(2):121–140
Wilkinson L (1999) The Grammar Of Graphics. Springer-Verlag, Berlin Heidelberg New York
Wong PC, Whitney P, Thomas J (1999) Visualizing association rules for text mining. In: Proceedings of the 1999 IEEE symposium on information visualization, Berkeley, California, IEEE Computer Society, Washington, DC, pp 120–123
Author information
Authors and Affiliations
Corresponding author
Additional information
Julien Blanchard earned the Ph.D. in 2005 from Nantes University (France) and is currently an assistant professor at the Polytechnic School of Nantes University. He is the author of a book chapter and seven journal and international conference papers in the field of visualization and interestingness measures for data mining.
Fabrice Guillet is currently a member of the LINA laboratory (CNRS 2729) at the Polytechnic Graduate School of Nantes University (France). He receive the Ph.D. degree in computer science in 1995 from the Ecole Nationale Supěrieure des Télécommunications de Bretagne. He is author of 35 international publications in data mining and knowledge management. He is a founder and a permanent member of the Steering Committee of the annual EGC French-speaking conference.
Henri Briand received the Ph.D. degree in 1983 from Paul Sabatier University located in Toulouse (France) and has published works in over 100 publications in database systems and database mining. He was the head of the Computer Engineering Department at the Polytechnic School of Nantes University. He was in charge of a research team in the data mining domain. He is responsible for the organization of the Data Mining Master in Nantes University.
Rights and permissions
About this article
Cite this article
Blanchard, J., Guillet, F. & Briand, H. Interactive visual exploration of association rules with rule-focusing methodology. Knowl Inf Syst 13, 43–75 (2007). https://doi.org/10.1007/s10115-006-0046-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-006-0046-2