FIRE: a two-level interactive visualization for deep exploration of association rules

Mukherji, Abhishek; Lin, Xika; Toto, Ermal; Botaish, Christopher R.; Whitehouse, Jason; Rundensteiner, Elke A.; Ward, Matthew O.

doi:10.1007/s41060-018-0133-y

FIRE: a two-level interactive visualization for deep exploration of association rules

Regular Paper
Published: 16 June 2018

Volume 7, pages 201–226, (2019)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abhishek Mukherji¹,
Xika Lin²,
Ermal Toto²,
Christopher R. Botaish²,
Jason Whitehouse²,
Elke A. Rundensteiner² &
…
Matthew O. Ward²

358 Accesses
2 Citations
Explore all metrics

Abstract

While rule mining is critical for decision-making applications, rule mining systems still lack support for interactive exploration of multitude of generated rules and understanding of relationships among rule results produced with various parameter settings. Based on a novel parameter space-driven approach, our proposed Framework forInteractiveRuleExploration [FIRE (PARAS/FIRE homepage: http://paras.cs.wpi.edu/)] addresses this usability shortcoming. FIRE features innovative visual displays and interactions to enable interactive rule exploration. We propose two linked interactive displays, namely the parameter space view (PSpace) and the rule space view (RSpace) that together enable enhanced sense-making of rule relationships. The PSpace view visualizes the distribution of rules produced for diverse parameter settings. This not only facilitates user parameter selection for rule mining but also enhances an analyst’s understanding of rule relationships in the parameter space context. The RSpace view provides a detailed display of the rules using a novel rule glyph visualization to facilitate interactive visual rule comparisons. We evaluate the usability and effectiveness of our FIRE framework with two studies. First, in a case study a researcher explored a dataset of interest using the FIRE paradigm as well as the state-of-the-art rule visualization techniques from the ARulsViz R package. Further, our user study with 22 subjects establishes the usability and effectiveness of the proposed visual displays and interactions of FIRE using several benchmark datasets. Overall, this research encompasses significant contributions at the intersection of data mining and visual analytics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visualizing association rules in hierarchical groups

Article Open access 07 May 2016

Visualization and Visual Analytic Techniques for Patterns

SubSect—An Interactive Itemset Visualization

Notes

The FIRE tool is available at [11] as a web interface for researchers to upload their own datasets, generate association rules on the datasets and visualize the rules.
This case study was performed by an avid bike user with an interest in data mining.

References

Aggarwal, C.C., Yu, P.S.: A new approach to online generation of association rules. IEEE Trans. Knowl. Data Eng. 13(4), 527–540 (2001)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB ’94 Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc. San Francisco, CA (1994)
Borgelt, C.: Efficient implementations of Apriori, Eclat and FP-growth. http://www.borgelt.net (2017). Accessed Dec 2017
Boulicaut, J.F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 339–354. Springer, Berlin (2010)
Google Scholar
Cao, L., Li, J., Wang, C., Yu, P.S.: Efficient selection of globally optimal rules on large imbalanced data based on rule coverage relationship analysis. In: SIAM International Conference on Data Mining, pp. 216–224 (2013)
Cao, L.: Combined mining: analyzing object and pattern relations for discovering and constructing complex yet actionable patterns. WIREs Data Min. Knowl. Discov. 3(2), 140–155 (2013)
Article Google Scholar
Chaudhuri, S., Lee, H., Narasayya, V.R.: Variance aware optimization of parameterized queries. In: SIGMOD Conference, pp. 531–542 (2010)
Cleveland, R.B., Cleveland, W.S., Mcrae, J.E., Terpenning, I.: STL: a seasonal-trend decomposition procedure based on loess. J. Off. Stat. 6(1), 3–73 (1990)
Google Scholar
Couturier, O., Hamrouni, T., Yahia, S.B., Nguifo, E.M.: A scalable association rule visualization towards displaying large amounts of knowledge. In: International Conference Information Visualisation, pp. 657–663 (2007)
Duan, S., Thummala, V., Babu, S.: Tuning database configuration parameters with iTuned. PVLDB 2(1), 1246–1257 (2009)
Google Scholar
PARAS/FIRE Home Page. http://paras.cs.wpi.edu/ (2018). Accessed March 2018
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD ’00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data, vol. 29, pp. 1–12. ACM, New York, NY (2000)
Hahsler, M., Chelluboina, S.: ARulesViz R package. http://cran.r-project.org/web/packages/arulesViz/vignettes/arulesViz-1.jpg (2017). Accessed Dec 2017
Jeudy, B., Boulicaut, J.-F.: Using condensed representations for interactive association rule mining. In: PKDD, pp. 225–236 (2002)
Kaya, M., Reda, A.: Online mining of fuzzy multidimensional weighted association rules. Appl. Intell. 29(1), 13–34 (2008)
Article Google Scholar
Kubat, M., Hafez, A., Raghavan, V.V., Lekkala, J.R., Chen, W.K.: Itemset trees for targeted association querying. IEEE Trans. Knowl. Data Eng. 15(6), 1522–1534 (2003)
Article Google Scholar
Leung, C.K.-S.: Constraint-based association rule mining. In: Wang, J. (ed.) Encyclopedia of Data Warehousing and Mining, pp. 307–312. Hershey, Information Science Reference (2009)
Chapter Google Scholar
Lin, X., Mukherji, A., Rundensteiner, E.A., Ruiz, C., Ward, M.O.: PARAS: a parameter space framework for online association mining. PVLDB 6(3), 193–204 (2013)
Google Scholar
Lin, X., Mukherji, A., Rundensteiner, E.A., Ward, M.O.: SPIRE: supporting parameter-driven interactive rule mining and exploration. PVLD 7(13), 1653–1656 (2014)
Google Scholar
Liu, G., Suchitra, A., Zhang, H., Feng, M., Ng, S.-K., Wong, L.: AssocExplorer: an association rule visualization system for exploratory data analysis. In: ACM SIGKDD Demo, pp. 1536–1539 (2012)
Lucchese, C., Orlando, S., Perego, R., Silvestri, F.: WebDocs: a real-life huge transac. dataset, FIMI (2004)
Mukherji, A., Lin, X., Botaish, C.R., Whitehouse, J., Rundensteiner, E.A., Ward, M.O., Ruiz, C.: PARAS: interactive parameter space exploration for association rule mining. In: ACM SIGMOD, pp. 1017–1020 (2013)
Mukherji, A., Lin, X., Whitehouse, J., Botaish, C.R., Rundensteiner, E.A., Ward, M.O.: FIRE: interactive visual support for parameter space-driven rule mining. In: CIKM, pp. 2447–2452 (2013)
Qin, X., Ahsan, R., Lin, X., Rundensteiner, E.A., Ward, M.O.: iPARAS: incremental construction of parameter space for online association mining. In: BigMine, pp. 149–165 (2014)
Qin, X., Ahsan, R., Lin, X., Rundensteiner, E.A., Ward, M.O.: Interactive temporal association analytics. In: EDBT, pp. 197–208 (2016)
Qin, X., Kakar, T., Wunnava, S., Rundensteiner, E.A., Cao, L.: MARAS: signaling multi-drug adverse reactions. In: ACM SIGKDD, pp. 1615–1623 (2017)
Shao, J., Yin, J., Liu, W., Cao, L.: Actionable combined high utility itemset mining. In: AAAI, pp. 4206–4207 (2015)
Tork, H.F.: Bike sharing dataset. https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset (2017). Accessed Dec 2017
UCI Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLRepository.html (2017). Accessed 17 March 2017
Wang, S., Cao, L.: Inferring implicit rules by learning explicit and hidden item dependency. In: IEEE TSMC, PP(99), pp. 1–12 (2017)
Ward, M.O.: A taxonomy of glyph placement strategies for multidimensional data visualization. Inf. Vis. 1(3–4), 194–210 (2002)
Article Google Scholar
Wong, P.-Y., Chan, T.-M., Wong, M.-H., Leung, K.-S.: Predicting approximate protein-DNA binding cores using association rule mining. In: IEEE ICDE, pp. 965–976 (2012)
Wu, T., Chen, Y., Han, J.: Association mining in large databases: a re-examination of its measures. In: PKDD, pp. 621–628 (2007)
XmdvTool Home Page. http://davis.wpi.edu/~xmdv/ (2018). Accessed March 2018
Yang, D., Rundensteiner, E.A., Ward, M.O.: A shared execution strategy for multiple pattern mining requests over streaming data. Proc. VLDB Endow. 2(1), 874–885 (2009)
Article Google Scholar
Zaki, M.J., Hsiao, C.-J.: CHARM: an efficient algorithm for closed itemset mining. In: SIAM SDM (2002)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: SIG KDD, pp. 283–286 (1997)

Download references

Acknowledgements

This work was supported by NSF under Grants IIS-0812027, CCF-0811510 and IIS-1117139.

Author information

Authors and Affiliations

Cisco Systems Inc., San Jose, CA, USA
Abhishek Mukherji
Department of Computer Science, Worcester Polytechnic Institute, Worcester, MA, USA
Xika Lin, Ermal Toto, Christopher R. Botaish, Jason Whitehouse, Elke A. Rundensteiner & Matthew O. Ward

Authors

Abhishek Mukherji
View author publications
You can also search for this author in PubMed Google Scholar
Xika Lin
View author publications
You can also search for this author in PubMed Google Scholar
Ermal Toto
View author publications
You can also search for this author in PubMed Google Scholar
Christopher R. Botaish
View author publications
You can also search for this author in PubMed Google Scholar
Jason Whitehouse
View author publications
You can also search for this author in PubMed Google Scholar
Elke A. Rundensteiner
View author publications
You can also search for this author in PubMed Google Scholar
Matthew O. Ward
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abhishek Mukherji.

Additional information

Supported by National Science Foundation under Grants IIS-0812027, CCF-0811510 and IIS-1117139.

Appendix

1.1 Preprocessing of the bike sharing dataset

The Bike Sharing dataset [29] was preprocessed before loading into FIRE [11] and R ARulesViz [14], as described below.

Table 3 Discretized attributes: bike sharing dataset

Full size table

1.
Data corresponding to three of the attributes was eliminated. The attributes are instant (unique identifier), dteday (date) and yr (contains 2 values: year 1 and year 2).
2.
The casual and registered users increased over time. In particular, the casual users increase at 0.895 users per day, whereas the registered users increase at a rate of 4.874 users per day. To cancel the effect of the overall growth, the data was rotated to negate the slope of the trend lines. In Figs. 45 and 46, we show the original user counts, and in Figs. 47 and 48 we show the adjusted user counts for the casual and registered users categories. This processing is similar in flavor to season trend decomposition in [8].
3.
Further, the attributes were discretized as shown in Table 3.

1.2 Association rules and redundancy

An adjacency lattice (Fig. 49) denotes items such as X, Y and Z. The support value of each item (say, X) or itemset (say, XY) indicates the total instances of the item or itemset in the dataset. For example, in a set of 100 records, X occurs in 80 and Y in 60 records. Itemset XY has a support of 40 records. For a rule \(R = (X \longrightarrow Y\)), its confidence can be represented as confidence \((R) = \frac{\hbox {support}(X \cup Y)}{\hbox {support}(X)}\).

Table 4 Redundancy in generated association rules

Full size table

Aggarwal et al. [1] define rule redundancy relationships, such that redundant rules may be filtered out to present succinct results to the user. The redundant rules could always be derived on demand, if so desired. We examine how these redundancy relationships can be identified in the parameter space model. In particular, redundancy can be of two types [1], as defined below.

Definition 1

Simple redundancy Let \(A \Rightarrow B\) and C\(\Rightarrow D\) be two rules such that the itemsets A, B, C and D satisfy the condition \(A \cup B = C \cup D\). The rule \(C \Rightarrow D\) is simply redundant with respect to the rule \(A \Rightarrow B\), if \(C \supset A\).

Definition 2

Strict redundancy We consider two rules generated from itemsets \(X_{i}\) and \(X_{j}\), respectively, such that \(X_{i} \supset X_{j}\). Let \(A \Rightarrow B\) and \(C \Rightarrow D\) be rules satisfying \(A \cup B = X_{i}\), \(C \cup D = X_{j}\), and \(C \supseteq A\). Then the rule \(C \Rightarrow D\) is strictly redundant with respect to the rule \(A \Rightarrow B\).

The concept of redundancy can be illustrated using the rules generated from the lattice (Fig. 49) as listed in Table 4. Based on Definitions 1 and 2, if a rule \(\mathcal {R}_{1}\) is simple or strict redundant with respect to another rule \(\mathcal {R}_{2}\), then \(\mathcal {R}_{2}\) is said to simple or strict dominate\(\mathcal {R}_{1}\), respectively. In Table 4, the rule (\(X \Rightarrow YZ\)) simple dominates the rules (\(XY \Rightarrow Z\)) and (\(XZ \Rightarrow Y\)) (Def. 1). In Table 4, the rule (\(X \Rightarrow YZ\)) strict dominates rules (\(X \Rightarrow Y\)) and (\(X \Rightarrow Z\)) (Def. 2). In general, a rule may be dominated by several dominating rules and may in turn dominate several other dominated rules.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mukherji, A., Lin, X., Toto, E. et al. FIRE: a two-level interactive visualization for deep exploration of association rules. Int J Data Sci Anal 7, 201–226 (2019). https://doi.org/10.1007/s41060-018-0133-y

Download citation

Received: 14 May 2017
Accepted: 19 May 2018
Published: 16 June 2018
Issue Date: 01 April 2019
DOI: https://doi.org/10.1007/s41060-018-0133-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FIRE: a two-level interactive visualization for deep exploration of association rules

Abstract

Access this article

Similar content being viewed by others

Visualizing association rules in hierarchical groups

Visualization and Visual Analytic Techniques for Patterns

SubSect—An Interactive Itemset Visualization

Notes

References

Acknowledgements