Abstract
The emerging field of Visual Analytics combines several fields where Data Mining and Visualization play leading roles. The fundamental departure of visual analytics from other approaches is in extensive use of visual analytical tools to discover patterns not only to visualize pattern that have been discovered by traditional data mining methods. High complexity data mining tasks often require employing a multi-level top-down approach, where first at the top levels a qualitative analysis of the complex situation is conducted and top-level patterns are discovered. This paper presents the concept of Monotone Boolean Function Visual Analytics (MBFVA) for such top level pattern discovery. This approach employs binarization and monotonization of quantitative attributes to get a top level data representation. The top level discoveries form a foundation for next more detailed data mining levels where patterns are refined. The approach is illustrated with application to the medical, law enforcement and security domains. The medical application is concerned with discovering breast cancer diagnostic rules (i) interactively with a radiologist, (ii) analytically with data mining algorithms, and (iii) visually. The coordinated visualization of these rules opens an opportunity to coordinate the multi-source rules, and to come up with rules that are meaningful for the expert in the field, and are confirmed with the database. Often experts and data mining algorithms operate at the very different and incomparable levels of detail and produce incomparable patterns. The proposed MBFVA approach allows solving this problem. This paper shows how to represent and visualize binary multivariate data in 2-D and 3-D. This representation preserves the structural relations that exist in multivariate data. It creates a new opportunity to guide the visual discovery of unknown patterns in the data. In particular, the structural representation allows us to convert a complex border between the patterns in multidimensional space into visual 2-D and 3-D forms. This decreases the information overload on the user. The visualization shows not only the border between classes, but also shows a location of the case of interest relative to the border between the patterns. A user does not need to see the thousands of previous cases that have been used to build a border between the patterns. If the abnormal case is deeply inside in the abnormal area, far away from the border between “normal” and “abnormal” classes, then this shows that this case is very abnormal and needs immediate attention. The paper concludes with the outline of the scaling of the algorithm for the large data sets and expanding the approach for non-monotone data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Beilken, C., Spenke, M.: Visual interactive data mining with InfoZoom-the Medical Data Set. In: 3rd European Conf. on Principles and Practice of Knowledge Discovery in Databases, PKDD (1999), http://lisp.vse.cz/pkdd99/Challenge/spenke-m.zip
Groth, D., Robertson, E.: Architectural support for database visualization. In: Workshop on New Paradigms in Information Visualization and Manipulation, pp. 53–55 (1998)
Hansel, G.: Sur le nombre des functions Bool’eenes monotones de n variables. C.R. Acad. Sci., Paris 262(20), 1088–1090 (1966)
Inselberg, A., Dimsdale, B.: Parallel coordinates: A tool for visualizing multidimensional Geometry. In: Proceedings of IEEE Visualization 1990, pp. 360–375. IEEE Computer Society Press, Los Alamitos (1990)
Keim, D., Hao Ming, C., Dayal, U., Meichun, H.: Pixel bar charts: a visualization technique for very large multiattributes data sets. Information Visualization 1(1), 20–34 (2002)
Keim, D., Müller, W., Schumann, H.: Visual Data Mining. In: EUROGRAPHICS 2002 STAR (2002), http://www.eg.org/eg/dl/conf/eg2002/stars/s3_visualdatamining_mueller.pdf
Keim, D.: Information Visualization and Visual Data Mining. IEEE TVCG 7(1), 100–107 (2002)
Keller, N., Pilpel, H.: Linear transformations of monotone functions on the discrete cube. Discrete Mathematics 309(12), 4210–4214 (2009)
Korshunov, A.D.: Monotone Boolean Functions. Russian Math. Surveys 58(5), 929–1001 (2003)
Kovalerchuk, B., Delizy, F.: Visual Data Mining using Monotone Boolean functions. In: Kovalerchuk, B., Schwing, J. (eds.) Visual and Spatial Analysis, pp. 387–406. Springer, Heidelberg (2005)
Kovalerchuk, B., Triantaphyllou, E., Despande, A., Vityaev, E.: Interactive Learning of Monotone Boolean Functions. Information Sciences. Information Sciences 94(1-4), 87–118 (1996)
Kovalerchuk, B., Vityaev, E., Ruiz, J.: Consistent and complete data and “expert” mining in medicine. In: Medical Data Mining and Knowledge Discovery, pp. 238–280. Springer, Heidelberg (2001)
Kovalerchuk, B., Vityaev, E.: Data Mining in Finance: Advances in Relational and Hybrid Methods. Kluwer/Springer, Heidelberg, Dordrecht (2000)
Kovalerchuk, B., Perlovsky, L.: Fusion and Mining Spatial Data in Cyber-physical space with Phenomena Dynamic Logic. In: Proceedings of the 2009 International Joint Conference on Neural Networks, Atlanta, Georgia, USA, pp. 2440–2447 (2009)
Kovalerchuk, B., Perlovsky, L.: Dynamic Logic of Phenomena and Cognition. In: Computational Intelligence: Research Frontiers, pp. 3529–3536. IEEE, Hong Kong (2008)
Lim, S.: Interactive Visual Data Mining of a Large Fire Detector Database. In: International Conference on Information Science and Applications (ICISA), pp. 1–8 (2010), doi:10.1109/ICISA.2010.5480395
Lim, S.: On A Visual Frequent Itemset Mining. In: Proc. of the 4th Int’l Conf. on Digital Information Management (ICDIM 2009), pp. 46–51. IEEE, Los Alamitos (2009)
de Oliveira, M., Levkowitz, H.: From Visual Data Exploration to Visual Data Mining: A Survey. IEEE TVCG 9(3), 378–394 (2003)
Pak, C., Bergeron, R.: 30 Years of Multidimensional Multivariate Visualization. In: Scientific Visualization, pp. 3–33. Society Press (1997)
Shaw, C., Hall, J., Blahut, C., Ebert, D., Roberts, A.: Using shape to visualize multivariate data. In: CIKM 1999 Workshop on New Paradigms in Information Visualization and Manipulation, pp. 17–20. ACM Press, New York (1999)
Ward, M.: A taxonomy of glyph placement strategies for multidimensional data visualization. Information Visualization 1, 194–210 (2002)
Schulz, H., Nocke, T., Schumann, H.: A framework for visual data mining of structures. In: ACM International Conf. Proc Series, vol. 171; Proc. 29th Australasian Computer Science Conf., Hobart, vol. 48, pp. 157–166 (2006)
Badjio, E., Poulet, F.: Dimension Reduction for Visual Data Mining. In: Stochastic Models and Data Analysis, ASMDA-2005 (2002), http://conferences.telecom-bretagne.eu/asmda2005/IMG/pdf/proceedings/266.pdf
Wong, P., Whitney, P., Thomas, j.: Visualizing Association Rules for Text Mining. In: Proc. of the IEEE INFOVIS, pp. 120–123. IEEE, Los Alamitos (1999)
Wong, P.C.: Visual Data Mining. In: IEEE CG&A, pp. 20–21 (September/October 1999)
Zhao, K., Bing, L., Tirpak, T.M., Weimin, X.: A visual data mining framework for convenient identification of useful knowledge. In: Fifth IEEE International Conference on Data Mining, 8 p (2005), doi:10.1109/ICDM.2005.16
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kovalerchuk, B., Delizy, F., Riggs, L., Vityaev, E. (2012). Visual Data Mining and Discovery with Binarized Vectors. In: Holmes, D.E., Jain, L.C. (eds) Data Mining: Foundations and Intelligent Paradigms. Intelligent Systems Reference Library, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23241-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-23241-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23240-4
Online ISBN: 978-3-642-23241-1
eBook Packages: EngineeringEngineering (R0)