Skip to main content

Visual Data Mining and Discovery with Binarized Vectors

  • Chapter

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 24))

Abstract

The emerging field of Visual Analytics combines several fields where Data Mining and Visualization play leading roles. The fundamental departure of visual analytics from other approaches is in extensive use of visual analytical tools to discover patterns not only to visualize pattern that have been discovered by traditional data mining methods. High complexity data mining tasks often require employing a multi-level top-down approach, where first at the top levels a qualitative analysis of the complex situation is conducted and top-level patterns are discovered. This paper presents the concept of Monotone Boolean Function Visual Analytics (MBFVA) for such top level pattern discovery. This approach employs binarization and monotonization of quantitative attributes to get a top level data representation. The top level discoveries form a foundation for next more detailed data mining levels where patterns are refined. The approach is illustrated with application to the medical, law enforcement and security domains. The medical application is concerned with discovering breast cancer diagnostic rules (i) interactively with a radiologist, (ii) analytically with data mining algorithms, and (iii) visually. The coordinated visualization of these rules opens an opportunity to coordinate the multi-source rules, and to come up with rules that are meaningful for the expert in the field, and are confirmed with the database. Often experts and data mining algorithms operate at the very different and incomparable levels of detail and produce incomparable patterns. The proposed MBFVA approach allows solving this problem. This paper shows how to represent and visualize binary multivariate data in 2-D and 3-D. This representation preserves the structural relations that exist in multivariate data. It creates a new opportunity to guide the visual discovery of unknown patterns in the data. In particular, the structural representation allows us to convert a complex border between the patterns in multidimensional space into visual 2-D and 3-D forms. This decreases the information overload on the user. The visualization shows not only the border between classes, but also shows a location of the case of interest relative to the border between the patterns. A user does not need to see the thousands of previous cases that have been used to build a border between the patterns. If the abnormal case is deeply inside in the abnormal area, far away from the border between “normal” and “abnormal” classes, then this shows that this case is very abnormal and needs immediate attention. The paper concludes with the outline of the scaling of the algorithm for the large data sets and expanding the approach for non-monotone data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beilken, C., Spenke, M.: Visual interactive data mining with InfoZoom-the Medical Data Set. In: 3rd European Conf. on Principles and Practice of Knowledge Discovery in Databases, PKDD (1999), http://lisp.vse.cz/pkdd99/Challenge/spenke-m.zip

  2. Groth, D., Robertson, E.: Architectural support for database visualization. In: Workshop on New Paradigms in Information Visualization and Manipulation, pp. 53–55 (1998)

    Google Scholar 

  3. Hansel, G.: Sur le nombre des functions Bool’eenes monotones de n variables. C.R. Acad. Sci., Paris 262(20), 1088–1090 (1966)

    MathSciNet  Google Scholar 

  4. Inselberg, A., Dimsdale, B.: Parallel coordinates: A tool for visualizing multidimensional Geometry. In: Proceedings of IEEE Visualization 1990, pp. 360–375. IEEE Computer Society Press, Los Alamitos (1990)

    Google Scholar 

  5. Keim, D., Hao Ming, C., Dayal, U., Meichun, H.: Pixel bar charts: a visualization technique for very large multiattributes data sets. Information Visualization 1(1), 20–34 (2002)

    Google Scholar 

  6. Keim, D., Müller, W., Schumann, H.: Visual Data Mining. In: EUROGRAPHICS 2002 STAR (2002), http://www.eg.org/eg/dl/conf/eg2002/stars/s3_visualdatamining_mueller.pdf

  7. Keim, D.: Information Visualization and Visual Data Mining. IEEE TVCG 7(1), 100–107 (2002)

    MathSciNet  Google Scholar 

  8. Keller, N., Pilpel, H.: Linear transformations of monotone functions on the discrete cube. Discrete Mathematics 309(12), 4210–4214 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Korshunov, A.D.: Monotone Boolean Functions. Russian Math. Surveys 58(5), 929–1001 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kovalerchuk, B., Delizy, F.: Visual Data Mining using Monotone Boolean functions. In: Kovalerchuk, B., Schwing, J. (eds.) Visual and Spatial Analysis, pp. 387–406. Springer, Heidelberg (2005)

    Google Scholar 

  11. Kovalerchuk, B., Triantaphyllou, E., Despande, A., Vityaev, E.: Interactive Learning of Monotone Boolean Functions. Information Sciences. Information Sciences 94(1-4), 87–118 (1996)

    Article  Google Scholar 

  12. Kovalerchuk, B., Vityaev, E., Ruiz, J.: Consistent and complete data and “expert” mining in medicine. In: Medical Data Mining and Knowledge Discovery, pp. 238–280. Springer, Heidelberg (2001)

    Google Scholar 

  13. Kovalerchuk, B., Vityaev, E.: Data Mining in Finance: Advances in Relational and Hybrid Methods. Kluwer/Springer, Heidelberg, Dordrecht (2000)

    MATH  Google Scholar 

  14. Kovalerchuk, B., Perlovsky, L.: Fusion and Mining Spatial Data in Cyber-physical space with Phenomena Dynamic Logic. In: Proceedings of the 2009 International Joint Conference on Neural Networks, Atlanta, Georgia, USA, pp. 2440–2447 (2009)

    Google Scholar 

  15. Kovalerchuk, B., Perlovsky, L.: Dynamic Logic of Phenomena and Cognition. In: Computational Intelligence: Research Frontiers, pp. 3529–3536. IEEE, Hong Kong (2008)

    Google Scholar 

  16. Lim, S.: Interactive Visual Data Mining of a Large Fire Detector Database. In: International Conference on Information Science and Applications (ICISA), pp. 1–8 (2010), doi:10.1109/ICISA.2010.5480395

    Google Scholar 

  17. Lim, S.: On A Visual Frequent Itemset Mining. In: Proc. of the 4th Int’l Conf. on Digital Information Management (ICDIM 2009), pp. 46–51. IEEE, Los Alamitos (2009)

    Google Scholar 

  18. de Oliveira, M., Levkowitz, H.: From Visual Data Exploration to Visual Data Mining: A Survey. IEEE TVCG 9(3), 378–394 (2003)

    Google Scholar 

  19. Pak, C., Bergeron, R.: 30 Years of Multidimensional Multivariate Visualization. In: Scientific Visualization, pp. 3–33. Society Press (1997)

    Google Scholar 

  20. Shaw, C., Hall, J., Blahut, C., Ebert, D., Roberts, A.: Using shape to visualize multivariate data. In: CIKM 1999 Workshop on New Paradigms in Information Visualization and Manipulation, pp. 17–20. ACM Press, New York (1999)

    Google Scholar 

  21. Ward, M.: A taxonomy of glyph placement strategies for multidimensional data visualization. Information Visualization 1, 194–210 (2002)

    Article  Google Scholar 

  22. Schulz, H., Nocke, T., Schumann, H.: A framework for visual data mining of structures. In: ACM International Conf. Proc Series, vol. 171; Proc. 29th Australasian Computer Science Conf., Hobart, vol. 48, pp. 157–166 (2006)

    Google Scholar 

  23. Badjio, E., Poulet, F.: Dimension Reduction for Visual Data Mining. In: Stochastic Models and Data Analysis, ASMDA-2005 (2002), http://conferences.telecom-bretagne.eu/asmda2005/IMG/pdf/proceedings/266.pdf

  24. Wong, P., Whitney, P., Thomas, j.: Visualizing Association Rules for Text Mining. In: Proc. of the IEEE INFOVIS, pp. 120–123. IEEE, Los Alamitos (1999)

    Google Scholar 

  25. Wong, P.C.: Visual Data Mining. In: IEEE CG&A, pp. 20–21 (September/October 1999)

    Google Scholar 

  26. Zhao, K., Bing, L., Tirpak, T.M., Weimin, X.: A visual data mining framework for convenient identification of useful knowledge. In: Fifth IEEE International Conference on Data Mining, 8 p (2005), doi:10.1109/ICDM.2005.16

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kovalerchuk, B., Delizy, F., Riggs, L., Vityaev, E. (2012). Visual Data Mining and Discovery with Binarized Vectors. In: Holmes, D.E., Jain, L.C. (eds) Data Mining: Foundations and Intelligent Paradigms. Intelligent Systems Reference Library, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23241-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23241-1_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23240-4

  • Online ISBN: 978-3-642-23241-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics