Advertisement

HOV3: An Approach to Visual Cluster Analysis

  • Ke-Bing Zhang
  • Mehmet A. Orgun
  • Kang Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)

Abstract

Clustering is a major technique in data mining. However the numerical feedback of clustering algorithms is difficult for user to have an intuitive overview of the dataset that they deal with. Visualization has been proven to be very helpful for high-dimensional data analysis. Therefore it is desirable to introduce visualization techniques with user’s domain knowledge into clustering process. Whereas most existing visualization techniques used in clustering are exploration oriented. Inevitably, they are mainly stochastic and subjective in nature. In this paper, we introduce an approach called HOV3 (H ypothesis O riented V erification and V alidation by V isualization), which projects high-dimensional data on the 2D space and reflects data distribution based on user hypotheses. In addition, HOV3 enables user to adjust hypotheses iteratively in order to obtain an optimized view. As a result, HOV3 provides user an efficient and effective visualization method to explore cluster information.

Keywords

Data Mining Information Visualization Exploration Discovery User Hypothesis Large Spatial Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alpern, B., Carter, L.: Hyperbox. In: Proc. Visualization 1991, San Diego, CA, pp. 133–139 (1991)Google Scholar
  2. 2.
    Ankerst, M., Breunig, M., Kriegel, S.H.J.: OPTICS: Ordering points to identify the clustering structure. In: Proc. of ACM SIGMOD Conference, pp. 49–60 (1999)Google Scholar
  3. 3.
    Ankerst, M., Keim, D.: Visual Data Mining and Exploration of Large Databases. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, Springer, Heidelberg (2001)Google Scholar
  4. 4.
    Berkhin, P.: Survey of clustering data mining techniques. Technical report, Accrue Software (2002)Google Scholar
  5. 5.
    Cook, D.R., Buja, A., Cabrea, J., Hurley, H.: Grand tour and projection pursuit. Journal of Computational and Graphical Statistics 23, 225–250 (1995)Google Scholar
  6. 6.
    Chen, K., Liu, L.: VISTA: Validating and Refining Clusters via Visualization. Journal of Information Visualization I3(4), 257–270 (2004)CrossRefGoogle Scholar
  7. 7.
    Chernoff, H.: The Use of Faces to Represent Points in k-Dimensional Space Graphically. Journal Amer. Statistical Association 68, 361–368 (1973)CrossRefGoogle Scholar
  8. 8.
    Cleveland, W.S.: Visualizing Data, AT&T Bell Laboratories, Murray Hill, NJ. Hobart Press, Summit NJ (1993)Google Scholar
  9. 9.
    Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining (1996)Google Scholar
  10. 10.
    Fienberg, S.E.: Graphical methods in statistics. American Statisticians 33, 165–178 (1979)Google Scholar
  11. 11.
    Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: Proc. of ACM SIGMOD Int’l Conf. on Management of Data, pp. 73–84. ACM Press, New York (1998)Google Scholar
  12. 12.
    Hinneburg, A., Keim, D.A., Wawryniuk, M.: HD-Eye-Visual Clustering of High dimensional Data. In: Proc. of the 19th International Conference on Data Engineering, pp. 753–755 (2003)Google Scholar
  13. 13.
    Hoffman, P.E., Grinstein, G.: A survey of visualizations for high-dimensional data mining. In: Fayyad, U., Grinstein, G.G., Wierse, A. (eds.) Information visualization in data mining and knowledge discovery, pp. 47–82. Morgan Kaufmann Publishers Inc., San Francisco (2002)Google Scholar
  14. 14.
    Inselberg, A.: Multidimensional Detective. In: Proc. of IEEE Information Visualization 1997, pp. 100–107 (1997)Google Scholar
  15. 15.
    Jain, A., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  16. 16.
    Kandogan, E.: Visualizing multi-dimensional clusters, trends, and outliers using star coordinates. In: Proc. of ACM SIGKDD Conference, pp. 107–116 (2001)Google Scholar
  17. 17.
    Keim, D.A., And Kriegel, H.: VisDB: Database Exploration using Multidimensional Visualization. Computer Graphics & Applications, 40–49 (1994)Google Scholar
  18. 18.
    de Oliveira, M.C.F., Levkowitz, H.: From Visual Data Exploration to Visual Data Mining: A Survey. IEEE Transaction on Visualization and Computer Graphs 9(3), 378–394 (2003)CrossRefGoogle Scholar
  19. 19.
    Pampalk, E., Goebl, W., Widmer, G.: Visualizing Changes in the Structure of Data for Exploratory Feature Selection. In: SIGKDD 2003, Washington, DC, USA (2003)Google Scholar
  20. 20.
    Pickett, R.M.: Visual Analyses of Texture in the Detection and Recognition of Objects. In: Lipkin, B.S., Rosenfeld, A. (eds.) Picture Processing and Psycho-Pictorics, pp. 289–308. Academic Press, New York (1970)Google Scholar
  21. 21.
    Qian, Y., Zhang, G., Zhang, K.: FAÇADE: A Fast and Effective Approach to the Discovery of Dense Clusters in Noisy Spatial Data. In: Proc. ACM SIGMOD 2004 Conference, pp. 921–922. ACM Press, New York (2004)CrossRefGoogle Scholar
  22. 22.
    Ribarsky, W., Katz, J., Jiang, F., Holland, A.: Discovery visualization using fast clustering. Computer Graphics and Applications, IEEE 19, 32–39 (1999)CrossRefGoogle Scholar
  23. 23.
    Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-resolution clustering approach for very large spatial databases. In: Proc. of 24th Intl. Conf. On Very Large Data Bases, pp. 428–439 (1998)Google Scholar
  24. 24.
    Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 17–28. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  25. 25.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: Proc. of SIGMOD 1996, Montreal, Canada, pp. 103–114 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ke-Bing Zhang
    • 1
  • Mehmet A. Orgun
    • 1
  • Kang Zhang
    • 2
  1. 1.Department of ComputingMacquarie UniversitySydneyAustralia
  2. 2.Department of Computer ScienceUniversity of Texas at Dallas RichardsonUSA

Personalised recommendations