Advertisement

Evolutionary Approaches to Visualisation and Knowledge Discovery

  • Russell Beale
  • Andy Pryke
  • Robert J. Hendley
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3101)

Abstract

Haiku is a data mining system which combines the best properties of human and machine discovery. An self organising visualisation system is coupled with a genetic algorithm to provide an interactive, flexible system. Visualisation of data allows the human visual system to identify areas of interest, such as clusters, outliers or trends. A genetic algorithm based machine learning algorithm can then be used to explain the patterns identified visually. The explanations (in rule form) can be biased to be short or long; contain all the characteristics of a cluster or just those needed to predict membership; or concentrate on accuracy or on coverage of the data.

This paper describes both the visualisation system and the machine learning component, with a focus on the interactive nature of the data mining process, and provides case studies to demonstrate the capabilities of the system.

Keywords

Genetic Algorithm Data Mining Human Visual System Projection Pursuit Dimension Reduction Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Pryke, A.: Data Mining using Genetic Algorithms and Interactive Visualisation (Ph.D Thesis), The University of Birmingham (1998)Google Scholar
  2. 2.
    Blake, C.L. & Merz, C.J.: UCI Repository of machine learning databases, Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  3. 3.
    Quinlan, R.: Combining Instance-Based and Model-Based Learning. In: Proceedings on the Tenth International Conference of Machine Learning, University of Mas-sachusetts, Amherst, pp. 236–243. Morgan Kaufmann, San Francisco (1993)Google Scholar
  4. 4.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1992)Google Scholar
  5. 5.
    Friedman, J.H., Tukey, J.W.: A projection pursuit algorithm for exploratory data analysis. IEEE Trans. Computers c-23(9), 881 (1974)CrossRefGoogle Scholar
  6. 6.
    Cox, T.F., Cox, M.A.A.: Multidimensional Scaling. Chapman & Hall, London (1994)zbMATHGoogle Scholar
  7. 7.
    Hendley, R.J., Drew, N., Beale, R., Wood, A.M.: Narcissus: visualising information. In: Card, S., Mackinlay, J., Shneiderman, B. (eds.) Readings in information visualization, January 1999, pp. 503–511 (1999)Google Scholar
  8. 8.
    Beale, R., McNab, R.J., Witten, I.H.: Visualising sequences of queries: a new tool for information retrieval. In: Proc IEEE Conf on Information Visualisation, London, England, August 1997, pp. 57–62 (1997)Google Scholar
  9. 9.
    Wood, A.M., Drew, N.S., Beale, R., Hendley, R.J.: HyperSpace: Web Browsing with Visualisation. In: Third International World-Wide Web Conference Poster Proceeding, Darmstadt Germany, April 1995, pp. 21–25 (1995)Google Scholar
  10. 10.
    Bocker, H.D., Fischer, G., Nieper, H.: The enhancement of understanding through visual representations. In: ACM Proceedings of the SIGCHI conference on Human fac-tors in computing systems, pp. 44–50 (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Russell Beale
    • 1
  • Andy Pryke
    • 1
  • Robert J. Hendley
    • 1
  1. 1.School of Computer ScienceThe University of BirminghamBirminghamUK

Personalised recommendations