DataJewel: Integrating Visualization with Temporal Data Mining

  • Mihael Ankerst
  • Anne Kao
  • Rodney Tjoelker
  • Changzhou Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4404)

Abstract

In this chapter we describe DataJewel, a new temporal data mining architecture. DataJewel tightly integrates a visualization component, an algorithmic component and a database component. We introduce a new visualization technique called CalendarView as an implementation of the visualization component, and we introduce a data structure that supports temporal mining of large databases. In our architecture, algorithms can be tightly integrated with the visualization component and most existing temporal data mining algorithms can be leveraged by embedding them into DataJewel. This integration is achieved by an interface that is used by both the user and the algorithms to assign colors to events. The user interactively assigns colors to incorporate domain knowledge or to formulate hypotheses. The algorithm assigns colors based on discovered patterns. The same visualization technique is used for displaying both data and patterns to make it more intuitive for the user to identify useful patterns while exploring data interactively or while using algorithms to search for patterns. Our experiments in analyzing several large datasets from the airplane maintenance domain demonstrate the usefulness of our approach and we discuss its applicability to domains like homeland security, market basket analysis and web mining.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ankerst, M., Ester, M., Kriegel, H.-P.: Towards an effective cooperation of the computer and the user for classification. In: Proceedings of the Sixth ACM SIGKDD international conference on Knowledge discovery and data mining SIGKDD 2000, Boston, MA, pp. 179–188 (2000)Google Scholar
  2. 2.
    Antunes, C.M., Oliviera, A.L.: Temporal data mining: An overview. In: Proceedings of the SIGKDD 2001 Workshop on Temporal Data Mining, 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2001, August 26. ACM Press, San Francisco (2001)Google Scholar
  3. 3.
    Daassi, C., Dumas, M., Fauvet, M.-C., Nigay, L., Scholl, P.-C.: Visual exploration of temporal object databases. In: Proceedings of 16th French Conference on Databases BDA 2000, Blois, France, October 24-27, pp. 159–178 (2000)Google Scholar
  4. 4.
    Gehrke, J., Ramakrishnan, R., Ganti, V.: RainForest – A framework for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery journal 4, 122–162 (2000)Google Scholar
  5. 5.
    Grinstein, G., Ankerst, M., Keim, D.A.: Visual data mining: Background, techniques and drug discovery applications. In: KDD 2002 Tutorial, 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2002, Edmonton, Canada (2002)Google Scholar
  6. 6.
    Havre, S., Hetzler, E., Whitney, P., Nowell, L.: ThemeRiver: Visualizing thematic changes in large document collections. IEEE Transactions on Visualization and Computer Graphics 8(1) (January-March 2002)Google Scholar
  7. 7.
    Hellerstein, J.M., Avnur, R., Raman, V.: Informix under CONTROL: Online query processing. Data Mining and Knowledge Discovery Journal 12, 281–314 (2000)CrossRefGoogle Scholar
  8. 8.
    Hinneburg, A., Keim, D.A., Wawryniuk, M.: HD-Eye: Visual mining of high-dimensional data. IEEE Computer Graphics and Applications 19(5) (1999)Google Scholar
  9. 9.
    Keim, D.A., Hao, M.C., Dayal, U.: Hierarchical pixel bar charts. IEEE Transactions on Visualization and Computer Graphics 8(3), 255–269 (2002)CrossRefGoogle Scholar
  10. 10.
    Kolluri, V., Provost, F.: A Survey for scaling up inductive algorithms. Data Mining and Knowledge Discovery journal 2, 131–169 (1999)Google Scholar
  11. 11.
    Mackinlay, J.D., Robertson, G.G., de Line, R.: Developing calendar visualizers for the information visualizer. In: Proceedings of the Seventh Annual ACM Symposium on User Interface Software and Technology UIST 1994, Marina del Rey, California, November 2-4, pp. 109–118 (1994)Google Scholar
  12. 12.
    Sarawagi, S., Thomas, S., Agrawal, R.: Integrating mining with relational database systems: Alternatives and implications. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, June 2-4, pp. 343–354. ACM Press, Seattle (1998)Google Scholar
  13. 13.
    Trueblood, R.P., Lovett Jr., J.N.: Data Mining and Statistical Analysis Using SQL. Apress (2001)Google Scholar
  14. 14.
    Van Wijk, J.J., Van Selow, E.R.: Cluster and calendar based visualization of time series data. In: Wills, G., Keim, D. (eds.) Proceedings of the IEEE Symposium on Information Visualization InfoVis 1999, pp. 4–9. IEEE Computer Society, Los Alamitos (1999)CrossRefGoogle Scholar
  15. 15.
    Yang, L.: Interactive Exploration of Very Large Relational Datasets through 3D Dynamic Projections. In: Proceedings of the Sixth ACM SIGKDD international conference on Knowledge discovery and data mining SIGKDD 2000, Boston, MA, pp. 236–243 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mihael Ankerst
    • 1
  • Anne Kao
    • 2
  • Rodney Tjoelker
    • 2
  • Changzhou Wang
    • 2
  1. 1.  MünchenGermany
  2. 2.The Boeing CompanySeattle

Personalised recommendations