Skip to main content

Using visualization to support data mining of large existing databases

  • Papers: Interaction, User Interfaces and Presentation
  • Conference paper
  • First Online:
Database Issues for Data Visualization (DBVIS 1993)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 871))

Included in the following conference series:

Abstract

In this paper. we present ideas how visualization technology can be used to improve the difficult process of querying very large databases. With our VisDB system, we try to provide visual support not only for the query specification process. but also for evaluating query results and. thereafter, refining the query accordingly. The main idea of our system is to represent as many data items as possible by the pixels of the display device. By arranging and coloring the pixels according to the relevance for the query, the user gets a visual impression of the resulting data set and of its relevance for the query. Using an interactive query interface, the user may change the query dynamically and receives immediate feedback by the visual representation of the resulting data set. By using multiple windows for different parts of the query, the user gets visual feedback for each part of the query and, therefore, may easier understand the overall result. To support complex queries, we introduce the notion of ‘approximate joins’ which allow the user to find data items that only approximately fulfill join conditions. We also present ideas how our technique may be extended to support the interoperation of heterogeneous databases. Finally, we discuss the performance problems that are caused by interfacing to existing database systems and present ideas to solve these problems by using data structures supporting a multidimensional search of the database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anwar T. M., Beck H. W., Navathe S. B.: ‘Knowledge Mining by Imprecise Querying: A Classification-Based Approach', Proc. 8th Int. Conf. on Data Engineering, Tempe, AZ, 1992, pp. 622–630.

    Google Scholar 

  2. Beddow J.: ‘Shape Coding of Multidimensional Data on a Mircocomputer Display', Visualization'90, San Francisco, CA. 1990, pp. 238–246.

    Google Scholar 

  3. Beckmann N., Kriegel H.-P., Schneider R. Seeger B.: ‘The R *-Tree: An Efficient and Robust Access Method for Points and Rectangles', Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, NJ, 1990. pp. 322–331.

    Google Scholar 

  4. Chaudhuri S.: ‘Generalization and a Framework for Query Modification', Proc. 6th Int. Conf. on Data Engineering, Los Angeles, CA. 1990, pp. 138–145.

    Google Scholar 

  5. Dunn G., Everitt B.: ‘An Introduction to Mathematical Taxonomy', Cambridge University Press, Cambridge. MA, 1982.

    Google Scholar 

  6. Feiner S., Beshers C.: ‘Visualizing n-Dimensional Virtual Worlds with n-Vision', Computer Graphics, Vol. 24, No. 2. 1990. pp. 37–38.

    Google Scholar 

  7. Frei H. P., Meienberg S.: 'Evaluating Weighted Search Terms as Boolean Queries', Proc. GI/GMD-Workshop. Darmstadt 1991. in: Informatik-Fachberichte, Vol. 289, 1991, pp. 11–22.

    Google Scholar 

  8. Frawley W. J., Piatetsky-Shapiro G., Matheus C. J.: ‘Knowledge Discovery in Databases: An Overview', in: Knowledge Discovery in Databases, AAAI Press, Menlo Park, CA, 1991.

    Google Scholar 

  9. Frei H. P., Schäuble P.: ‘Determining the Effectiveness of Retrieval Algorithms', Information Processing & Management, Vol. 27, No. 2, 1991.

    Google Scholar 

  10. Geiger D., Paz A., Pearl J.: ‘Learning Causal Trees from Dependence Information', Proc. 8th National Conf. on Artificial Intelligence, 1990. pp. 771–776.

    Google Scholar 

  11. Glymour C., Scheines R., Spirtes P., Kelly K.: ‘Discovering Causal Structure', Academic Press, San Diego. CA. 1987.

    Google Scholar 

  12. Hall P. A., Dowling G. R.: 'Approximate String Matching', Proc. 6th Annual Int. SIGIR Conf., in: SIGIR, Vol. 17. No. 4, 1983. pp. 130.

    Google Scholar 

  13. Holland J. H., Holyoak K. J.. Nisbett R. E., Thagard P. R.: ‘Induction: Processes of Inference, Learning, and Discovery', MIT Press, Cambridge, MA. 1986.

    Google Scholar 

  14. Huber P. J.: ‘Projection Pursuit', The Annals of Statistics, Vol. 13, No. 2. 1985, pp. 435–474.

    Google Scholar 

  15. Inselberg A., Dimsdale B.: ‘Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry', Visualization'90. San Francisco, CA. 1990, pp. 361–370.

    Google Scholar 

  16. ISO/IEC: ‘Database Language SQL', ISO/IEC 9075:1992 (German Standardization: DIN 66315).

    Google Scholar 

  17. Joshi A. K., Kaplan S. J., Lee R. M.: ‘Approximate Responses from a Data Base Query System: Applications of Inferencing in Natural Language', Proc. 5th Int. Joint Conf. on Artificial Intelligence (IJCAI). Boston. MA. 1977, pp. 211–212.

    Google Scholar 

  18. Kaplan S. J.: ‘Cooperative Responses from a Portable Natural Language Query System', Artificial Intelligence. Vol. 19, 1982. pp. 165–187.

    Google Scholar 

  19. Keim D. A., Kriegel H.-P., Miethsam A.: ‘Integration of Relational Databases in a Multidatabase System based on Schema Enrichment', Proc. 3rd Int. Workshop on Interoperability in Multidatabase Systems (RIDE-IMS), Vienna. Austria. 1993, pp. 96–104.

    Google Scholar 

  20. Keim D. A., Kriegel H.-P., Miethsam A.: ‘Object-Oriented Querying of Existing Relational Databases', Proc. 4th Int. Conf. on Database and Expert Systems Applications (DEXA), Prague. Czech Republic, 1993, in: Lecture Notes in Computer Science, Vol. 720, Springer, 1993, pp. 325–336.

    Google Scholar 

  21. Keim D. A., Kriegel H.-R, Miethsam A.: ‘Query Translation Supporting the Migration of Legacy Databases into Cooperative Information Systems', Proc. Int. Conf. on Cooperative Information Systems. Toronto, Canada, 1994.

    Google Scholar 

  22. Keim D. A., Kriegel H.-P., Seidl T.: ‘Visual Feedback in Querying Large Databases', Proc. Visualization'93, San Jose, CA, 1993, pp. 158–165.

    Google Scholar 

  23. Keim D. A., Lum V.: ‘Visual Query Specification in a Multimedia Database System', Proc. Visualization'92, Boston. MA. 1993, pp. 194–201.

    Google Scholar 

  24. LeBlanc J., Ward M. O., Wittels N.: ‘Exploring N-Dimensional Databases', Visualization'90, San Francisco. CA. 1990, pp. 230–239.

    Google Scholar 

  25. Mihalisin T., Gawlinski E., Timlin J., Schwendler J.: ‘Visualizing Scalar Field on an N-dimensional Lattice', Visualization'90, San Francisco, CA, 1990, pp. 255–262.

    Google Scholar 

  26. Motro A.: ‘BAROQUE: A Browser for Relational Databases', ACM Trans. on Office Information Systems. Vol. 4. No. 2, 1983, pp. 164–181.

    Google Scholar 

  27. Motro A.: ‘FLEX: A Tolerant and Cooperative User Interface to Databases', IEEE Trans. on Knowledge and Data Engineering, Vol. 2, No. 2. 1990, pp. 231–246.

    Google Scholar 

  28. Marchak F., Zulager D.: ‘The Effectiveness of Dynamic Graphics in Revealing Structure in Multivariate Data', Behavior. Research Methods, Instruments and Computers, Vol. 24, No. 2, 1992. pp. 253–257.

    Google Scholar 

  29. Noreault T., McGill M., Koll M. B.: ‘A Performance Evaluation of Similarity Measures, Document Term Weighting Schemes and Representations in a Boolean Environment', in: Information Retrieval Research. Butterworths, London. 1981.

    Google Scholar 

  30. Parsaye K., Chignell M.: ‘Intelligent Database Tools & Applications', John Wiley & Sons. New York, 1993.

    Google Scholar 

  31. Quinlan J. R.: ‘Induction of Decision Trees', in: Machine Learning, Vol. 1. No. 1, 1986, pp. 81–106.

    Google Scholar 

  32. Rummelhart D. E., McClelland J. L.: ‘Parallel Distributed Processing', MIT Press, Cambridge. MA. 1986.

    Google Scholar 

  33. Salton G.: ‘A Simple Blueprint for Automatic Boolean Query Processing', Information Processing & Management Vol. 24. No. 3. 1988, pp. 269–280.

    Google Scholar 

  34. Salton G., Buckley C.: ‘Term-Weighting Approaches in Automatic Text Retrieval', Information Processing & Management. Vol. 24, No. 5. 1988, pp. 513–523.

    Google Scholar 

  35. Smith S., Bergeron D., Grinstein G.: ‘Stereophonic and Surface Sound Generation for Exploratory Data Analysis'. Proc. Conf. Special Interest Group in Computer and Human Interaction (SIGCHI), 1990. pp. 125–131.

    Google Scholar 

  36. Seeger B., Kriegel H.-P: ‘The Buddy Tree: An Efficient and Robust Access Method for Spatial Databases', Proc. 16th Int. Conf. on Very Large Data Bases, Brisbane, Australia, 1990, pp. 590–601.

    Google Scholar 

  37. Silberschatz A., Stonebraker M., Ullman J. D.: ‘Database Systems: Achievements and Opportunities', Technical Report. No. TR-90-22, Dept. of Computer Sciences, University of Texas at Austin, 1990.

    Google Scholar 

  38. Trimble J. H., Chappell D.: ‘A Visual Introduction to SQL', John Wiley & Sons, New York, 1990.

    Google Scholar 

  39. Treinish L. A. Butler D. M., Senay H., Grinstein G. G., Bryson S. T.: ‘Grand Challenge Problems in Visualization Software', Proc. Visualization'92, Boston, MA. 1992, pp. 366–371.

    Google Scholar 

  40. Zloof M. M. ‘Query-By-Example: A Data Base Language', IBM Systems Journal, Vol. 4, 1977, pp. 324–343.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

John P. Lee Georges G. Grinstein

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Keim, D.A., Kriegel, H.P. (1994). Using visualization to support data mining of large existing databases. In: Lee, J.P., Grinstein, G.G. (eds) Database Issues for Data Visualization. DBVIS 1993. Lecture Notes in Computer Science, vol 871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0021156

Download citation

  • DOI: https://doi.org/10.1007/BFb0021156

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58519-0

  • Online ISBN: 978-3-540-49016-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics