Abstract
In this paper. we present ideas how visualization technology can be used to improve the difficult process of querying very large databases. With our VisDB system, we try to provide visual support not only for the query specification process. but also for evaluating query results and. thereafter, refining the query accordingly. The main idea of our system is to represent as many data items as possible by the pixels of the display device. By arranging and coloring the pixels according to the relevance for the query, the user gets a visual impression of the resulting data set and of its relevance for the query. Using an interactive query interface, the user may change the query dynamically and receives immediate feedback by the visual representation of the resulting data set. By using multiple windows for different parts of the query, the user gets visual feedback for each part of the query and, therefore, may easier understand the overall result. To support complex queries, we introduce the notion of ‘approximate joins’ which allow the user to find data items that only approximately fulfill join conditions. We also present ideas how our technique may be extended to support the interoperation of heterogeneous databases. Finally, we discuss the performance problems that are caused by interfacing to existing database systems and present ideas to solve these problems by using data structures supporting a multidimensional search of the database.
Preview
Unable to display preview. Download preview PDF.
References
Anwar T. M., Beck H. W., Navathe S. B.: ‘Knowledge Mining by Imprecise Querying: A Classification-Based Approach', Proc. 8th Int. Conf. on Data Engineering, Tempe, AZ, 1992, pp. 622–630.
Beddow J.: ‘Shape Coding of Multidimensional Data on a Mircocomputer Display', Visualization'90, San Francisco, CA. 1990, pp. 238–246.
Beckmann N., Kriegel H.-P., Schneider R. Seeger B.: ‘The R *-Tree: An Efficient and Robust Access Method for Points and Rectangles', Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, NJ, 1990. pp. 322–331.
Chaudhuri S.: ‘Generalization and a Framework for Query Modification', Proc. 6th Int. Conf. on Data Engineering, Los Angeles, CA. 1990, pp. 138–145.
Dunn G., Everitt B.: ‘An Introduction to Mathematical Taxonomy', Cambridge University Press, Cambridge. MA, 1982.
Feiner S., Beshers C.: ‘Visualizing n-Dimensional Virtual Worlds with n-Vision', Computer Graphics, Vol. 24, No. 2. 1990. pp. 37–38.
Frei H. P., Meienberg S.: 'Evaluating Weighted Search Terms as Boolean Queries', Proc. GI/GMD-Workshop. Darmstadt 1991. in: Informatik-Fachberichte, Vol. 289, 1991, pp. 11–22.
Frawley W. J., Piatetsky-Shapiro G., Matheus C. J.: ‘Knowledge Discovery in Databases: An Overview', in: Knowledge Discovery in Databases, AAAI Press, Menlo Park, CA, 1991.
Frei H. P., Schäuble P.: ‘Determining the Effectiveness of Retrieval Algorithms', Information Processing & Management, Vol. 27, No. 2, 1991.
Geiger D., Paz A., Pearl J.: ‘Learning Causal Trees from Dependence Information', Proc. 8th National Conf. on Artificial Intelligence, 1990. pp. 771–776.
Glymour C., Scheines R., Spirtes P., Kelly K.: ‘Discovering Causal Structure', Academic Press, San Diego. CA. 1987.
Hall P. A., Dowling G. R.: 'Approximate String Matching', Proc. 6th Annual Int. SIGIR Conf., in: SIGIR, Vol. 17. No. 4, 1983. pp. 130.
Holland J. H., Holyoak K. J.. Nisbett R. E., Thagard P. R.: ‘Induction: Processes of Inference, Learning, and Discovery', MIT Press, Cambridge, MA. 1986.
Huber P. J.: ‘Projection Pursuit', The Annals of Statistics, Vol. 13, No. 2. 1985, pp. 435–474.
Inselberg A., Dimsdale B.: ‘Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry', Visualization'90. San Francisco, CA. 1990, pp. 361–370.
ISO/IEC: ‘Database Language SQL', ISO/IEC 9075:1992 (German Standardization: DIN 66315).
Joshi A. K., Kaplan S. J., Lee R. M.: ‘Approximate Responses from a Data Base Query System: Applications of Inferencing in Natural Language', Proc. 5th Int. Joint Conf. on Artificial Intelligence (IJCAI). Boston. MA. 1977, pp. 211–212.
Kaplan S. J.: ‘Cooperative Responses from a Portable Natural Language Query System', Artificial Intelligence. Vol. 19, 1982. pp. 165–187.
Keim D. A., Kriegel H.-P., Miethsam A.: ‘Integration of Relational Databases in a Multidatabase System based on Schema Enrichment', Proc. 3rd Int. Workshop on Interoperability in Multidatabase Systems (RIDE-IMS), Vienna. Austria. 1993, pp. 96–104.
Keim D. A., Kriegel H.-P., Miethsam A.: ‘Object-Oriented Querying of Existing Relational Databases', Proc. 4th Int. Conf. on Database and Expert Systems Applications (DEXA), Prague. Czech Republic, 1993, in: Lecture Notes in Computer Science, Vol. 720, Springer, 1993, pp. 325–336.
Keim D. A., Kriegel H.-R, Miethsam A.: ‘Query Translation Supporting the Migration of Legacy Databases into Cooperative Information Systems', Proc. Int. Conf. on Cooperative Information Systems. Toronto, Canada, 1994.
Keim D. A., Kriegel H.-P., Seidl T.: ‘Visual Feedback in Querying Large Databases', Proc. Visualization'93, San Jose, CA, 1993, pp. 158–165.
Keim D. A., Lum V.: ‘Visual Query Specification in a Multimedia Database System', Proc. Visualization'92, Boston. MA. 1993, pp. 194–201.
LeBlanc J., Ward M. O., Wittels N.: ‘Exploring N-Dimensional Databases', Visualization'90, San Francisco. CA. 1990, pp. 230–239.
Mihalisin T., Gawlinski E., Timlin J., Schwendler J.: ‘Visualizing Scalar Field on an N-dimensional Lattice', Visualization'90, San Francisco, CA, 1990, pp. 255–262.
Motro A.: ‘BAROQUE: A Browser for Relational Databases', ACM Trans. on Office Information Systems. Vol. 4. No. 2, 1983, pp. 164–181.
Motro A.: ‘FLEX: A Tolerant and Cooperative User Interface to Databases', IEEE Trans. on Knowledge and Data Engineering, Vol. 2, No. 2. 1990, pp. 231–246.
Marchak F., Zulager D.: ‘The Effectiveness of Dynamic Graphics in Revealing Structure in Multivariate Data', Behavior. Research Methods, Instruments and Computers, Vol. 24, No. 2, 1992. pp. 253–257.
Noreault T., McGill M., Koll M. B.: ‘A Performance Evaluation of Similarity Measures, Document Term Weighting Schemes and Representations in a Boolean Environment', in: Information Retrieval Research. Butterworths, London. 1981.
Parsaye K., Chignell M.: ‘Intelligent Database Tools & Applications', John Wiley & Sons. New York, 1993.
Quinlan J. R.: ‘Induction of Decision Trees', in: Machine Learning, Vol. 1. No. 1, 1986, pp. 81–106.
Rummelhart D. E., McClelland J. L.: ‘Parallel Distributed Processing', MIT Press, Cambridge. MA. 1986.
Salton G.: ‘A Simple Blueprint for Automatic Boolean Query Processing', Information Processing & Management Vol. 24. No. 3. 1988, pp. 269–280.
Salton G., Buckley C.: ‘Term-Weighting Approaches in Automatic Text Retrieval', Information Processing & Management. Vol. 24, No. 5. 1988, pp. 513–523.
Smith S., Bergeron D., Grinstein G.: ‘Stereophonic and Surface Sound Generation for Exploratory Data Analysis'. Proc. Conf. Special Interest Group in Computer and Human Interaction (SIGCHI), 1990. pp. 125–131.
Seeger B., Kriegel H.-P: ‘The Buddy Tree: An Efficient and Robust Access Method for Spatial Databases', Proc. 16th Int. Conf. on Very Large Data Bases, Brisbane, Australia, 1990, pp. 590–601.
Silberschatz A., Stonebraker M., Ullman J. D.: ‘Database Systems: Achievements and Opportunities', Technical Report. No. TR-90-22, Dept. of Computer Sciences, University of Texas at Austin, 1990.
Trimble J. H., Chappell D.: ‘A Visual Introduction to SQL', John Wiley & Sons, New York, 1990.
Treinish L. A. Butler D. M., Senay H., Grinstein G. G., Bryson S. T.: ‘Grand Challenge Problems in Visualization Software', Proc. Visualization'92, Boston, MA. 1992, pp. 366–371.
Zloof M. M. ‘Query-By-Example: A Data Base Language', IBM Systems Journal, Vol. 4, 1977, pp. 324–343.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keim, D.A., Kriegel, H.P. (1994). Using visualization to support data mining of large existing databases. In: Lee, J.P., Grinstein, G.G. (eds) Database Issues for Data Visualization. DBVIS 1993. Lecture Notes in Computer Science, vol 871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0021156
Download citation
DOI: https://doi.org/10.1007/BFb0021156
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58519-0
Online ISBN: 978-3-540-49016-6
eBook Packages: Springer Book Archive