Abstract
This paper explores the use of entropy for visualizing database structure. In particular, we show how visualizing the entropy of a relation provides a global perspective on the distribution of values and helps to identify areas within the relation where interesting relationships may be discovered. The type of structure we are interested in discovering is related to functional dependencies. Our approach is not dependent on the underlying domain of the data, providing a view of the dependency landscape within a relation. Using these techniques, we describe comparative results for a wide variety of synthetic and real data.
Both authors were supported by NSF Grant IIS-0082407
Chapter PDF
References
www.census.gov. On The Web.
Blake, C. and Merz, C. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.
Card, S. K., Mackinlay, J. D., and Shneiderman, B., editors (1999). Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann Publishers, Inc.
Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory. John Wiley & Sons, New York, NY, USA.
Dalkilic, M. M. (2000). Foundations of Data Mining. PhD thesis, Indiana University, Computer Science.
Dalkilic, M. M. and Robertson, E. L. (2000). Information dependencies. In Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, May 15–17, 2000, Dallas, Texas, USA,pages 245–253. ACM.
DeWitt, D. J. (1993). The wisconsin benchmark: Past, present, and future. In Gray, J., editor, The Benchmark Handbook for Database and Transaction Systems (2nd Edition). Morgan Kaufmann.
Feiner, S. (1992). Virtual worlds for visualizing information. In Advanced Visual Interfaces, pages 3–11.
Groth, D. P. and Robertson, E. L. (1998). Architectural support for database visualization. In Proceedings of the Workshop on New Paradigms in Information Visualization and Manipulation.
Inselberg, A. and Dimsdale, B.(1987). Parallel coordinates for visualizing multidimensional geometry. In Proceedings of Computer Graphics International ‘87,Tokyo. Springer-Verlag.
Inselberg, A. and Dimsdale, B. (1990). Parallel coordinates: A tool for visualizing multi-dimensional geometry. In Proceedings of IEEE Visualization ‘80, pages 361–375, Los Alamitos, CA. IEEE Computer Society Press.
Keim, D. A. (1996a). Databases and visualization. In Jagadish, H. V. and Mumick, I. S., editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada,June 4–6, 1996,page 543. ACM Press.
Keim, D. A. (1996b). Pixel-oriented database visualizations. SIGMOD Record, 25 (4): 35–39.
Keim, D. A., Kriegel, H.-P., and Seidl, T. (1994). Supporting data mining of large databases by visual feedback queries. In Proceedings of the Tenth International Conference on Data Engineering, February 14–18, 1994, Houston, Texas, USA,pages 302–313. IEEE Computer Society.
Kivinen, J. and Mannila, H. (1992). Approximate dependency inference from relations. In Biskup, J. and Hull, R., editors, Database Theory - ICDT’92, 4th International Conference, Berlin, Germany, October 14–16, 1992, Proceedings,volume 646 of Lecture Notes in Computer Science,pages 86–98. Springer.
LeBlanc, J., Ward, M. O., and Wittels, N. (1990). Exploring n-dimensional databases. In Proceedings of IEEE Visualization ‘80, pages 230–237, Los Alamitos, CA. IEEE Computer Society Press.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this chapter
Cite this chapter
Groth, D.P., Robertson, E.L. (2002). An Entropy-Based Approach to Visualizing Database Structure. In: Zhou, X., Pu, P. (eds) Visual and Multimedia Information Management. VDB 2002. IFIP — The International Federation for Information Processing, vol 88. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-35592-4_12
Download citation
DOI: https://doi.org/10.1007/978-0-387-35592-4_12
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-6935-7
Online ISBN: 978-0-387-35592-4
eBook Packages: Springer Book Archive