Abstract
We propose a new representation for high-dimensional data that can prove very effective for visualization, nearest neighbor (NN) and range searches. It has been unequivocally demonstrated that existing index structures cannot facilitate efficient search in high-dimensional spaces. We show that a transformation from points to sequences can potentially diminish the negative effects of the dimensionality curse, permitting an efficient NN-search. The transformed sequences are optimally reordered, segmented and stored in a low-dimensional index. The experimental results validate that the proposed representation can be a useful tool for the fast analysis and visualization of high-dimensional databases.
Chapter PDF
References
Aggarwal, C.C., Han, J., Yang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. In: VLDB (2004)
de Vries, A.P., Mamoulis, N., Nes, N., Kersten, M.L.: Efficient k-NN search on vertically decomposed data. In: SIGMOD (2002)
Faloutsos, C., Lin, K.-I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: SIGMOD (1995)
Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multidimensional geometry. In: IEEE Visualization 1990 (1990)
Johnson, D., Krishnan, S., Chhugani, J., Kumar, S., Venkatasubramanian, S.: Compressing large boolean matrices using reordering techniques. In: VLDB (2004)
Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: SC 1998 (1998)
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Comp. Biol. and Bioinf. 1(1) (2004)
Wang, H., Wang, W., Yang, J., Yu, P.S.: Clustering by pattern similarity in large data sets. In: SIGMOD (2002)
Yi, B.-K., Faloutsos, C.: Fast time sequence indexing for arbitrary L p norms. In: VLDB (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vlachos, M., Papadimitriou, S., Vagena, Z., Yu, P.S. (2006). RIVA: Indexing and Visualization of High-Dimensional Data Via Dimension Reorderings. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Knowledge Discovery in Databases: PKDD 2006. PKDD 2006. Lecture Notes in Computer Science(), vol 4213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871637_39
Download citation
DOI: https://doi.org/10.1007/11871637_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45374-1
Online ISBN: 978-3-540-46048-0
eBook Packages: Computer ScienceComputer Science (R0)