RIVA: Indexing and Visualization of High-Dimensional Data Via Dimension Reorderings

Vlachos, Michail; Papadimitriou, Spiros; Vagena, Zografoula; Yu, Philip S.

doi:10.1007/11871637_39

RIVA: Indexing and Visualization of High-Dimensional Data Via Dimension Reorderings

Michail Vlachos²¹,
Spiros Papadimitriou²¹,
Zografoula Vagena²² &
…
Philip S. Yu²¹

Conference paper

3382 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4213))

Abstract

We propose a new representation for high-dimensional data that can prove very effective for visualization, nearest neighbor (NN) and range searches. It has been unequivocally demonstrated that existing index structures cannot facilitate efficient search in high-dimensional spaces. We show that a transformation from points to sequences can potentially diminish the negative effects of the dimensionality curse, permitting an efficient NN-search. The transformed sequences are optimally reordered, segmented and stored in a low-dimensional index. The experimental results validate that the proposed representation can be a useful tool for the fast analysis and visualization of high-dimensional databases.

Download to read the full chapter text

Chapter PDF

References

Aggarwal, C.C., Han, J., Yang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. In: VLDB (2004)
Google Scholar
de Vries, A.P., Mamoulis, N., Nes, N., Kersten, M.L.: Efficient k-NN search on vertically decomposed data. In: SIGMOD (2002)
Google Scholar
Faloutsos, C., Lin, K.-I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: SIGMOD (1995)
Google Scholar
Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multidimensional geometry. In: IEEE Visualization 1990 (1990)
Google Scholar
Johnson, D., Krishnan, S., Chhugani, J., Kumar, S., Venkatasubramanian, S.: Compressing large boolean matrices using reordering techniques. In: VLDB (2004)
Google Scholar
Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: SC 1998 (1998)
Google Scholar
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Comp. Biol. and Bioinf. 1(1) (2004)
Google Scholar
Wang, H., Wang, W., Yang, J., Yu, P.S.: Clustering by pattern similarity in large data sets. In: SIGMOD (2002)
Google Scholar
Yi, B.-K., Faloutsos, C.: Fast time sequence indexing for arbitrary L _p norms. In: VLDB (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

IBM T.J. Watson Research Center, Hawthorne, NY, USA
Michail Vlachos, Spiros Papadimitriou & Philip S. Yu
IBM Almaden Research Center, San Jose, CA, USA
Zografoula Vagena

Authors

Michail Vlachos
View author publications
You can also search for this author in PubMed Google Scholar
Spiros Papadimitriou
View author publications
You can also search for this author in PubMed Google Scholar
Zografoula Vagena
View author publications
You can also search for this author in PubMed Google Scholar
Philip S. Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge Engineering Group, Technische Universität Darmstadt,
Johannes Fürnkranz
Max Planck Institute for Computer Science, Saarbrücken, Germany
Tobias Scheffer
Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, Germany
Myra Spiliopoulou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vlachos, M., Papadimitriou, S., Vagena, Z., Yu, P.S. (2006). RIVA: Indexing and Visualization of High-Dimensional Data Via Dimension Reorderings. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Knowledge Discovery in Databases: PKDD 2006. PKDD 2006. Lecture Notes in Computer Science(), vol 4213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871637_39

Download citation

DOI: https://doi.org/10.1007/11871637_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45374-1
Online ISBN: 978-3-540-46048-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics