Advertisement

RIVA: Indexing and Visualization of High-Dimensional Data Via Dimension Reorderings

  • Michail Vlachos
  • Spiros Papadimitriou
  • Zografoula Vagena
  • Philip S. Yu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4213)

Abstract

We propose a new representation for high-dimensional data that can prove very effective for visualization, nearest neighbor (NN) and range searches. It has been unequivocally demonstrated that existing index structures cannot facilitate efficient search in high-dimensional spaces. We show that a transformation from points to sequences can potentially diminish the negative effects of the dimensionality curse, permitting an efficient NN-search. The transformed sequences are optimally reordered, segmented and stored in a low-dimensional index. The experimental results validate that the proposed representation can be a useful tool for the fast analysis and visualization of high-dimensional databases.

Keywords

Travel Salesman Problem Near Neighbor Query Point Projected Dimension Dimension Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Aggarwal, C.C., Han, J., Yang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. In: VLDB (2004)Google Scholar
  2. 2.
    de Vries, A.P., Mamoulis, N., Nes, N., Kersten, M.L.: Efficient k-NN search on vertically decomposed data. In: SIGMOD (2002)Google Scholar
  3. 3.
    Faloutsos, C., Lin, K.-I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: SIGMOD (1995)Google Scholar
  4. 4.
    Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multidimensional geometry. In: IEEE Visualization 1990 (1990)Google Scholar
  5. 5.
    Johnson, D., Krishnan, S., Chhugani, J., Kumar, S., Venkatasubramanian, S.: Compressing large boolean matrices using reordering techniques. In: VLDB (2004)Google Scholar
  6. 6.
    Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: SC 1998 (1998)Google Scholar
  7. 7.
    Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Comp. Biol. and Bioinf. 1(1) (2004)Google Scholar
  8. 8.
    Wang, H., Wang, W., Yang, J., Yu, P.S.: Clustering by pattern similarity in large data sets. In: SIGMOD (2002)Google Scholar
  9. 9.
    Yi, B.-K., Faloutsos, C.: Fast time sequence indexing for arbitrary L p norms. In: VLDB (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Michail Vlachos
    • 1
  • Spiros Papadimitriou
    • 1
  • Zografoula Vagena
    • 2
  • Philip S. Yu
    • 1
  1. 1.IBM T.J. Watson Research CenterHawthorneUSA
  2. 2.IBM Almaden Research CenterSan JoseUSA

Personalised recommendations