Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files

  • Roger Weber
  • Klemens Böhm
  • Hans-J. Schek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1923)


In digital libraries, nearest-neighbor search (NN-search) plays a key role for content-based retrieval over multimedia objects. However, performance of existing NN-search techniques is not satisfactory with large collections and with high-dimensional representations of the objects. To obtain response times that are interactive, we pursue the following approach: it uses a linear algorithm that works with approximations of the vectors and parallelizes it. In more detail, we parallelize NN-search based on the VA-File in a Network of Workstations (NOW). This approach reduces search time to a reasonable level for large collections. The best speedup we have observed is by almost 30 for a NOW with only three components with 900 MB of feature data. But this requires a number of design decisions, in particular when taking load dynamism and heterogeneity of components into account. Our contribution is to address these design issues.


Digital Library Main Memory Search Time Feature Data Search Cost 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Berchtold, C. Böhm, B. Braunmüller, D.A. Keim, and H.-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 1–12, Tucson, USA, 1997.Google Scholar
  2. 2.
    S. Berchtold, D.A. Keim, and H.-P. Kriegel. The X-tree: An index structure forhigh-dimensional data. In Proc. of the Int. Conference on Very Large Databases, pages 28–39, 1996.Google Scholar
  3. 3.
    Jakob Bosshard. An open and powerful relevance feedback engine for content-basedimage-retrieval. Diploma thesis (in english), Institute of Information Systems, ETH, Zurich, 2000.Google Scholar
  4. 4.
    P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In Proc. of the Int. Conference on Very Large Databases, Greece, 1997.Google Scholar
  5. 5.
    P. Ciaccia, P. Tiberio, and P. Zezula. Declustering of key-based partitioned signature files. ACM Transactions on Database Systems, 21(3), September 1996.Google Scholar
  6. 7.
    M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker. Query by image and video content: The QBIC system. Computer, 28(9):23–32, September 1995.CrossRefGoogle Scholar
  7. 8.
    A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 47–57, Boston, MA, June 1984.Google Scholar
  8. 9.
    I. Kamel and C. Faloutsos. Parallel R-trees. Technical Report CS-TR-2820, University of Maryland Institute for Advanced Computer Studies Dept. of Computer Science, Univ. of Maryland, College Park, MD, January 6 1992.Google Scholar
  9. 10.
  10. 11.
    G. Panagopoulos and C. Faloutsos. Bit-sliced signature files for very large text databases on a parallel machine architecture. Lecture Notes in Computer Science, 779, 1994.Google Scholar
  11. 12.
    A. N. Papadopoulos and Y. Manolopoulos. Similarity query processing using disk arrays. SIGMOD Record (ACM Special Interest Group on Management of Data), 27(2), 1998.Google Scholar
  12. 13.
    H.-J. Schek and R. Weber. Higher-Order Databases and Multimedia Information. In Proc. of the Swiss/Japan Seminar “Advances in Databases and Multimedia for the New Century-A Swiss/Japanese Perspective”, Kyoto, Japan, December 1-2, 1999, Singapore, 2000. World Scientific Press.Google Scholar
  13. 14.
    Columbia University. Webseek: A content-based image and video search and catalog tool for the web.
  14. 15.
    R. Weber and K. Böhm. Trading Quality for Time with Nearest-Neighbor Search. In Advances in Database Technology EDBT 2000, Proc. of the 7th Int. Conf. on Extending Database Technology, Konstanz, Germany, March 2000, volume 1777 of Lecture Notes in Computer Science, pages 21–35, Berlin, 2000. Springer-Verlag.Google Scholar
  15. 16.
    R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proc. of the Int. Conference on Very Large Databases, New York, USA, August 1998.Google Scholar
  16. 17.
    Roger Weber, Jürg Bolliger, Thomas Gross, and Hans-J. Schek. Architecture of a networked image search and retrieval system. In Eighth International Conference on Information and Knowledge Management (CIKM99), Kansas City, Missouri, USA, November 2–6 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Roger Weber
    • 1
  • Klemens Böhm
    • 1
  • Hans-J. Schek
    • 1
  1. 1.Institute of Information SystemsETH ZentrumZurichSwitzerland

Personalised recommendations