Skip to main content

Processing All k-Nearest Neighbor Queries in Hadoop

  • Conference paper
Web-Age Information Management (WAIM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7418))

Included in the following conference series:

Abstract

A k-nearest neighbor (k-NN) query, which retrieves nearest k points from a database is one of the fundamental query types in spatial databases. An all k-nearest neighbor query (AkNN query), a variation of a k-NN query, determines the k-nearest neighbors for each point in the dataset in a query process. In this paper, we propose a method for processing AkNN queries in Hadoop. We decompose the given space into cells and execute a query using the MapReduce framework in a distributed and parallel manner. Using the distribution statistics of the target data points, our method can process given queries efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: Proc. EDBT, pp. 99–110 (2010)

    Google Scholar 

  2. Chen, Y., Patel, J.M.: Efficient evaluation of all-nearest-neighbor queries. In: Proc. ICDE 2007, pp. 1056–1065 (2007)

    Google Scholar 

  3. Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)

    Google Scholar 

  4. Emrich, T., Graf, F., Kriegel, H.-P., Schubert, M., Thoma, M.: Optimizing All-Nearest-Neighbor Queries with Trigonometric Pruning. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 501–518. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. The apache software foundation: Hadoop homepage, http://hadoop.apache.org/

  6. Jiang, D., Tung, A.K.H., Chen, G.: MAP-JOIN-REDUCE: Toward scalable and efficient data analysis on large clusters. IEEE TKDE 23(9), 1299–1311 (2011)

    Google Scholar 

  7. Samet, H.: The quadtree and related hierarchical data structures. ACM Computing Surveys 16(2), 187–260 (1984)

    Article  MathSciNet  Google Scholar 

  8. Vernica, R., Carey, M.J., Li, C.: Efficient parallel set-similarity joins using MapReduce. In: Proc. SIGMOD, pp. 495–506 (2010)

    Google Scholar 

  9. White, T.: Hadoop: The Definitive Guide. O’Reilly (2009)

    Google Scholar 

  10. Yokoyama, T., Ishikawa, Y., Suzuki, Y.: Processing all k-nearest neighbor queries in hadoop (long version) (2012), http://www.db.itc.nagoya-u.ac.jp/papers/2012-waim-long.pdf

  11. Zhang, J., Mamoulis, N., Papadias, D., Tao, Y.: All-nearest-neighbors queries in spatial databases. In: Proc. SSDBM, pp. 297–306 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yokoyama, T., Ishikawa, Y., Suzuki, Y. (2012). Processing All k-Nearest Neighbor Queries in Hadoop. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds) Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32281-5_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32281-5_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32280-8

  • Online ISBN: 978-3-642-32281-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics