Processing All k-Nearest Neighbor Queries in Hadoop

Yokoyama, Takuya; Ishikawa, Yoshiharu; Suzuki, Yu

doi:10.1007/978-3-642-32281-5_34

Takuya Yokoyama²¹,
Yoshiharu Ishikawa^22,21,23 &
Yu Suzuki²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7418))

Included in the following conference series:

International Conference on Web-Age Information Management

1902 Accesses
12 Citations

Abstract

A k-nearest neighbor (k-NN) query, which retrieves nearest k points from a database is one of the fundamental query types in spatial databases. An all k-nearest neighbor query (AkNN query), a variation of a k-NN query, determines the k-nearest neighbors for each point in the dataset in a query process. In this paper, we propose a method for processing AkNN queries in Hadoop. We decompose the given space into cells and execute a query using the MapReduce framework in a distributed and parallel manner. Using the distribution statistics of the target data points, our method can process given queries efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: Proc. EDBT, pp. 99–110 (2010)
Google Scholar
Chen, Y., Patel, J.M.: Efficient evaluation of all-nearest-neighbor queries. In: Proc. ICDE 2007, pp. 1056–1065 (2007)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
Google Scholar
Emrich, T., Graf, F., Kriegel, H.-P., Schubert, M., Thoma, M.: Optimizing All-Nearest-Neighbor Queries with Trigonometric Pruning. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 501–518. Springer, Heidelberg (2010)
Chapter Google Scholar
The apache software foundation: Hadoop homepage, http://hadoop.apache.org/
Jiang, D., Tung, A.K.H., Chen, G.: MAP-JOIN-REDUCE: Toward scalable and efficient data analysis on large clusters. IEEE TKDE 23(9), 1299–1311 (2011)
Google Scholar
Samet, H.: The quadtree and related hierarchical data structures. ACM Computing Surveys 16(2), 187–260 (1984)
Article MathSciNet Google Scholar
Vernica, R., Carey, M.J., Li, C.: Efficient parallel set-similarity joins using MapReduce. In: Proc. SIGMOD, pp. 495–506 (2010)
Google Scholar
White, T.: Hadoop: The Definitive Guide. O’Reilly (2009)
Google Scholar
Yokoyama, T., Ishikawa, Y., Suzuki, Y.: Processing all k-nearest neighbor queries in hadoop (long version) (2012), http://www.db.itc.nagoya-u.ac.jp/papers/2012-waim-long.pdf
Zhang, J., Mamoulis, N., Papadias, D., Tao, Y.: All-nearest-neighbors queries in spatial databases. In: Proc. SSDBM, pp. 297–306 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science, Nagoya University, Japan
Takuya Yokoyama & Yoshiharu Ishikawa
Information Technology Center, Nagoya University, Japan
Yoshiharu Ishikawa & Yu Suzuki
National Institute of Informatics, Japan
Yoshiharu Ishikawa

Authors

Takuya Yokoyama
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiharu Ishikawa
View author publications
You can also search for this author in PubMed Google Scholar
Yu Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, No. 92, West Dazhi Street, 150001, Heilongjiang, Harbin, China
Hong Gao
Information and Computer Science Department, University of Hawaii, 1680 East West Road, 96822, Honolulu, HI, USA
Lipyeow Lim
School of Computer Science, Fudan University, No. 220, Handan Road, 200433, Shanghai, China
Wei Wang
School of Computer Science and Technology, Sichuan University, No. 29 Jiuyanqiao Wangjing Road, 610064, Chengdu, Sichuan, China
Chuan Li
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon,, Hong Kong, China
Lei Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yokoyama, T., Ishikawa, Y., Suzuki, Y. (2012). Processing All k-Nearest Neighbor Queries in Hadoop. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds) Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32281-5_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-32281-5_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32280-8
Online ISBN: 978-3-642-32281-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics