An efficient index structure for distributed k-nearest neighbours query processing

  • Min Yang
  • Kun Ma
  • Xiaohui Yu


Many location-based services are supported by the moving k-nearest neighbour (k-NN) query, which continuously returns the k-nearest data objects for a query point. Most of existing approaches to this problem have focused on a centralized setting, which show poor scalability to work around massive-scale and distributed data sets. In this paper, we propose an efficient distributed solution for k-NN query over moving objects to tackle the increasingly large scale of data. This approach includes a new grid-based index called Block Grid Index (BGI), and a distributed k-NN query algorithm based on BGI. There are three advantages of our approach: (1) BGI can be easily constructed and maintained in a distributed setting; (2) the algorithm is able to return the results set in only two iterations. (3) the efficiency of k-NN query is improved. The efficiency of our solution is verified by extensive experiments with millions of nodes.


k-Nearest neighbour query Distributed query processing Moving objects 



This work was supported in part by the 973 Program (2015CB352500), the National Natural Science Foundation of China Grant (61272092), the Shandong Provincial Natural Science Foundation Grant (ZR2012FZ004), the Science and Technology Development Program of Shandong Province (2014G GE27178), the Taishan Scholars Program and NSERC Discovery Grants.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.


  1. Ab Malek MSB, Ahmadon MAB, Yamaguchi S, Gupta BB (2016) Implementation of parallel model checking for computer-based test security design. In: International conference on information and communication systemsGoogle Scholar
  2. Alex R, Laio A (2014) Machine learning. Clustering by fast search and find of density peaks. Science 344(6191):1492–1496CrossRefGoogle Scholar
  3. Bamba B, Liu Ling, Iyengar A, Yu PS (2009) Distributed processing of spatial alarms: a safe region-based approach. In: 29th IEEE international conference on distributed computing systems, 2009. ICDCS ’09, pp 207–214Google Scholar
  4. Cahsai A, Ntarmos N, Anagnostopoulos C, Triantafillou P (2017) Scaling \(k\)-nearest neighbours queries (the right way). In: IEEE international conference on distributed computing systems, pp 1419–1430Google Scholar
  5. Chaudhuri S, Gravano L (1999) Evaluating top-k selection queries. In: VLDB, vol 99, pp 397–410Google Scholar
  6. Eldawy A, Mokbel MF (2013) A demonstration of SpatialHadoop: an efficient mapreduce framework for spatial data. Proc VLDB Endow 6(12):1230–1233CrossRefGoogle Scholar
  7. Gedik B, Liu L (2004) Mobieyes: distributed processing of continuously moving queries on moving objects in a mobile system. In: EDBT, pp 523–524Google Scholar
  8. Hjaltason GR, Samet H (1999) Distance browsing in spatial databases. ACM Trans Database Syst: TODS 24(2):265–318CrossRefGoogle Scholar
  9. Lu W, Shen Y, Chen S, Ooi BC (2012) Efficient processing of \(k\) nearest neighbor joins using MapReduce. Proc VLDB Endow 5(10):1016–1027CrossRefGoogle Scholar
  10. Plageras AP, Stergiou C, Kokkonis G, Psannis KE, Ishibashi Y, Kim BG, Gupta BB (2017) Efficient large-scale medical data (eHealth Big Data) analytics in internet of things. In: Business informatics, pp 21–27Google Scholar
  11. Raptopoulou K, Papadopoulos A, Manolopoulos Y (2003) Fast nearest-neighbor query processing in moving-object databases. GeoInformatica 7(2):113–137CrossRefGoogle Scholar
  12. Roussopoulos N, Kelley S, Vincent F (1995) Nearest neighbor queries. In: ACM sigmod record, vol 24. ACM, pp 71–79Google Scholar
  13. Seidl T, Kriegel H-P (1998) Optimal multi-step \(k\)-nearest neighbor search. In: ACM SIGMOD record, vol 27. ACM, pp 154–165Google Scholar
  14. Šidlauskas D, Šaltenis S, Jensen CS (2012) Parallel main-memory indexing for moving-object query and update workloads. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data. ACM, pp 37–48Google Scholar
  15. Song Z, Roussopoulos N (2001) \(K\)-nearest neighbor search for moving query point. In: Advances in spatial and temporal databases. Springer, pp 79–96Google Scholar
  16. Tao Y, Papadias D, Shen Q (2002) Continuous nearest neighbor search. In: Proceedings of the 28th international conference on very large data bases. VLDB Endowment, pp 287–298Google Scholar
  17. Tripathi S, Gupta B, Almomani A, Mishra A, Veluru S (2013) Hadoop based defense solution to handle distributed denial of service (DDoS) attacks. J Inf Secur 04(3):150–164Google Scholar
  18. Wang H, Zimmermann R, Ku WS (2006) Distributed continuous range query processing on moving objects. Database and expert systems applications. Springer, Berlin, pp 655–665Google Scholar
  19. Wu W, Guo W, Tan K L (2007) Distributed processing of moving \(k\)-nearest-neighbor query on moving objects. In: 2014 IEEE 30th international conference on data engineering. IEEE, pp 1116–1125Google Scholar
  20. Wu H, Wang L, Jiang T (2018) Secure and efficient \(k\)-nearest neighbor query for location-based services in outsourced environments. Sci China (Inf Sci) 61(3):039101CrossRefGoogle Scholar
  21. Xia Y, Wang R, Zhang X, Bae H-Y (2017) Grid-based \(k\)-nearest neighbor queries over moving object trajectories with MapReduce. Int J Database Theory Appl 10:1–12CrossRefGoogle Scholar
  22. Yu C, Ooi BC, Tan K-L, Jagadish H (2001) Indexing the distance: an efficient method to knn processing. In: VLDB, vol 1, pp 421–430Google Scholar
  23. Yu X, Pu KQ, Koudas N (2005) Monitoring \(k\)-nearest neighbor queries over moving objects. In: 21st international conference on data engineering, 2005. ICDE 2005. Proceedings. IEEE, pp 631–642Google Scholar
  24. Yu Z, Liu Y, Yu X, Pu KQ (2015) Scalable distributed processing of \(k\) nearest neighbor queries over moving objects. IEEE Trans Knowl Data Eng 27(5):1383–1396CrossRefGoogle Scholar
  25. Zhang C, Li F, Jestes J (2012) Efficient parallel kNN joins for large data in MapReduce. In: International Conference on Extending Database Technology, pp 38-49Google Scholar
  26. Zheng B, Xu J, Lee W-C, Lee L (2006) Grid-partition index: a hybrid method for nearest-neighbor queries in wireless location-based services. VLDB J 15(1):21–39CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyShandong UniversityJinanChina
  2. 2.School of Information Science and EngineeringUniversity of JinanJinanChina
  3. 3.School of Information TechnologyUniversity of YorkTorontoCanada

Personalised recommendations