Skip to main content

Approximate k-Nearest Neighbor Query over Spatial Data Federation

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2023)

Abstract

Approximate nearest neighbor query is a fundamental spatial query widely applied in many real-world applications. In the big data era, there is an increasing demand to scale these queries over a spatial data federation, which consists of multiple data owners, each holding a private, disjoint partition of the entire spatial dataset. However, it is non-trivial to enable approximate k-nearest neighbor query over a spatial data federation. This is because stringent security constraints are often imposed to protect the sensitive, privately owned data partitions, whereas naively extending prior secure query processing solutions leads to high inefficiency (e.g., 100 s per query). In this paper, we propose two novel algorithms for efficient and secure approximate k-nearest neighbor query over a spatial data federation. We theoretically analyze their communication cost and time complexity, and further prove their security guarantees and approximation bounds. Extensive experiments show that our algorithms outperform the state-of-the-art solutions with respect to the query efficiency and often yield a higher accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amap: https://lbs.amap.com/. Accessed 30 Jan 2023

  2. Bater, J., Elliott, G., Eggen, C., Goel, S., Kho, A.N., Rogers, J.: SMCQL: secure query processing for private data networks. PVLDB 10(6), 673–684 (2017)

    Google Scholar 

  3. Bater, J., He, X., Ehrich, W., Machanavajjhala, A., Rogers, J.: Shrinkwrap: efficient SQL query processing in differentially private data federations. PVLDB 12(3), 307–320 (2018)

    Google Scholar 

  4. Bater, J., Park, Y., He, X., Wang, X., Rogers, J.: SAQE: practical privacy-preserving approximate query processing for data federations. PVLDB 13(11), 2691–2705 (2020)

    Google Scholar 

  5. Bonawitz, K., et al.: Practical secure aggregation for privacy-preserving machine learning. In: CCS, pp. 1175–1191 (2017)

    Google Scholar 

  6. Cai, D.: A revisit of hashing algorithms for approximate nearest neighbor search. IEEE Trans. Knowl. Data Eng. 33(6), 2337–2348 (2021)

    Article  Google Scholar 

  7. Choi, S., Ghinita, G., Lim, H., Bertino, E.: Secure knn query processing in untrusted cloud environments. IEEE Trans. Knowl. Data Eng. 26(11), 2818–2831 (2014)

    Article  Google Scholar 

  8. Evans, D., Kolesnikov, V., Rosulek, M.: A pragmatic introduction to secure multi-party computation. Found. Trends Priv. Secur. 2(2–3), 70–246 (2018)

    Article  Google Scholar 

  9. Gao, D., Tong, Y., She, J., Song, T., Chen, L., Xu, K.: Top-k team recommendation in spatial crowdsourcing. In: WAIM, pp. 191–204 (2016)

    Google Scholar 

  10. Jurczyk, P., Xiong, L.: Information sharing across private databases: secure union revisited. In: SocialCom/PASSAT, pp. 996–1003 (2011)

    Google Scholar 

  11. Keller, M.: MP-SPDZ: a versatile framework for multi-party computation. In: CCS, pp. 1575–1590 (2020)

    Google Scholar 

  12. Lei, X., Liu, A.X., Li, R.: Secure KNN queries over encrypted data: dimensionality is not always a curse. In: ICDE, pp. 231–234 (2017)

    Google Scholar 

  13. Li, W., Zhang, Y., Sun, Y., Wang, W., Li, M., Zhang, W., Lin, X.: Approximate nearest neighbor search on high dimensional data - experiments, analyses, and improvement. IEEE Trans. Knowl. Data Eng. 32(8), 1475–1488 (2020)

    Article  Google Scholar 

  14. Li, Y., Yuan, Y., Wang, Y., Lian, X., Ma, Y., Wang, G.: Distributed multimodal path queries. IEEE Trans. Knowl. Data Eng. 34(7), 3196–3210 (2022)

    Google Scholar 

  15. Liu, C., Wang, X.S., Nayak, K., Huang, Y., Shi, E.: Oblivm: a programming framework for secure computation. In: S & P, pp. 359–376 (2015)

    Google Scholar 

  16. Mount, D.M., Arya, S.: Ann library. http://www.cs.umd.edu/mount/ANN/. Accessed 30 Jan 2023

  17. Pan, X., et al.: Hu-fu: a data federation system for secure spatial queries. PVLDB 15(12), 3582–3585 (2022)

    Google Scholar 

  18. Shi, Y., Tong, Y., Zeng, Y., Zhou, Z., Ding, B., Chen, L.: Efficient approximate range aggregation over large-scale spatial data federation. IEEE Trans. Knowl. Data Eng. 35(1), 418–430 (2023)

    Google Scholar 

  19. Tao, Q., Zeng, Y., Zhou, Z., Tong, Y., Chen, L., Xu, K.: Multi-worker-aware task planning in real-time spatial crowdsourcing. In: DASFAA, pp. 301–317 (2018)

    Google Scholar 

  20. Tong, Y., et al.: Hu-fu: efficient and secure spatial queries over data federation. PVLDB 15(6), 1159–1172 (2022)

    Google Scholar 

  21. Tong, Y., Zeng, Y., Zhou, Z., Chen, L., Xu, K.: Unified route planning for shared mobility: an insertion-based framework. ACM Trans. Database Syst. 47(1), 2:1-2:48 (2022)

    Article  MathSciNet  Google Scholar 

  22. Tong, Y., Zhou, Z., Zeng, Y., Chen, L., Shahabi, C.: Spatial crowdsourcing: a survey. VLDB J. 29(1), 217–250 (2020)

    Article  Google Scholar 

  23. Vershynin, R.: High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press, Cambridge (2018)

    Book  MATH  Google Scholar 

  24. Volgushev, N., Schwarzkopf, M., Getchell, B., Varia, M., Lapets, A., Bestavros, A.: Conclave: secure multi-party computation on big data. In: EuroSys, pp. 3:1–3:18 (2019)

    Google Scholar 

  25. Wang, M., Xu, X., Yue, Q., Wang, Y.: A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search. PVLDB 14(11), 1964–1978 (2021)

    Google Scholar 

  26. Wang, Y., et al.: Fed-LTD: towards cross-platform ride hailing via federated learning to dispatch. In: KDD, pp. 4079–4089 (2022)

    Google Scholar 

  27. Xie, D., Li, F., Yao, B., Li, G., Zhou, L., Guo, M.: Simba: efficient in-memory spatial analytics. In: SIGMOD, pp. 1071–1085 (2016)

    Google Scholar 

  28. Yuan, Y., Ma, D., Wen, Z., Zhang, Z., Wang, G.: Subgraph matching over graph federation. PVLDB 15(3), 437–450 (2021)

    Google Scholar 

Download references

Acknowledgements

We are grateful to anonymous reviewers for their constructive comments. This work is partially supported by the National Science Foundation of China (NSFC) under Grant No. U21A20516 and 62076017, the Beihang University Basic Research Funding No. YWF-22-L-531, the Funding No. 22-TQ23-14-ZD-01-001 and WeBank Scholars Program. Lei Chen’s work is partially supported by National Science Foundation of China (NSFC) under Grant No. U22B2060, the Hong Kong RGC GRF Project 16213620, RIF Project R6020-19, AOE Project AoE/E-603/18, Theme-based project TRS T41-603/20R, China NSFC No. 61729201, Guangdong Basic and Applied Basic Research Foundation 2019B151530001, Hong Kong ITC ITF grants MHX/078/21 and PRP/004/22FX, Microsoft Research Asia Collaborative Research Grant and HKUST-Webank joint research lab grants.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongxin Tong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, K. et al. (2023). Approximate k-Nearest Neighbor Query over Spatial Data Federation. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13943. Springer, Cham. https://doi.org/10.1007/978-3-031-30637-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30637-2_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30636-5

  • Online ISBN: 978-3-031-30637-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics