Abstract
In this paper, we consider the following sum query problem: Given a point set P in \({\mathbb {R}}^d\), and a distance-based function f(p, q) (i.e., a function of the distance between p and q) satisfying some general properties, the goal is to develop a data structure and a query algorithm for efficiently computing a \((1+\epsilon )\)-approximate solution to the sum \(\sum _{p \in P} f(p,q)\) for any query point \(q \in {\mathbb {R}}^d\) and any small constant \(\epsilon >0\). Existing techniques for this problem are mainly based on some core-set techniques which often have difficulties to deal with functions with local domination property. Based on several new insights to this problem, we develop in this paper a novel technique to overcome these encountered difficulties. Our algorithm is capable of answering queries with high success probability in time no more than \({\tilde{O}}_{\epsilon ,d}(n^{0.5 + c})\), and the underlying data structure can be constructed in \({\tilde{O}}_{\epsilon ,d}(n^{1+c})\) time for any \(c>0\), where the hidden constant has only polynomial dependence on \(1/\epsilon\) and d. Our technique is simple and can be easily implemented for practical purpose.
Similar content being viewed by others
Notes
Indeed this restriction can be greatly softened. Our scheme applies as long as \(F(\cdot )\) is “not increasing rapidly”, i.e., \(F(x_1) \le CF(x_2)\) for some constant C when \(x_1 > x_2\). The listed restriction is mainly for ease of presentation.
References
Afshani, P., Chan, T.M.: Optimal halfspace range reporting in three dimensions. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 180–186. SIAM (2009)
Agarwal, P.K., Har-peled, S., Varadarajan, K.R.: Geometric approximation via coresets. In: Combinatorial and Computational Geometry, MSRI. pp. 1–30. University Press (2005)
Andoni, A.: Nearest Neighbor Search: the Old, the New, and the Impossible. PhD thesis, Massachusetts Institute of Technology (2009)
Andoni, A., Indyk, P., Nguyen, H.L., Razenshteyn, I.: Beyond locality-sensitive hashing. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms. pp. 1018–1028. SIAM (2014)
Aronov, B., Har-Peled, S.: On approximating the depth and related problems. SIAM J. Comput. 38(3), 899–921 (2008)
Bach, F., Lacoste-Julien, S., Obozinski, G.: On the equivalence between herding and conditional gradient algorithms. In: Proceedings of the 29th International Conference on International Conference on Machine Learning. pp. 1355–1362. Omnipress (2012)
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)
Charikar, M., Siminelakis, P.: Hashing-Based-Estimators for Kernel Density in High Dimensions. In: Proceedings of the 58th Annual IEEE Symposium on Foundations of Computer Science. pp. 1032-1043. (2017)
Chen, D.Z., Huang, Z., Liu, Y., Xu, J.: On clustering induced voronoi diagrams. In: Proceedings of the IEEE 54th Annual Symposium on Foundations of Computer Science, pp. 390–399. (2013)
Chen, K.: On coresets for k-median and k-means clustering in metric and euclidean spaces and their applications. SIAM J. Comput. 39(3), 923–947 (2009)
Cortés, E.C., Scott, C.: Sparse approximation of a kernel mean. IEEE Trans. Signal Process. 65(5), 1310–1323 (2017)
Feldman, D., Langberg, M.: A unified framework for approximating and clustering data. In: Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing. pp. 569–578. ACM (2011)
Har-Peled, S., Indyk, P., Motwani, R.: Approximate nearest neighbor: towards removing the curse of dimensionality. Theory Comput. 8(1), 321–350 (2012)
Har-Peled, S., Mazumdar, S.: On coresets for k-means and k-median clustering. In: Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing. pp. 291–300. ACM (2004)
Har-Peled, S.: Computing the k nearest-neighbors for all vertices via Dijkstra. Preprint arXiv:1607.07818 (2016)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Phillips, J.M., Tai, W.M.: Improved coresets for kernel density estimates. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. pp. 2718–2727. SIAM (2018)
Rahul, S., Tao, Y.: Efficient top-k indexing via general reductions. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. pp. 277–288. ACM (2016)
Sheng, C., Tao, Y.: Dynamic top-k range reporting in external memory. In: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems. pp. 121–130. ACM (2012)
Acknowledgements
This research was supported in part by NSF through Grants IIS-1422591, CCF-1422324, CNS-1547167, CCF-1716400, and IIS-1910492. A preliminary version of this paper has appeared in the Proceedings of the 28th International Symposium on Algorithms and Computation (ISAAC 2017).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, Z., Xu, J. An Efficient Sum Query Algorithm for Distance-Based Locally Dominating Functions. Algorithmica 82, 2415–2431 (2020). https://doi.org/10.1007/s00453-020-00691-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-020-00691-w