Skip to main content

Optimal Metric Search Is Equivalent to the Minimum Dominating Set Problem

  • Conference paper
  • First Online:
  • 774 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12440))

Abstract

In metric search, worst-case analysis is of little value, as the search invariably degenerates to a linear scan for ill-behaved data. Consequently, much effort has been expended on more nuanced descriptions of what performance might in fact be attainable, including heuristic baselines like the AESA family, as well as statistical proxies such as intrinsic dimensionality. This paper gets to the heart of the matter with an exact characterization of the best performance actually achievable for any given data set and query. Specifically, linear-time objective-preserving reductions are established in both directions between optimal metric search and the minimum dominating set problem, whose greedy approximation becomes the equivalent of an oracle-based AESA, repeatedly selecting the pivot that eliminates the most of the remaining points. As an illustration, the AESA heuristic is adapted to downplay the role of previously eliminated points, yielding some modest performance improvements over the original, as well as its younger relative iAESA2.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Other measures include the distance exponent  [24] and the ball-overlap factor  [21].

  2. 2.

    Personal communication, July 2012.

  3. 3.

    Note that the reductions are to and from two different versions of the dominating set problem (the directed and undirected version, respectively). At the price of slightly looser bounds, one could stick with just one of these.

  4. 4.

    This is the worst case given that the optimal number of distance computations is some value \(\gamma \), not the more general, non-informative worst-case of \(\Omega (n)\).

  5. 5.

    Optimal kNN with upper bounds does not map as cleanly to dominating sets.

  6. 6.

    The undirected version is most commonly discussed, with a reduction, e.g., from set covering  [13, Th. A.1]. A similar reduction to the directed version is straightforward.

  7. 7.

    In terms of vertices, not edges.

  8. 8.

    Note that only the lower bound is relevant, as the upper bound is always greater than the search radius.

  9. 9.

    If the new distance is allowed to use the original graph as part of its definition, the reduction can be performed in constant time—it is merely a reinterpretation.

  10. 10.

    The upper bound is easily shown by reinterpreting the minimum dominating set problem for a directed graph G = (V, E) as the problem of covering V with the closed out-neighborhoods of G, translating the standard set covering approximation  [26].

  11. 11.

    That is, for any range search instance, there is a directed graph with the objects as its nodes for which the equivalence holds. Reducing in the other direction preserves the objective value, but not necessarily the number of nodes/objects.

  12. 12.

    Chávez et al. say that such independence is a “reasonable approximation”  [5].

References

  1. Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In: Proceedings of the 47th Annual ACM Symposium on Theory of Computing (2015). https://doi.org/10.1145/2746539.2746612

  2. Beecks, C., Uysal, M.S., Seidl, T.: Signature quadratic form distance. In: Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, New York, NY, USA (2010). https://doi.org/10.1145/1816041.1816105

  3. Boyar, J., Eidenbenz, S.J., Favrholdt, L.M., Kotrbčík, M., Larsen, K.S.: Online dominating set. Algorithmica 81(5), 1938–1964 (2018). https://doi.org/10.1007/s00453-018-0519-1

    Article  MathSciNet  MATH  Google Scholar 

  4. Bustos, B., Navarro, G., Chávez, E.: Pivot selection techniques for proximity searching in metric spaces. Pattern Recogn. Lett. 24(14), 2357–2366 (2003). https://doi.org/10.1016/S0167-8655(03)00065-5

    Article  MATH  Google Scholar 

  5. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001). https://doi.org/10.1145/502807.502808

    Article  Google Scholar 

  6. Chlebík, M., Chlebíková, J.: Approximation hardness of dominating set problems. In: Albers, S., Radzik, T. (eds.) ESA 2004. LNCS, vol. 3221, pp. 192–203. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30140-0_19

    Chapter  Google Scholar 

  7. Das, A.: Partial domination in graphs. Iran. J. Sci. Technol. Trans. A Sci. 43(4), 1713–1718 (2018). https://doi.org/10.1007/s40995-018-0618-5

    Article  MathSciNet  Google Scholar 

  8. Edsberg, O., Hetland, M.L.: Indexing inexact proximity search with distance regression in pivot space. In: Proceedings of the 3rd International Conference on Similarity Search and Applications (2010). https://doi.org/10.1145/1862344.1862353

  9. Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org/Metric_Space_Library.html

  10. Figueroa, K., Chávez, E., Navarro, G., Paredes, R.: Speeding up spatial approximation search in metric spaces. J. Exp. Algorithmics 14, 3–6 (2010). https://doi.org/10.1145/1498698.1564506

    Article  MathSciNet  MATH  Google Scholar 

  11. Ford Jr., L.R., Johnson, S.M.: A tournament problem. Am. Math. Mon. 66(5), 37–40 (1959). https://doi.org/10.1080/00029890.1959.11989306

    Article  MathSciNet  MATH  Google Scholar 

  12. Gurobi Optimization, LLC.: Gurobi optimizer reference manual (2020). http://gurobi.com

  13. Kann, V.: On the Approximability of NP-complete Optimization Problems. Ph.D. thesis, Department of Numerical Analysis and Computing Science, Royal Institute of Technology, Stockholm (1992)

    Google Scholar 

  14. Lee, C.: Domination in digraphs. J. Korean Math. Soc. 35(4), 843–853 (1998)

    MathSciNet  MATH  Google Scholar 

  15. Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A stat. Mech. Appl. 390(6), 1150–1170 (2011). https://doi.org/10.1016/j.physa.2010.11.027

    Article  Google Scholar 

  16. Mao, R., Liu, X., Tang, H., Luo, Q., Chen, J., Wu, W.: Multivariate regression for pivot selection: a preliminary study. In: 2011 3rd Symposium on Web Society. IEEE (2011). https://doi.org/10.1109/SWS.2011.6101281

  17. Murakami, T., Takahashi, K., Serita, S., Fujii, Y.: Probabilistic enhancement of approximate indexing in metric spaces. Inf. Syst. 38(7), 1007–1018 (2013). https://doi.org/10.1016/j.is.2012.05.012

    Article  Google Scholar 

  18. Naidan, B., Hetland, M.L.: Shrinking data balls in metric indexes. In: DBKDA (2013)

    Google Scholar 

  19. Navarro, G.: Analyzing metric space indexes: what for? In: Proceedings of the 2009 2nd International Workshop on Similarity Search and Applications, SISAP 2009. IEEE Computer Society (2009). https://doi.org/10.1109/SISAP.2009.17

  20. Pestov, V.: Lower bounds on performance of metric tree indexing schemes for exact similarity search in high dimensions. Algorithmica (2013). https://doi.org/10.1007/s00453-012-9638-2

    Article  MathSciNet  MATH  Google Scholar 

  21. Skopal, T.: Unified framework for exact and approximate search in dissimilarity spaces. ACM Trans. Database Syst. (TODS) 32(4), 1–45 (2007). https://doi.org/10.1145/1292609.1292619

    Article  Google Scholar 

  22. Socorro, R., Micó, L., Oncina, J.: A fast pivot-based indexing algorithm for metric spaces. Pattern Recogn. Lett. 32(11), 1511–1516 (2011). https://doi.org/10.1016/j.patrec.2011.04.016

    Article  Google Scholar 

  23. Telelis, O.A., Zissimopoulos, V.: Absolute \(o(\log m)\) error in approximating random set covering: an average case analysis. Inf. Process. Lett. 94(4), 171–177 (2005). https://doi.org/10.1016/j.ipl.2005.02.009

    Article  MathSciNet  MATH  Google Scholar 

  24. Traina Jr., C.: Distance exponent: a new concept for selectivity estimation in metric trees. In: Proceedings of the 16th International Conference on Data Engineering (2000). https://doi.org/10.1109/ICDE.2000.839409

  25. Vidal Ruiz, E.: An algorithm for finding nearest neighbours in (approximately) constant average time. Pattern Recogn. Lett. 4(3), 145–157 (1986). https://doi.org/10.1016/0167-8655(86)90013-9

    Article  Google Scholar 

  26. Williamson, D.P., Shmoys, D.B.: The Design of Approximation Algorithms. Cambridge University Press, Cambridge (2011)

    Book  Google Scholar 

Download references

Acknowledgements

The author would like to thank Ole Edsberg, both for discussions providing the initial idea for this paper, and for substantial later input. He would also like to thank Jon Marius Venstad and Bilegsaikhan Naidan for reading early drafts of the paper and providing feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Magnus Lie Hetland .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hetland, M.L. (2020). Optimal Metric Search Is Equivalent to the Minimum Dominating Set Problem. In: Satoh, S., et al. Similarity Search and Applications. SISAP 2020. Lecture Notes in Computer Science(), vol 12440. Springer, Cham. https://doi.org/10.1007/978-3-030-60936-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60936-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60935-1

  • Online ISBN: 978-3-030-60936-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics