Skip to main content

Lightweight Approximate Selection

  • Conference paper
  • 1816 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 8737)

Abstract

Given a relative rank r ∈ (0,1) (e.g., r = 1/2 refers to the median), we show how to efficiently sample with high probability an element with rank very close to r from any probability distribution that supports efficient sampling (e.g., elements stored in an array). A primary feature of our methods is their elegance and ease of implementation – they can be coded in less space than is occupied by this abstract, and their lightweight footprint makes them ideally suited for highly resource-constrained computing environments. We demonstrate through empirical testing that these methods perform well in practice, and provide a complete theoretical analysis for our methods that offers valuable insight into the performance of a natural class of approximate selection algorithms based on hierarchical random sampling.

Keywords

  • Data Stream
  • Association Rule
  • Relative Rank
  • Approximation Guarantee
  • Time Series Dataset

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-662-44777-2_26
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-662-44777-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alsabti, K., Ranka, S., Singh, V.: A one-pass algorithm for accurately estimating quantiles for disk-resident data. In: VLDB, pp. 346–355 (1997)

    Google Scholar 

  2. Agrawal, R., Swami, A.: A one-pass space-efficient algorithm for finding quantiles. In: COMAD (1995)

    Google Scholar 

  3. Biggs, N.: Some odd graph theory. Annals of the New York Academy of Sciences 319(1), 71–81 (1979)

    CrossRef  MathSciNet  Google Scholar 

  4. Brody, J., Liang, H., Sun, X.: Space-efficient approximation scheme for circular earth mover distance. In: Fernández-Baca, D. (ed.) LATIN 2012. LNCS, vol. 7256, pp. 97–108. Springer, Heidelberg (2012)

    Google Scholar 

  5. Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Space- and time-efficient deterministic algorithms for biased quantiles over data streams. In: IEEE International Conference on Data Engineering (2005)

    Google Scholar 

  6. Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Space- and time-efficient deterministic algorithms for biased quantiles over data streams. In: PODS, pp. 263–272 (2006)

    Google Scholar 

  7. DeWitt, D.J., Naughton, J.F., Schneider, D.A.: Parallel sorting on a shared-nothing architecture using probabilistic splitting. In: PDIS, pp. 280–291 (1991)

    Google Scholar 

  8. Floyd, R.W., Rivest, R.L.: Expected time bounds for selection. Commun. ACM 18(3), 165–172 (1975)

    CrossRef  MATH  Google Scholar 

  9. Greenwald, M., Khanna, S.: Space-efficient online computation of quantile summaries. In: SIGMOD, pp. 58–66 (2001)

    Google Scholar 

  10. Guha, S., McGregor, A.: Approximate quantiles and the order of the stream. SIAM J. Comput. 38(5), 2044–2059 (2009)

    CrossRef  MATH  MathSciNet  Google Scholar 

  11. Gibbons, P.B., Matias, Y., Poosala, V.: Fast incremental maintenance of approximate histograms. ACM Trans. Database Syst. 27(3), 261–298 (2002)

    CrossRef  Google Scholar 

  12. Ioannidis, Y.E.: The history of histograms (abridged). In: VLDB, pp. 19–30 (2003)

    Google Scholar 

  13. Jain, R., Chlamtac, I.: The P2 algorithm for dynamic calculation of quantiles and histograms without storing observations. Commun. ACM 28(10), 1076–1085 (1985)

    CrossRef  Google Scholar 

  14. Munro, I., Paterson, M.: Selection and sorting with limited storage. In: FOCS, pp. 253–258 (1978)

    Google Scholar 

  15. Munro, I., Raman, V.: Selection from read-only memory and sorting with minimum data movement. Theor. Comput. Sci. 165(2), 311–323 (1996)

    CrossRef  MATH  MathSciNet  Google Scholar 

  16. Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: SIGMOD, pp. 426–435 (1998)

    Google Scholar 

  17. McGregor, A., Valiant, P.: The shifting sands algorithm. In: SODA, pp. 453–458 (2012)

    Google Scholar 

  18. Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: SIGMOD, pp. 1–12 (1996)

    Google Scholar 

  19. Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: SIGMOD, pp. 23–34 (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dean, B.C., Jalasutram, R., Waters, C. (2014). Lightweight Approximate Selection. In: Schulz, A.S., Wagner, D. (eds) Algorithms - ESA 2014. ESA 2014. Lecture Notes in Computer Science, vol 8737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44777-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44777-2_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44776-5

  • Online ISBN: 978-3-662-44777-2

  • eBook Packages: Computer ScienceComputer Science (R0)