Skip to main content

Resizing

  • Chapter
  • First Online:
The Joys of Hashing
  • 1404 Accesses

Abstract

If you know how performance degrades as the load factor of a hash table increases, you can use this to pick a table size where the expected performance matches your needs, presuming that you know how many keys the table will need to store. If you do not know the number of elements you need to store, n, then you cannot choose a table size, m, that ensures that α = n/m is below a desired upper bound. In most applications, you do not know n before you run your program. Therefore, you must adjust m as n increases by resizing the table.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Strictly speaking, amortized means that you write off expensive operations over time, and this suggests that cheaper ones follow costly operations. Doing this would not give you the runtime guarantee you are after. If you stop an algorithm right after an expensive operation and do not follow it with a series of cheap operations, you will be in trouble. What you do with amortized running time is that you save up some “computation” when doing cheap operations such that you can guarantee that you have enough computation in your bank account when you need to pay for an expensive operation.

  2. 2.

    Technically, you could compute these primes as needed, but this would be much slower than all the other hash table operations, so tabulating the primes you need is the only practical way. You can go to https://primes.utm.edu/lists / to get a list of the first 1000, 10,000 or 50 million primes and build a table from these by filtering them according to your choice of β.

  3. 3.

    You do not necessarily need your table size to be prime just because you use modulo and prime to get your bins. You can first get a random key using modulus and then mask out the lower bits. This way, you get a table size that is easier to work with; you can grow it and shrink it by a power of two, but, of course, at the cost of needing two operations to get your bin index. Since getting this index is unlikely to be the most time-critical in using a hash-table, this is a small price to pay.

  4. 4.

    The reason we say that n insertion takes (amortized) linear time is that the cost per operation does not depend on n. It does depend on β, however, as you can see from the figure.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Thomas Mailund

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mailund, T. (2019). Resizing. In: The Joys of Hashing. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4066-3_4

Download citation

Publish with us

Policies and ethics