Skip to main content
Log in

Some average performance measures for the B-tree

  • Published:
Acta Informatica Aims and scope Submit manuscript

Summary

Several average performance measures are presented for large B-trees formed from insertions, where large refers to the number of keys. Formulas are first derived for the expected number of nodes of each size on the bottom level of the tree, where the size of a node is the number of keys currently contained in the node. This is followed by formulas for the probability of making an insertion into a node of a given size, the probability of a split during an insertion into a node, and the expected number of splits during an insertion into the tree. It is shown that for large trees of high order m, the expected number of splits per insertion is approximately 1/((ln 2) m). A formula is presented for the average storage utilization, and it is shown that this average approaches ln 2 as m approaches infinity. A simpler formula is derived for the average storage utilization at the bottom level of the tree, and it is shown that this formula is an increasing function of m ranging from 2/3 to ln 2. It is shown that the expected tree height and the expected search path length are approximately logarithmic to the base (ln 2) m. Simulation results are presented to corroborate the theoretical analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bayer, R., McCreight, E.: Organization and maintenance of large ordered indexes. Acta Inf. 1, 173–189 (1972)

    Google Scholar 

  2. Bayer, R., Schkolnick, M.: Concurrency of operations on B-trees. Acta Inf. 9, 1–21 (1977)

    Google Scholar 

  3. Bayer, R., Unterauer, K.: Prefix B-trees. ACM Trans. Database Syst. 2, 11–26 (1977)

    Google Scholar 

  4. Held, G., Stonebraker, M.: B-trees reexamined. Commun. ACM 21, 139–143 (1978)

    Google Scholar 

  5. Knuth, D.: The Art of Computer Programming, Vol. 1 Fundamental Algorithms. Reading, MA: Addison-Wesley 1969

    Google Scholar 

  6. Knuth, D.: The Art of Computer Programming, Vol. 3 Sorting and Searching, pp. 471–480, 679–680. Reading, MA: Addison-Wesley 1973

    Google Scholar 

  7. Kuspert, K.: Storage utilization in B-trees with a generalized overflow technique. Acta Inf. 19, 35–55 (1983)

    Google Scholar 

  8. Nakamura, T., Mizoguchi, T.: An analysis of storage utilization factor in block split data structuring scheme. Proc. 4th Int. Conf. on Very Large Data Bases, pp. 489–495, 1978

  9. Quitzow, K., Klopprogge, M.: Space utilization and access path length in B-trees. Inf. Syst. 5, 7–16 (1980)

    Google Scholar 

  10. Ullman, J.: Principles of Database Systems, pp. 42–49. Potomac, MD: Computer Science Press 1980

    Google Scholar 

  11. Yao, A.: On random 2–3 trees. Acta Inf. 9, 159–170 (1978)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wright, W.E. Some average performance measures for the B-tree. Acta Informatica 21, 541–557 (1985). https://doi.org/10.1007/BF00289710

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00289710

Keywords

Navigation