Optimal multi-block read schedules for partitioned signature files

  • Paolo Ciaccia
System Issues
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1057)

Abstract

Queries on partitioned signature files, namely Quick Filter (QF), can lead to retrieve from disk a large number of blocks, depending on the specific query pattern. In order to reduce the overall retrieval time, we consider multi-block read schedules that, provided contiguous allocation of blocks of the file on disk surface is guaranteed by the storage system, transfer more than one block at a time. We show that, for any signature query and buffer size, there always exists an optimal schedule whose reads all have the same size, and that such a constant size (CS) schedule can be determined in a time logarithmic in the number of blocks to be retrieved. We then provide analytical results for the expected performance of QF using CS schedules and compare QF with other, sequential-based, signature file organizations. Finally, we suggest how our approach can also be of interest for other file organizations based on multi-attribute hashing.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Cia95]
    P. Ciaccia. Optimal multi-block read schedules for partitioned signature files. Technical Report UBLCS-95-13, University of Bologna, Dept. of Computer Science, August 1995.Google Scholar
  2. [CM93]
    P. Ciaccia and D. Maio. Access cost estimation for physical database design. Data and Knowledge Engineering, 11(2):125–150, 1993.Google Scholar
  3. [CZ93]
    P. Ciaccia and P. Zezula. Estimating accesses in partitioned signature file organizations. ACM Transactions on Information Systems, 11(2):133–142, April 1993.Google Scholar
  4. [Fal90]
    C. Faloutsos. Signature-based text retrieval methods: A survey. Data Engineering, 13(1):25–32, March 1990.Google Scholar
  5. [FC87]
    C. Faloutsos and S. Christodoulakis. Description and performance analysis of signature file methods for office filing. ACM Transactions on Office Information Systems, 5(3):237–257, July 1987.Google Scholar
  6. [Jag90]
    H.V. Jagadish. Linear clustering of objects with multiple attributes. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pages 332–342, Atlantic City, NJ, May 1990.Google Scholar
  7. [LF92]
    Z. Lin and C. Faloutsos. Frame-sliced signature files. IEEE Transactions on Knowledge and Data Engineering, 4(3):281–289, June 1992.Google Scholar
  8. [Lit80]
    W. Litwin. Linear hashing: a new tool for files and table addressing. In Proceedings of the 6th VLDB International Conference, pages 212–223, Montreal, Canada, August 1980.Google Scholar
  9. [LL89]
    D.L. Lee and C.-W. Leng. Partitioned signature files: Design issues and performance evaluation. ACM Transactions on Office Information Systems, 7(2):158–180, April 1989.Google Scholar
  10. [SAC+79]
    P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database system. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pages 23–34, May 1979.Google Scholar
  11. [SLM93]
    B. Seeger, P.-A. Larson, and R. McFayden. Reading a set of disk pages. In Proceedings of the 19th VLDB International Conference, pages 592–603, Dublin, Ireland, August 1993.Google Scholar
  12. [ZRT91]
    P. Zezula, F. Rabitti, and P. Tiberio. Dynamic partitioning of signature files. ACM Transactions on Information Systems, 9(4):336–369, October 1991.Google Scholar

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Paolo Ciaccia
    • 1
  1. 1.DEIS - CIOC-CNRUniversity of BolognaItaly

Personalised recommendations