Optimal multi-block read schedules for partitioned signature files
Queries on partitioned signature files, namely Quick Filter (QF), can lead to retrieve from disk a large number of blocks, depending on the specific query pattern. In order to reduce the overall retrieval time, we consider multi-block read schedules that, provided contiguous allocation of blocks of the file on disk surface is guaranteed by the storage system, transfer more than one block at a time. We show that, for any signature query and buffer size, there always exists an optimal schedule whose reads all have the same size, and that such a constant size (CS) schedule can be determined in a time logarithmic in the number of blocks to be retrieved. We then provide analytical results for the expected performance of QF using CS schedules and compare QF with other, sequential-based, signature file organizations. Finally, we suggest how our approach can also be of interest for other file organizations based on multi-attribute hashing.
KeywordsOptimal Schedule Buffer Size Access Pattern Query Term Feasible Schedule
Unable to display preview. Download preview PDF.
- [Cia95]P. Ciaccia. Optimal multi-block read schedules for partitioned signature files. Technical Report UBLCS-95-13, University of Bologna, Dept. of Computer Science, August 1995.Google Scholar
- [CM93]P. Ciaccia and D. Maio. Access cost estimation for physical database design. Data and Knowledge Engineering, 11(2):125–150, 1993.Google Scholar
- [CZ93]P. Ciaccia and P. Zezula. Estimating accesses in partitioned signature file organizations. ACM Transactions on Information Systems, 11(2):133–142, April 1993.Google Scholar
- [Fal90]C. Faloutsos. Signature-based text retrieval methods: A survey. Data Engineering, 13(1):25–32, March 1990.Google Scholar
- [FC87]C. Faloutsos and S. Christodoulakis. Description and performance analysis of signature file methods for office filing. ACM Transactions on Office Information Systems, 5(3):237–257, July 1987.Google Scholar
- [Jag90]H.V. Jagadish. Linear clustering of objects with multiple attributes. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pages 332–342, Atlantic City, NJ, May 1990.Google Scholar
- [LF92]Z. Lin and C. Faloutsos. Frame-sliced signature files. IEEE Transactions on Knowledge and Data Engineering, 4(3):281–289, June 1992.Google Scholar
- [Lit80]W. Litwin. Linear hashing: a new tool for files and table addressing. In Proceedings of the 6th VLDB International Conference, pages 212–223, Montreal, Canada, August 1980.Google Scholar
- [LL89]D.L. Lee and C.-W. Leng. Partitioned signature files: Design issues and performance evaluation. ACM Transactions on Office Information Systems, 7(2):158–180, April 1989.Google Scholar
- [SAC+79]P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database system. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pages 23–34, May 1979.Google Scholar
- [SLM93]B. Seeger, P.-A. Larson, and R. McFayden. Reading a set of disk pages. In Proceedings of the 19th VLDB International Conference, pages 592–603, Dublin, Ireland, August 1993.Google Scholar
- [ZRT91]P. Zezula, F. Rabitti, and P. Tiberio. Dynamic partitioning of signature files. ACM Transactions on Information Systems, 9(4):336–369, October 1991.Google Scholar