Abstract
Parallel disks promise to be a cost effective means for achieving high bandwidth in applications involving massive data sets, but algorithms for parallel disks can be difficult to devise. To combat this problem, we define a useful and natural duality between writing to parallel disks and the seemingly more difficult problem of prefetching. We first explore this duality for applications involving read-once accesses using parallel disks. We get a simple linear time algorithm for computing optimal prefetch schedules and analyze the efficiency of the resulting schedules for randomly placed data and for arbitrary interleaved accesses to striped sequences. Duality also provides an optimal schedule for the integrated caching and prefetching problem, in which blocks can be accessed multiple times. Another application of this duality gives us the first parallel disk sorting algorithms that are provably optimal up to lower order terms. One of these algorithms is a simple and practical variant of multiway merge sort, addressing a question that has been open for some time.
Supported in part by the NSF through research grant CCR-0082986.
Partially supported by the IST Programme of the EU under contract number IST-1999-14186 (ALCOM-FT)
Supported in part by the NSF through research grants CCR-9877133 and EIA-9870724 and by the ARO through MURI grant DAAH04-96-1-0013
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Aggarwal and J. S. Vitter. The Input/Output complexity of sorting and related problems. Communications of the ACM, 31(9):1116–1127, 1988.
S. Albers, N. Garg, and S. Leonardi. Minimizing stall time in single and parallel disk systems. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC-98), pages 454–462, New York, May 23–26 1998. ACM Press.
R. D. Barve and J. S. Vitter. A simple and efficient parallel disk mergesort. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 232–241, St. Malo, France, June 1999.
Rakesh D. Barve, Edward F. Grove, and Jeffrey Scott Vitter. Simple randomized mergesort on parallel disks. Parallel Computing, 23(4):601–631, 1997.
A. L. Belady. A study of replacement algorithms for virtual storage computers. IBM Systems Journal, 5:78–101, 1966.
Allan Borodin and Ran El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, Cambridge, 1998.
Pei Cao, Edward W. Felten, Anna R. Karlin, and Kai Li. Implementation and performance of integrated application-controlled file caching, prefetching and disk scheduling. ACM Transactions on Computer Systems, 14(4):311–343, Nov. 1996.
F. Dehne, W. Dittrich, and D. Hutchinson. Efficient external memory algorithms by simulating coarse-grained parallel algorithms. In Proceedings of the 9th ACM Symposium on Parallel Algorithms and Architectures, pages 106–115, June 1997.
F. Dehne, D. Hutchinson, and A. Maheshwari. Reducing I/O complexity by simulating coarse grained parallel algorithms. In Proc. of the Intl. Parallel Processing Symmposium, pages 14–20, April 1999.
M. Kallahalla and P. J.Varman. Optimal read-once parallel disk scheduling. In IOPADS, pages 68–77, 1999.
M. Kallahalla and P.J. Varman. Optimal prefetching and caching for parallel I/O systems. In Proc. of the ACM Symposium on Parallel Algorithms and Architectures, 2001. To appear.
Tracy Kimbrel and Anna R. Karlin. Near-optimal parallel prefetching and caching. SI AM Journal on Computing, 29(4):1051–1082, 2000.
D. E. Knuth. The Art of Computer Programming— Sorting and Searching, volume 3. Addison Wesley, 2nd edition, 1998.
M. H. Nodine and J. S. Vitter. Deterministic distribution sort in shared and distributed memory multiprocessors. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 120–129, Velen, Germany, June-July 1993.
M. H. Nodine and J. S. Vitter. Greed Sort: An optimal sorting algorithm for multiple disks. Journal of the ACM, 42(4):919–933, July 1995.
P. Sanders, S. Egner, and J. Korst. Fast concurrent access to parallel disks. In 11th ACM-SIAM Symposium on Discrete Algorithms, pages 849–858, 2000.
J. S. Vitter. External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys, in press. An earlier version entitled “External Memory Algorithms and Data Structures” appeared in External Memory Algorithms and Visualization, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, American Mathematical Society, 1999, 1–38.
J. S. Vitter and D. A. Hutchinson. Distribution sort with randomized cycling. In Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms, Washington, January 2001.
J. S. Vitter and E. A. M. Shriver. Algorithms for parallel memory I: Two-level memories. Algorithmica, 12(2–3):110–147, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hutchinson, D.A., Sanders, P., Vitter, J.S. (2001). Duality between Prefetching and Queued Writing with Parallel Disks. In: auf der Heide, F.M. (eds) Algorithms — ESA 2001. ESA 2001. Lecture Notes in Computer Science, vol 2161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44676-1_5
Download citation
DOI: https://doi.org/10.1007/3-540-44676-1_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42493-2
Online ISBN: 978-3-540-44676-7
eBook Packages: Springer Book Archive