Abstract
Query throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications: they are of lower complexity and almost exclusively read-only. The architecture we propose here is specifically tailored to take advantage of the query characteristics. It is based on a large parallel shared-nothing database cluster where each node runs a separate server with a fully replicated copy of the database. A query is assigned and entirely executed on one single node avoiding network contention or synchronization effects. However, the actual key to enhanced throughput is a resource efficient scheduling of the arriving queries. We develop a simple and robust scheduling scheme that takes the currently memory resident data at each server into account and trades off memory re-use and execution time, reordering queries as necessary.
Our experimental evaluation demonstrates the effectiveness when scaling the system beyond hundreds of nodes showing super-linear speedup.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
P. A. Boncz and M. L. Kersten. Monet: An Impressionist Sketch of an Advanced Database System. In Proc. Basque International Workshop on Information Technology, San Sebastian, Spain, 1995.
P. A. Boncz and M. L. Kersten. MIL Primitives for Querying a Fragmented World. The VLDB Journal, 8(2):101–119, 1999.
P. A. Boncz, S. Manegold, and M. L. Kersten. Database Architecture Optimized for the new Bottleneck: Memory Access. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 54–65, Edinburgh, UK, 1999.
C. Chekuri, W. Hasan, and R. Motwani. Scheduling Problems in Parallel Query Optimization. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 255–265, San Jose, CA, USA, May 1995.
D. J. DeWitt and J. Gray. Parallel Database Systems: The Future of High Performance Database Systems. Communications of the ACM, 35(6):85–98, June 1992.
D. F. Ferguson, L. Georgiadis, C. Nikolaou, and K. Davies. Goal Oriented, Adaptive Transaction Routing for High Performance Transaction Processing Systems. In Proc. of the Int’l. Conf. on Parallel and Distributed Information Systems, pages 138–147, San Diego, CA, USA, January 1993.
M. N. Garofalakis and Y. E. Ioannidis. Multi-dimensional Resource Scheduling for Parallel Queries. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 365–376, Montreal, Canada, June 1996.
M. N. Garofalakis and Y. E. Ioannidis. Parallel Query Scheduling and Optimization with Time-and Space-Shared Resources. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 296–305, Athens, Greece, September 1997.
R. L. Graham. Bounds on Multiprocessing Timing Anomalies. SIAM Journal on Applied Mathematics, 17(2):416–429, March 1969.
J. Gray. A Survey of Parallel Database Techniques and Systems. In Tutorial Handouts of the 21st Int’l. Conf. on Very Large Data Bases, Zurich, Switzerland, September 1995.
W. Hasan and R. Motwani. Optimization Algorithms for Exploiting the Parallelism-Communication Tradeoff in Pipelining Parallelism. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 36–47, Santiago, Chile, September 1994.
W. Hasan and R. Motwani. Coloring Away Communication in Parallel Query Optimization. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 239–250, Zurich, Switzerland, September 1995.
W. Hong and M. Stonebraker. Optimization of Parallel Query Execution Plans in XPRS. Distributed and Parallel Databases, 1(1):9–32, 1993.
S. Manegold, P. A. Boncz, and M. L. Kersten. What happens during a Join?–Dissecting CPU and Memory Optimization Effects. In Proc. of the Int’l. Conf. on Very Large Data Bases, Cairo, Egypt, September 2000. Accepted for publication.
M. Mehta and D. J. DeWitt. Dynamic Memory Allocation for Multiple-Query Workloads. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 354–367, Dublin, Ireland, September 1993.
M. Mehta, V. Soloviev, and D. J. DeWitt. Batch Scheduling in Parallel Database Systems. In Proc. of the IEEE Int’l. Conf. on Data Engineering, pages 400–410, Vienna, Austria, April 1993.
M. G. Norman, T. Zurek, and P. Thanisch. Much Ado About Shared-Nothing. ACM SIGMOD Record, 25(3):16–21, September 1996.
H. Pirahesh, C. Mohan, J. Cheng, T. S. Liu, and P. Selinger. Parallelism in Relational Data Base Systems: Architectural Issues and Design Approaches. In Proc. of the Int’l. Symp. on Databases in Parallel and Distr. Systems, pages 4–29, Dublin, Ireland, July 1990.
E. Rahm. A Framewok for Workload Allocation in Distributed Transaction Processing Systems. Systems Software Journal, 18:171–190, 1992.
U. Röhm, K. Böhm, and H.-J. Schek. OLAP Query Routing and Physical Design in a Database Cluster. In Proc. of the Int’l. Conf. on Extending Database Technology, Lecture Notes in Computer Science, pages 254–268, Konstanz, Germany, March 2000.
D. A. Schneider and D. J DeWitt. A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 110–121, May 1989.
D. A. Schneider and D. J. DeWitt. Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 469–480, Brisbane, Australia, August 1990.
A. R. Schmidt, M. L. Kersten, M. A. Windhouwer, and F. Waas. Efficient Relational Storage and Retrieval of XML Documents. In International Workshop on the Web and Databases, pages 47–52, Dallas, TX, USA, May 2000.
K. Shim, T. Sellis, and D. Nau. Improvements on a Heuristic Algorithm for Multiple-Query Optimization. IEEE Trans. on Knowledge and Data Engineering, 12(2):197–222, March 1994.
M. Stonebraker. The Case for Shared-Nothing. IEEE Data Engineering Bulletin, 9(1):4–9, March 1986.
A. Thomasian. A Performance Study of Dynamic Load Balancing in Distributed Systems. In Proc. of the IEEE Int’l. Conf. on Distributed Computing Systems, Berlin, Germany, 1987.
A. N. Wilschut, J. Flokstra, and P. M. G. Apers. Parallel Evaluation of Multi-Join Queries. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 115–126, San Jose, CA, USA, May 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Waas, F., Kersten, M.L. (2001). Memory Aware Query Routing in Interactive Web-Based Information Systems. In: Read, B. (eds) Advances in Databases. BNCOD 2001. Lecture Notes in Computer Science, vol 2097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45754-2_11
Download citation
DOI: https://doi.org/10.1007/3-540-45754-2_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42265-5
Online ISBN: 978-3-540-45754-1
eBook Packages: Springer Book Archive