Abstract
It is becoming more important to leverage a large number of distributed cache memory seamlessly in modern large scale systems. Several previous studies showed that traditional scheduling policies often fail to exhibit high cache hit ratio and to achieve good system load balance with large scale distributed caching facilities. To maximize the system throughput, distributed caching facilities should balance the workloads and leverage cached data at the same time. In this work, we present a distributed job processing framework that yields high cache hit ratio while achieving balanced system load. Our framework employs a scheduling policy—DEMA that considers both cache hit ratio and system load and it supports geographically distributed multiple job schedulers. We show collaborative task scheduling and the data migration can even further improve the performance by increasing the cache hit ratio while achieving good load balance. Our experiments show that the proposed job scheduling policies outperform legacy load-based job scheduling policy in terms of job response time, load balancing, and cache hit ratio.
Similar content being viewed by others
Notes
In our implementation, however, we performed a linear search that takes O(n) assuming that n is small enough, for example, up to 40.
References
Andrade, H., Kurc, T., Sussman, A., Saltz, J.: Active Proxy-G: Optimizing the query execution process in the Grid. In: Proceedings of the ACM/IEEE SC2002 Conference (2002)
Aron, M., Sanders, D., Druschel, P., Zwaenepoel, W.: Scalable content-aware request distribution in cluster-basednetwork servers. In: Proceedings of Usenix Annual Technical Conference (2000)
de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Computational Geometry. Algorithms and Applications. Springer, Heidelberg (1998)
Beynon, M.D., Ferreira, R., Kurc, T., Sussman, A., Saltz, J.: DataCutter: Middleware for filtering very large scientific datasets on archival storage systems. In: Proceedings of the Eighth Goddard Conference on Mass Storage Systems and Technologies/17th IEEE Symposium on Mass Storage Systems, pp. 119–133 (2000)
Catalyurek, U.V., Boman, E.G., Devine, K.D., Bozdag, D., Heaphy, R.T., Riesen, L.A.: A repartitioning hypergraph model for dynamic load balancing. J. Parallel Distrib. Comput. 69(8), 711–724 (2009)
Godfrey ,B., Lakshminarayanan, K., Surana, S., Karp, R., Stoica, I.: Load balancing in dynamic structured p2p systems. In: Proceedings of INFOCOM 2004 (2004)
Katevenis, M., Sidiropoulos, S., Courcoubetis, C.: Weighted round-robin cell multiplexing in a general-purpose atm switch chip. IEEE J. Sel. Areas Commun. 9(8), 1265–1279 (1991)
Kim, J.S., Andrade, H., Sussman, A.: Principles for designing data-/compute-intensive distributed applications and middleware systems for heterogeneous environments. J. Parallel Distrib. Comput. 67(7), 755–771 (2007)
Kurc, T., Chang, C., Ferreira, R., Sussman, A., Saltz, J.: Querying very large multi-dimensional datasets in ADR. In: Proceedings of the ACM/IEEE SC1999 Conference (1999)
Menasce, D.A., Almeida, V.A.F.: Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning. Prentice Hall PTR, Upper Saddle River (2000)
Nam, B., Shin, M., Andrade, H., Sussman, A.: Multiple query scheduling for distributed semantic caches. J. Parallel Distrib. Comput. 70(5), 598–611 (2010)
Pai, V., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., Nahum, E.: Locality-aware request distribution in cluster-based network servers. In: Proceedings of ACM ASPLOS (1998)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods (Springer Texts in Statistics). Springer, Secaucus (2005)
Rodríguez-Martínez, M., Roussopoulos, N.: Mocha: A self-extensible database middleware system for distributed data sources. In: Proceedings of 2000 ACM SIGMOD, ACMPRESS, pp. 213–224, aCM SIGMOD Record, Vol. 29, No. 2 (2000)
Smith, J., Sampaio, S., Watson, P., Paton, N.: The polar parallel object database server. Distrib. Parallel Databases 16(3), 275–319 (2004)
Vydyanathan, N., Krishnamoorthy, S., Sabin, G., Catalyurek, U., Kurc, T., Sadayappan, P., Saltz, J.: An integrated approach to locality-conscious processor allocation and scheduling of mixed-parallel applications. IEEE Trans. Parallel Distrib. Syst. 15, 3319–3332 (2009)
Wolf, J.L., Yu, P.S.: Load balancing for clustered web farms. ACM SIGMETRICS Perform. Eval. Rev. 28(4), 11–13 (2001)
Zhang, K., Andrade, H., Raschid, L., Sussman, A.: Query planning for the Grid: Adapting to dynamic resource availability. In: Proceedings of the 5th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), Cardiff, UK (2005a)
Zhang, Q., Riska, A., Sun, W., Smirni, E., Ciardo, G.: Workload-aware load balancing for clustered web servers. IEEE Trans. Parallel Distrib. Syst. 16(3), 219–233 (2005b)
Acknowledgments
This research was supported by NRF (National Research Foundation of Korea) grant NRF-2014R1A1A2058843 and MKE/KEIT (No. 10041608, Embedded System Software for New Memory based Smart Devices).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Eom, Y., Kim, J. & Nam, B. Multi-dimensional multiple query scheduling with distributed semantic caching framework. Cluster Comput 18, 1141–1156 (2015). https://doi.org/10.1007/s10586-015-0464-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-015-0464-6