Environment-Sensitive Performance Tuning for Distributed Service Orchestration

Lin, Yu; Ivančić, Franjo; Joshi, Pallavi; Balakrishnan, Gogul; Ganai, Malay; Gupta, Aarti

doi:10.1007/978-3-319-17353-5_18

Environment-Sensitive Performance Tuning for Distributed Service Orchestration

Yu Lin¹⁶,
Franjo Ivančić¹⁷,
Pallavi Joshi¹⁷,
Gogul Balakrishnan¹⁷,
Malay Ganai¹⁷ &
…
Aarti Gupta¹⁷

Conference paper
First Online: 01 January 2015

741 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8969))

Abstract

Modern distributed systems are designed to tolerate unreliable environments, i.e., they aim to provide services even when some failures happen in the underlying hardware or network. However, the impact of unreliable environments can be significant on the performance of the distributed systems, which should be considered when deploying the services. In this paper, we present an approach to optimize performance of the distributed systems under unreliable deployed environments, through searching for optimal configuration parameters. To simulate an unreliable environment, we inject several failures in the environment of a service application, such as a node crash in the cluster, network failures between nodes, resource contention in nodes, etc. Then, we use a search algorithm to find the optimal parameters automatically in the user-selected parameter space, under the unreliable environment we created. We have implemented our approach in a testing-based framework and applied it to several well-known distributed service systems.

F. Ivančić and G. Balakrishnan—Current affiliation: Google, Inc.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Execution time is not obtained from Ganglia, but from Linux time command.

References

Ganglia. http://ganglia.sourceforge.net/
Juju. https://juju.ubuntu.com/
Juju Charms. https://jujucharms.com/
LoadRunner. http://www.hp.com/go/LoadRunner
LXC. http://linuxcontainers.org/
Selenium. http://seleniumhq.org/
Allspaw, J.: Fault injection in production. Commun. ACM 55(10), 48–52 (2012)
Article Google Scholar
Babu, S.: Towards automatic optimization of mapreduce programs. In: SOCC 2010, pp. 137–142 (2010)
Google Scholar
Banabic, R., Candea, G.: Fast black-box testing of system recovery code. In: Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys 2012, pp. 281–294 (2012)
Google Scholar
Broadwell, P., Sastry, N., Traupman, J.: FIG: A prototype tool for online verification of recovery. In: Workshop on Self-Healing, Adaptive and Self-Managed Systems (2002)
Google Scholar
Carbone, M., Rizzo, L.: Dummynet revisited. SIGCOMM Comput. Commun. Rev. 40(2), 12–20 (2010)
Article Google Scholar
Dawson, S., Jahanian, F., Mitton, T.: Experiments on six commercial TCP implementations using a software fault injection tool. Softw. Pract. Exper. 27(12), 1385–1410 (1997)
Article Google Scholar
Gunawi, H., Do, T., Joshi, P., Alvaro, P., Hellerstein, J., Arpaci-Dusseau, A., Arpaci-Dusseau, R., Sen, K., Borthakur, D.: FATE and DESTINI: A framework for cloud recovery testing. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI 2011, pp. 18–18 (2011)
Google Scholar
Herodotos, H., Babu, S.: Profiling, What-if analysis, and cost-based optimization of MapReduce programs. In: VLDB 2011, pp. 1111–1122 (2011)
Google Scholar
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: A self-tuning system for big data analytics. In: CIDR 2011, pp. 261–272 (2011)
Google Scholar
Hoarau, W., Tixeuil, S., Vauchelles, F.: FAIL-FCI: Versatile fault injection. Future Gener. Comput. Syst. 23(7), 913–919 (2007)
Article Google Scholar
Joshi, P., Ganai, M., Balakrishnan, B., Gupta, A., Papakonstantinou, N.: SETSUDO: Perturbation-based testing framework for scalable distributed systems. In: Proceeding of the Conference on Timely Results in Operating Systems (2013)
Google Scholar
Joshi, P., Gunawi, H., Sen, K.: PREFAIL: A programmable tool for multiple-failure injection. In: Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2011, pp. 171–188 (2011)
Google Scholar
Lubke, R., Lungwitz, R., Schuster, D., Schill, A.: Large-scale tests of distributed systems with integrated emulation of advanced network behavior. WWW/Internet 10(2), 138–151 (2013)
Google Scholar
Marinescu, P., Candea, G.: Efficient testing of recovery code using fault injection. ACM Trans. Comput. Syst. 29(4), 11:1–11:38 (2011)
Article Google Scholar
Molyneaux, I.: The Art of Application Performance Testing: Help for Programmers and Quality Assurance. O’Reilly Media (2009)
Google Scholar
Tseitlin, A.: The antifragile organization. CACM 56(8), 40–44 (2013)
Article Google Scholar
Ye, T., Kalyanaraman, S.: A recursive random search algorithm for large-scale network parameter configuration. In: SIGMETRICS 2003, pp. 196–205 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign, Champaign, IL, USA
Yu Lin
NEC Laboratories America, Princeton, NJ, USA
Franjo Ivančić, Pallavi Joshi, Gogul Balakrishnan, Malay Ganai & Aarti Gupta

Authors

Yu Lin
View author publications
You can also search for this author in PubMed Google Scholar
Franjo Ivančić
View author publications
You can also search for this author in PubMed Google Scholar
Pallavi Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Gogul Balakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Malay Ganai
View author publications
You can also search for this author in PubMed Google Scholar
Aarti Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Lin .

Editor information

Editors and Affiliations

IRIT, ENSEEIHT, Toulouse Cedex, France
Michel Daydé
Lawrence Berkeley National Laboratory, Berkeley, California, USA
Osni Marques
Information Technology Center, The University of Tokyo, Tokyo, Japan
Kengo Nakajima

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, Y., Ivančić, F., Joshi, P., Balakrishnan, G., Ganai, M., Gupta, A. (2015). Environment-Sensitive Performance Tuning for Distributed Service Orchestration. In: Daydé, M., Marques, O., Nakajima, K. (eds) High Performance Computing for Computational Science -- VECPAR 2014. VECPAR 2014. Lecture Notes in Computer Science(), vol 8969. Springer, Cham. https://doi.org/10.1007/978-3-319-17353-5_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-17353-5_18
Published: 18 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17352-8
Online ISBN: 978-3-319-17353-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics