Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems

Malik, Haroon; Shakshuki, Elhadi M.

doi:10.1007/s12652-017-0525-1

Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems

Original Research
Published: 04 July 2017

Volume 9, pages 43–59, (2018)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Haroon Malik¹ &
Elhadi M. Shakshuki²

227 Accesses
3 Citations
Explore all metrics

Abstract

Cloud providers and data centres rely heavily on forecasts to accurately predict future workload. This information helps them in appropriate virtualization and cost-effective provisioning of the infrastructure. The accuracy of a forecast greatly depends upon the merit of performance data fed to the underlying algorithms. One of the fundamental problems faced by analysts in preparing data for use in forecasting is the timely identification of data discontinuities. A discontinuity is an abrupt change in a time-series pattern of a performance counter that persists but does not recur. We used a supervised and an unsupervised techniques to automatically identify the important performance counters that are likely indicators of discontinuities within performance data. We compared the performance of our approaches by conducting a case study on the performance data obtained from a large scale cloud provider as well as on open source benchmarks systems. The supervised counter selection approach has superior results in terms of unsupervised approach but bears and overhead of manual labelling of the performance data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Connecting the dots: anomaly and discontinuity detection in large-scale systems

Article 16 June 2016

Finding Performance Patterns from Logs with High Confidence

Diagnosing Performance Variations in HPC Applications Using Machine Learning

References

Attariyan M, Chow M, Flinn J (2012) X-ray: automating root-cause diagnosis of performance anomalies in production software. In OSDI 12:307–320
Google Scholar
Bondi A (2007) Automating the analysis of load test results to assess the scalability and stability of a component. In: ICPE’16 Companion, vol 1, p 133
Cherkasova L, Ozonat K, Mi N, Symons J, Smirni E (2009) Automated anomaly detection and performance modeling of enterprise applications. ACM Trans Comput Syst 27:1–32
Article Google Scholar
Cohen J (2013) Statistical power analysis for the behavioral sciences. Routledge Academic, Abingdon
MATH Google Scholar
Creţu-Ciocârlie GF, Budiu M, Goldszmidt M (2008) Hunting for problems with Artemis. In: WASL’08, vol 39, p 2
Delimitrou C, Kozyrakis C (2013) iBench: Quantifying interference for datacenter applications. In: IISWC’13, vol 2, pp 23–33
Foo KCD (2011) Automated discovery of performance regressions in enterprise applications. Dissertation, Canadian theses
Foo KC, Jiang ZM, Adams B, Hassan AE, Zou Y, Flora P (2010) Mining performance regression testing repositories for automated performance analysis. In: QSIC’10, vol 45, pp 32–41
Foong A, Fung J, Newell D (2004) An in-depth analysis of the impact of processor affinity on network performance. In: ICON’04, vol 1, pp 244–250
Georges A, Buytaert D, Eeckhout L (2007) Statistically rigorous java performance evaluation. In: OOPSLA’07, vol 42, pp 57–76
Gunasekaran R, Dillow DA, Shipman GM, Maxwell DE, Hill JJ, Park BH, Geist A (2010) Correlating log messages for system diagnostics. In: OCLF’10, vol 01, p 09
Gunther HW (2000) Websphere application server development best practices for performance and scalability. White paper, IBM WebSphere Application Server Standard and Advanced Editions
Hartung J, Knapp G, Sinha BK (2011) Statistical meta-analysis with applications. Wiley, New York
Jaffe D, Muirhead T (2005) The open source DVD store application. http://www.dell.com/downloads/global/power/ps3q05-20050217-Jaffe-OE.pdf
James McCaffrey E (2011) Eat-Mem [Tool]. http://stereopsis.com/eatmem/. Accessed 5 Oct 2011
Jiang ZM (2010) Automated analysis of load testing results. In: proceedings of the 19th international symposium on Software testing and analysis, pp. 143–146, 42
Jiang ZM, Hassan AE, Hamann G, Flora P (2008) An automated approach for abstracting execution logs to execution events. J Softw Maint Evol: Res Pract 20:249–267. doi:10.1002/smr.374
Article Google Scholar
Jolliffe I (2002) Principal component analysis. Springer, New York, ISBN 978-0-387-22440-4
MATH Google Scholar
Kampenes VB, Dybå T, Hannay JE, Sjøberg DI (2007) A systematic review of effect size in software engineering experiments. J Softw Technol 49:1073–1086
Article Google Scholar
Kitchenham BA, Pfleeger SL, Pickard LM, Jones PW, Hoaglin DC, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Software Eng 28:721–734
Article Google Scholar
Knop M, Schopf J, Dinda P (2002) Windows performance monitoring and data reduction using watchtower. In: SHAMAN’02, pp 22–35
Leyda M, Geiss R (2010) WinThrottle [TOOL]. http://www.oldskool.org/pc/throttle
Limbrunner JF, Vogel RM, Brown LC (2000) Estimation of harmonic mean of a lognormal variable. J Hydrol Eng 5:59–66
Article Google Scholar
Malik H, Jiang ZM, Adams B, Hassan AE, Flora P, Hamann G (2010a) Automatic comparison of load tests to support the performance analysis of large enterprise systems. In: CSMR’10, vol 32, pp 222–231
Malik H, Adams B, Hassan AE, Flora P, Hamann G (2010b) Using load tests to automatically compare the subsystems of a large enterprise system. In: COMPSAC’10, vol 20, pp 117–126
Malik H, Adams B, Hassan AE (2010c) Pinpointing the subsystems responsible for the performance deviations in a load. In: ISSRE’10, vol 26, pp 201–210
Malik H, Hemmati H, Hassan AE (2013) Automatic detection of performance deviations in the load testing of large scale systems. In: ICSE’13, vol 42, pp 1012–1021
Nguyen TH, Adams B, Jiang ZM, Hassan AE, Nasser M, Flora P (2011) Automated verification of load tests using control charts. In: APSEC’11, vol 5, pp 282–289
Nguyen TH, Nagappan M, Hassan AE, Nasser M, Flora P (2014) An industrial case study of automatically identifying performance regression-causes. In: MSR’14, vol 20, pp 232–241
Pertet S, Narasimhan P (2005) Causes of failure in web applications Parallel Data Laboratory. Technical Report, Carnegie Mellon University, CMU-PDL-05-109
Stanford S (2003) MMB3 comparative analysis–white paper [ONLINE]. www.dell.com/downloads/global/solutions/MMB2_Comparison_with_MMB3.doc
Syer MD, Adams B, Hassan AE (2011) Identifying performance deviations in thread pools. In: ICSM’11, vol 32, pp 83–92
Syer MD, Jiang ZM, Nagappan M, Hassan AE, Nasser M, Flora P (2013) Leveraging performance counters and execution logs to diagnose memory-related performance issues. In: ICSM’13, vol 35, pp 110–119
Thakkar D, Hassan AE, Hamann G, Flora P (2008) A framework for measurement based performance modeling. In: WOSP’08, vol 12, pp 55–66
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics bulletin, pp. 80–83

Download references

Author information

Authors and Affiliations

Weisberg Division of Computer Science, Marshall University, Huntington, WV, USA
Haroon Malik
Jodery School of Compueter Science, Acadia Univeristy, Wolfville, NS, Canada
Elhadi M. Shakshuki

Authors

Haroon Malik
View author publications
You can also search for this author in PubMed Google Scholar
Elhadi M. Shakshuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haroon Malik.

Additional information

Peer-review under responsibility of the Conference Program Chairs.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Malik, H., Shakshuki, E.M. Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems. J Ambient Intell Human Comput 9, 43–59 (2018). https://doi.org/10.1007/s12652-017-0525-1

Download citation

Received: 19 February 2017
Accepted: 17 April 2017
Published: 04 July 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s12652-017-0525-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems

Abstract

Access this article

Similar content being viewed by others

Connecting the dots: anomaly and discontinuity detection in large-scale systems

Finding Performance Patterns from Logs with High Confidence

Diagnosing Performance Variations in HPC Applications Using Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems

Abstract

Access this article

Similar content being viewed by others

Connecting the dots: anomaly and discontinuity detection in large-scale systems

Finding Performance Patterns from Logs with High Confidence

Diagnosing Performance Variations in HPC Applications Using Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation