Abstract
Cloud providers and data centres rely heavily on forecasts to accurately predict future workload. This information helps them in appropriate virtualization and cost-effective provisioning of the infrastructure. The accuracy of a forecast greatly depends upon the merit of performance data fed to the underlying algorithms. One of the fundamental problems faced by analysts in preparing data for use in forecasting is the timely identification of data discontinuities. A discontinuity is an abrupt change in a time-series pattern of a performance counter that persists but does not recur. We used a supervised and an unsupervised techniques to automatically identify the important performance counters that are likely indicators of discontinuities within performance data. We compared the performance of our approaches by conducting a case study on the performance data obtained from a large scale cloud provider as well as on open source benchmarks systems. The supervised counter selection approach has superior results in terms of unsupervised approach but bears and overhead of manual labelling of the performance data.
Similar content being viewed by others
References
Attariyan M, Chow M, Flinn J (2012) X-ray: automating root-cause diagnosis of performance anomalies in production software. In OSDI 12:307–320
Bondi A (2007) Automating the analysis of load test results to assess the scalability and stability of a component. In: ICPE’16 Companion, vol 1, p 133
Cherkasova L, Ozonat K, Mi N, Symons J, Smirni E (2009) Automated anomaly detection and performance modeling of enterprise applications. ACM Trans Comput Syst 27:1–32
Cohen J (2013) Statistical power analysis for the behavioral sciences. Routledge Academic, Abingdon
Creţu-Ciocârlie GF, Budiu M, Goldszmidt M (2008) Hunting for problems with Artemis. In: WASL’08, vol 39, p 2
Delimitrou C, Kozyrakis C (2013) iBench: Quantifying interference for datacenter applications. In: IISWC’13, vol 2, pp 23–33
Foo KCD (2011) Automated discovery of performance regressions in enterprise applications. Dissertation, Canadian theses
Foo KC, Jiang ZM, Adams B, Hassan AE, Zou Y, Flora P (2010) Mining performance regression testing repositories for automated performance analysis. In: QSIC’10, vol 45, pp 32–41
Foong A, Fung J, Newell D (2004) An in-depth analysis of the impact of processor affinity on network performance. In: ICON’04, vol 1, pp 244–250
Georges A, Buytaert D, Eeckhout L (2007) Statistically rigorous java performance evaluation. In: OOPSLA’07, vol 42, pp 57–76
Gunasekaran R, Dillow DA, Shipman GM, Maxwell DE, Hill JJ, Park BH, Geist A (2010) Correlating log messages for system diagnostics. In: OCLF’10, vol 01, p 09
Gunther HW (2000) Websphere application server development best practices for performance and scalability. White paper, IBM WebSphere Application Server Standard and Advanced Editions
Hartung J, Knapp G, Sinha BK (2011) Statistical meta-analysis with applications. Wiley, New York
Jaffe D, Muirhead T (2005) The open source DVD store application. http://www.dell.com/downloads/global/power/ps3q05-20050217-Jaffe-OE.pdf
James McCaffrey E (2011) Eat-Mem [Tool]. http://stereopsis.com/eatmem/. Accessed 5 Oct 2011
Jiang ZM (2010) Automated analysis of load testing results. In: proceedings of the 19th international symposium on Software testing and analysis, pp. 143–146, 42
Jiang ZM, Hassan AE, Hamann G, Flora P (2008) An automated approach for abstracting execution logs to execution events. J Softw Maint Evol: Res Pract 20:249–267. doi:10.1002/smr.374
Jolliffe I (2002) Principal component analysis. Springer, New York, ISBN 978-0-387-22440-4
Kampenes VB, Dybå T, Hannay JE, Sjøberg DI (2007) A systematic review of effect size in software engineering experiments. J Softw Technol 49:1073–1086
Kitchenham BA, Pfleeger SL, Pickard LM, Jones PW, Hoaglin DC, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Software Eng 28:721–734
Knop M, Schopf J, Dinda P (2002) Windows performance monitoring and data reduction using watchtower. In: SHAMAN’02, pp 22–35
Leyda M, Geiss R (2010) WinThrottle [TOOL]. http://www.oldskool.org/pc/throttle
Limbrunner JF, Vogel RM, Brown LC (2000) Estimation of harmonic mean of a lognormal variable. J Hydrol Eng 5:59–66
Malik H, Jiang ZM, Adams B, Hassan AE, Flora P, Hamann G (2010a) Automatic comparison of load tests to support the performance analysis of large enterprise systems. In: CSMR’10, vol 32, pp 222–231
Malik H, Adams B, Hassan AE, Flora P, Hamann G (2010b) Using load tests to automatically compare the subsystems of a large enterprise system. In: COMPSAC’10, vol 20, pp 117–126
Malik H, Adams B, Hassan AE (2010c) Pinpointing the subsystems responsible for the performance deviations in a load. In: ISSRE’10, vol 26, pp 201–210
Malik H, Hemmati H, Hassan AE (2013) Automatic detection of performance deviations in the load testing of large scale systems. In: ICSE’13, vol 42, pp 1012–1021
Nguyen TH, Adams B, Jiang ZM, Hassan AE, Nasser M, Flora P (2011) Automated verification of load tests using control charts. In: APSEC’11, vol 5, pp 282–289
Nguyen TH, Nagappan M, Hassan AE, Nasser M, Flora P (2014) An industrial case study of automatically identifying performance regression-causes. In: MSR’14, vol 20, pp 232–241
Pertet S, Narasimhan P (2005) Causes of failure in web applications Parallel Data Laboratory. Technical Report, Carnegie Mellon University, CMU-PDL-05-109
Stanford S (2003) MMB3 comparative analysis–white paper [ONLINE]. www.dell.com/downloads/global/solutions/MMB2_Comparison_with_MMB3.doc
Syer MD, Adams B, Hassan AE (2011) Identifying performance deviations in thread pools. In: ICSM’11, vol 32, pp 83–92
Syer MD, Jiang ZM, Nagappan M, Hassan AE, Nasser M, Flora P (2013) Leveraging performance counters and execution logs to diagnose memory-related performance issues. In: ICSM’13, vol 35, pp 110–119
Thakkar D, Hassan AE, Hamann G, Flora P (2008) A framework for measurement based performance modeling. In: WOSP’08, vol 12, pp 55–66
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics bulletin, pp. 80–83
Author information
Authors and Affiliations
Corresponding author
Additional information
Peer-review under responsibility of the Conference Program Chairs.
Rights and permissions
About this article
Cite this article
Malik, H., Shakshuki, E.M. Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems. J Ambient Intell Human Comput 9, 43–59 (2018). https://doi.org/10.1007/s12652-017-0525-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-017-0525-1