Advertisement

A Performance Analysis of System S, S4, and Esper via Two Level Benchmarking

  • Miyuru Dayarathna
  • Toyotaro Suzumura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8054)

Abstract

Data stream processing systems have become popular due to their effectiveness in applications in large scale data stream processing scenarios. This paper compares and contrasts performance characteristics of three stream processing softwares System S, S4, and Esper. We study about which software aspects shape the characteristics of the workloads handled by these software. We use a micro benchmark and different real world stream applications on System S, S4, and Esper to construct 70 different application scenarios. We use job throughput, CPU, Memory consumption, and network utilization of each application scenario as performance metrics. We observed that S4’s architectural aspect which instantiates a Processing Element (PE) for each keyed attribute is less efficient compared to the fixed number of PEs used by System S and Esper. Furthermore, all the Esper benchmarks produced more than 150% increased performance in single node compared to S4 benchmarks. S4 and Esper are more portable compared to System S and could be fine tuned for different application scenarios easily. In future we hope to widen our understanding of performance characteristics of these systems by investigating in to the code level profiling.

Keywords

stream processing data-intensive computing workload characterization performance analysis benchmarking systems scalability 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abadi, D.J., et al.: Aurora: a new model and architecture for data stream management. The VLDB Journal 12, 120–139 (2003)CrossRefGoogle Scholar
  2. 2.
    Andrade, H., et al.: Scale-up strategies for processing high-rate data streams in systems. In: ICDE 2009 (2009)Google Scholar
  3. 3.
    Arasu, A., et al.: Linear road: a stream data management benchmark. In: VLDB 2004, pp. 480–491 (2004)Google Scholar
  4. 4.
    EsperTech. Esper - Complex Event Processing (February 2012), http://esper.codehaus.org/
  5. 5.
    Etzion, O., Niblett, P.: Event Processing in Action (2011)Google Scholar
  6. 6.
    IBM. Ibm infosphere streams version 1.2.0.1: Programming model and language reference (February 2010)Google Scholar
  7. 7.
    IBM. Ibm infosphere streams version 1.2.1: Installation and administration guide (October 2010)Google Scholar
  8. 8.
    Mendes, M.R.N., Bizarro, P., Marques, P.: A performance study of event processing systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 221–236. Springer, Heidelberg (2009)Google Scholar
  9. 9.
    Neumeyer, L., et al.: S4: Distributed stream computing platform. In: KDCloud 2010 (December 2010)Google Scholar
  10. 10.
    Nmon. nmon for Linux (June 2011), http://nmon.sourceforge.net
  11. 11.
    Parekh, S., et al.: Characterizing, constructing and managing resource usage profiles of systems applications: challenges and experience. In: CIKM 2009, pp. 1177–1186 (2009)Google Scholar
  12. 12.
    Snyder, B., Bosanac, D., Davies, R.: ActiveMQ in Action (2011)Google Scholar
  13. 13.
    SourceForge. OProfile - A System Profiler for Linux (June 2011), http://oprofile.sourceforge.net
  14. 14.
    Suzumura, T., Yasue, T., Onodera, T.: Scalable performance of systems for extract-transform-load processing. In: SYSTOR 2010 (2010)Google Scholar
  15. 15.
    The_STREAM_Group. Stream: The stanford stream data manager. Technical Report 2003-21 (2003)Google Scholar
  16. 16.
    Turaga, D., et al.: Design principles for developing stream processing applications. In: Software: Practice and Experience (August 2010)Google Scholar
  17. 17.
    Wolf, J., Bansal, N., Hildrum, K., Parekh, S., Rajan, D., Wagle, R., Wu, K.-L., Fleischer, L.K.: SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In: Issarny, V., Schantz, R. (eds.) Middleware 2008. LNCS, vol. 5346, pp. 306–325. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Zeitler, E., Risch, T.: Scalable splitting of massive data streams. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 184–198. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    Zhang, X.J., et al.: Workload characterization for operator-based distributed stream processing applications. In: DEBS 2010, pp. 235–247 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Miyuru Dayarathna
    • 1
  • Toyotaro Suzumura
    • 1
    • 2
  1. 1.Department of Computer ScienceTokyo Institute of TechnologyMeguro-kuJapan
  2. 2.IBM ResearchTokyoJapan

Personalised recommendations