Information Systems Frontiers

, Volume 21, Issue 1, pp 67–85 | Cite as

Quantitative Analysis of Apache Storm Applications: The NewsAsset Case Study

  • José I. RequenoEmail author
  • José Merseguer
  • Simona Bernardi
  • Diego Perez-Palacin
  • Giorgos Giotis
  • Vasilis Papanikolaou


The development of Information Systems today faces the era of Big Data. Large volumes of information need to be processed in real-time, for example, for Facebook or Twitter analysis. This paper addresses the redesign of NewsAsset, a commercial product that helps journalists by providing services, which analyzes millions of media items from the social network in real-time. Technologies like Apache Storm can help enormously in this context. We have quantitatively analyzed the new design of NewsAsset to assess whether the introduction of Apache Storm can meet the demanding performance requirements of this media product. Our assessment approach, guided by the Unified Modeling Language (UML), takes advantage, for performance analysis, of the software designs already used for development. In addition, we converted UML into a domain-specific modeling language (DSML) for Apache Storm, thus creating a profile for Storm. Later, we transformed said DSML into an appropriate language for performance evaluation, specifically, stochastic Petri nets. The assessment ended with a successful software design that certainly met the scalability requirements of NewsAsset.


Apache storm UML Petri nets Software performance Software reuse 



This work has been supported by the European Commission under the H2020 Research and Innovation program [DICE, Grant Agreement No. 644869], the Spanish Ministry of Economy and Competitiveness [ref. CyCriSecTIN201458457R], and the Aragon Government [ref. T21_17R, DIStributed COmputation (DISCO)]


  1. Apache. (2017a). Apache Storm Website.
  2. Apache. (2017b). Apache Zookeeper Website.
  3. Ardagna, D., & et al. (2016). Modeling performance of Hadoop applications: a journey from queueing networks to stochastic well formed nets. In Carretero, J., & et al. (Eds.) Proceedings of the 16th Int. Conf. on algorithms and architectures for parallel processing. ISBN 978-3-319-49583-5 (pp. 599–613). Cham: Springer.Google Scholar
  4. ATC. (2018). Athens technology center Website.
  5. Avizienis, A., Laprie, J.-C., Randell, B., Landwehr, C.E. (2004). Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing, 1(1), 11–33. Scholar
  6. Bernardi, S., Merseguer, J., Petriu, D.C. (2011). A dependability profile within MARTE. Software &, Systems Modeling, 10(3), 313–336.CrossRefGoogle Scholar
  7. Chiola, G., Marsan, M.A., Balbo, G., Conte, G. (1993). Generalized stochastic Petri nets: a definition at the net level and its implications. IEEE Transactions on Software Engineering, 19(2), 89–107.CrossRefGoogle Scholar
  8. DICE Consortium. (2016). Requirement Specification. Technical report, European Union’s Horizon 2020 research and innovation program.
  9. DICE Consortium. (2016). Storm profile.
  10. DICE Consortium. (2017). DICE Simulation tool.
  11. Dipartimento di informatica, Università di Torino. (2015). GRaphical Editor and Analyzer for Timed and Stochastic Petri Nets.
  12. Diplaris, S., & et al. (2012). SocialSensor: sensing user generated input for improved media discovery and experience. In Proceedings of the 21st international conference on World Wide Web (pp. 243–246). ACM.Google Scholar
  13. Flexiant. (2017). Flexiant cloud orchestator Website.
  14. Gianniti, E., & et al. (2017). Fluid Petri nets for the performance evaluation of MapReduce and Spark applications. ACM SIGMETRICS Performance Evaluation Review, 44(4), 23–36.CrossRefGoogle Scholar
  15. ISO. (2008). Systems and software engineering – High-level Petri nets – Part 2: Transfer format. ISO/IEC 159092:2011. Geneva: International Organization for Standardization.Google Scholar
  16. Kroß, J., Brunnert, A., Krcmar, H. (2015). Modeling big data systems by extending the Palladio component model. Softwaretechnik-Trends, 35(3), 1–3.Google Scholar
  17. Kroß, J., & Krcmar, H. (2016). Modeling and simulating Apache Spark streaming applications. Softwaretechnik-Trends, 36(4), 1–3.Google Scholar
  18. Lagarde, F., Espinoza, H., Terrier, F., Gérard, S. (2007). Improving UML profile design practices by leveraging conceptual domain models. In Kurt Stirewalt, RE, Egyed, A., Fischer, B. (Eds.) Proceeedins of the 22nd IEEE/ACM international conference on automated software engineering (ASE 2007) (pp. 445–448). Atlanta : ACM.Google Scholar
  19. Law Averill, M. (2015). Simulation modeling and analysis. McGraw-Hill.Google Scholar
  20. Marsan, M.A., Balbo, G., Conte, G., Donatelli, S., Franceschinis, G. (1994). Modelling with generalized stochastic Petri nets, 1st edn. New York: John Wiley & Sons, Inc.Google Scholar
  21. Nalepa, F., Batko, M., Zezula, P. (2015a). Model for performance analysis of distributed stream processing applications. In Proceedings of the 20th international conference on database and expert systems applications (pp. 520–533). Springer.Google Scholar
  22. Nalepa, F., Batko, M., Zezula, P. (2015b). Performance analysis of distributed stream processing applications through colored Petri nets. In Proceedings of the 10th international doctoral workshop on mathematical and engineering methods in computer science (pp. 93–106). Springer.Google Scholar
  23. OMG. (2011a). UML Profile for MARTE: Modeling and Analysis of Real-time Embedded Systems, Version 1.1.
  24. OMG. (2011b). Meta object facility (MOF) 2.0 Query/View/ Transformation specification, version 1.1.
  25. Rak, T. (2015). Response time analysis of distributed web systems using QPNs. Mathematical Problems in Engineering, 2015, Article ID 490835, 10.
  26. Ranjan, R. (2014). Modeling and simulation in performance optimization of big data processing frameworks. IEEE Cloud Computing, 1(4), 14–19.CrossRefGoogle Scholar
  27. Requeno, J.I., Merseguer, J., Bernardi, S. (2017). Performance analysis of Apache Storm applications using stochastic Petri nets. In Proceedings of the 5th international workshop on formal methods integration.Google Scholar
  28. Samolej, S., & Rak, T. (2009). Simulation and performance analysis of distributed internet systems using TCPNs. Informatica, 33(4), 405–415.Google Scholar
  29. Selic, B. (2007). A systematic approach to domain-specific language design using UML. In Proceedings of the 10th IEEE international symposium on object-oriented real-time distributed computing (pp. 2–9). IEEE Computer Society.Google Scholar
  30. Singhal, R., & Verma, A. (2016). Predicting job completion time in heterogeneous MapReduce environments. In Proceedings of the 30th IEEE international parallel and distributed processing symposium workshops (pp. 17–27). IEEE.Google Scholar
  31. Wang, K., & Khan, M.M.H. (2015). Performance prediction for Apache Spark platform. In 2015 IEEE 17th international conference on high performance computing and communications (HPCC), 2015 IEEE 7th international symposium on cyberspace safety and security (CSS), and 2015 IEEE 12th international conferen on embedded software and systems (ICESS) (pp. 166–173). IEEE.Google Scholar
  32. Zimmermann, A. (2017). Modelling and performance evaluation with TimeNET 4.4 (pp. 300–303). Cham: Springer International Publishing.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • José I. Requeno
    • 1
    Email author
  • José Merseguer
    • 1
  • Simona Bernardi
    • 1
  • Diego Perez-Palacin
    • 1
  • Giorgos Giotis
    • 2
  • Vasilis Papanikolaou
    • 2
  1. 1.Departamento de Informática e Ingeniería de SistemasUniversidad de ZaragozaZaragozaSpain
  2. 2.Athens Technology CenterAthensGreece

Personalised recommendations