How Data Volume Affects Spark Based Data Analytics on a Scale-up Server

  • Ahsan Javed Awan
  • Mats Brorsson
  • Vladimir Vlassov
  • Eduard AyguadeEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9495)


Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10 % better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.


Scalability Spark Micro-architecture 


  1. 1.
    Intel Vtune Amplifier XE 2013.
  2. 2.
    Memory manangement in the java hotspot virtual machine.
  3. 3.
  4. 4.
    Appuswamy, R., Gkantsidis, C., Narayanan, D., Hodson, O., Rowstron, A.I.T.: Scale-up vs scale-out for hadoop: time to rethink? In: ACM Symposium on Cloud Computing, SOCC, p. 20 (2013)Google Scholar
  5. 5.
    Awan, A.J., Brorsson, M., Vlassov, V., Ayguadé, E.: Performance characterization of in-memory data analytics on a modern cloud server. arXiv preprint arXiv:1506.07742 (2015)
  6. 6.
    Chen, R., Chen, H., Zang, B.: Tiled-mapreduce: Optimizing resource usages of data-parallel applications on multicore with tiling. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, pp. 523–534. PACT 2010 (2010)Google Scholar
  7. 7.
    Detlefs, D., Flood, C., Heller, S., Printezis, T.: Garbage-first garbage collection. In: Proceedings of the 4th international symposium on Memory management, pp. 37–48. ACM (2004)Google Scholar
  8. 8.
    Levinthal, D.: Performance analysis guide for intel core i7 processor and intel xeon 5500 processors. In: Intel Performance Analysis Guide (2009)Google Scholar
  9. 9.
    Ferdman, M., Adileh, A., Kocberber, O., Volos, S., Alisafaee, M., Jevdjic, D., Kaynak, C., Popescu, A.D., Ailamaki, A., Falsafi, B.: Clearing the clouds: A study of emerging scale-out workloads on modern hardware. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 37–48. ASPLOS XVII (2012)Google Scholar
  10. 10.
    Jia, Z., Wang, L., Zhan, J., Zhang, L., Luo, C.: Characterizing data analysis workloads in data centers. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 66–76 (2013)Google Scholar
  11. 11.
    Jia, Z., Zhan, J., Wang, L., Han, R., McKee, S.A., Yang, Q., Luo, C., Li, J.: Characterizing and subsetting big data workloads. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 191–201 (2014)Google Scholar
  12. 12.
    Jiang, T., Zhang, Q., Hou, R., Chai, L., McKee, S.A., Jia, Z., Sun, N.: Understanding the behavior of in-memory computing workloads. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 22–30 (2014)Google Scholar
  13. 13.
    Karakostas, V., Unsal, O.S., Nemirovsky, M., Cristal, A., Swift, M.: Performance analysis of the memory management unit under scale-out workloads. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 1–12, October 2014Google Scholar
  14. 14.
    Luo, C., Zhan, J., Jia, Z., Wang, L., Lu, G., Zhang, L., Xu, C.Z., Sun, N.: Cloudrank-d: Benchmarking and ranking cloud computing systems for data processing applications. Front. Comput. Sci. 6(4), 347–362 (2012)MathSciNetGoogle Scholar
  15. 15.
    Ming, Z., Luo, C., Gao, W., Han, R., Yang, Q., Wang, L., Zhan, J.: BDGS: A scalable big data generator suite in big data benchmarking. In: Rabl, T., Raghunath, N., Poess, M., Bhandarkar, M., Jacobsen, H.-A., Baru, C. (eds.) Advancing Big Data Benchmarks. LNCS, pp. 138–154. Springer, Heidelberg (2014)Google Scholar
  16. 16.
    Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., Chun, B.G.: Making sense of performance in data analytics frameworks. In: 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2015), pp. 293–307 (2015)Google Scholar
  17. 17.
    Wang, L., Zhan, J., Luo, C., Zhu, Y., Yang, Q., He, Y., Gao, W., Jia, Z., Shi, Y., Zhang, S., Zheng, C., Lu, G., Zhan, K., Li, X., Qiu, B.: Bigdatabench: A big data benchmark suite from internet services. In: 20th IEEE International Symposium on High Performance Computer Architecture, HPCA, pp. 488–499 (2014)Google Scholar
  18. 18.
    Yasin, A.: A top-down method for performance analysis and counters architecture. In: 2014 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS, pp. 35–44 (2014)Google Scholar
  19. 19.
    Yasin, A., Ben-Asher, Y., Mendelson, A.: Deep-dive analysis of the data analytics workload in cloudsuite. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 202–211, October 2014Google Scholar
  20. 20.
    Yoo, R.M., Romano, A., Kozyrakis, C.: Phoenix rebirth: Scalable mapreduce on a large-scale shared-memory system. In: Proceedings of IEEE International Symposium on Workload Characterization (IISWC), pp. 198–207 (2009)Google Scholar
  21. 21.
    Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2012), pp. 15–28. San Jose, CA (2012)Google Scholar
  22. 22.
    Zhang, K., Chen, R., Chen, H.: Numa-aware graph-structured analytics. In: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 183–193. ACM (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ahsan Javed Awan
    • 1
  • Mats Brorsson
    • 1
  • Vladimir Vlassov
    • 1
  • Eduard Ayguade
    • 2
    Email author
  1. 1.Software and Computer Systems Department(SCS)KTH Royal Institute of TechnologyStockholmSweden
  2. 2.Computer Architecture DepartmentTechnical University of Catalunya (UPC)BarcelonaSpain

Personalised recommendations