From Application to Disk: Tracing I/O Through the Big Data Stack

  • Robert SchmidtkeEmail author
  • Florian Schintke
  • Thorsten Schütt
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11203)


Typical applications in data science consume, process and produce large amounts of data, making disk I/O one of the dominating—and thus worthwhile optimizing—factors of their overall performance. Distributed processing frameworks, such as Hadoop, Flink and Spark, hide a lot of complexity from the programmer when they parallelize these applications across a compute cluster. This exacerbates reasoning about I/O of both the application and the framework, through the distributed file system, such as HDFS, down to the local file systems.

We present SFS (Statistics File System), a modular framework to trace each I/O request issued by the application and any JVM-based big data framework involved, mapping these requests to actual disk I/O.

This allows detection of inefficient I/O patterns, both by the applications and the underlying frameworks, and builds the basis for improving I/O scheduling in the big data software stack.


Distributed File Systems I/O analysis Big data ecosystems 



This work received funding from the BMBF projects Berlin Big Data Center (BBDC) under grant 01IS14013B and GeoMultiSens under grant 01IS14010C.


  1. 1.
    Anderson, J.M., et al.: Continuous profiling: where have all the cycles gone? ACM Trans. Comput. Syst. 15(4), 1–14 (1997)CrossRefGoogle Scholar
  2. 2.
  3. 3.
    Apache Software Foundation: org.apache.hadoop.examples.terasort package description (2017).
  4. 4.
    Chang, F., et al.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4:1–4:26 (2008). Scholar
  5. 5.
    Chow, M., Meisner, D., Flinn, J., Peek, D., Wenisch, T.F.: The mystery machine: end-to-end performance analysis of large-scale Internet services. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2014), Broomfield, CO, pp. 217–231. USENIX Association (2014).
  6. 6.
    Conrad, T.O.F., et al.: Sparse proteomics analysis - a compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data. BMC Bioinform. 18(1), 160 (2017). Scholar
  7. 7.
    Côté, R.G., Reisinger, F., Martens, L.: jmzML, an open-source Java API for mzML, the PSI standard for MS data. Proteomics 10(7), 1332–1335 (2010). Scholar
  8. 8.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008). Scholar
  9. 9.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. SIGOPS Oper. Syst. Rev. 37(5), 29–43 (2003). Scholar
  10. 10.
    Graham, S.L., Kessler, P.B., Mckusick, M.K.: Gprof: a call graph execution profiler. In: ACM SIGPLAN Notices, vol. 17, pp. 120–126. ACM (1982)Google Scholar
  11. 11.
    Gregg, B.: Linux performance, August 2017.
  12. 12.
    Harter, T., et al.: Analysis of HDFS under HBase: a Facebook messages case study. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies, FAST 2014, pp. 199–212. USENIX Association, Berkeley (2014).
  13. 13.
    Joukov, N., Traeger, A., Iyer, R., Wright, C.P., Zadok, E.: Operating system profiling via latency analysis. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 89–102. USENIX Association (2006)Google Scholar
  14. 14.
    Kim, S., Kim, H., Jonwoon, L., Jeong, J.: Enlightening the I/O path: a holistic approach for application performance. In: Proceedings of the 15th USENIX Conference on File and Storage Technologies, FAST 2017, pp. 345–358. USENIX Association, Berkeley (2017).
  15. 15.
    Liu, Y., Gunasekaran, R., Ma, X., Vazhkudai, S.S.: Automatic identification of application I/O signatures from noisy server-side traces. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 2014), Santa Clara, CA, pp. 213–228. USENIX (2014).
  16. 16.
    Mace, J., Roelke, R., Fonseca, R.: Pivot tracing: dynamic causal monitoring for distributed systems. In: Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, pp. 378–393. ACM, New York (2015).
  17. 17.
    O’Malley, O.: Terabyte sort on Apache Hadoop, May 2008.
  18. 18.
    O’Malley, O., Murthy, A.C.: Winning a 60 second dash with a yellow elephant, April 2009.
  19. 19.
    Ousterhout, J.K., Da Costa, H., Harrison, D., Kunze, J.A., Kupfer, M., Thompson, J.G.: A trace-driven analysis of the UNIX 4.2 BSD file system, vol. 19. ACM (1985)Google Scholar
  20. 20.
    Stefanovici, I., Schroeder, B., O’Shea, G., Thereska, E.: Treating the storage stack like a network. Trans. Storage 13(1), 2:1–2:27 (2017). Scholar
  21. 21.
    Thereska, E., et al.: IOFlow: a software-defined storage architecture. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP 2013, pp. 182–196. ACM, New York (2013).
  22. 22.
    Thompson, K., Ritchie, D.M.: Unix Programmer’s Manual, 5 edn, June 1974Google Scholar
  23. 23.
    Transaction Processing Performance Council: TPC Express Benchmark™HS, June 2017.
  24. 24.
    White, T.: Hadoop: The Definitive Guide, 4th edn. O’Reilly Media Inc., Sebastopol (2015)Google Scholar
  25. 25.
    Zadok, E., Nieh, J.: Fist: a language for stackable file systems. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC 2000, p. 5. USENIX Association, Berkeley (2000). Scholar
  26. 26.
    Zhao, X., Rodrigues, K., Luo, Y., Yuan, D., Stumm, M.: Non-intrusive performance profiling for entire software stacks based on the flow reconstruction principle. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016), pp. 603–618. USENIX Association (2016).

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Zuse Institute BerlinBerlinGermany

Personalised recommendations