Advertisement

The Journal of Supercomputing

, Volume 72, Issue 5, pp 1946–1972 | Cite as

Experimental analysis of operating system jitter caused by page reclaim

  • Yoshihiro OyamaEmail author
  • Shun Ishiguro
  • Jun Murakami
  • Shin Sasaki
  • Ryo Matsumiya
  • Osamu Tatebe
Article

Abstract

Operating system jitter is one of the causes of runtime overhead in high-performance computing applications. Many high-performance computing applications perform burst accesses to I/O, and such accesses consume a large amount of memory. When the Linux kernel runs out of memory, it awakens special kernel threads to reclaim memory pages. If the kernel threads are frequently awakened, application performance is degraded because of the threads’ resource consumption as well as the increase in the application’s page faults and migration between CPU cores. In this study, we empirically analyze the impact of jitter caused by reclaiming memory pages, and we propose a method for reducing it. The proposed method reclaims memory pages in advance of the kernel thread. It reclaims more pages at one time than the kernel threads, thus reducing the frequency of page reclaim and the impact of jitter. We conducted experiments using practical weather forecast software, the results of which showed that the proposed method minimized performance degradation caused by jitter.

Keywords

System jitter System noise Memory management  Page cache  File I/O Operating systems 

Notes

Acknowledgments

We are grateful for the insightful discussion with Hiroko Midorikawa of Seikei University. We also appreciate many insightful feedbacks from anonymous reviewers.

References

  1. 1.
    Akkan H, Lang M, Liebrock LM (2012) Stepping towards noiseless Linux environment. In: Proceedings of the 2nd international workshop on runtime and operating systems for supercomputersGoogle Scholar
  2. 2.
    Argonne Leadership Computing Facility: Mira/Cetus/Vesta. http://www.alcf.anl.gov/user-guides/mira-cetus-vesta. Accessed 20 Mar 2016
  3. 3.
    Beckman P, Iskra K, Yoshii K, Coghlan S (2006) The influence of operating systems on the performance of collective operations at extreme scale. In: Proceedings of the 2006 IEEE international conference on cluster computingGoogle Scholar
  4. 4.
    Betti E, Cesati M, Gioiosa R, Piermaria F (2009) A global operating system for HPC clusters. In: Proceedings of the 2009 IEEE international conference on cluster computingGoogle Scholar
  5. 5.
    Chinner D, Higdon J (2006) Exploring high bandwidth filesystems on large systems. Proc Ott Linux Symp 2006:177–191Google Scholar
  6. 6.
    De P, Kothari R, Mann V (2007) Identifying sources of operating system jitter through fine-grained kernel instrumentation. In: Proceedings of the 2007 IEEE international conference on cluster computing, pp 331–340Google Scholar
  7. 7.
    De P, Mann V, Mittaly U (2009) Handling OS jitter on multicore multithreaded systems. In: Proceedings of the 23rd IEEE international symposium on parallel and distributed processingGoogle Scholar
  8. 8.
    Dunigan TH (1994) Early experiences and performance of the Intel Paragon. Tech. Rep. ORNL/TM-12194, Oak Ridge National LaboratoryGoogle Scholar
  9. 9.
    Ferreira KB, Bridges P, Brightwell R (2008) Characterizing application sensitivity to os interference using kernel-level noise injection. In: Proceedings of the 2008 ACM/IEEE conference on supercomputingGoogle Scholar
  10. 10.
    Giampapa M, Gooding T, Inglett T, Wisniewski RW (2010) Experiences with a lightweight supercomputer kernel: lessons learned from Blue Gene’s CNK. In: Proceedings of SC10Google Scholar
  11. 11.
    Gioiosa R, Petrini F, Davis K, Lebaillif-Delamare F (2004) Analysis of system overhead on parallel computers. In: Proceedings of the 4th IEEE international symposium on signal processing and information technology, pp 387–390Google Scholar
  12. 12.
    GlusterFS. http://www.gluster.org/. Accessed 20 Mar 2016
  13. 13.
    Hoefler T, Schneider T, Lumsdaine A (2010) Characterizing the influence of system noise on large-scale applications by simulation. In: Proceedings of SC10Google Scholar
  14. 14.
    Isaila F, Balaprakash P, Wild SM, Kimpe D, Latham R, Ross R, Hovland P (2015) Collective I/O tuning using analytical and machine learning models. In: Proceedings of the 2015 IEEE international conference on cluster computing, pp 128–137Google Scholar
  15. 15.
    Jones T (2011) Linux kernel co-scheduling for bulk synchronous parallel applications. In: Proceedings of the 1st international workshop on runtime and operating systems for supercomputers, pp 57–64Google Scholar
  16. 16.
    Kuo CS, Shah A, Nomura A, Matsuoka S, Wolf F (2014) How file access patterns influence interference among cluster applications. In: Proceedings of 2014 IEEE international conference on cluster computing, pp 185–193Google Scholar
  17. 17.
    Morari A, Gioiosa R, Wisniewski RW, Cazorla FJ, Valero M (2011) A quantitative analysis of OS noise. In: Proceedings of the 2011 IEEE international parallel and distributed processing symposium, pp 852–863Google Scholar
  18. 18.
    Moriya S (2011) Tunable watermark. https://lwn.net/Articles/422291/. Accessed 20 Mar 2016
  19. 19.
    Nataraj A, Morris A, Malony AD, Sottile M, Beckman P (2007) The ghost in the machine: observing the effects of kernel operation on parallel application performance. In: Proceedings of SC07Google Scholar
  20. 20.
    Oral S, Wang F, Shipman GM, Dillow D, Miller R, Maxwell D, Becklehimer J, Larkin J, Henseler D (2010) Reducing application runtime variability on Jaguar XT5. Cray User Group (CUG) MeetingGoogle Scholar
  21. 21.
    Oyama Y, Ishiguro S, Murakami J, Sasaki S, Matsumiya R, Tatebe O (2014) Reduction of operating system jitter caused by page reclaim. In: Proceedings of the 4th international workshop on runtime and operating systems for supercomputers (ROSS’14)Google Scholar
  22. 22.
    Park Y, Hensbergen EV, Hillenbrand M, Inglett T, Rosenburg B, Ryu KD, Wisniewski RW (2012) FusedOS: fusing LWK performance with FWK functionality in a heterogeneous environment. In: Proceedings of the 24th international symposium on computer architecture and high performance computing, pp 211–218Google Scholar
  23. 23.
    Rosenthal E, León EA, Moody AT (2013) Mitigating system noise with simultaneous multi-threading. In: Proceedings of SC13, poster sessionGoogle Scholar
  24. 24.
    Schwan P (2003) Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux symposiumGoogle Scholar
  25. 25.
    Seelam S, Fong L, Lewars J, Divirgilio J, Veale BF, Gildea K (2011) Characterization of system services and their performance impact in multi-core nodes. In: Proceedings of the 25th IEEE international parallel and distributed processing symposium, pp 104–117Google Scholar
  26. 26.
    Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: Proceedings of the 26th IEEE symposium on massive storage systems and technologiesGoogle Scholar
  27. 27.
    Sumimoto S (2013) Performance evaluation of FEFS on K computer and Fujitsu’s roadmap toward Lustre 2.x. Lustre User Group 2013. http://www.opensfs.org/events/lug13/. Accessed 20 Mar 2016
  28. 28.
    Tatebe O, Hiraga K, Soda N (2010) Gfarm grid file system. New Gener Comput 28(3):257–275CrossRefzbMATHGoogle Scholar
  29. 29.
    Tsafrir D, Etsion Y, Feitelson DG, Kirkpatrick S (2005) System noise, OS clock ticks, and fine-grained parallel applications. In: Proceedings of the 19th ACM international conference on supercomputing, pp 303–312Google Scholar
  30. 30.
    van Riel R (2011) Add extra free kbytes tunable. https://lkml.org/lkml/2011/9/1/188. Accessed 20 Mar 2016
  31. 31.
    Vicente E Jr, Matias R (2012) Exploratory study on the Linux OS jitter. In: Proceedings of the 2012 Brazilian symposium on computing system engineering, pp 19–24Google Scholar
  32. 32.
    WRF. http://www.wrf-model.org/. Accessed 20 Mar 2016
  33. 33.
    Yuan Q, Zhao J, Chen M, Sun N (2010) GenerOS: an asymmetric operating system kernel for multi-core systems. In: Proceedings of the 24th IEEE international parallel and distributed processing symposiumGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of InformaticsThe University of Electro-CommunicationsChofuJapan
  2. 2.Department of Computer ScienceUniversity of TsukubaTsukubaJapan

Personalised recommendations