Advertisement

Empirical Software Engineering

, Volume 23, Issue 3, pp 1519–1551 | Cite as

Analyzing a decade of Linux system calls

  • Mojtaba Bagherzadeh
  • Nafiseh Kahani
  • Cor-Paul Bezemer
  • Ahmed E. Hassan
  • Juergen Dingel
  • James R. Cordy
Article

Abstract

Over the past 25 years, thousands of developers have contributed more than 18 million lines of code (LOC) to the Linux kernel. As the Linux kernel forms the central part of various operating systems that are used by millions of users, the kernel must be continuously adapted to the changing demands and expectations of these users. The Linux kernel provides its services to an application through system calls. The combined set of all system calls forms the essential Application Programming Interface (API) through which an application interacts with the kernel. In this paper, we conduct an empirical study of 8,770 changes that were made to Linux system calls during the last decade (i.e., from April 2005 to December 2014). In particular, we study the size of the changes, and we manually identify the type of changes and bug fixes that were made. Our analysis provides an overview of the evolution of the Linux system calls over the last decade. We find that there was a considerable amount of technical debt in the kernel, that was addressed by adding a number of sibling calls (i.e., 26% of all system calls). In addition, we find that by far, the ptrace() and signal handling system calls are the most challenging to maintain. Our study can be used by developers who want to improve the design and ensure the successful evolution of their own kernel APIs.

Keywords

Linux kernel System calls Empirical software engineering API evolution Software evolution 

References

  1. Antoniol G, Villano U, Merlo E, Penta MD (2002) Analyzing cloning evolution in the Linux kernel. Inf Softw Technol 44(13):755–765CrossRefGoogle Scholar
  2. Atlidakis V, Andrus J, Geambasu R, Mitropoulos D, Nieh J (2016) Posix abstractions in modern operating systems: The old, the new, and the missing. In: Proceedings of the 11th European conference on computer systems (EuroSys), pp 19:1–19:17. ACMGoogle Scholar
  3. Bagherzadeh M, Kahani N, Bezemer C-P, Hassan AE, Dingel J, Cordy JR (2017) Analyzing a decade of Linux system calls: online appendix. https://github.com/corpaul/decade_of_systemcalls. (Last visited: June 19, 2017)
  4. Bogart C, Kästner C, Herbsleb J, Thung F (2016) How to break an API: Cost negotiation and community values in three software ecosystems. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE), pp 109–120. ACMGoogle Scholar
  5. Corbet J (2002) A new system call restart mechanism. https://lwn.net/Articles/17744/. (Last visited: June 19, 2017)
  6. Corbet J (2014) The possible demise of remap_file_pages(). https://lwn.net/Articles/597632/. (Last visited: June 19, 2017)
  7. Intel Corporation (2016) Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2: Instruction Set Reference, A-Z. https://goo.gl/1cFOvB. (Last visited: June 19, 2017)
  8. Davis M (2012) Creating a vDSO: the Colonel’s Other Chicken. http://www.Linuxjournal.com/content/creating-vdso-colonels-other-chicken. (Last visited: June 19, 2017)
  9. de Melo CA (2009) net: Introduce recvmmsg socket syscall. https://github.com/torvalds/linux/commit/a2e2725541. (Last visited: June 19, 2017)
  10. Deller H (2013) Correctly wire up mq functions for compat case. https://github.com/torvalds/Linux/commit/fee707b45. (Last visited: June 19, 2017)
  11. Deller H (2014) Fix epoll pwait syscall on compat kernel. https://github.com/torvalds/Linux/commit/ab3e55b11, (Last visited: June 19, 2017)
  12. Dig D, Johnson R (2005) The role of refactorings in API evolution. In: Proceedings of the 21st International Conference on Software Maintenance (ICSM), pp 389–398. IEEEGoogle Scholar
  13. Dig D, Johnson R (2006) How do APIs evolve? a story of refactoring. J Softw Maint Evol Res Pract 18(2):83–107CrossRefGoogle Scholar
  14. Chase Douglas (2009) Add compat truncate. https://github.com/torvalds/Linux/commit/dd90bbd5f. (Last visited: June 19, 2017)
  15. Drepper U (2005) *at syscalls: Intro. http://lwn.net/Articles/164584/. (Last visited: June 19, 2017)
  16. Drepper U (2006) [PATCH] Implement AT_SYMLINK_FOLLOW flag for linkat. https://github.com/torvalds/Linux/commit/45c9b11a1. (Last visited: June 19, 2017)
  17. Drysdale D (2015) Documentation: describe how to add a system call. https://github.com/torvalds/Linux/commit/4983953d. (Last visited: June 19, 2017)
  18. Filippov M (2014) Xtensa: deprecate fast_xtensa and fast_spill_registers syscalls. https://github.com/torvalds/Linux/commit/9184289. (Last visited: June 19, 2017)
  19. Fox J, Weisberg S (2010) Nonparametric regression in R. https://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Nonparametric-Regression.pdf. (Last visited: June 19, 2017)
  20. Free Software Foundation (2016) The GNU C library. https://www.gnu.org/software/libc/. (Last visited: June 19, 2017)
  21. Gartner (2015) Gartner says tablet sales continue to be slow in 2015. http://www.gartner.com/newsroom/id/2954317. (Last visited: June 19, 2017)
  22. Gillen A, Bozman JS (2013) Running mission-critical workloads on enterprise Linux x86 servers. IDC WhitepaperGoogle Scholar
  23. Godfrey MW, Tu Q (2000) Evolution in open source software: A case study. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp 131–142. IEEEGoogle Scholar
  24. Godfrey MW, Qiang T (2001) Growth, evolution, and structural change in open source software. In: Proceedings of the 4th international workshop on principles of software evolution (IWPSE), pp 103–106. ACMGoogle Scholar
  25. Google (2017) Syzkaller: Linux syscall fuzzer. https://github.com/google/syzkaller. (Last visited: Apr 18, 2017)
  26. Han S (2012) Scalable event multiplexing: epoll vs. kqueue. http://people.eecs.berkeley.edu/sangjin/2012/12/21/epoll-vs-kqueue.html. (Last visited: Apr 18, 2017)
  27. Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering (ICSE), pp 78–88. IEEEGoogle Scholar
  28. Hayward M (2002) LKML: Mike Hayward: Intel P6 vs P7 system call performance. https://lkml.org/lkml/2002/12/9/13. (Last visited: June 19, 2017)
  29. Henkel J, Diwan A (2005) Catchup!: Capturing and replaying refactorings to support API evolution. In: Proceedings of the 27th International Conference on Software Engineering (ICSE), pp 274–283. ACMGoogle Scholar
  30. Hora A, Robbes R, Anquetil N, Etien A, Ducasse S., Valente M T (2015) How do developers react to API evolution? The Pharo ecosystem case. In: Proceedings of the International Conference on Software Maintenance and Evolution (ICSME), pp 251–260. IEEEGoogle Scholar
  31. Hunt A (2000) The pragmatic programmer. Pearson Education India, LondonGoogle Scholar
  32. Israeli A, Feitelson DG (2010) The Linux kernel as a case study in software evolution. J Syst Softw 83(3):485–501CrossRefGoogle Scholar
  33. Izurieta C, Bieman J (2006) The evolution of FreeBSD and Linux. In: Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering (ESEM), pp 204–211. ACMGoogle Scholar
  34. Jones D (2016) Trinity: A Linux system call fuzz tester. https://codemonkey.org.uk/projects/trinity/. (Last visited: Apr 18, 2017)
  35. Kerrisk M (2015a) Linux programmer’s manual: Linux system calls. http://www.man7.org/Linux/man-pages/man2/syscalls.2.html. (Last visited: June 19, 2017)
  36. Kerrisk M (2015b) Linux programmer’s manual: nfsservctl. http://man7.org/Linux/man-pages/man2/nfsservctl.2.html. (Last visited: June 19, 2017)
  37. Kerrisk M (2017) Linux programmer’s manual: writev. http://www.man7.org/Linux/man-pages/man2/writev.2.html, (Last visited: June 19, 2017)
  38. Lehman MM (1980) Programs, life cycles, and laws of software evolution. Proc IEEE 68(9):1060–1076CrossRefGoogle Scholar
  39. Li Z, Tan L, Wang X, Shan L, Zhou Y, Zhai C (2006) Have things changed now?: An empirical study of bug characteristics in modern open source software. In: Proceedings of the 1st workshop on architectural and system support for improving software dependability, pages 25–33. ACMGoogle Scholar
  40. Linares-Vásquez M, Bavota G, Bernal-Cárdenas C, Di Penta M, Oliveto R, Poshyvanyk D (2013) API change and fault proneness: A threat to the success of Android apps. In: Proceedings of the 9th joint meeting on foundations of software engineering (ESEC-FSE), pages 477–487, New York, NY, USA. ACMGoogle Scholar
  41. Linux Kernel Documentation (2005) Adding a new system call. https://www.kernel.org/doc/html/latest/process/adding-syscalls.html. (Last visited: June 19, 2017)
  42. Linux Kernel Documentation (2016) Submitting patches: the essential guide to getting your code into the kernel. https://www.kernel.org/doc/Documentation/process/submitting-patches.rst. (Last visited: June 19, 2017)
  43. Livieri S, Higo Y, Matsushita M, Inoue K (2007) Analysis of the Linux kernel evolution using code clone coverage. In: 4th international workshop on mining software repositories (MSR), pp 22–22. IEEEGoogle Scholar
  44. Long JD, Feng D, Cliff N (2003) Ordinal analysis of behavioral data. Handproceedings of psychologyGoogle Scholar
  45. Lotufo R, She S, Berger T, Czarnecki K, Wȧsowski A (2010) Evolution of the Linux kernel variability model. In: International Conference on Software Product Lines, pages 136–150. SpringerGoogle Scholar
  46. Lu L, Arpaci-Dusseau AC, Arpaci-Dusseau RH, Lu S (2014) A study of Linux file system evolution. Trans Storage 10(1):3:1–3:32. ISSN 1553-3077CrossRefGoogle Scholar
  47. Lu S, Soyeon P, Eunsoo S, Zhou Y (2008) Learning from mistakes: A comprehensive study on real world concurrency bug characteristics. SIGOPS Operating Syst Rev 42(2):329–339CrossRefGoogle Scholar
  48. Mauerer W (2010) Professional Linux kernel architecture. Wiley, New YorkGoogle Scholar
  49. McDonnell T, Ray B, Kim M (2013) An empirical study of API stability and adoption in the android ecosystem. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp 70–79Google Scholar
  50. McMartin K (2007) Reorder syscalls to match. https://github.com/torvalds/Linux/commit/1e67685b1. (Last visited: June 19, 2017)
  51. Merlo E, Dagenais M, Bachand P, Sormani JS, Gradara S, Antoniol G (2002) Investigating large software system evolution: the Linux kernel. In: Proceedings of the 26th International Computer Software and Applications Conference (COMPSAC), pp 421–426. IEEEGoogle Scholar
  52. National Institute of Standards and Technology (2009) National Vulnerability Database: CVE-2009-0029. https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-0029. (Last visited: June 19, 2017)
  53. Padioleau Y, Lawall JL, Muller G (2006) Understanding collateral evolution in Linux device drivers. In: ACM SIGOPS Operating Systems Review, volume 40, pp 59–71. ACMGoogle Scholar
  54. Palix N, Thomas G, Saha S, Calvès C, Lawall J, Muller G (2011) Faults in Linux: ten years later. In: ACM SIGPLAN Notices, volume 46, pages 305–318. ACMGoogle Scholar
  55. Passos L, Czarnecki K, Wȧsowski A (2012) Towards a catalog of variability evolution patterns: the Linux kernel case. In: Proceedings of the 4th international workshop on feature-oriented software development, pp 62–69. ACMGoogle Scholar
  56. Perkins JH (2005) Automatically generating refactorings to support API evolution. In: Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on program analysis for software tools and engineering (PASTE), pp 111–114. ACMGoogle Scholar
  57. Pingdom (2012) Linux kernel development by the numbers. http://royal.pingdom.com/2012/04/16/Linux-kernel-development-numbers/. (Last visited: June 19, 2017)
  58. Robbes R, Lungu M, Röthlisberger D (2012) How do developers react to API deprecation?: The case of a Smalltalk ecosystem. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering (FSE), pp 56:1–56:11. ACMGoogle Scholar
  59. Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: Are the t-test and Cohen’s d indices the most appropriate choices. In: Annual meeting of the southern association for institutional researchGoogle Scholar
  60. Sandeep S (2002) Process tracing using ptrace. http://www.tldp.org/LDP/LGNET/81/sandeep.html. (Last visited: June 19, 2017)
  61. Skinnemoen H (2008a) Fix sys sync file range call convention. https://github.com/torvalds/Linux/commit/73d4393d1. (Last visited: June 19, 2017)
  62. Skinnemoen H (2008b) Fix timerfd breakage on avr32. https://github.com/torvalds/Linux/commit/46a56c5a0. (Last visited: June 19, 2017)
  63. Spinellis D (2015) A repository with 44 years of Unix evolution. In: Proceedings of the 12th working conference on mining software repositories (MSR), pages 462–465. IEEE PressGoogle Scholar
  64. Spinellis D (2016) A repository of Unix history and evolution. Empirical Software EngineeringGoogle Scholar
  65. Spinellis D, Louridas P, Kechagia M (2016) The evolution of C programming practices: A study of the Unix operating system 1973–2015. In: Proceedings of the 38th International Conference on Software Engineering (ICSE), pp 748–759. ACMGoogle Scholar
  66. Tan L, Liu C, Li Z, Wang X, Zhou Y, Zhai C (2014) Bug characteristics in open source software. Empir Softw Eng 19(6):1665–1705CrossRefGoogle Scholar
  67. Torvalds L (2002) Compatibility syscall layer (lets try again). https://lwn.net/Articles/17746/. (Last visited: June 19, 2017)
  68. Torvalds L (2014) Renameat2 does not need (or have) a separate compat system. https://github.com/torvalds/Linux/commit/9abd09acd. (Last visited: June 19, 2017)
  69. Linus Torvalds (2017) Linux Git repository. https://github.com/torvalds/Linux/, 2016. (Last visited: June 19
  70. Tsai C-C, Jain B, Abdul NA, Porter DE (2016) A study of modern Linux API usage and compatibility: What to support when you’re supporting. In: Proceedings of the 11th European conference on computer systems (EuroSys), pp 16:1–16:16. ACMGoogle Scholar
  71. Viro A (2012) [braindump][rfc] signals and syscall restarts. https://lkml.org/lkml/2012/12/6/366. (Last visited: June 19, 2017)
  72. Xavier L, Brito A, Hora A, Valente MT (2017) Historical and impact analysis of API breaking changes: A large-scale study. In: Proceedings of the 24th international conference on software analysis, evolution and reengineering (SANER), pp 138–147. IEEEGoogle Scholar
  73. Xing Z, Stroulia E (2007) API-evolution support with diff-catchup. IEEE Trans Softw Eng 33(12):818–836CrossRefGoogle Scholar
  74. Zankel C (2005) [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 1. https://github.com/torvalds/Linux/commit/8e1a6dd2. (Last visited: June 19, 2017)

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.School of ComputingQueen’s UniversityKingstonCanada

Personalised recommendations