Empirical Software Engineering

, Volume 22, Issue 3, pp 1372–1404 | Cite as

A repository of Unix history and evolution



The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1972 as a five thousand line kernel, to 2016 as a widely-used 27 million line system. The 1.1gb repository contains 496 thousand commits and 2,523 branch merges. The repository employs the commonly used Git version control system for its storage, and is hosted on the popular GitHub archive. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386bsd team, two legacy repositories, and the modern repository of the open source Freebsd system. In total, 973 individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology.


Software archeology Unix Configuration management Git 



The author thanks the many individuals who contributed, directly or indirectly, to the effort. John Cowan, Brian W. Kernighan, Larry McVoy, Doug McIlroy, Jeremy C. Reed, Aharon Robbins, and Marc Rochkind helped with Bell Labs login identifiers. Clem Cole, John Cowan, Era Eriksson, Mary Ann Horton, Warner Losh, Kirk McKusick, Jeremy C. Reed, Ingo Schwarze, Anatole Shaw, and Norman Wilson helped with bsd login identifiers and code authorship information. The historical and current material used in the repository was made available thanks to efforts by the Free bsd Project, Lynne Greer Jolitz, William F. Jolitz, Kirk McKusick, and the Unix Heritage Society. The early Unix editions were released under an bsd-style license thanks to the efforts of Bill Broderick, Paul Hatch, Dion L. Johnson II, Ransom Love, and Warren Toomey. The bsd sccs import code is based on work by H. Merijn Brand and Jonathan Gray. The newoldar program is a result of work by Brandon Creighton and Dan Frasnelli. The First Research Edition Unix was restored by Johan Beiser, Tim Bradshaw, Brantley Coile, Christian David, Alex Garbutt, Hellwig Geisse, Cyrille Lefevre, Ralph Logan, James Markevitch, Doug Merritt, Tim Newsham, Brad Parker, and Warren Toomey.


  1. Aho A V, Kernighan B W, Weinberger P J (1979) Awk—a pattern scanning and processing language. Softw Pract Exper 9(4):267–280CrossRefMATHGoogle Scholar
  2. Babaog~lu O, Joy W (1981) Converting a swap-based system to do paging in an architecture lacking page-referenced bits. In: Proceedings of the Eighth ACM symposium on operating systems principles SOSP ’81. ACM, New York, pp 78–86Google Scholar
  3. Bashkow TR (1972) Study of UNIX. Bell Laboratories memo MH-8234-TRB-mbh. Available online at http://bitsavers.informatik.uni-stuttgart.de/pdf/bellLabs/unix/PreliminaryUnixImplementationDocument_Jun72.pdf. Current September 2015
  4. Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, ACM, New York, NY, USA, MSR ’06, pp 137–143. doi: 10.1145/1137983.1138016
  5. Bourne S R (1978) The UNIX shell. Bell Syst Tech J 56(6):1971–1990CrossRefGoogle Scholar
  6. Bourne SR (1979) An introduction to the UNIX shell. In: UNIX programmer’s manual, volume 2—supplementary documents, 7th edn. Bell Telephone Laboratories. Murray HillGoogle Scholar
  7. Dolotta T A, Haight R C, Mashey J R (1978) The programmer’s workbench. Bell Syst Tech J 56(6):2177–2200CrossRefGoogle Scholar
  8. Feldman S I (1979) Make—a program for maintaining computer programs. Softw Pract Exper 9(4):255–265CrossRefMATHGoogle Scholar
  9. FreeBSD (2015) FreeBSD Handbook. The FreeBSD Documentation Project, revision 47376 edn, available online, https://www.freebsd.org/doc/handbook/index.html
  10. Gall H, Menzies T, Williams L, Zimmermann T (2014) Software Development Analytics (Dagstuhl Seminar 14261). Dagstuhl Reports 4(6):64–83. doi: 10.4230/DagRep.4.6.64. http://drops.dagstuhl.de/opus/volltexte/2014/4763 Google Scholar
  11. Gehani N (2003) Bell labs: life in the crown jewel. Silicon Press, SummitGoogle Scholar
  12. Johnson S C (1975) Yacc—yet another compiler-compiler. Computer Science Technical Report 32. Bell Laboratories, Murray HillGoogle Scholar
  13. Johnson S C (1977) Lint, a C program checker. Computer Science Technical Report 65. Bell Laboratories, Murray HillGoogle Scholar
  14. Johnson S C, Lesk M E (1978) Language development tools. Bell Syst Tech J 56(6):2155–2176CrossRefGoogle Scholar
  15. Johnson S C, Ritchie D M (1978) Portability of C programs and the UNIX system. Bell Syst Tech J 57(6):2021–2048CrossRefGoogle Scholar
  16. Jolitz W F, Jolitz L G (1991) Porting UNIX to the 386: a practical approach. Designing a software specification. Dr Dobb’s J 16(1)Google Scholar
  17. Kernighan B, Lesk M, Ossanna J J (1978) UNIX time-sharing system: Document preparation. Bell Syst Techn J 57(6):2115–2135CrossRefGoogle Scholar
  18. Kernighan B W (1982) A typesetter-independent TROFF. Computer Science Technical Report 97. Bell Laboratories, Murray Hill, available online at http://cm.bell-labs.com/cm/cs/cstr/97.ps.gz Google Scholar
  19. Kernighan B W, Cherry L L (1974) A system for typesetting mathematics. Computer Science Technical Report 17. Bell Laboratories, Murray HillGoogle Scholar
  20. Kernighan BW, Ritchie DM (1979) The M4 macro processor. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2– supplementary documents, 7th edn. Bell Telephone Laboratories, Murray HillGoogle Scholar
  21. Lesk M (1979a) Some applications of inverted indexes on the Unix system. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 4th edn. Bell Telephone Laboratories, Murray HillGoogle Scholar
  22. Lesk M E (1975) Lex—a lexical analyzer generator. Computer Science Technical Report 39. Bell Laboratories, Murray HillGoogle Scholar
  23. Lesk ME (1979b) TBL—a program to format tables. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray HillGoogle Scholar
  24. Lewis A (1956) AT&T settles antitrust case; shares patents. New York Times 16:1Google Scholar
  25. Libes D, Ressler S (1989) Life with UNIX. Prentice Hall, Englewood CliffsGoogle Scholar
  26. Lions J (1996) Lions’ commentary on Unix 6th edition with source code. Annabooks, PowayGoogle Scholar
  27. Mashey JR, Smith DW (1976) Documentation tools and techniques. In: Proceedings of the 2Nd international conference on software engineering ICSE ’76. IEEE Computer Society Press, Los Alamitos, pp 177–181Google Scholar
  28. McIlroy M D, Pinson E N, Tague B A (1978) UNIX time-sharing system: foreword. Bell Syst Tech J 57(6):1899–1904CrossRefGoogle Scholar
  29. McKusick M K (1999) Twenty years of Berkeley Unix: from AT&T-owned to freely redistributable. In: DiBona C, Ockman S, Stone M (eds) Open sources: voices from the open source revolution, O’Reilly, pp 31–46Google Scholar
  30. McKusick M K, Neville-Neil G V (2004) The design and implementation of the FreeBSD operating system. Addison-Wesley, ReadingGoogle Scholar
  31. McMahon LE (1979) SED—a non-interactive text editor. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray HillGoogle Scholar
  32. Nowitz DA, Lesk ME (1979) A dial-up network of UNIX systems. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray HillGoogle Scholar
  33. Ossanna JF (1979) NROFF/TROFF user’s manual. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray HillGoogle Scholar
  34. Pike R, Kernighan B W (1984) Program design in the UNIX system environment. AT&T Bell Lab Tech J 63(8):1595–1606CrossRefGoogle Scholar
  35. Quarterman J S, Hoskins J C (1986) Notable computer networks. Commun ACM 29(10):932–971CrossRefGoogle Scholar
  36. Raymond ES (2003) The art of Unix programming. Addison-WesleyGoogle Scholar
  37. Resnick P (2008) Internet message format. RFC 5322, RFC Editor. doi: 10.17487/RFC5322. http://www.rfc-editor.org/rfc/rfc5322.txt
  38. Ritchie D M (1978) A retrospective. Bell System Technical Journal 56(6):1947–1969CrossRefGoogle Scholar
  39. Ritchie D M (1984) The evolution of the UNIX time-sharing system. AT&T Bell Lab Tech J 63(8):1577–1593CrossRefGoogle Scholar
  40. Ritchie DM (1993) The development of the C language. ACM SIGPLAN Not 28 (3):201–208. preprints of the History of Programming Languages Conference (HOPL-II)CrossRefGoogle Scholar
  41. Ritchie D M, Thompson K (1974) The UNIX time-sharing system. Commun ACM 17(7):365–375CrossRefGoogle Scholar
  42. Ritchie D M, Thompson K (1978) The UNIX time-sharing system. Bell Syst Tech J 57(6):1905–1929CrossRefGoogle Scholar
  43. Ritchie D M, Johnson S C, Lesk M E, Kernighan B W (1978) The C programming language. Bell Syst Tech J 57(6)Google Scholar
  44. Rochkind M J (1975) The source code control system. IEEE Trans Softw Eng SE 1(4):255–265Google Scholar
  45. Rosler L (1984) The evolution of C — past and future. Bell Syst Tech J 63(8)Google Scholar
  46. Salus P H (1994) A quarter century of UNIX. Addison-Wesley, BostonGoogle Scholar
  47. Spinellis D (2015) A repository with 44 years of Unix evolution. In: MSR ’15: Proceedings of the 12th working conference on mining software repositories. IEEE, pp 462–465. doi: 10.1109/MSR.2015.6. http://www.dmst.aueb.gr/dds/pubs/conf/2015-MSR-Unix-History/html/Spi15c.html, best Data Showcase Award
  48. Spinellis D, Louridas P, Kechagia M (2015) An exploratory study on the evolution of C programming in the Unix operating system. In: Wang Q, Ruhe G (eds) ESEM ’15: 9th International symposium on empirical software engineering and measurement. http://www.dmst.aueb.gr/dds/pubs/conf/2015-ESEM-CodeStyle/htm l/SLK15.html. IEEE, pp 54–57Google Scholar
  49. Spinellis D, Louridas P, Kechagia M (2016) The evolution of C programming practices: a study of the Unix operating system. In: Visser W, Williams L (eds) ICSE ’16: Proceedings of the 38th international conference on software engineering. doi: 10.1145/2884781.2884799, (to appear in print). to appear. Association for Computing Machinery, New York, pp 1973–2015
  50. Stevens W R (1990) UNIX network programming. Prentice Hall, Englewood CliffsGoogle Scholar
  51. Stroustrup B (1984) Data abstraction in C. Bell Syst Tech J 63(8):1701–1732Google Scholar
  52. Stroustrup B (1994) The design and evolution of C++. Addison-Wesley, BostonGoogle Scholar
  53. Takahashi N, Takamatsu T (2013) UNIX license makes Linux the last missing piece of the puzzle. Ann Bus Admin Sci 12:123–137Google Scholar
  54. Tichy WF (1982) Design, implementation, and evaluation of a revision control system. In: Proceedings of the 6th international conference on software engineering. IEEEGoogle Scholar
  55. Toomey W (2009) The restoration of early UNIX artifacts. In: Proceedings of the 2009 USENIX annual technical conference USENIX’09. USENIX Association, Berkeley, pp 20–26Google Scholar
  56. Toomey W (2010) First edition Unix: its creation and restoration. IEEE Ann Hist Comput 32(3):74–82. doi: 10.1109/MAHC.2009.55 MathSciNetCrossRefGoogle Scholar
  57. Wall L, Schwartz R L (1990) Programming Perl. O’Reilly and Associates, SebastopolMATHGoogle Scholar
  58. Yoo A B, Jette M A, Grondona M (2003) SLURM: Simple Linux utility for resource management. In: Feitelson D, Rudolph L, Schwiegelshohn U (eds) JSSPP 03: 9th International workshop on job scheduling strategies for parallel processing. doi: 10.1007/10968987_3, (to appear in print). lecture Notes in Computer Science Volume 2862. Springer, Berlin Heidelberg, pp 44–60

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Management Science and TechnologyAthens University of Economics and BusinessAthensGreece

Personalised recommendations