Analysis of Linux Evolution Using Aligned Source Code Segments

  • Antti Rasinen
  • Jaakko Hollmén
  • Heikki Mannila
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4265)


The Linux operating system embodies a development history of 15 years and community effort of hundreds of voluntary developers. We examine the structure and evolution of the Linux kernel by considering the source code of the kernel as ordinary text without any regard to its semantics. After selecting three functionally central modules to study, we identified code segments using local alignments of source code from a reduced set of file comparisons. The further stages of the analyses take advantage of these identified alignments. We build module-specific visualizations, or descendant graphs, to visualize the overall code migration between versions and files. More detailed view can be achieved with chain graphs which show the time evolution of alignments between selected files. The methods used here may also prove useful in studying large collections of legacy code, whose original maintainers are not available.


Source Code Stable Branch Chain Graph Source Code Distribution Rolling Stone 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Beck, M., Böhme, H., Dziadzka, M., Kunitz, U., Magnus, R., Schröter, C., Verworner, D.: Linux Kernel Programming, 3rd edn. Pearson Education Ltd., London (2002)Google Scholar
  2. 2.
    Bellman, R.: On the theory of dynamic programming. In: Proceedings of the National Academy of Sciences of the United States of America, vol. 38, pp. 716–719 (August 1952)Google Scholar
  3. 3.
    Carver, E.A., Stubbs, L.: Zooming in on the human-mouse comparative map: Genome conservation re-examined on a high-resolution scale. Genome Research 7(12), 1123–1137 (1997)Google Scholar
  4. 4.
    Godfrey, M.W., Tu, Q.: Evolution in open source software: A case study. In: 16th IEEE International Conference on Software Maintenance (ICSM 2000), vol. 0, pp. 131–142 (2000)Google Scholar
  5. 5.
    Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)Google Scholar
  6. 6.
    Jaro, M.A.: Advances in record linking methodology as applied to the 1985 census of tampa florida. Journal of the American Statistical Society 64, 1183–1210 (1989)Google Scholar
  7. 7.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48, 443–453 (1970)CrossRefGoogle Scholar
  8. 8.
    Rusling, D.A.: The Linux kernel. Last visited on (March 25, 2006), Published in Web at:
  9. 9.
    Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)CrossRefGoogle Scholar
  10. 10.
    Tahvildari, L., Gregory, R., Kontogianni, K.: An approach for measuring software evolution using source code features. In: Sixth Asia-Pacific Software Engineering Conference (APSEC 1999), vol. 00, pp. 10–17 (1999)Google Scholar
  11. 11.
    Torvalds, L.: Linux: A portable operating system. Master’s thesis, University of Helsinki (1997)Google Scholar
  12. 12.
    Viégas, F.B., Wattenberg, M., Dave, K.: Studying cooperation and conflict between authors with history flow visualizations. In: Conference on Human Factors in Computing Systems (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Antti Rasinen
    • 1
  • Jaakko Hollmén
    • 1
  • Heikki Mannila
    • 1
  1. 1.Laboratory of Computer and Information ScienceHelsinki Institute of Information Technology, Basic Research Unit, Helsinki University of TechnologyFinland

Personalised recommendations