Developing Architectural Documentation for the Hadoop Distributed File System

  • Len Bass
  • Rick Kazman
  • Ipek Ozkaya
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 365)


Many open source projects are lacking architectural documentation that describes the major pieces of the system, how they are structured, and how they interact. We have produced architectural documentation for the Hadoop Distributed File System (HDFS), a major open source project. This paper describes our process and experiences in developing this documentation. We illustrate the documentation we have produced and how it differs from existing documentation by describing the redundancy mechanisms used in HDFS for reliability.


  1. 1.
    Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: ACM SIGOPS Operating Systems Review - SOSP 2003, vol. 37(5), pp. 23–43 (2003)Google Scholar
  2. 2.
    Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop Distributed File System. In: IEEE 26th Symposium on Mass Storage Systems and Technologies, Incline Village, NV, pp. 1–10 (2010)Google Scholar
  3. 3.
    Clements, P., Bachmann, F., Bass, L., Garlan, D., Ivers, J., Little, R., Merson, P., Nord, R., Stafford, J.: Documenting Software Architectures: Views and Beyond, 2nd edn. Addison-Wesley, Reading (2010)Google Scholar
  4. 4.
    Brown, A., Wilson, G.: The Architecture of Open Source Applications (2010), (accessed June 6, 2011)
  5. 5.
    Apache Hadoop, HDFS Architecture, (accessed April 7, 2011)
  6. 6.
    ISO/IEC 42010:2007 – Recommended Practice for Architectural Description of Software-intensive Systems (2007), (accessed June 6, 2011)
  7. 7.
    Sonar, J.: (accessed April 9, 2011)
  8. 8.
    Lattix, (accessed April 9, 2011)
  9. 9.
    Foote, B., Yoder, J.: Big Ball of Mud. In: Fourth Conference on Patterns Languages of Programs (PLoP 1997/EuroPLoP 1997), Monticello, Illinois (1997)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2011

Authors and Affiliations

  • Len Bass
    • 1
  • Rick Kazman
    • 1
  • Ipek Ozkaya
    • 1
  1. 1.Software Engineering InstituteCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations