The use of domain knowledge in program understanding

Abstract

Program understanding is an essential part of all software maintenance and enhancement activities. As currently practiced, program understanding consists mainly of code reading. The few automated understanding tools that are actually used in industry provide helpful but relatively shallow information, such as the line numbers on which variable names occur or the calling structure possible among system components. These tools rely on analyses driven by the nature of the programming language used. As such, they are adequate to answer questions concerning implementation details, so called what questions. They are severely limited, however, when trying to relate a system to its purpose or requirements, the why questions. Application programs solve real‐world problems. The part of the world with which a particular application is concerned is that application's domain. A model of an application's domain can serve as a supplement to programming‐language‐based analysis methods and tools. A domain model carries knowledge of domain boundaries, terminology, and possible architectures. This knowledge can help an analyst set expectations for program content. Moreover, a domain model can provide information on how domain concepts are related. This article discusses the role of domain knowledge in program understanding. It presents a method by which domain models, together with the results of programming‐language‐based analyses, can be used to answers both what and why questions. Representing the results of domain‐based program understanding is also important, and a variety of representation techniques are discussed. Although domain‐based understanding can be performed manually, automated tool support can guide discovery, reduce effort, improve consistency, and provide a repository of knowledge useful for downstream activities such as documentation, reengineering, and reuse. A tools framework for domain‐based program understanding, a dowser, is presented in which a variety of tools work together to make use of domain information to facilitate understanding. Experience with domain‐based program understanding methods and tools is presented in the form of a collection of case studies. After the case studies are described, our work on domain‐based program understanding is compared with that of other researchers working in this area. The paper concludes with a discussion of the issues raised by domain‐based understanding and directions for future work.

This is a preview of subscription content, access via your institution.

References

  1. ACM (1995), Proceedings ACM SIGPLAN Workshop on Intermediate Representations (IR'95), ACM.

  2. Arango, G. and R. Prieto-Díaz (1991), "Domain Analysis Concepts and Research Directions," In Domain Analysis and Software Systems Modeling, R. Prieto-Díaz and G. Arango, Eds., IEEE Computer Society Press, pp. 9–32.

  3. Arango, G., E. Schoen, and R. Pettengill (1993), "A Process for Consolidating and Reusing Design Knowledge," 15th International Conference on Software Engineering, IEEE Computer Society Press, Baltimore, MD, pp. 231–242.

    Chapter  Google Scholar 

  4. Arthur, L.J. (1988), Software Evolution, Wiley, New York.

    Google Scholar 

  5. Batory, D. and S. O'Malley (1992), "The Design and Implementation of Hierarchical Software Systems with Reusable Components," ACM Transactions on Software Engineering and Methodology 1, 4, 355–398.

    Article  Google Scholar 

  6. Biggerstaff, T.J. (1989), "Design Recovery for Maintenance and Reuse," IEEE Computer 7, 22, 36–49.

    Google Scholar 

  7. Biggerstaff, T.J., B.G. Mitbander, and D. Webster (1994), "Program Understanding and the Concept Assignment Problem," Communications of the ACM 37, 5, 72–83.

    Article  Google Scholar 

  8. Boehm, B. (1981), Software Engineering Economics, Prentice-Hall, Englewood Cliffs, NJ.

    MATH  Google Scholar 

  9. Borgida, A., R.J. Brachman, D.L. McGuinness and L.A. Resnick (1989), "CLASSIC: A Structural Data Model for Objects," In Proceedings ACM SIGMOD International Conference on Management of Data.

  10. Brachman, R., D. McGuinness, P. Patel-Schneider, L. Resnick, and A. Borgida (1990), "Living with CLASSIC: When and How to Use a KL-ONE-Like Language," In Principles of Semantic Networks, J. Sowa, Ed., Morgan Kaufmann, San Mateo, CA.

  11. Brooks, R. (1983), "Towards a Theory of the Comprehension of Computer Programs," International Journal of Man-Machine Studies 18, 543–554.

    Google Scholar 

  12. Caere Corporation (1994), OmniPage Professional Reference Manual, Los Gatos, CA.

  13. Campbell, R.H. (1974), "The Specification of Process Synchronization by Path-Expressions," In Lecture Notes in Computer Science, Vol. 16, Springer-Verlag, pp. 89–102.

    MATH  Article  Google Scholar 

  14. Chen, P.P. (1976), "The Entity-Relationship Model-Toward a Unified View of Data," ACM Transactions on Database Systems 1, 1, pp. 9–36.

    Article  Google Scholar 

  15. Chen, Y.F. and C.V. Ramamoorthy (1986), "The C Information Abstractor," In Proceedings COMPASC 86, IEEE, pp. 291–298.

  16. Chikofsky, E.J. and J.H. Cross II (1990), "Reverse Engineering and Design Recovery: A Taxonomy," IEEE Software 7, 1, 13–17.

    Article  Google Scholar 

  17. Clayton, R. and S. Rugaber (1993), "The Representation Problem in Reverse Engineering," In Proceedings of the First Working Conference on Reverse Engineering, pp. 8–16.

  18. Clayton, R., S. Rugaber, L. Taylor, and L. Wills (1997a), "A Case Study of Domain-based Program Understanding," In 5th International Workshop on Program Comprehension, pp. 102–110.

  19. Clayton, R., S. Rugaber, and L. Wills (1997b), "Domain Based Design Documentation and Component Reuse and their Application to a System Evolution Record; Final Report," College of Computing, Georgia Institute of Technology, http://www.cc.gatech.edu/reverse/dare/finalreport/index.html.

  20. Clayton, R., S. Rugaber, and L. Wills (1998a), "Dowsing: A Tools Framework for Domain-Oriented Browsing of Software Artifacts," In Proceedings ASE 99, pp. 204–208.

  21. Clayton, R., S. Rugaber, and L. Wills (1998b), "On the Knowledge Required to Understand a Program," In The Fifth IEEE Working Conference on Reverse Engineering, pp. 69–78.

  22. Cleaveland, J.C. (1988), "Building Application Generators," IEEE Software 5, 4, 25–33.

    Article  Google Scholar 

  23. DeBaud, J.-M. (1994), "From Domain Analysis to Object-Oriented Frameworks, A Reuse Oriented Software Engineering Methodology," Technical Report CIMR TR# 94-04, Center for Information Management Research, Georgia Institute of Technology.

  24. Debaud, J.-M. (1996), "Lessons From a Domain-based Reengineering Effort," In Proceedings of the Third Working Conference on Reverse Engineering, pp. 217–226.

  25. DeBaud, J.-M., B. Moopen, and S. Rugaber (1994), "Domain Analysis and Reverse Engineering," In Proceedings of the Conference on Software Maintenance, pp. 326–335.

  26. DeBaud, J.-M. and S. Rugaber (1995), "A Software Re-engineering Method Using Domain Models," In International Conference on Software Maintenance, pp. 204–213.

  27. Defense Modeling and Simulation Office (1999), "High Level Architecture (HLA)," http:// hla.dmso.mil/

  28. Devambu, P.T. (1992), "GENOA/GENII-A customizable, language-and front-end-independent code analyzer," In Fourteenth International Conference on Software Engineering, pp. 307–319.

  29. Devanbu, P., R.J. Brachman, P.G. Selfridge, and B.W. Ballard (1991), "LaSSIE: A Knowledge-Based Software Information System," Communications of the ACM 34, 5, 35–49.

    Google Scholar 

  30. Eidbo, M., M. Ammar, R. Clark, R. Clayton, S. Doddapaneni, R. Dodge, M. McCracken, B. Nguyen, W. Roberts, S. Rogers, and S. Rugaber (1993), "Transitioning to the Open Systems Environment (TRANSOPEN) Final Report," Technical Report CIMR-93-01, Center for Information Management Research, Georgia Institute of Technology.

  31. Fjeldstad, R.K. and W.T. Hamlen (1983), "Application Program Maintenance Study: Report to Our Respondents," In Proceedings GUIDE 48, Philadelphia, PA, Tutorial on Software Maintenance, G. Parikh and N. Zvegintozov, Eds., IEEE Computer Society.

  32. Forsythe, G., M. Malcolm, and M. Moler (1977), Computer Methods for Mathematical Computations, Prentice-Hall, Englewood Cliffs, NJ, pp. 161–166.

    MATH  Google Scholar 

  33. Garlan, D. and M. Shaw (1995), Software Architecture: Perspectives on an Emerging Discipline, Prentice-Hall, Englewood Cliffs, NJ.

    Google Scholar 

  34. Grass, J.E. and Y.-F. Chen (1990), "The C++ Information Abstractor," In 1990 USENIX Conference, pp. 265–277.

  35. Harris, D., H.B. Reubenstein, and A.S. Yeh (1995), "Recognizers for Extracting Architectural Features from Source Code," In Second Working Conference on Reverse Engineering, L. Wills, P. Newcomb, and E. Chikofsky, Eds., IEEE Computer Society Press, pp. 252–261.

  36. Hildreth, H. (1994), "Reverse Engineering Requirements for Process-Control Software," In Proceedings of the Conference on Software Maintenance, pp. 316–325.

  37. Johnson, R.E. and B. Foote (1988), "Designing Reusable Classes," Journal of Object-Oriented Programming1, 2, 22–35.

    Google Scholar 

  38. Johnson, W.L. and A. Erdem (1997), "Interactive Explanation of Software Systems," In Automated Software Engineering 2, 1, 53–75.

    MATH  Article  Google Scholar 

  39. Jones, C.B. (1990), Systematic Software Development Using VDM, Prentice-Hall, Englewood Cliffs, NJ.

    MATH  Google Scholar 

  40. Jullig, R., Y.V. Srinivas, L. Blaine, L.-M. Gilham, A. Goldberg, C. Green, J. McDonald, and R. Waldinger (1995), Specware Languages Manual, Version 1.1, Kestrel Institute.

  41. Loral Federal Systems-Owego (1999), "DSSA-Domain-Specific Software Architectures (DSSA)," Owego, New York, http://www.owego.com/dssa/foils/dssafoils.ps.

  42. Lowry, M., A. Philpot, T. Pressburger, and I. Underwood (1994), "Amphion: Specification-based Programming for Scientific Subroutine Libraries," In SAIRAS'94.

  43. MacDougall, M.H. (1987), Simulating Computer Systems: Techniques and Tools, The MIT Press, Cambridge, MA.

    Google Scholar 

  44. Moore, M. (1996), "Rule-Based Detection for Reverse Engineering User Interfaces," In Proceedings of the Third Working Conference on Reverse Engineering, IEEE Computer Society Press, pp. 42–48.

  45. Moore, M. and S. Rugaber (1997a), "Using a Knowledge Representation for Understanding Interactive Systems," In Proceedings of the International Workshop on Program Comprehension, pp. 60–67.

  46. Moore, M., and S. Rugaber (1997b), "Domain Analysis for Transformational Reuse," In Proceedings of the Fourth Working Conference on Reverse Engineering, IEEE Computer Society Press, pp. 156–163.

  47. Moore, M., S. Rugaber, and H. Astudillo (1993), "Knowledge Worker Platform Analysis Final Report," Technical Report CIMR-93-02, Center for Information Management Research, College of Computing, Georgia Institute of Technology.

  48. Moore, M., S. Rugaber, and P. Seaver (1994), "Knowledge-based User Interface Migration," In Proceedings of the 1994 International Conference on Software Maintenance, pp. 72–79.

  49. Murphy, G.C., D. Notkin, and K. Sullivan (1995), "Software Reflexion Models: Bridging the Gap Between Source and High-Level Models," In Proceedings of the Third ACM SIGSOFT Symposium on the Foundations of Software Engineering, ACM, pp. 18–28.

  50. Neighbors, J. (1980), Software Construction from Components, PhD Dissertation, ICS Department, University of California at Irvine.

    Google Scholar 

  51. Neighbors, J.M. (1989), "Draco: A Method for Engineering Reusable Software Components," In Software Reusability/Concepts and Models, Vol. 1,T.J. Biggerstaff and A.J. Perlis, Eds., Addison-Wesley, Reading, MA.

    Google Scholar 

  52. Ousterhout, J.K. (1994), Tcl and Tk Toolkit, Addison-Wesley, Reading, MA.

    MATH  Google Scholar 

  53. Overton, R.K. et al. (1971), "A Study of the Fundamental Factors Underlying Software Maintenance Problems: Final Report," Corporation for Information Systems Research and Development.

  54. Prieto-Díaz, R. (1989), "Classification of Reusable Modules," In Software Reusability/Concepts and Models, Vol. 1, T.J. Biggerstaff and A.J. Perlis, Eds., Addison-Wesley, Reading, MA, pp. 99-123.

    Google Scholar 

  55. Prieto-Díaz, R. (1991), "Domain Analysis for Reusability," In Domain Analysis and Software Systems Modeling, R. Prieto-Díaz and G. Arango, Eds., IEEE Computer Society Press, pp. 63-69.

  56. Prieto-Díaz, R. and G. Arango (1991), Domain Analysis and Software Systems Modeling, IEEE Computer Society Press, Los Alamitos, CA.

    Google Scholar 

  57. Quilici, A. and D.N. Chin (1995), "DECODE: A Cooperative Environment for Reverse-Engineering Legacy Software," In Second Working Conference on Reverse Engineering, L. Wills, P. Newcomb, and E. Chikofsky, Eds., IEEE Computer Society Press, pp. 156-165.

  58. Reasoning Systems Incorporated (1990), Software Refinery Toolkit, Palo Alto, CA.

  59. Resnick, L.A. et al. (1993), CLASSIC Description and Reference Manual for the Common LISP Implementation Version 2.1, AT&T Bell Labs, Murray Hill, NJ.

    Google Scholar 

  60. Rugaber, S. (1996), "Program Understanding," In Encyclopedia of Computer Science and Technology, Supplement 20, 35, A. Kent and J.G. Williams, Eds., Marcel Dekker, pp. 341-368.

  61. Rugaber, S. (1997), "An Example of Program Understanding," Technical Report GIT-CC-98-14, College of Computing, Georgia Institute of Technology.

  62. Rugaber, S., S.B. Ornburn, and R.J. LeBlanc, Jr. (1990), "Recognizing Design Decisions in Programs," IEEE Software 7, 1, 46-54.

    Article  Google Scholar 

  63. Rugaber, S., K. Stirewalt, and L. Wills (1995a), "Detecting Interleaving," In International Conference on Software Maintenance, pp. 265-274.

  64. Rugaber, S., K. Stirewalt, and L. Wills (1995b), "The Detection and Extraction of Interleaving Code Segments," Technical Report GIT-CC-95-49, College of Computing, Georgia Institute of Technology.

  65. Rugaber, S., K. Stirewalt and L. Wills (1996), "Understanding Interleaved Code," Automated Software Engineering 1-2, 3, 47-76.

    Article  MathSciNet  Google Scholar 

  66. Rumbaugh, J., M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen (1991), Object-Oriented Modeling and Design, Prentice-Hall, Englewood Cliffs, NJ.

    Google Scholar 

  67. Soloway, E., J. Pinto, S. Letovsky, D. Littman, and R. Lampert (1988), "Designing Documentation to Compensate for Delocalized Plans," Communications of the ACM 31, 11, 1259-1267.

    Article  Google Scholar 

  68. Spivey, J.M. (1987), Understanding Z: A Specification Language and Its Formal Semantics, Cambridge University Press.

  69. Srinivas, Y.V. (1991a), "Algebraic Specification of Domains," In Domain Analysis and Software Systems Modeling, R. Prieto-Díaz and G. Arango, Eds., IEEE Computer Society Press, pp. 90-124.

  70. Srinivas, Y.V. (1991b), "Pattern Matching: A Sheaf-Theoretic Approach," PhD Dissertation, Department of Information and Computer Science, University of California at Irvine.

    Google Scholar 

  71. SUN Microsystems (1994), Browsing Source Code.

  72. Yeh, A., D. Harris, and H. Reubenstein (1995), "Recovering Abstract Data Types and Object Instances from a Conventional Procedural Language," In Proceedings of the Second Working Conference on Reverse Engineering, pp. 227-236.

  73. Zeigler, B.P. (1976), Theory of Modeling and Simulation, Wiley, New York.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Rugaber, S. The use of domain knowledge in program understanding. Annals of Software Engineering 9, 143–192 (2000). https://doi.org/10.1023/A:1018976708691

Download citation

Keywords

  • Source Code
  • Domain Model
  • Domain Knowledge
  • Domain Analysis
  • Reverse Engineering