Using Structural Similarity for Effective Retrieval of Knowledge from Class Diagrams

  • Markus WolfEmail author
  • Miltos Petridis
  • Jixin Ma
Conference paper


Due to the proliferation of object-oriented software development, UML software designs are ubiquitous. The creation of software designs already enjoys wide software support through CASE (Computer-Aided Software Engineering) tools. However, there has been limited application of computer reasoning to software designs in other areas. Yet there is expert knowledge embedded in software design artefacts which could be useful if it were successfully retrieved. While the semantic tags are an important aspect of a class diagram, the approach formulated here uses only structural information. It is shown that by applying case-based reasoning and graph matching to measure similarity between class diagrams it is possible to identify properties of an implementation not encoded within the actual diagram, such as the domain, programming language, quality and implementation cost. The practical applicability of this research is demonstrated in the area of cost estimation.


Software Design Class Diagram Graph Similarity Weight Setting Graph Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Beddoe, G. Petrovic, S. (2006) Determining feature weights using a genetic algorithm in a case-based reasoning approach to personnel rostering. European Journal of Operational Research, Vol. 175, Issue 2, pp. 649–671.Google Scholar
  2. 2.
    Boehm, B. Abts, C. Brown, A. W. Chulani, S. Clark, B. K. Horowitz, E. Madachy, R. Reifer, D. J. Steece, B. (2000) Software Cost Estimation with COCOMO II, Englewood Cliffs, NJ:Prentice-Hall.Google Scholar
  3. 3.
    Briand, L. Wieczorek, I (2002) Resource Estimation in Software Engineering, Encyclopedia of Software Engineering, J. J. Marcinak. New York, John Wiley & Sons: 1160–1196.Google Scholar
  4. 4.
    Desharnais, J. M. (1989) Analyse statistique de la productivitie des projets informatique a partie de la technique des point des fonction. University of Montreal.Google Scholar
  5. 5.
    Garey, M. R. Johnson, D. S. (1987) Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman.Google Scholar
  6. 6.
    Gomes, P. Gandola, P. Cordeiro, J. (2007) Helping Software Engineers Reusing UML Class Diagrams, in Proceedings of the 7th International Conference on Base-Based Reasoning (ICCBR’07) pp. 449–462, Springer, 2007.Google Scholar
  7. 7.
    Grabert, M. Bridge, D.G. (2003) Case-Based Reuse of Software Examplets, Journal of Universal Computer Science, Vol. 9, No. 7, pp. 627–641.Google Scholar
  8. 8.
    Huang, Z. (2009) Cost Estimation of Software Project Development by Using Case-Based Reasoning Technology with Clustering Index Mechanism. In Proceedings of the 2009 Fourth international Conference on innovative Computing, information and Control, ICICIC. IEEE Computer Society, pp. 1049–1052, Washington, DC.Google Scholar
  9. 9.
    Li, Y. F. Xie, M. Goh, T. N. (2009) A study of mutual information based feature selection for case based reasoning in software cost estimation. Expert Systems with Applications: An International Journal, Volume 36, Issue 3, pp. 5921–5931, Pergamon Press, Tarrytown, NY.Google Scholar
  10. 10.
    Meditskos, G. Bassiliades, N. (2007) Object-Oriented Similarity Measures for Semantic Web Service Matchmaking, in Proceedings 5th IEEE European Conference on Web Services.Google Scholar
  11. 11.
    Mitchell, T. M. (1990) The need for biases in learning generalizations, In Readings in machine learning, San Mateo, CA, Morgan Kaufmann.Google Scholar
  12. 12.
    Özşen, S. Güneş, S. (2009) Attribute weighting via genetic algorithms for attribute weighted artificial immune system (AWAIS) and its application to heart disease and liver disorders problems, Expert Systems with Applications, Vol. 36, Issue 1, pp. 386–392.Google Scholar
  13. 13.
    Petridis, M. Saeed, S. Knight, B. (2007) A Generalised Approach for Similarity Metrics Between 3D Shapes to Assist the Design of Metal Castings using an Automated Case Based Reasoning System, in Proceedings of the 12\(^{{\rm th}}\) UK CBR workshop, Peterhouse, December 2007, CMS press, pp. 19–29, UK.Google Scholar
  14. 14.
    Robles, K. Fraga, A. Morato, J. Llorens, J. (2012) Towards an ontology-based retrieval of UML Class Diagrams, Information and Software Technology, Vol. 54, Issue 1, January 2012, pp. 72–86, Elsevier.Google Scholar
  15. 15.
    Valerdi, R. (2007) Cognitive Limits of Software Cost Estimation. In Proceedings of the First international Symposium on Empirical Software Engineering and Measurement, Empirical Software Engineering and Measurement. IEEE Computer Society, pp. 117–125, Washington, DC.Google Scholar
  16. 16.
    Wolf, M. Petridis, M. (2008) Measuring Similarity of Software Designs using Graph Matching for CBR, in Proceedings of the Artificial Intelligence Techniques in Software Engineering Workshop, 18\(^{{\rm th}}\) European Conference on Artificial Intelligence.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  1. 1.University of GreenwichLondonUK
  2. 2.University of BrightonBrightonUK

Personalised recommendations