Requirements Engineering

, Volume 13, Issue 1, pp 1–18 | Cite as

Generating Natural Language specifications from UML class diagrams

  • Farid Meziane
  • Nikos Athanasakis
  • Sophia Ananiadou
Original Article


Early phases of software development are known to be problematic, difficult to manage and errors occurring during these phases are expensive to correct. Many systems have been developed to aid the transition from informal Natural Language requirements to semi-structured or formal specifications. Furthermore, consistency checking is seen by many software engineers as the solution to reduce the number of errors occurring during the software development life cycle and allow early verification and validation of software systems. However, this is confined to the models developed during analysis and design and fails to include the early Natural Language requirements. This excludes proper user involvement and creates a gap between the original requirements and the updated and modified models and implementations of the system. To improve this process, we propose a system that generates Natural Language specifications from UML class diagrams. We first investigate the variation of the input language used in naming the components of a class diagram based on the study of a large number of examples from the literature and then develop rules for removing ambiguities in the subset of Natural Language used within UML. We use WordNet, a linguistic ontology, to disambiguate the lexical structures of the UML string names and generate semantically sound sentences. Our system is developed in Java and is tested on an independent though academic case study.


Unify Modelling Language Noun Phrase Class Diagram Object Constraint Language Ambiguous Word 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors would like to thank the anonymous referees for their helpful comments, suggestions and insightful questions that helped improve the content and structure of this paper.


  1. 1.
    Abbott A (1983) Program design by informal English description. CACM 16(11):882–894Google Scholar
  2. 2.
    Ambler SW (2001) The object primer, the application developer’s guide to object orientation and the UML. Cambridge University Press, CambridgeGoogle Scholar
  3. 3.
    Ambler SW (2004) The object primer, Agile—model driven application development with UML 2.0, 3rd edn. Cambridge University Press, CambridgeGoogle Scholar
  4. 4.
    Ahrendt W, Baar T, Beckert B, Bubel R, Giese M, Hähnle R, Menzel W, Mostowski W, Roth A, Schlager S, Schmitt PH (2005) The KeY tool. Softw Syst Model 4(1):32–54CrossRefGoogle Scholar
  5. 5.
    Ambriola V, Gervasi V (2003) The CIRCE approach to the systematic analysis of natural language requirements, Technical Report TR03 -05, University of Pisa, Dipartimento di Informatica, March 2003,
  6. 6.
    Athanasakis N (2006) Generating natural language from UML class diagrams. Master Thesis, University of Salford, School of Computing, Science and Engineering, UKGoogle Scholar
  7. 7.
    Barclay KA, Savage J (2003) Object-oriented design with UML and JAVA, Butterworth-Heinemann LtdGoogle Scholar
  8. 8.
    Bennet S, Mcrobb S, Fenter R (2002) Object oriented systems analysis and design using UML, 3rd edn. Mc Graw Hill, Google Scholar
  9. 9.
    Booch G (1986) Object Oriented development. IEEE Trans Softw Eng 12(2):211–221Google Scholar
  10. 10.
    Booch G, Rumbaugh J, Jacobson I (1999) The Unified modeling language user guide, Addison-Wesley, ReadingGoogle Scholar
  11. 11.
    Booch G (2003) Object oriented analysis and design with applications, Addison-Wesley, ReadingGoogle Scholar
  12. 12.
    Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 21(4):543–565Google Scholar
  13. 13.
    Bruegge B, Dutoit AH (2003) Object-oriented software engineering: using Uml, patterns and Java: International Edition, Prentice-Hall, Englewood CliffsGoogle Scholar
  14. 14.
    Burke DA (2004) Improving the natural language translation of formal software specifications, Master Thesis, Chalmers University of Technology, Gothenburg, SwedenGoogle Scholar
  15. 15.
    Burke D, Johannisson K (2005) Translating formal software specifications to natural language: a grammar-based approach. In: Proceedings of the logical aspects of computational linguistics conference, Bordeaux, France, April 2005, pp 51–66Google Scholar
  16. 16.
    Caldwell DE, Korelsky T (1994) Bilingual generation of job descriptions from quasi-conceptual forms. In: Proceedings of the 4th ACL conference on applied natural language processing, Stuttgart, pp 1–6Google Scholar
  17. 17.
    Caragno D, Iordanskaja L (1989) Content determination and text structuring in Gossip. In: Extended abstracts of the 2nd European natural language generation workshop, University of Edinburgh, pp 15–21Google Scholar
  18. 18.
    Chen PPS (1976) The entity-relationship model : toward a unified view of data. ACM Trans Database Syst 1(3):9–36CrossRefGoogle Scholar
  19. 19.
    Chen PPS (1983) English sentence structure and entity-relationship diagrams. Inf Syst 29:127–149Google Scholar
  20. 20.
    Cheng BHC, Campbell LA (2001) Integrating informal and formal approaches to requirements modeling and analysis, In: Proceedings of the 5th international symposium on requirements engineering, Toronto, Canada, pp 294–295Google Scholar
  21. 21.
    Christel MG, Kang KC (1992) Issues in requirements elicitation, Technical Report CMU/SE1–92-TR-12 ESC-TR-92–012, Software Engineering Institute, Carnegie Mellon University Pittsburgh, Pennsylvania 15213Google Scholar
  22. 22.
    Dalianis H, Hovy E (1993) Aggregation in natural language generation. In: Lecture notes in computer science, vol 1036, Springer, Heidelberg, pp 88–105Google Scholar
  23. 23.
    Danlos L, Lapalme G, Lux V (2000) Generating a Controlled Language. In: Proceedings of the first international conference on Natural language generation, vol 14. Mitzpe Ramon, Israel, pp 141–147Google Scholar
  24. 24.
    Davis AM (1990) Software requirements analysis and specification, Prentice-Hall, Englewood CliffsGoogle Scholar
  25. 25.
    Dunn L, Orlowska M (1990) A natural language interpreter for construction of conceptual schemas. In: Proceedings of the 2nd Nordic conference on advanced information systems engineering, Springer, Heidelberg, pp 371–386Google Scholar
  26. 26.
    Eugenio DB, Glass M, Trolio M (2002) The DIAG Experiments: NLG for intelligent tutoring systems. In: Proceedings of the AAAI spring symposium on natural language generation in spoken and written dialogue, pp 120–127Google Scholar
  27. 27.
    Fellbaum C (ed) (1998) WordNet: an electronic lexical database, MIT Press, Cambridge.
  28. 28.
    Fickas S (1987) Automating the analysis process. In: Proceedings of 4th IEEE international workshop on software specification and design, Monterey, pp 58–67Google Scholar
  29. 29.
    Gervasi AV, Zowghi AD (2005) Reasoning about inconsistencies in natural language requirements. ACM Trans Softw Eng Methodol 14(3):277–330CrossRefGoogle Scholar
  30. 30.
    Goldberg E, Driedger N, Kitteridge R (1994) Using natural language processing to produce weather forecasts. IEEE Expert 9(2):45–53CrossRefGoogle Scholar
  31. 31.
    Goldin L, Berry DM (1997) AbstFinder, A prototype natural language text abstraction finder for use in requirements elicitation. Autom Softw Eng 4(4):375–412CrossRefGoogle Scholar
  32. 32.
    Harmain HM (2000) Building Object-Oriented conceptual models using natural language processing techniques, PhD Thesis, Department of Computer Science, University of Sheffield, UKGoogle Scholar
  33. 33.
    Harmain HM, Gaizauskas R (2003) CM-Builder: A natural language-based CASE tool for object-oriented analysis. Autom Softw Eng J 10(2):157–181CrossRefGoogle Scholar
  34. 34.
    Heitmeyer CL, Jeffords RD, Labaw BC (1996) Automated consistency checking of requirements specifications. ACM Trans Softw Eng Methodol 5(3):231–261CrossRefGoogle Scholar
  35. 35.
    Horacek H (1992) An integrated view of text planning. In: Aspects of automated natural language generation, Lecture notes in artificial intelligence, vol 587. Springer, Berlin, pp 29–44Google Scholar
  36. 36.
    Johannisson K (2005) Formal and Informal Software Specifications. PhD Thesis, Department of Computing Science, Götenborg UniversityGoogle Scholar
  37. 37.
    Jones CB (1990) Systematic software development using VDM, Prentice-Hall, Englewood CliffsGoogle Scholar
  38. 38.
    Kof L (2006) Text analysis for requirements engineering, PhD Thesis, Institut für Informatik der Technischen Universität München, GermanyGoogle Scholar
  39. 39.
    Konrad S, Cheng BHC (2005) Automated analysis of natural language properties for UML models. MoDELS Satellite Events, pp 48–57Google Scholar
  40. 40.
    Lavoie B, Rambow O, Reiter E (1996) The ModelExplainer. In: Proceedings of the 8th international workshop on natural language generation, England, pp 9–12Google Scholar
  41. 41.
    Lethbridge T, Singer J, Forward A (2003) How software engineers use documentation: the state of the practice, IEEE Software, pp 35–39Google Scholar
  42. 42.
    Lethbridge T, Laganiere R (2004) Object-oriented software engineering: practical software development using UML and Java, McGraw-Hill, New YorkGoogle Scholar
  43. 43.
    Lubars MD, Harandi MT (1986) Intelligent support for software specification and design. IEEE Expert 1(4):33–41CrossRefGoogle Scholar
  44. 44.
    Maciaszek LA (2001) Requirements analysis and system design, developing information systems with UML, Addison-Wesley, ReadingGoogle Scholar
  45. 45.
    Mann WC, Moore JA (1980) Computer as author—results and prospects, Technical Report RR-79–82. USC Information Science Institute, Marina del ReyGoogle Scholar
  46. 46.
    Martin-Löf P (1984) Intuitionistic type theory. Bibliopolis, NapolizbMATHGoogle Scholar
  47. 47.
    Kajko-Mattsson M (2001) The state of documentation practice within corrective maintenance. In: Proceedings of the IEEE international conference on software maintenance, ICSM, pp 354–363Google Scholar
  48. 48.
    Mckeown K, Kukich K, Shaw J (1994) Practical issues in automatic documentation generation. In: Proceedings of the 4th conference on applied natural language processing, Stuttgart, pp 7–14Google Scholar
  49. 49.
    Meziane F (1994) From English to formal specifications, PhD Thesis, University of SalfordGoogle Scholar
  50. 50.
    Mich L (1996) NL-OOPS: from natural language to object oriented requirements using the natural language processing system LOLITA. Nat Lang Eng 2(2):161–187CrossRefGoogle Scholar
  51. 51.
    Moore J, Paris C (1989) Planning text for advisory dialogues. In: Proceedings of the 27th annual meeting of the association for computational linguistics, pp 203–211Google Scholar
  52. 52.
    Moreno AC, Juristo N, Van RP de Riet (2002) Formal justification in object-oriented modelling: a linguistic approach. Data Knowledge Eng 33(2):25–47Google Scholar
  53. 53.
    Nuseibeh B, Easterbrook S (2000) Requirements engineering: a roadmap, In: Proceedings of the international conference on the future of software engineering, Limerick, Ireland, ACM Press, New york, pp 35–46Google Scholar
  54. 54.
    Nyberg EH, Mitamura T (1996) Controlled language and knowledge-based machine translation: principles and practice. In: Proceedings of the first international workshop on controlled language applicationsGoogle Scholar
  55. 55.
    OMG, Unified Modeling Language Specification, version 1.5 (
  56. 56.
    Overmeyer SP, Lovoie B, Rambow O (2001) Conceptual modeling through linguistic analysis using LIDA. In: Proceedings of the 23rd international conference on software engineering, Toronto, Canada, pp 401–410Google Scholar
  57. 57.
    Paiva DS (1998) A survey of applied natural language generation systems, Information Technology Research Institute Technical report Series, University of BrightonGoogle Scholar
  58. 58.
    Presland G (1996) The analysis of natural language requirements documents, PhD thesis, University of LiverpoolGoogle Scholar
  59. 59.
    Reiter E, Mellish C, Levine J (1992) Automatic generation of on-line documentation in the IDAS project. In: Proceedings of the 3rd conference on applied natural language processing, pp 64–71Google Scholar
  60. 60.
    Reiter E (1995) NLG vs. Templates. In: Proceedings of the 5th European workshop on natural language generation, Leiden, The Netherlands, pp 95–105Google Scholar
  61. 61.
    Richard K, Mel’cuk I (1983) Towards a computable model of meaning text relations within a natural sublanguage. In: Proceedings of the 8th international joint conference on artificial intelligence (IJCAI-83), Karlsruhe, West Germany, pp 657–659Google Scholar
  62. 62.
    Saeki M, Horai H, Toyama K, Uematsu N, Enomoto H (1987) framework based on natural language, In: Proceedings of the 4th IEEE international workshop on software specification and design, Monterey, pp 87–94Google Scholar
  63. 63.
    Schwitter R (2002) English as a formal specification language, Proceedings of the 13th international workshop on database and expert systems applications, IEEE Computer Society, Washington, DC, USA, pp. 228–232Google Scholar
  64. 64.
    Shaw J (1995) Conciseness through aggregation in text generation. In: Proceedings of the 33rd ACL, pp 329–331Google Scholar
  65. 65.
    Spivey M (1992) The Z notation: a reference manual, 2nd edn. Prentice-Hall InternationalGoogle Scholar
  66. 66.
    Wahlster W, Andre E, Bandyopadhyay S, Winfried G, Rist T (1991) WIP: the coordinated generation of multimodal presentations from a common representation. In: Stock O, Slack J, Ortony A (eds) Computational theories of communication and their applications. Springer, Heidelberg, pp 75–93Google Scholar
  67. 67.
    Walden K (1996) Reversibility in software engineering. Computer 29(9):93–95CrossRefGoogle Scholar
  68. 68.
    Warmer JS, Kleep AG (1998) The object constraint language: precise modeling with UML, Addison-Wesley, ReadingGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  • Farid Meziane
    • 1
  • Nikos Athanasakis
    • 1
  • Sophia Ananiadou
    • 2
  1. 1.Informatics Research Institute, Newton BuildingUniversity of SalfordSalfordUK
  2. 2.School of Computer Science, National Centre for Text MiningUniversity of ManchesterManchesterUK

Personalised recommendations