Skip to main content
Log in

Improving requirements completeness: automated assistance through large language models

  • Original Article
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

Natural language (NL) is arguably the most prevalent medium for expressing systems and software requirements. Detecting incompleteness in NL requirements is a major challenge. One approach to identify incompleteness is to compare requirements with external sources. Given the rise of large language models (LLMs), an interesting question arises: Are LLMs useful external sources of knowledge for detecting potential incompleteness in NL requirements? This article explores this question by utilizing BERT. Specifically, we employ BERT’s masked language model to generate contextualized predictions for filling masked slots in requirements. To simulate incompleteness, we withhold content from the requirements and assess BERT’s ability to predict terminology that is present in the withheld content but absent in the disclosed content. BERT can produce multiple predictions per mask. Our first contribution is determining the optimal number of predictions per mask, striking a balance between effectively identifying omissions in requirements and mitigating noise present in the predictions. Our second contribution involves designing a machine learning-based filter to post-process BERT’s predictions and further reduce noise. We conduct an empirical evaluation using 40 requirements specifications from the PURE dataset. Our findings indicate that: (1) BERT’s predictions effectively highlight terminology that is missing from requirements, (2) BERT outperforms simpler baselines in identifying relevant yet missing terminology, and (3) our filter reduces noise in the predictions, enhancing BERT’s effectiveness for completeness checking of requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Abbas M, Ferrari A, Shatnawi A, Enoiu EP, Saadatmand M (2021) Is requirements similarity a good proxy for software similarity? An empirical investigation in industry. In: 27th international working conference on requirements engineering: foundation for software quality (REFSQ’21)

  2. Alrajeh D, Kramer J, van Lamsweerde A, Russo A, Uchitel S (2012) Generating obstacle conditions for requirements completeness. In: 34th international conference on software engineering (ICSE’12)

  3. Amaral CO, Abualhaija S, Torre D, Sabetzadeh M, Briand L (2022) AI-enabled automation for completeness checking of privacy policies. IEEE Trans Softw Eng 48(11):4647–4674

    Article  Google Scholar 

  4. Chetan A, Mehrdad S, Lionel B (2019) An empirical study on the potential usefulness of domain models for completeness checking of requirements. Empir Softw Eng 24(4):2509–2539

    Article  Google Scholar 

  5. Chetan A, Mehrdad S, Lionel B, Frank Z (2015) Automated checking of conformance to requirements templates using natural language processing. IEEE Trans Softw Eng 41(10):944–968

    Article  Google Scholar 

  6. Chetan A, Mehrdad S, Lionel B, Frank Z (2017) Automated extraction and clustering of requirements glossary terms. IEEE Trans Softw Eng 43(10):918–945

    Article  Google Scholar 

  7. Chetan A, Mehrdad S, Shiva N, Lionel B (2019) An active learning approach for improving the accuracy of automated domain model extraction. ACM Trans Softw Eng Methodol 28:1–34

    Google Scholar 

  8. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:286-–05

  9. Berry D (2021) Empirical evaluation of tools for hairy requirements engineering tasks. Empir Softw Eng 26:11

    Article  Google Scholar 

  10. Berry DM., Kamsties E, Krieger M (2003) From contract drafting to software specification: Linguistic sources of ambiguity, a handbook. https://cs.uwaterloo.ca/dberry/handbook/ambiguityHandbook.pdf

  11. Bhatia J, Breaux T (2018) Semantic incompleteness in privacy policy goals. In: 26th IEEE international requirements engineering conference (RE’18)

  12. Jie C, Jiawei L, Shulin W, Sheng Y (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79

    Article  Google Scholar 

  13. Capon JA (1988) Elementary statistics for the social sciences: study guide. Wadsworth

  14. Cui G, Lu Q, Li W, Chen Y-R (2008) Corpus exploitation from Wikipedia for ontology construction. In: 6th international conference on language resources and evaluation (LREC’08)

  15. Dalpiaz F, van der Schalk I, Lucassen G (2018) Pinpointing ambiguity and incompleteness in requirements engineering via information visualization and NLP. In: 24th international working conference on requirements engineering: foundation for software quality (REFSQ’18)

  16. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Annual conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT’19)

  17. Eckhardt J, Vogelsang A, Femmer H, Mager P (2016) Challenging incompleteness of performance requirements by sentence patterns. In: 24th IEEE international requirements engineering conference (RE’16)

  18. Espana S, Condori-Fernandez N, Gonzalez A, Pastor Ó (2009) Evaluating the completeness and granularity of functional requirements specifications: a controlled experiment. In: 17th IEEE international requirements engineering conference (RE’09)

  19. Ezzini S, Abualhaija S, Arora C, Sabetzadeh M (2022) Automated handling of anaphoric ambiguity in requirements: a multi-solution study. In: 44th international conference on software engineering (ICSE’22)

  20. Ezzini S, Abualhaija S, Arora C, Sabetzadeh M, Briand L (2021) Using domain-specific corpora for improved handling of ambiguity in requirements. In: 43rd international conference on software engineering (ICSE’21)

  21. Ezzini S, Abualhaija S, Sabetzadeh M (2022) WikiDoMiner: wikipedia domain-specific miner. In: 30th ACM joint European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE’22)

  22. Fellbaum C (1998) WordNet: an electronic lexical database. Bradford Books, Bradford

    Book  Google Scholar 

  23. Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018). Cost-Sensitive Learning. In: Learning from Imbalanced Data Sets. Springer, Cham

  24. Ferrari A, dell’Orletta F, Spagnolo GO, Gnesi S (2014) Measuring and improving the completeness of natural language requirements. In: 20th international working conference on requirements engineering: foundation for software quality (REFSQ’14)

  25. Ferrari A, Donati B, Gnesi S (2017) Detecting domain-specific ambiguities: an NLP approach based on Wikipedia crawling and word embeddings. In: 25th IEEE international requirements engineering conference workshops (REW’17)

  26. Ferrari A, Spagnolo GO, Gnesi S (2017) PURE: a dataset of public requirements documents. In: 25th IEEE international requirements engineering conference (RE’17)

  27. Gigante G, Gargiulo F, Ficco M (2015) A semantic driven approach for requirements verification. In: Camacho D, Braubach L, Venticinque S, Badica C (eds) Intelligent distributed computing VIII. Springer, Cham

    Google Scholar 

  28. Hasso H, Großer K, Aymaz I, Geppert H, Jürjens J (2022) Abbreviation-expansion pair detection for glossary term extraction. In: 28th international working conference on requirements engineering: foundation for software quality (REFSQ’22)

  29. Hess M, Kromrey J (2004) Robust confidence intervals for effect sizes: a comparative study of cohen’s d and cliff’s delta under non-normality and heterogeneous variances. Annual Meeting of the American Educational Research Association

  30. Hey T, Keim J, Koziolek A, Tichy WF (2020) NoRBERT: transfer learning for requirements classification. In: 28th IEEE international requirements engineering conference (RE’20)

  31. Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266

    Article  MathSciNet  Google Scholar 

  32. Jurafsky D, Martin JH (2019) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd edn. Pearson, London

    Google Scholar 

  33. Krzeszowski TP (2011) Contrasting languages: the scope of contrastive linguistics, vol 51. Walter de Gruyter, Berlin

    Google Scholar 

  34. Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin

    Google Scholar 

  35. Lucassen G, Dalpiaz F, Van der Werf JM, Brinkkemper S (2016) Improving agile requirements: the quality user story framework and tool. Requir Eng 21:383–403

    Article  Google Scholar 

  36. Luitel D, Hassani S, Sabetzadeh M (2023) Replication package. https://bit.ly/REJ-BERT-2023

  37. Luitel D, Hassani S, Sabetzadeh M (2023) Using language models for enhancing the completeness of natural-language requirements. In: 29th international working conference on requirements engineering: foundation for software quality (REFSQ’23)

  38. Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Syngress, Oxford

    Book  Google Scholar 

  39. Mikolov T, Yih W-T, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Annual conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT’13)

  40. Open AI. ChatGPT. https://openai.com/blog/chatgpt Accessed June 2023

  41. OpenAI (2023) GPT-4 technical report. arXiv:2303.08774

  42. Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Conference on empirical methods in natural language processing (EMNLP’14)

  43. Sainani A, Anish PR, Joshi V, Ghaisas S (2020) Extracting and classifying requirements from software engineering contracts. In: 28th IEEE international requirements engineering conference (RE’20)

  44. Sammut C, Webb GI (2010) editors. TF–IDF. Springer

  45. Shen Y, Breaux T (2022) Domain model extraction from user-authored scenarios and word embeddings. In: 30th IEEE international requirements engineering conference workshops (REW’22)

  46. Sleimi A, Sannier N, Sabetzadeh M, Briand L, Dann J (2018) Automated extraction of semantic legal metadata using natural language processing. In: 26th IEEE international requirements engineering conference (RE’18)

  47. Vargha A, Delaney H (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132

    Google Scholar 

  48. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17)

  49. Witten Ian H, Eibe F, Hall Mark A (2017) Data mining: practical machine learning tools and techniques, 4th edn. Morgan Kaufmann, Burlington

    Google Scholar 

  50. Witten IH, Frank E, Hall MA, Pal CJ (2016) The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”, 4th edn. Morgan Kaufmann Publishers Inc., Burlington

  51. Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13:08

    Article  Google Scholar 

  52. Zhao L, Alhoshan W, Ferrari A, Letsholo KJ, Ajagbe MA, Chioasca E-V, Batista-Navarro RT (2021) Natural language processing for requirements engineering: a systematic mapping study. ACM Comput Surv 54(3):1–41

    Article  Google Scholar 

  53. Didar Z, Vincenzo G (2003) On the interplay between consistency, completeness, and correctness in requirements evolution. Inf Softw Technol 45(14):993–1009

    Article  Google Scholar 

  54. Zowghi D, Gervasi V (2003) The three Cs of requirements: consistency, completeness, and correctness. In: 8th international workshop on requirements engineering: foundation for software quality (REFSQ’03)

Download references

Acknowledgements

This work was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) under the Discovery and Discovery Accelerator programs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dipeeka Luitel.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

A psuedo-code for baselines

See Figs. 10, 11 and 12.

Fig. 10
figure 10

Baseline 1 Pseudocode

Fig. 11
figure 11

Baseline 2 Pseudocode

Fig. 12
figure 12

Baseline 3 Pseudocode

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luitel, D., Hassani, S. & Sabetzadeh, M. Improving requirements completeness: automated assistance through large language models. Requirements Eng 29, 73–95 (2024). https://doi.org/10.1007/s00766-024-00416-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-024-00416-3

Keywords

Navigation