Abstract
Domain models capture the key concepts and relationships of a business domain. Typically, domain models are manually defined by software designers in the initial phases of a software development cycle, based on their interactions with the client and their own domain expertise. Given the key role of domain models in the quality of the final system, it is important that they properly reflect the reality of the business.
To facilitate the definition of domain models and improve their quality, we propose to move towards a more assisted domain modeling building process where an NLP-based assistant will provide autocomplete suggestions for the partial model under construction based on the automatic analysis of the textual information available for the project (contextual knowledge) and/or its related business domain (general knowledge). The process will also take into account the feedback collected from the designer’s interaction with the assistant. We have developed a proof-of-concept tool and have performed a preliminary evaluation that shows promising results.
Supported by Spanish project TIN2016-75944-R and CEA’s initiative Modelia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
According to the Cambridge dictionary: “information on many different subjects that you collect gradually, from reading, television, etc., rather than detailed information on subjects that you have studied formally”.
- 2.
Note that “NLP model” and “domain model” do not refer to the same type of model at all. In the NLP field, a model is the result of analyzing the textual corpus of data (it could be a trained neural network, a statistical model,...). To avoid confusion, in this work, each time we refer to a NLP model, we always refer to it as “NLP model” and never as “model” alone.
- 3.
- 4.
Note that, for each model, there is a finite number of slices.
- 5.
In linguistics, lemmatization is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word’s lemma, or dictionary form.
- 6.
- 7.
- 8.
These documents are not publicly available due to industrial property right. Nevertheless, the software artifacts derived from them are available in our Git repository.
References
Agt-Rickauer, H., Kutsche, R., Sack, H.: Automated recommendation of related model elements for domain models. In: MODELSWARD 2018, vol. 991, pp. 134–158 (2018)
Arora, C., Sabetzadeh, M., Briand, L.C., Zimmer, F.: Extracting domain models from natural-language requirements: approach and industrial evaluation. In: MODELS 2016, pp. 250–260 (2016)
Bakar, N.H., Kasirun, Z.M., Salleh, N.: Feature extraction approaches from natural language requirements for reuse in software product lines: a systematic literature review. J. Syst. Softw. 106, 132–149 (2015)
Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., et al.: Language models are few-shot learners (2020). https://arxiv.org/abs/2005.14165
Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: ESEC-FSE 2009, pp. 213–222 (2009)
Buitelaar, P., Cimiano, P., Magnini, B.: Ontology learning from text: methods, evaluation and applications, vol. 123. IOS press (2005)
CEA NLP tech: LIMA: LIbre Multilingual Analyzer. https://github.com/aymara/lima/wiki/DeepLima-beta#the-lima-multilingual-nlp-tool (2020)
Conesa, J., Olivé, A.: A method for pruning ontologies in the development of conceptual schemas of information systems. In: JoDS V, pp. 64–90 (2006)
Dahab, M.Y., Hassan, H.A., Rafea, A.: TextOntoEx: automatic ontology construction from natural English text. Expert Syst. Appl. 34(2), 1474–1480 (2008)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018). http://arxiv.org/abs/1810.04805
Elkamel, A., Gzara, M., Ben-Abdallah, H.: An UML class recommender system for software design. In: AICCSA 2016, pp. 1–8 (2016)
Evans, E.: Domain-driven design: tackling complexity in the heart of software. Addison-Wesley Professional (2004)
Fellbaum, C.: WordNet: an electronic lexical database. Bradford Books (1998). https://wordnet.princeton.edu/
Friedrich, F., Mendling, J., Puhlmann, F.: Process model generation from natural language text. In: CAISE 2011, pp. 482–496 (2011)
Ganser, A., Lichter, H.: Engineering model recommender foundations. In: MODELSWARD 2013, vol. 19, pp. 135–142 (2013)
Gasparic, M., Janes, A.: What recommendation systems for software engineering recommend. J. Syst. Softw. 113, 101–113 (2016)
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. In: LREC 2018 (2018)
Harel, D., Katz, G., Marelly, R., Marron, A.: Wise computing: toward endowing system development with proactive wisdom. Computer 51(2), 14–26 (2018)
Harmain, H.M., Gaizauskas, R.J.: Cm-builder: a natural language-based case tool for object-oriented analysis. Autom. Softw. Eng. 10, 157–181 (2003)
Ibrahim, M., Ahmad, R.: Class diagram extraction from textual requirements using natural language processing (NLP) techniques. In: ICCRD 2010, pp. 200–204 (2010)
Kuhn, A.: On recommending meaningful names in source and UML. In: RSSE 2010, pp. 50–51 (2010)
Kumar, D.D., Sanyal, R.: Static UML model generator from analysis of requirements (SUGAR). In: ASEA 2008, pp. 77–84 (2008)
Kuschke, T., Mäder, P.: Pattern-based auto-completion of UML modeling activities. In: ASE 2014, pp. 551–556 (2014)
Lee, C.S., Kao, Y.F., Kuo, Y.H., Wang, M.H.: Automated ontology construction for unstructured text documents. Data Knowl. Eng. 60(3), 547–566 (2007)
Marasoiu, M., Church, L., Blackwell, A.F.: An empirical investigation of code completion usage by professional software developers. In: PPIG 2015, p. 14 (2015)
Mendix: Mendix assist (2020). https://www.mendix.com/platform/#assist
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS 2013, vol. 2 (2013)
Mussbacher, G., Combemale, B., Kienzle, J., et al.: Opportunities in intelligent modeling assistance. Softw. Syst. Model. 19(5), 1045–1053 (2020)
Olivé, A.: Conceptual Modeling of Information Systems. Springer, Berlin (2007). https://doi.org/10.1007/978-3-540-39390-0
OutSystems: (2020). https://www.outsystems.com/p/low-code-platform/
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP 2014, pp. 1532–1543 (2014)
Reinhartz-Berger, I., Kemelman, M.: Extracting core requirements for software product lines. Requirements Eng. 25(1), 47–65 (2020)
Robillard, M., Walker, R., Zimmermann, T.: Recommendation systems for software engineering. IEEE Softw. 27(4), 80–86 (2009)
Sagar, V.B.R.V., Abirami, S.: Conceptual modeling of natural language functional requirements. J. Syst. Softw. 88, 25–41 (2014)
Saini, R., Mussbacher, G., Guo, J.L., Kienzle, J.: DoMoBOT: a bot for automated and interactive domain modelling. In: MDE Intelligence 2020, pp. 1–10 (2020)
Saini, R., Mussbacher, G., Guo, J.L., Kienzle, J.: Towards queryable and traceable domain models. In: RE 2020, pp. 334–339. IEEE (2020)
Sen, S., Baudry, B., Vangheluwe, H.: Towards domain-specific model editors with automatic model completion. Simulation 86(2), 109–126 (2010)
Shao, T., Chen, H., Chen, W.: Query auto-completion based on word2vec semantic similarity. J. Phys. Conf. Ser. 1004(1), 12–18 (2018)
Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: EMF: Eclipse Modeling Framework 2.0., 2nd edn. Addison-Wesley Professional, Boston (2009)
Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: a look back and into the future. ACM Comput. Surv. (CSUR) 44(4), 1–36 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Burgueño, L., Clarisó, R., Gérard, S., Li, S., Cabot, J. (2021). An NLP-Based Architecture for the Autocompletion of Partial Domain Models. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-79382-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79381-4
Online ISBN: 978-3-030-79382-1
eBook Packages: Computer ScienceComputer Science (R0)