Recommending metamodel concepts during modeling activities with pre-trained language models

Weyssow, Martin; Sahraoui, Houari; Syriani, Eugene

doi:10.1007/s10270-022-00975-5

Recommending metamodel concepts during modeling activities with pre-trained language models

Theme Section Paper
Published: 12 February 2022

Volume 21, pages 1071–1089, (2022)
Cite this article

Software and Systems Modeling Aims and scope Submit manuscript

Martin Weyssow¹,
Houari Sahraoui¹ &
Eugene Syriani¹

891 Accesses
21 Citations
3 Altmetric
Explore all metrics

Abstract

The design of conceptually sound metamodels that embody proper semantics in relation to the application domain is particularly tedious in model-driven engineering. As metamodels define complex relationships between domain concepts, it is crucial for a modeler to define these concepts thoroughly while being consistent with respect to the application domain. We propose an approach to assist a modeler in the design of metamodel by recommending relevant domain concepts in several modeling scenarios. Our approach does not require knowledge from the domain or to hand-design completion rules. Instead, we design a fully data-driven approach using a deep learning model that is able to abstract domain concepts by learning from both structural and lexical metamodel properties in a corpus of thousands of independent metamodels. We evaluate our approach on a test set containing 166 metamodels, unseen during the model training, with more than 5000 test samples. Our preliminary results show that the trained model is able to provide accurate top 5 lists of relevant recommendations for concept renaming scenarios. Although promising, the results are less compelling for the scenario of the iterative construction of the metamodel, in part because of the conservative strategy we use to evaluate the recommendations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Fig. 8

Automated Recommendation of Related Model Elements for Domain Models

MemoRec: a recommender system for assisting modelers in specifying metamodels

Article Open access 29 March 2022

Juri Di Rocco, Davide Di Ruscio, … Alfonso Pierantonio

Models, More Models, and Then a Lot More

Notes

In this paper, the term “model” refers to a machine learning model rather than an instance of a metamodel as customary in the MDE literature. All MDE artifacts we operate on are metamodels.
The illustration is inspired from the following blog: http://jalammar.github.io/illustrated-bert/.
http://mar-search.org/experiments/models20/.
https://www.eclipse.org/xtend/.
https://www.eclipse.org/modeling/emf/.
https://wordnet.princeton.edu/.

References

Agt-Rickauer, H., Kutsche, R.D., Sack, H.: Automated recommendation of related model elements for domain models. In: International Conference on Model-Driven Engineering and Software Development, pp. 134–158. Springer, Berlin (2018)
Agt-Rickauer, H., Kutsche, R.D., Sack, H.: Domore—a recommender system for domain modeling. In: MODELSWARD, pp. 71–82 (2018)
Atkinson, C., Kühne, T.: A tour of language customization concepts. Adv. Comput. 70, 105–161 (2007)
Article Google Scholar
Baker, P., Loh, S., Weil, F.: Model-driven engineering in a large industrial context—Motorola case study. In: International Conference on Model Driven Engineering Languages and Systems, pp. 476–491. Springer, Berlin (2005)
Basciani, F., Di Rocco, J., Di Ruscio, D., Di Salle, A., Iovino, L., Pierantonio, A.: Mdeforge: an extensible web-based modeling platform. In: 2nd International Workshop on Model-Driven Engineering on and for the Cloud, CloudMDE 2014, Co-located with the 17th International Conference on Model Driven Engineering Languages and Systems, MoDELS 2014, vol. 1242, pp. 66–75. CEUR-WS (2014)
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(null), 1137–1155 (2003)
MATH Google Scholar
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. Adv. Neural Inf. Process Syst. 33, 1877–1901 (2020)
Burgueño, L., Clarisó, R., Li, S., Gérard, S., Cabot, J.: A NLP-based architecture for the autocompletion of partial domain models. https://hal.archives-ouvertes.fr/hal-03010872. Working paper or preprint (2020)
Burgueño, L., Cabot, J., Gérard, S.: An LSTM-based neural network architecture for model transformations. In: 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS), pp. 294–299 (2019). https://doi.org/10.1109/MODELS.2019.00013
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Devlin, J., Chang, M-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Di Rocco, J., Di Sipio, C., Di Ruscio, D., Nguyen, T.P.: A GNN-based recommender system to assist the specification of metamodels and models. https://github.com/MDEGroup/MORGAN/blob/main/main.pdf
Eclipse Foundation, Inc.: Eclipse Emfatic. https://www.eclipse.org/emfatic/
Elkamel, A., Gzara, M., Ben-Abdallah, H.: An UML class recommender system for software design. In: 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), pp. 1–8 (2016). https://doi.org/10.1109/AICCSA.2016.7945659
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., et al.: CodeBERT: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)
France, R., Bieman, J., Cheng, B.H.: Repository for model driven development (ReMoDD). In: International Conference on Model Driven Engineering Languages and Systems, pp. 311–317. Springer, Berlin (2006)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–80 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Kanade, A., Maniatis, P., Balakrishnan, G., Shi, K.: Pre-trained contextual embedding of source code. arXiv preprint arXiv:2001.00059 (2019)
Karampatsis, R.M., Babii, H., Robbes, R., Sutton, C., Janes, A.: Big code != big vocabulary: open-vocabulary models for source code. Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (2020). https://doi.org/10.1145/3377811.3380342
Karampatsis, R.M., Sutton, C.: SCELMo: source code embeddings from language models. arXiv preprint arXiv:2004.13214 (2020)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Kuschke, T., Mäder, P.: Pattern-based auto-completion of UML modeling activities. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 551–556 (2014)
Kuschke, T., Mäder, P., Rempel, P.: Recommending auto-completions for software modeling activities. In: International Conference on Model Driven Engineering Languages and Systems, pp. 170–186. Springer, Berlin (2013)
Lample, G., Conneau, A.: Cross-lingual language model pretraining. CoRR arXiv:1901.07291 (2019)
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBerta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
López, J.A.H., Cuadrado, J.S.: MAR: a structure-based search engine for models. In: Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, pp. 57–67 (2020)
López-Fernández, J.J., Guerra, E., De Lara, J.: Assessing the quality of meta-models. In: MoDeVVa@ MoDELS, pp. 3–12. Citeseer (2014)
Mohagheghi, P., Gilani, W., Stefanescu, A., Fernandez, M.A.: An empirical study of the state of the practice and acceptance of model-driven engineering in four industrial cases. Empir. Softw. Eng. 18(1), 89–116 (2013)
Article Google Scholar
Mussbacher, G., Combemale, B., Abrahão, S., Bencomo, N., Burgueño, L., Engels, G., Kienzle, J., Kühn, T., Mosser, S., Sahraoui, H., et al.: Towards an assessment grid for intelligent modeling assistance. In: Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings, pp. 1–10 (2020)
Mussbacher, G., Combemale, B., Kienzle, J., Abrahão, S., Ali, H., Bencomo, N., Búr, M., Burgueño, L., Engels, G., Jeanjean, P., et al.: Opportunities in intelligent modeling assistance. Softw. Syst. Model. 19(5), 1045–1053 (2020)
Article Google Scholar
NaoMod Research Group: Atlanmod Modeling Tools. https://www.atlanmod.org/
Rabbi, F., Lamo, Y., Yu, I., Kristensen, L.M.: A diagrammatic approach to model completion. In: AMT@MoDELS (2015)
Radford, A.: Improving language understanding by generative pre-training. OpenAI Blog (2018)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Robillard, M., Walker, R., Zimmermann, T.: Recommendation systems for software engineering. IEEE Softw. 27(4), 80–86 (2009)
Article Google Scholar
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2008)
Article Google Scholar
Sen, S., Baudry, B., Precup, D.: Partial model completion in model driven engineering using constraint logic programming. In: 17th International Conference on Applications of Declarative Programming and Knowledge Management (INAP 2007) and 21st Workshop on (Constraint), p. 59 (2007)
Sen, S., Baudry, B., Vangheluwe, H.: Domain-specific model editors with model completion. In: Giese, H. (ed.) Models in Software Engineering, pp. 259–270. Springer, Berlin (2008)
Chapter Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin (2016). https://doi.org/10.18653/v1/P16-1162. https://www.aclweb.org/anthology/P16-1162
Stephan, M.: Towards a cognizant virtual software modeling assistant using model clones. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pp. 21–24. IEEE (2019)
Svyatkovskiy, A., Lee, S., Hadjitofi, A., Riechert, M., Franco, J.V., Allamanis, M.: Fast and memory-efficient neural code completion. In: IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pp. 329–340. IEEE (2020)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Weyssow, M., Sahraoui, H., Frénay, B., Vanderose, B.: Combining code embedding with static analysis for function-call completion. arXiv:2008.03731 (2020)
Whittle, J., Hutchinson, J., Rouncefield, M.: The state of practice in model-driven engineering. IEEE Softw. 31(3), 79–85 (2014). https://doi.org/10.1109/MS.2013.65
Article Google Scholar
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Scao, T.L., Gugger, S., Drame, M., Lhoest, Q., Rush, A.M.: Huggingface’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019)

Download references

Author information

Authors and Affiliations

DIRO, Université de Montréal, Montreal, Canada
Martin Weyssow, Houari Sahraoui & Eugene Syriani

Authors

Martin Weyssow
View author publications
You can also search for this author in PubMed Google Scholar
Houari Sahraoui
View author publications
You can also search for this author in PubMed Google Scholar
Eugene Syriani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Weyssow.

Additional information

Communicated by L. Burgueño, J. Cabot, M. Wimmer and S. Zschaler.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Replication package

We make our code, datasets, and models publicly available to ease the replication of our experiments and to help researchers that are interested in extending our work:

https://github.com/martin-wey/metamodel-concepts-bert.

The data and models are available on Zenodo:

https://doi.org/10.5281/zenodo.5579980.

B Model hyperparameters

See Table 3.

Table 3 Hyperparameters (HP) used for the training of our model

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Weyssow, M., Sahraoui, H. & Syriani, E. Recommending metamodel concepts during modeling activities with pre-trained language models. Softw Syst Model 21, 1071–1089 (2022). https://doi.org/10.1007/s10270-022-00975-5

Download citation

Received: 02 April 2021
Revised: 12 October 2021
Accepted: 15 December 2021
Published: 12 February 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10270-022-00975-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recommending metamodel concepts during modeling activities with pre-trained language models

Abstract

Access this article

Similar content being viewed by others

Automated Recommendation of Related Model Elements for Domain Models

MemoRec: a recommender system for assisting modelers in specifying metamodels

Models, More Models, and Then a Lot More

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Replication package

B Model hyperparameters

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recommending metamodel concepts during modeling activities with pre-trained language models

Abstract

Access this article

Similar content being viewed by others

Automated Recommendation of Related Model Elements for Domain Models

MemoRec: a recommender system for assisting modelers in specifying metamodels

Models, More Models, and Then a Lot More

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Replication package

B Model hyperparameters

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation