GQM and Recommender System for Relevant Metrics

Kruglov, Artem; Succi, Giancarlo; Gorb, Anna

doi:10.1007/978-3-031-11658-2_4

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

1476 Accesses

Abstract

Any software project needs metrics to measure closeness to the goal. However, it is a complicated task to identify appropriate and goal-focused metrics by hand. That is why this chapter presents the GQM-based metrics recommender—a recommender system that can automatically deduce metrics from the project goals and questions.

You have full access to this open access chapter, Download chapter PDF

Personalized project recommendation on GitHub

Article 20 April 2018

An Overview of Recommender Systems in Requirements Engineering

Recommendation Systems in Requirements Discovery

Measurement in software development has its own specific challenges. The developer may easily try to count all the lines of code of the program. Yet, that will not give an answer about the quality of the code, as it is much more complex and includes many other aspects beyond the lines of code. In order to understand the effects of actions that are implemented in software development and gain an understanding of how improvements can be made for future software development processes, a certain purpose should be added to the measurement process. This can be reached with the use of the GQM model that allows to select goal-focused metrics among all possible variants. However, even after the application of the GQM model, too many metrics may remain, which will affect the efficiency of the processes and their collection and interpretation will not benefit the project. With the use of the recommender system, it is possible to choose the most useful among them without any wastage of resource.

4.1 Introduction

Software engineering is a unique phenomenon, standing so close to formal and rigorous mathematics, and at the same time to art, that it is impossible to define or frame. Software tasks have no standard algorithmic solutions, and it is very difficult to formalize the quality degree of the final product. That is why, as noted earlier in this book, every software project needs properly chosen metrics. They help to evaluate the processes, products, and resources in the early stages of the development and shape the right direction accordingly.

However, there are several problems to consider when choosing metrics. Firstly, they need to be selected very carefully and competently, as incorrectly defined metrics can lead the project away from its goal and drain valuable time and resources allocated to the project. The second problem is that there exist an infinite number of metrics and ways to categorize them. Overloading in their number can lead to a shift in priorities and loss of focus on the project.

4.2 Concept of Goal-Question-Metric

Because the problem of metrics selection is so important and complex, usually the development team hires a specialist who goes through a long a list of requirements and just as long a list of metrics and allocates a basic scope of metrics for further use. However, this is costly both financially and in terms of time. Consequently, several structured and formal approaches to derive metrics have emerged to simplify this process. Among them, the most widely known in the scientific area is the Goal-Question-Metric model, invented by Victor Basili and David Weiss. With this technology, it is possible to solve the first problem mentioned in the Introduction—exclude the possibility of deviations from the project goals with the inappropriate metrics.

GQM, according to Basili et al. (1994), stands for “goal, question, metric” and allows one to define the goals to achieve by clarifying the questions on how the goal can be reached with the collected data. For that purpose, GQM defines a measurement model on three levels:

1.
Goal—The goal for which all the work is done, all the artifacts and processes being produced.
2.
Question—Several questions defined based on the goal to outline a way to achieve it.
3.
Metric—A set of metrics that answer a given question in a measurable way.

The GQM model can be represented in the form of a tree, the root of which is the goal. The branches of the tree are represented by questions that allow the goal to be made more specific. And the leaves are the metrics, the measurable outcome of the whole model (Fig. 4.1).

Defining the GQM model, namely, questions and metrics, helps to determine why and how it is possible to achieve the goal. Consequently, the movement from the root of the tree to the leaves-metrics makes the abstraction itself more understandable.

4.3 The Goal-Question-Metric Process

According to Solingen et al. (1999), there emerged certain goal-setting methodological processes that include the GQM model construction:

1.
Planning phase—The first step for the project selection, initial artifacts creation, and planning.
2.
Definition phase—The GQM model construction and documentation.
3.
Data Collection phase—Collect data according to the defined GQM.
4.
Interpretation phase—Interpret the collected data regarding the defined metrics, which will give the answers to the stated questions. With the gathered results, the achievement of the goal can be evaluated.

In the first phase, the team prepares the ground for further steps by identifying the project area, its clients, and their needs and by the creation of the initial documentation.

All of the management, training involvement, and project planning are done during this phase. After the plan is made, the definition phase starts. During the definition phase, the GQM deliverable is developed, and the information for the deliverable is acquired from informative sources such as interviews, analysis, and articles. The data collection phase presumes the measurement itself. During the data collection phase, the data is gathered, stored, and defined. Then the interpretation phase begins when the measurements are used to answer the stated questions, and the answers are further used for the stated goals (Fig. 4.2).

For more details, the steps of the GQM method are:

1.
Develop business goals—Develop a set of corporate, division, and project business goals and associated measurement goals for productivity and quality. In other words, aims that should be reached or investigated should be stated at this step in such a way that will easily allow questions to be found that should define the goals. Usually, the goal is specifying the purpose of measurement, object to be measured, issue to be measured, and viewpoint from which the measure is taken. For instance, the goal can be stated as “Improve (a purpose) the timeliness (a quality issue) of change request processing (a process) from the project manager’s point of view (obviously, viewpoint).”
2.
Generate questions—Generate questions (based on models) that define those goals as completely as possible in a quantifiable way. Questions usually break down the goal into its major components. They should also be specific enough in order to be refined into metrics on the next step. An example of a question for the goal stated previously is “What is the current change request processing speed?”
3.
Specify the measures—Specify the measures needed to be collected to answer those questions and track process and product conformance to the goals. It should also be noted that any metric can help to answer more than one question, that is, it can be used two or more times. Examples of metrics for the question above are “Average cycle time” and “Percentage of cases outside of the upper time limit.”
4.
Define mechanisms for data collection—Choose the best-fit mechanisms that will help collect necessary data. They can range from invasive methods when somebody can even pause processes in order to capture some metrics to noninvasive when the processes are observed from outside without any intervention. For example, considered metrics can be collected by analyzing log files produced by the system.
5.
Collect, validate, and analyze the data—Process the data in real time to provide feedback to projects in order to correct or improve the process. This step often includes processing of collected data with the usage of methods and tools provided by statistics and probability theory.

All this helps to make sure that the selected metrics are aligned to the goal and will guide the project in the right direction.

4.4 Recommender Systems

The solution for the second issue can become the recommender system. Automation in our century is the main engine of progress. That is why recommendation systems were created to automate our choices. In order for users not to have to search through a multitude of options that do not interest them, the recommender systems choose the most interesting ones for them instead. This mechanism is extremely useful in modern online stores, such as Amazon, content generators such as YouTube, and search engines like Google. It can be seen that all the platforms from these examples are very popular and have a large user base. This is one of the main features of recommender systems. With a large number of users, services can collect huge amounts of data, which can be analyzed to develop recommender systems. Collected datasets can consist of any information that the computer can process—text data, numbers, boolean values, and much more.

Because of their widespread use in different fields, many types of recommender algorithms have appeared over time. The first and best known of them is collaborative filtering (Schafer et al. 2007). There are several types of collaborative filtering algorithms. User-based filtering finds the most similar users based on an analysis of their preferences and makes recommendations based on that comparison (Wang et al. 2021). The item-based method, on the other hand, is based on item analysis. This approach compares item ratings with one another and produces a result based on similarity (Sarwar et al. 2001). In general, collaborative filtering is effective for solving problems that do not have a detailed list of characteristics for each item. However, it also has its disadvantages. For example, it is not possible to make a good recommendation if there are no similar users or similar rated items. One should also keep in mind that the higher the amount of data, the better this method will work, but the longer the algorithm will take to work (Isinkaye et al. 2015).

In order to generate recommendations when there are detailed descriptions of items but not a lot of collected data, the second type of recommender algorithm—content-based filtering—was invented (Pazzani et al. 2007). As the name suggests, the focus is shifted from the user to the recommended product. This method is somewhat similar to filtering by parameters in online stores. Although it does not require a huge dataset of information about users, it still requires a broad and detailed description of each product (Lops et al. 2011).

Since both approaches have their advantages and disadvantages, a combination of these algorithms—hybrid filtering—has appeared. This mixing allows to bypass the problems of both (Çano 2017).

Of particular interest is the use of recommender systems in software engineering. Such systems are used in all phases of development (Gašparič et al. 2015), as well as in various subsections of programming. For example, they can be used to assign status to pull requests (Azeem et al. 2020) and tags for questions (Zhang et al. 2018a), API selection (Cai et al. 2019; Thung et al. 2013; Xie et al. 2020), forum recommendation (Castro-Herrera et al. 2009), package suggestions (McMillan et al. 2012), bug detection (Ashok et al. 2009; Gomez et al. 2015), task selection (Wang et al. 2020), refactoring recommender (Lin et al. 2016), and many others. However, recommender systems have not yet been applied to define a set of metrics for a software project.

4.5 Metrics Recommender

Since data collectors, including Innometrics, collect data and present the user metrics in the form of graphs, it was necessary to create a recommender system to sift out unnecessary information on the application dashboard. The solution process for this task is as follows. The user enters the application, opens the tab with the GQM model, fills it—namely, enters the goals and questions for his project—and then clicks on the “Generate Metrics” button. The system processes all the text data it has previously received from other users. After that, it passes them to the recommender algorithm. It, in its turn, analyzes the data and displays the answer to the user by assigning the necessary metrics to the questions. For analysis, the system needs three things—a dataset, a way to process the data, and the recommender algorithm itself. The next sections examine these three components in more detail.

4.5.1 Dataset

Since the GQM-based metrics recommender is based on the analysis of textual data, firstly this data needs to be processed—brought to a common view to lower the sparsity level. Preprocessing is a data handling algorithm that solves exactly this problem. But there are many ways to process textual data. To choose the best one for our problem, we used a dataset that consists of many English sentences from Kagle: https://www.kaggle.com/theoviel/improve-your-score-with-some-text-preprocessing/notebook. With this dataset, we conducted several experiments, described in the next section.

Next, to evaluate the effectiveness of different recommender algorithms, we compiled our own second dataset, since no one had posted such a dataset in the public domain before us (Fitzgerald et al. 2011). To collect the dataset, we interviewed 35 developers of Innopolis. The developers, ranging from 21 to 32 years old with 1.5 to 6 years of work experience in different fields of software development, were included in the distribution.

4.5.2 Preprocessing

During the research, we found out that there are a limited number of steps used for text preprocessing, namely, (1) TF-IDF, (2) Stop words removal, (3) Tokenization, (4) Stemming, (5) Vectorization, (6) PoS and Lemmatization, and (7) Non-letter symbols removal. In order to determine the most suitable sequence for our problem, an experiment was conducted. Each possible sequence composed of these preprocessing steps was run 1000 times on the 500 sentences from the Kagle dataset. The machine used in the experiment had the following characteristics: Intel Core i5-8250U, 4GHz, 7862MiB RAM. The resulting mean runtime and standard deviation are shown in Table 4.1.

Table 4.1 Execution time for each combination

Full size table

Multiple pairwise comparison tests using Tukey’s method with a family error coefficient of 0.05 (Lee et al. 2018) show that the most efficient sequence is placed under letter B: (1) Non-letter symbols removal, (2) Tokenization, (3) Stop words removal, (4) PoS definition, (5) Lemmatization, and (6) TF-IDF. This is the sequence we have chosen for our system.

4.5.3 Recommender Algorithm

The metrics recommender problem we identified earlier is a multi-label classification problem (Zhang et al. 2014), where multiple metrics can be assigned to each question. That is why we need to consider all types of multi-label classification algorithms in order to create a recommender algorithm. To quantify the effectiveness of each of them, we used the dataset collected from Innopolis employees, as mentioned in the “Dataset” subsection.

There exist several multi-label classification algorithms, which can be divided into two categories: problem transformation and algorithm adaptation (Tsoumakas et al. 2009). To choose from representatives of these two groups, we divided the dataset, described earlier, into 90% of train and 10% of the test set and evaluated all the multi-label algorithms from Table 4.2 on it. The results are shown in the same table.

Table 4.2 Multi-label algorithms comparison

Full size table

The table shows that binary relevance is the best, so it was used for the implementation of the recommender algorithm.

4.5.4 Conclusion

Based on our experiments, we developed a recommender algorithm for the “Innometrics” system. It was written in Java using the REST API technique. In one of the written endpoints, we put all the logic of the recommender system. Data received from users are first processed by the following sequence of preprocessing steps: (1) Non-letter symbols removal, (2) Tokenization, (3) Stop words removal, (4) PoS definition, (5) Lemmatization, and (6) TF-IDF. After that, they are passed to the recommender algorithm. It analyzes them and finally generates metrics suggestions for a new user. Thus, the user does not need to search for appropriate metrics from all of those available in the system. The algorithm automatically generates a goal-focused set of metrics that makes the best fit for each individual.

References

Ashok, B. et al. 2009. DebugAdvisor: A recommender system for debugging. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering. ESEC/FSE ’09, 373–382. Amsterdam: Association for Computing Machinery.
Google Scholar
Azeem, Muhammad Ilyas et al. 2020. Action-based recommendation in pull-request development. In Proceedings of the International Conference on Software and System Processes. ICSSP ’20, 115–124. Seoul: Association for Computing Machinery.
Chapter Google Scholar
Basili, Victor R. et al. 1994. The goal question metric approach, vol. I. New York: Wiley.
Google Scholar
Cai, Liang et al. 2019. BIKER: A tool for bi-information source based API method recommendation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2019, 1075–1079. Tallinn: Association for Computing Machinery.
Google Scholar
Çano, Erion. 2017. Hybrid recommender systems: A systematic literature review. Intelligent Data Analysis 21: 1487–1524.
Article Google Scholar
Castro-Herrera, Carlos et al. 2009. A recommender system for requirements elicitation in large-scale software projects. In Proceedings of the 2009 ACM Symposium on Applied Computing. SAC ’09, 1419–1426. Honolulu: Association for Computing Machinery.
Chapter Google Scholar
Fitzgerald, Brian et al. 2011. Adopting open source software: A practical guide. Cambridge, MA: The MIT Press.
Book Google Scholar
Gašparič, Marko et al. 2015. What recommendation systems for software engineering recommend: a systematic literature review. Journal of Systems and Software 113: 101–113.
Article Google Scholar
Gomez, Maria et al. 2015. A recommender system of buggy app checkers for app store moderators. In Proceedings of the Second ACM International Conference on Mobile Software Engineering and Systems. MOBILESoft ’15, 1–11. Florence: IEEE Press.
Google Scholar
Isinkaye, F.O. et al. 2015. Recommendation systems: Principles, methods and evaluation. Egyptian Informatics Journal 16(3): 261–273.
Article Google Scholar
Lee, Sangseok et al. 2018. What is the proper way to apply the multiple comparison test? Korean Journal of Anesthesiology 71(5): 353–360.
Article Google Scholar
Lin, Yun et al. 2016. Interactive and guided architectural refactoring with search-based recommendation. In Proceedings of the 2016 24th ACM SIG-SOFT International Symposium on Foundations of Software Engineering. FSE 2016, 535–546. Seattle: Association for Computing Machinery.
Chapter Google Scholar
Lops, Pasquale et al. 2011. Content-based recommender systems: State of the art and trends. In Recommender Systems Handbook, 73–105.
Google Scholar
McMillan, Collin et al. 2012. Recommending source code for use in rapid software prototypes. In Proceedings of the 34th International Conference on Software Engineering. ICSE ’12, 848–858. Zurich: IEEE Press.
Google Scholar
Pazzani, Michael J. et al. 2007. Content-based recommendation systems. In The Adaptive Web: Methods and Strategies of Web Personalization, 325–341. Berlin, Heidelberg: Springer Berlin Heidelberg.
Google Scholar
Sarwar, Badrul et al. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of ACM World Wide Web Conference 1.
Google Scholar
Schafer J. Ben et al. 2007. Collaborative filtering recommender systems. In The adaptive web: Methods and strategies of web personalization, 291–324. Berlin, Heidelberg: Springer Berlin Heidelberg.
Google Scholar
Solingen, Rini van et al. 1999. The goal/question/metric method: A practical guide for quality improvement of software development.
Google Scholar
Thung, Ferdian et al. 2013. Automatic recommendation of API methods from feature requests. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering. ASE’13, 290–300. Silicon Valley: IEEE Press.
Google Scholar
Tsoumakas, Grigorios et al. 2009. Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3: 1–13.
Article Google Scholar
Wang, Hulong et al. 2021. User-based collaborative filtering algorithm design and implementation. Journal of Physics: Conference Series 1757(1): 012168.
Google Scholar
Wang, Junjie et al. 2020. Context-aware in-process crowdworker recommendation. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. ICSE ’20, 1535–1546. Seoul: Association for Computing Machinery.
Chapter Google Scholar
Xie, Wenkai et al. 2020. API method recommendation via explicit matching of functionality verb phrases. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2020, 1015–1026. Virtual Event, USA: Association for Computing Machinery.
Google Scholar
Zhang, Jian et al. 2018a. Semantically enhanced tag recommendation for software CQAs via deep learning. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings. ICSE ’18, 294–295. Gothenburg: Association for Computing Machinery.
Chapter Google Scholar
Zhang, Min-Ling et al. 2014. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering 26: 1819–1837.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Innopolis University, Innopolis, Russia
Artem Kruglov & Anna Gorb
Università di Bologna, Bologna, Italy
Giancarlo Succi

Authors

Artem Kruglov
View author publications
You can also search for this author in PubMed Google Scholar
Giancarlo Succi
View author publications
You can also search for this author in PubMed Google Scholar
Anna Gorb
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kruglov, A., Succi, G., Gorb, A. (2023). GQM and Recommender System for Relevant Metrics. In: Developing Sustainable and Energy-Efficient Software Systems. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-031-11658-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-11658-2_4
Published: 27 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11657-5
Online ISBN: 978-3-031-11658-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

GQM and Recommender System for Relevant Metrics

Abstract

Similar content being viewed by others

Personalized project recommendation on GitHub

An Overview of Recommender Systems in Requirements Engineering

Recommendation Systems in Requirements Discovery

4.1 Introduction

4.2 Concept of Goal-Question-Metric

4.3 The Goal-Question-Metric Process

4.4 Recommender Systems

4.5 Metrics Recommender

4.5.1 Dataset

4.5.2 Preprocessing

4.5.3 Recommender Algorithm

4.5.4 Conclusion

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

GQM and Recommender System for Relevant Metrics

Abstract

Similar content being viewed by others

Personalized project recommendation on GitHub

An Overview of Recommender Systems in Requirements Engineering

Recommendation Systems in Requirements Discovery

4.1 Introduction

4.2 Concept of Goal-Question-Metric

4.3 The Goal-Question-Metric Process

4.4 Recommender Systems

4.5 Metrics Recommender

4.5.1 Dataset

4.5.2 Preprocessing

4.5.3 Recommender Algorithm

4.5.4 Conclusion

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation