1 Introduction

Software components are preconfigured building elements with predetermined functions that can connect through industry-standard message interfaces. In contrast to software substances, mechanisms are larger units that indicate a better task or process level. A section can be used as a black box and has an outward description that is separate from its internal workings. Software development employing pre-built or existent software machines based on the definition of those apparatuses is known as CBSE. Measurements and their metrics are crucial for managing the software engineering process. Software metrics are measurable measurements employed to assess various traits and aspects of a software life cycle or the software organisation itself. Programme metrics are essential for evaluating and forecasting a variety of software qualities and complexity, including maintainability, testability, reusability, etc. Between all of these characteristics, the complexity aspect impacts every other part of the software, according to Gill and Balkishan (Weyuker 1988). Software metrics are crucial for forecasting, planning, carrying out, overseeing, controlling, and evaluating procedures and goods. Because technologies change over time, it is necessary to leverage ideas like constituent reusability, constituent interaction, and disappointment rate to create a novel product quickly. CBSE is a subset of software engineering that relies heavily on its constituent parts' dependencies, exchanges, and reuse. In CBSE, the capacity of the reusable component to provide fresh output with few errors and satisfy client needs depends on reliability (Biemen and Kang 1995). When assessing reusability in CBSE, component compatibility and reliability are crucial. CBS is a modern approach in software engineering that emphasises combining parts into sophisticated software organisations with the rapid growth of section technology. This method has various benefits, including quicker delivery, quality, lower maintenance costs, reusability, and productivity speed to market. Reliability may be anticipated by considering the dependability of each form of a part and the interconnection methodology among elements (Chen and Zhou 2011). Reusing existing mechanisms takes less time than creating a novel component. As a consequence, employing the component-based software engineering technique allows organisations to be constructed more quickly. Through lowering software expansion costs and raising software efficiency, businesses may become more competitive. Prior to making adjustments or developing the organisation, it offers an accurate view of the existing setup. Finished the dependency relationship, it unveils software configuration issues and reveals the implementation’s difficulties without forcing us to study every line of code. The failure forecasting approaches used in CBS’s reliability forecasting incorporate a quantitative evaluation of system dependability. The main goals of software metrics are cost reduction, quality improvement, control and keeping an eye on the timetable, minimising testing initiatives, and efficiently utilising reusable building blocks or pieces. The paper is divided into several parts; Sect. 2 reviews relevant research on a few fundamental cohesion metrics in the literature. Section 3 Limitations, Sect. 4 delivers a description of the scenario, Sect. 5 Materials and Methods, Sect. 6 discusses the problem at hand. Results and discussion are accessible in Sect. 6, and Sect. 7 concludes with a discussion of future instructions.

2 Related work

The asset of the relationship between elements and substances in a module is measured by cohesion. In other terms, it refers to how closely each module's instructions and components correspond to a certain purpose. Suit Chidamber, e.l. Metrics are LCOM (Lack of Cohesion in Methods) (Chidamber and Kemerer 1991). Later, it was changed to LCOM2 Chidambere (Chidamber and Kemerer 1994). LCOM2 is not employed in the observed investigation since it cannot discriminate between two software packages by giving them a cohesive score of zero. LCOM and LCOM2 don't take the invocation manner into account. In 2000, Li proposed RLCOM (Li et al. 2001). By separating the entire number of method pairs by the number of non-similar technique pairs, LCOM is extended. Cohesion measures were proposed by Hitz and Montazeri (1995). (LCOM3). It is a more effective form of LCOM. An unordered graph is used to illustrate how a class's functions are related to one another. A class's methods are its nodes. If two techniques share at least one parameter, there should be an edge. COM, LCOM3, and LCOM are, in reality, indicators of a lack of consistency; they really should be highlighted. TCC, which measures tight class cohesion, measures cohesion rather than lack of it. Bieman and Kang proposed TCC (Tight Class Cohesion) (1995). These measures take into account shared characteristics that methods will employ, as well as inter-method activation. All variables used in technique n would also be used by procedure m if procedure m is called procedure n. If two techniques use similar properties by referencing or invoking each other, they are said to be related. The similarity of approaches is viewed as an infinitive relation through these cohesion measures. LCOM3 and TCC take into account tangential connections between techniques. LCOM3 and TCC diagnose Similar to indirect and direct cohesiveness, Gandhi and Guie (2012), Gui, and Scott (2008) discuss the complexity factors of the component, techniques, and aggregate components (Biemen and Kang 1995; Chen and Zhou 2011; Chidamber and Kemerer 1991, 1994; Weyuker 1988; Gill and Balkishan 2008; Gandhi and Kumar 2012; Gui, and Scott 2008; Hitz and Montazeri 1995; Jianguo and Hui 2011). The author concludes that the sophisticated constituent requires extra time to run and is challenging to preserve and reuse after validation. Michael et al. (2015) have introduced BICM (Bounded Interface Complexity Metrics), which is an expansion of ICM. It is constrained; therefore, it might not always expand with size. According to the analysis of this statistic, connection size has no bearing on it (Tabrez et al. 2022; Singha et al. 2018a, 2018b, 2018c, d, 2022; Zubair and Singha 2021, 2020; Sultana et al. 2022; Arvind, & Ratan, R. 2020). The BICM can be used to assess a component's portability, independence, and self-completion. For component-based software systems, Kartika Yadav and Tomer (2014) (Singha et al. 2018c) introduced dual metrics: cohesion in class (CIC) and cohesion between components (CBM). These indicators are beneficial for raising the standard of CBSS design. In CBS, these metrics are being used to identify classes and components that are badly designed (Jianguo and Hui 2011; Li et al. 2001; Kaur and Singh 2013; Rana and Singh 2016; Singh and Chhillar, and Kajla, P. 2012; Singh et al. 2012; Sengupta and Kanjilal 2011; Sharma et al. 2009; Mittal and Bhatia 2013; Mwangi and Michael 2015; Tiwari and Kumar 2014). Two metrics—cohesion of variables within a component (COVC) and cohesion of methods in a component—were suggested by Rana and Rajender Singh (2016). (COMC). These statistics demonstrate the link between the variables utilised in various methodologies (Arvind & Ratan 2020; Bhat et al. 2022; Al-Taani and Al-Sayadi 2022; Kumar and Rath 2017; Azadeh et al. 2017; Ubaid et al. 2020; Jain and Raj 2018; Sreenivasula Reddy et al. 2022; Gadekar et al. 2022; Faiz and Daniel 2022). According to the authors, the difficulty of the component depends on the type and quantity of the parameters (Chhillar and Bhasin 2011; Singha et al. 2018a) (Din et al. 2023; Rostami et al. 2022; Zhang et al. 2019; Wechsler 2023; Liu 2021; D’Aniello et al. 2018; Taimoor et al. 2023; Samriya et al. 2023).

3 Limitations

  • Most of the intrinsic metrics discussed above take into account direct connection, cohesion among classes, and explicit similarity among methods. Incorporating indirect linkages between procedures has been proposed as an addition by one of the cohesion metrics, LCOM3. It does not allow for a numerical specification of, directly or indirectly, cohesiveness and considers both in the same manner.

  • The complexity of a module will grow with its size, given that ICM expands with the size of the component interface. This indicates that even if the new, enhanced component has substantially more identity, it will be graded poorly due to its increased complexity. The examination of BICM shows that it is self-governing in terms of boundary size. Still, it is necessary to assess it based on the entire system rather than just one element.

  • Utilising metrics and tools that quantify them is one technique to assess the cohesiveness and coupling of a code base. LCOM4 (Lack of Cohesion of Methods) and CBO (Coupling Between Objects) are two examples of tests that may be used for the same thing. However, these metrics are ambiguous as to whether the computation should take into account only exiting dependencies or both incoming and outgoing dependents.

4 Problem description

The purpose of software engineering is to make high-quality software that is very inexpensive to maintain. At various phases of software development, the quality of the software is evaluated. Additionally, the design level can be assessed. The design of a component in a system that uses components has two views: internal and outward. Building customs is more of a concern for component developers. The component's value will automatically rise if the build quality of the component is poor. The number of lines of code must be raised, and more work must be put into updating the component to make it usable. High component reuse and low component reliance are results of good design. Metrics and their dimensions are crucial for managing the software engineering procedure. Software metrics are measurable measurements employed to assess various traits and aspects of a software expansion cycle or the software organisation itself. Programme metrics are essential for evaluating and forecasting a variety of software qualities, including maintainability, testability, complexity, etc. Out of all these, the complexity factor impacts every additional software attribute, according to Gill and Balkishan (2008). Software quality is crucial for forecasting, planning, carrying out, overseeing, controlling, and evaluating procedures and goods. Software metrics' main goals are to lower costs, advance quality, regulate and monitor the schedule, minimise testing requirements, and support efficient usage of reusable construction blocks. Cohesion metrics have been suggested throughout this research to evaluate the component's effectiveness. Our primary attention is on parameters and variables, such as its procedures, components’ internal characteristics, etc.

5 Materials and methods

The intensity of a relationship between components inside a component is measured by cohesion. All of the commands in a cohesive component are directed towards carrying out a single, common goal. All the cohesive constituent needs to do is accept the statistics sent to it, process them, and then send the results to its superordinate component. The resemblance of a component's methods is described by cohesion. It measures the degree to which a component's multiple functions are connected.

5.1 Cohesion metrics

The cohesion of a component demonstrates how several properties relate to one another. The component's strength is shown. The cohesiveness component is an autonomous component that can be reused. If the component is very coherent, the likelihood of reuse rises.

5.2 Cohesion of variables (CoV)

A component's cohesion of factors refers to the regularity of its variables. The constituent is cohesive if the set of specified variables is focused on carrying out a particular purpose. The term "cohesion of variables" relates to how frequently variables are used in a component relative to the overall set of variables.

CoV = \(\frac{\sum_{i=1}^{n}f({B}_{i})}{U}\), where U is called the Number of variables overall in the component.

\(f\left({B}_{i}\right)\) Is the frequency of use for each attribute in the component.

5.3 Cohesion of methods (CoM)

The term "cohesion of techniques" describes how closely variables are employed in procedures related to each other. This measure considers how different techniques communicate with one another inside a component to determine the component's strength. By counting the ways using the same sort of variables and separating by the number of approaches, this measurement determines the cohesiveness of methods in a component.

CoM = \(\frac{\sum_{i=1}^{n}f({A}_{i})}{{n}^{2}+n+1}\) , here \(f({A}_{i})\) = number of techniques using the same kind of variables.

\({V=n}^{2}+n+1\) = a component's overall methods count.

6 Results and discussion

Experimentation is run on component-based software that is built in Java by Python to validate proposed complexity measures. Numerous Python components with varying numbers of object instances and methods can be found in this product.

6.1 Cohesion of variables

The occurrence of variables used in the element divided by the total number of variables is known as the cohesion of variables. There are ten (10) mechanisms in the sample, numbered B1 to B10. There are a few instance variables and methods within every element. The incidence of the variables is shown in Table 1. Some components' variable frequencies are the same, while others are different. Those frequencies are used to compute the CohV value.

Table 1 CoV and the frequency of variables

6.2 Cohesion of methods (CoM)

The connectedness of a component's procedures and local variables is referred to as the cohesion of its methods. This metric takes into account the interplay of the techniques in a component. There are ten (10) elements in the illustration from B1 to B10. Every element has examples of variables and processes. Table 2 displays the number of methods being used with identical types of parameters.

Table 2 Demonstrates CoM values

Two cohesion metrics, Cohesion between Methods (CBM) and Cohesion in Class (CIC), were suggested by Tomar and Yadav (2018c). Cohesion in a class refers to how frequently the class's methods use its attributes (variables) in a component (Singha et al. 2018c). The relatedness of class members is referred to as cohesion between methods (Singha et al. 2018c).

6.3 Cohesion in class (CIC)

CIC =  \(\frac{\sum_{i=1}^{N}f({c}_{i})}{TM}\) N = Total Number of class attributes. \(f({c}_{i})\) = frequency of use of each attribute by class methods for each attribute.TM = total Number of class methods.

6.4 Cohesion between methods (CM)

CM = \(\frac{\sum_{i=0 }^{a}f\left({c}_{i}\right).{M}_{i}}{am(m-1)}\)\(f\left({c}_{i}\right).{M}_{i}=\) Total of the techniques that use the same kind of attributes.M = Number of class methodsa = Number of factors

To verify these measures, an empirical investigation based on Python components must be done. The identical Python must be used. CM and CIC must be determined for each Python component. The frequency of attributes and CIC values aimed at each component are displayed in Table 3. Table 4 displays the total number of methods utilised for the same type of attribute and the CBM value for each component. CM, CIC, and CoM statistical tools should be used to determine the statistical significance of the CoV results. These measures will be subjected to the T-test. The inferential analysis is the T-test. It is employed to ascertain whether the resources of the dual groups differ meaningfully (Fig. 1). Table 5 and Fig. 1 represents the standard deviation and mean of CIC and CoV. It is used to determine whether there is a significant difference between the means of two groups.

Table 3 Displays the CIC value and the frequency of the characteristics
Table 4 Depicts the CBM value
Fig. 1
figure 1

Std. deviation and mean of CIC and CoV

Table 5 Std. Deviation and Mean of CIC and CoV

When paired sample t-tests are used on the data (results in Table 6), it is discovered that CoV's cohesiveness is greater than CIC [Tomer and Yadav]. The average value shows that the cohesiveness values of the given metrics are greater than those of the CIC that Yadav and Tomer have suggested (Singha et al. 2018c). The T-test value is 5.0654, and at a 99 percent degree of confidence, the same would be substantial. This indicates that CoV has a more significant cohesiveness value (Fig. 2).

Table 6 T-Test on paired samples
Fig. 2
figure 2

Comparison between CM and CoM graph

When paired sample t-tests are used on this data (results in Table 6), it is discovered that CoM is much more cohesive than CM [Tomer and Yadav] (Singha et al. 2018c). Comparison between CM and CoM Graph (Result in Fig. 2 and Table 7) are applied on the data and found that the cohesiveness of proposed metrics CohM (Cohesion of Methods) is more than CBM (Cohesion between methods). The mean value is reflecting that cohesion value of proposed metrics (CohM) is higher than the value of CBM which is proposed by Yadav and Tomer (Singha et al. 2018c). Table 7 and Fig. 2 denote CoM and CM's standard deviation and median. The paired sample T-test value is given in Table 8. The mean illustrates that the cohesiveness advantages of the proposed metrics (CoM) are higher than the value of the CM that Tomer and Yadav have proposed (Singha et al. 2018c). The 99 percent threshold of confidence T-test value is 5.0654, but the same is significant. It implies that CoM's value is higher and more significant in the recommended metrics (CoM). This analysis shows that, when compared to CIC and CM [Tomer and Yadav] (Singha et al. 2018c), the proposed metrics (CoV and CoM) are important. The suggested measures CIC and CBM were proposed by CoM and CVC, Tomar and Yadav (Singha et al. 2018c). These measures are based on the elements that are utilised in various ways by the components. CVC and CoM have been updated, and thus the cohesion of variables within a coefficient COM within an element is suggested (Rana and Singh 2016). These two measures are also affected by many variables and methodologies, but the authors classify the data into standard, moderate, and critical categories and take weights into account to normalize. The frequency of various types of variables that bind or enhance a component is represented by the COVC. The connectedness of a constituent's methods and instance variables is called the cohesion of methods. This measure considers how a component's methods communicate with one another (Rana and Singh 2016). Mean and Std. Deviation of CohM and CBM (Result in Table 9) are applied on the data and found that the cohesiveness of proposed metrics CohM (Cohesion of Methods) is more than CBM (Cohesion between methods). The mean value is reflecting that cohesion value of proposed metrics (CohM) is higher than the value of CBM which is proposed by Yadav and Tomer (Singha et al. 2018c). The T-test value is 4.838 and the same is significant at \(99\mathrm{\%}\) level of confidence. It means that the value of CohM (cohesion of methods) is higher and significant in proposed metrics (CohM) in Table 10.

Table 7 CoM and CM's standard deviation and median
Table 8 Paired sample T-test
Table 9 Mean and Std. deviation of CohM and CBM
Table 10 Paired sample T test

6.5 Cohesion of variables in a component (COVC)

COVC = \(\sum_{i=0}^{N}\frac{FV}{TV}\)FV = \(\sum_{i=0}^{N}\{\left[f\left(vi\right)*wi\right]+\left[f\left(vmi\right).wmi\right]+\left[f\left(vci\right)*wci\right]\}\)hereFV = frequency of a component's instance variables.TV = Number of instance variables in a component as a whole.\(f\left(vi\right)=the\) standard variable occurrence rates.\(f\left(vmi\right)=\) regularity with which moderate variables occur.\(f\left(vci\right)\)=the frequency with which important factors arise.

The weight factors for the standard, moderate, and critical types of variables are, individually, \(wi\), \(wmi\), and \(wci\).

The ten-part Python project will be used for the empirical analysis. Table 9 displays the COVC value as well as the frequency of various types of variables. The ten-part Python project will be used for the empirical analysis, conducted based on Python components, regarding each Java Beans component.

The inference is made that a component uses a moderate variable frequently, which increases its reusability when creating a brand-new application. The measures from Rajender Singh and Rana (Rana and Singh 2016) are subjected to correlation analysis. This leads to the conclusion that both the kind and frequency of the variables affect the component's complexity (coupling or cohesiveness). The outcome demonstrates that these characteristics have an impact on the component's complexity in Table 11. Although the recommended complexity seems rational and matches intuitive perception, it is not the sole factor in determining how complicated a CBSE is in total. One of our upcoming initiatives will involve conducting more empirical studies using genuine CBSS systems to apply our suggested measures. It will be possible to investigate the link between the suggested metric values and a number of CBS quality parameters using data from projects that the industry has already implemented. For standard-type variables, the Pearson correlation value (Table 12) is − 0.654. It implies that the value of COVC is reduced by 0.654 per unit if we increase the standard sort variables in a constituent. The moderate-type variable has a correlation value of 0.5678. It implies that the value of COVC grew by 0.5678 per unit if we raised the moderate-type variables in a component. For crucial-type variables, the Pearson correlation is also 0.675. This suggests that the value of COVC grew by 0.577 per unit if the frequency of essential type variables was increased in a component.

Table 11 Displays the COVC value as well as the frequency of various types of variables
Table 12 Demonstrates the Pearson correlation between cohesiveness measurements and the frequency of several types of variables

7 Conclusion and future work

Since moderate instance variables are used frequently within a component, there is a significant likelihood that the component can be reused when creating a brand-new application. To explore the relationship between the cohesiveness measure and the regularity of different kinds of variables (critical, moderate, and standard), the Pearson correlation approach is functional to the metrics developed by Rajender Singh and Rana (2016). For standard-type variables, the Pearson correlation value (Table No. 12) is − 0.654. It implies that the value of COVC is reduced by 0.654 per unit if we increase the standard sort variables in a constituent. The moderate-type variable has a correlation value of 0.5678. It implies that the value of COVC grew by 0.5678 per unit if we raised the moderate-type variables in a component. For crucial-type variables, the Pearson correlation is also 0.675. This suggests that the value of COVC grew by 0.577 per unit if the frequency of essential type variables was increased in a component. As a result, it is advised that researchers use fewer essential type variables and more moderate and critical sort variables. We already recognise that a component needs to have a high cohesion value and a low coupling value to be independent. Conclusion: To get the greatest outcomes for the component's strengthening, which results in the component's reusability for creating a new application, the use of moderate illustration variables within a section ought to be high. Projects completed using component-based systems are frequently delivered on time and within budget (Rajmohan and Ramasubramanian 2023; Pakrooh and Bohlooli 2021; Pawar et al. 2019; Li and Mao 2017; Upadhya 2023; Annepu and Rajesh 2020; Edla et al. 2020; Alimi et al. 2019; Miura and Suzuki 2003; Alam 2022). To assess the projects' complexity, metrics are created. An experimental assessment of the current and proposed metrics has been done on this Python. Cohesion measures have undergone extensive analysis and comparison between the proposed measures and various cohesiveness metrics. Using a statistical tool, CM and CIC's proposed metrics should be compared to CoM and CoV. The data is subjected to a t-test, which reveals that the proposed measures, CoM and CoV, have higher levels of cohesion than CM and CIC. The Pearson Correlation approach is used to display the relationship amid the cohesiveness measure and the regularity of various variables, functions that employ different variables, and COVC and COM are upgraded versions of CoV and CoM. The analysis of COVC and COM shows that by using modest wildcards, the component's strength, cohesion, and likelihood of being reused for creating new applications would all be high. The complexity of the constituent depends on the type and regularity of the variables, according to the case study's findings. The outcome demonstrates that these characteristics impact the component's complexity. The suggested cohesion complexity seems rational and fits intuitive perception. However, it is not the only factor determining how complicated a CBSE is. The primary drawbacks of the suggested technique are related to component trustworthiness since components are black-box programme units and users may not have access to the component's source code. Therefore, it is unwise to trust the components. Trade-offs between ideal criteria and available components are a constant in the system definition and design process, which may be another restriction. The use of moderate local variables within a portion and methodologies using moderate instance variables in an element must be on the higher side to have the best possible results for the reinforcing of the constituent, which in turn wires the reusability of the constituent for developing a novel application. Soft computing techniques and MATLAB may be utilised in future works to optimise the output of given measures (Ezugwu et al. 2022; Agushaka et al. 2022, 2023; Hu et al. 2023; Zare et al. 2023; Abualigah et al. 2023).