1 Introduction

One of the most well-known techniques for calculating stability and design of frames is combining elementary mechanisms based on the analytical methods [1,2,3]. Similar to various methods provided to solve different kinds of engineering problems, this method also has limitations that prevent its use in all practical issues. The presence of rigid joints is one of the main problems in the plastic analysis of the frames that have been solved using linear programming by Charnes and Greenberg [4]. Moreover, plastic analysis and design of frames have some limitations [5, 6]. Recently, some researchers have analyzed the behavior of the frames under different conditions [7,8,9,10,11].

Nguyen et al. presented nonlinear inelastic response history analysis of steel frame structures using plastic zone method [12]. Nguyen and Kim have performed nonlinear elastic dynamic analysis of space steel frames with semi-rigid connections [13, 14]. Also, distributed plasticity approach has been used to perform a time-history analysis of steel frames [15, 16]. In another study, the second-order spread of plasticity approach has been applied for nonlinear analysis of the space semi-rigid steel frames under time-history conditions [17]. Recently, the plastic zone method has been utilized to perform such an analysis [18]. Furthermore, the effects of various base restraints have been investigated on the nonlinear inelastic static and seismic responses of steel frames [19]. Greco et al. used a genetic algorithm to predict the seismic collapse of frame structures [20]. Moreover, a robust predictive model was presented by Aminian et al. based on the shear of steel frame structures using hybrid genetic programming [21]. Also, the neural network approach has been used to analyze the frames of conventional buildings [22]. Toward the developments in this field, a new algorithm has been presented for the automatic evaluation of plastic collapse conditions for planar frames with vertical irregularities [23]. Besides, an improved ant colony optimization algorithm was introduced for the analysis of frame structures [24].

Among all studies conducted in this regard, there are two main methods for the design and analysis of frames. The first method is the finite element method (FEM) that is applied to compute the stiffness matrix of individual elements and assemble them into the global matrix of stiffness. Afterward, sets of related equations are solved to achieve the system response. Realizing the history of loadings is always necessary for the FEM method to analyze incrementally until the failure of the structure, although it is a time-consuming process [5, 6].

The second method includes algebraic methods that fall into a class of methods directly. In these techniques, the direct computation of the stiffness matrix is not essential and the system is assumed to be in the inception of failure. To combine elementary mechanisms, they are initially calculated by applying Gaussian elimination on a specific matrix. Next, they are combined to obtain a final collapse mechanism whose load factor is lower than all possible combinations of elementary mechanisms. This process presents the failure mechanism of the structure. In this method, the analysis of complete loading history is not needed and just the final collapse mechanism and the associated CLP value are evaluated.

Also, this method has a major difficulty that prevents its usage in a wide range of analyses. For example, in a two-dimensional frame, increasing the number of bays and stories results in the increase in the number of mechanisms are regarded in the combination process and makes the solution more complex [25]. Hence, obtaining a certain combination of elementary mechanisms that lead to highly accurate responses (considering the minimization of CLF) is very costly and time-consuming. Considering the mentioned difficulties, in recent years, many attempts have been made on the identification and development of faster techniques to determine the CLP with high accuracy. Artificial intelligence (AI) systems can be used as an alternative approach in these cases. AI systems such as artificial neural networks (ANNs) have been used in many optimization and prediction problems. The most important advantage of these methods is their ability to solve complex problems and convergence of responses independent of the research scope. The main reason for this phenomenon is that these networks are easily adaptable to any topic. Hence, they are applied to optimize engineering problems in various industries. Cao et al. [26], Aydin et al. [27], Yahya et al. [28], Jiang et al. [29], and Jahanshahi et al. [30] have used ANNs for the analysis of accrued failures and damages in engineering structures and frames (2D and 3D).

The main aim of the present research is to develop a new approach based on ANNs to investigate the influences of effective parameters of planar frames on the CLF. To achieve this purpose, two planar frames were considered and analyzed as examples for the modeling process. A group of 30 samples was gathered from each of these two examples in order to build the ANN structure. Next, the developed network was tuned finely and their parameters were adjusted carefully.

2 A brief description of the formation of elementary mechanisms

Firstly, independent mechanisms were calculated for an assembly of pin-jointed rigid bars (Fig. 1). Here, the elongation of every bar is calculated via displacements of its two ends as follows:

$$e \, = \, \left( {d_{xj} - d_{xi} } \right)\cos \alpha \, + \, (d_{yj} - d_{yi} )\sin \alpha$$

where e is the elongation and dxi and dyi are the end displacements for the i-th member along X and Y axes, respectively. In addition, dxj and dyj are the corresponding displacements for the end of j. It can be rewritten as a matrix equation that involves all the members.

Fig. 1
figure 1

End displacements of a pin-jointed rigid bar

$${\mathbf{e}} \, = \, {\mathbf{Cd}}$$

where e represents the elongation vector, C is the coefficient matrix, and d is the displacement vector relating to the joints. The elementary mechanism is based on the principle that in an ideal mechanism the bars do not have elongation (Cd = 0). The assembly is not stable with all joints pinned (the structure is not considered as a truss). Therefore, the difference between the number of columns and the number of rows related to the matrix C is equal to the number of independent mechanisms. By performing Gaussian elimination on Cd = 0, results can be assessed as follows:

$$\left[ {{\mathbf{I}} \vdots {\mathbf{C}}_{d} } \right]\left\{ {\begin{array}{*{20}c} {{\mathbf{d}}^{i} } \\ {{\mathbf{d}}^{d} } \\ \end{array} } \right\} = 0$$

By substituting the parameters into the Eq. (3), di will be achieved in terms of dd as:

$${\mathbf{d}}^{i} = -\,{\mathbf{C}}_{d} {\mathbf{d}}^{d}$$

To simplify the computational approximation, dependent vectors can be constructed using superposition theorem (setting one of the dependent displacements to unity and the others to zero). According to the Deeks method [31], independent mechanisms can be refined by removing the excess hinges for obtaining a set of potential collapse mechanisms. This can be achieved by checking every independent mechanism for containing all set of active hinges of another independent mechanism. This method is repeated so that no changes will be made to the system [30].

3 Calculation of the CLF

In the current study, the CLF was calculated by utilizing the virtual work theorem in which the ratio of internal to external virtual works represents the value of collapse load factor (λ) as follows:

$$\lambda = \frac{{M^{T} \left| r \right|}}{{P^{T} d}}$$

where P and d represent joint force and corresponding joint displacement within the same direction, respectively, and M and r denote the plastic moment and rotations at hinges, respectively. Moreover, other research achivements remark that: “As the joint mechanisms are neglected throughout the formation of independent mechanisms, it is necessary to search out the locations of hinges in the members. These locations are determined to minimize the internal virtual work. If a joint is restrained against rotation, hinges are formed in all the members connected to this joint. However, if the joint is not restrained against rotation, hinges are formed in (n − 1) members among n members connected to that joint” [25].

4 Artificial neural networks

ANN is one of the most well-known soft computing efficient tools that has been utilized for modeling different engineering problems. The method has been inspired by the human’s brain and biological neural systems [32,33,34]. ANNs as highly interconnected arrays of processing computational neurons have been widely used and proven to be flexible interpolation functions. These networks are able to adapt to fit any advanced info and have the capability of prediction and optimization [35, 36]. The principles of ANN modeling concerning the performance and application of the biological and artificial neurons has been studied in various works [37,38,39]. The main advantage of an ANN over usual numerical analysis procedures, under the provision that the predicted results fall within acceptable tolerances, is that results can be produced so fast, requiring orders of magnitude less computational effort than the common procedures [40]. A single neuron computes the sum of the entered inputs, which are multiplied with a variant called the weight, adds a bias term, and drives the result through a transfer function to produce a target output. Generally, linear, tangent sigmoid (Tansig), and logarithmic sigmoid (Logsig) functions are used as the transfer functions. These transfer functions are mathematically represented as follows:

$$Linear:\chi \left( x \right) = linear\left( x \right)$$
$$Tansig:\phi \left( x \right) = \frac{2}{{1 + e^{ - 2x} }} - 1$$
$$Logsig:\psi \left( x \right) = \frac{1}{{1 + e^{ - x} }}$$

Structurally, each ANN consists of an input layer, hidden layer/layers, and an output layer [41]. The structure of an ANN model is determined by the number of its layers, a respective number of nodes in every layer, and the nature of the transfer function [42]. Figure 2 presents a neural network that is fed with p as input parameter and a as output parameter, with weight matrixes w, bias vectors b, linear combiner u, and transfer function f.

Fig. 2
figure 2

One layer neural network that feed with r inputs and s outputs

4.1 Implementation of ANN

In order to simulate a subjet by ANN, two main steps of network training and testing should be regarded. The main difference between these two steps is the used data sets; those used for testing are not used through the training step. Training step includes the process of calculating the weights and biases values with adaptation, learning from, and evaluating the training patterns. Sets of known input and output data are used to train the network. To achieve the optimal structure of ANN with the least errors, there is no generally accepted rule. However, one among the difficult steps in ANN modeling is selecting the optimum neural network structure via trial and error [43]. Generally, this procedure is carried out by training different networks with different structures and comparing them to gain acceptable ranges of errors. In this study, some well-known algorithms such as the feed-forward error back-propagation (BP) algorithm are employed to train the networks that are using a gradient descent technique to minimize the error for a particular training pattern. The accuracy of the developed model can increase along with the number of datasets [44]. In the present study, CLF of the planar frames was modeled using the ANN technique. Length of the bay, the height of story, loads, and plastic moments are regarded as inputs and the CLF considered as an output of the networks.

4.2 Performance evaluation of ANN

The efficiency of the developed ANN models during this research was assessed using several statistical criteria by comparing the experimental and predicted results of ANN. Four criteria of the coefficient of correlation (R2), root mean square error (RMSE), mean relative error (MRE), and mean absolute error (MAE) were used for this purpose (Eqs. 710):

$$R^{2} = \frac{{\sum\nolimits_{i = 1}^{n} {\left( {f_{EXP,i} - F_{EXP} } \right)} \left( {f_{ANN,i} - F_{ANN} } \right)}}{{\sqrt {\sum\nolimits_{i = 1}^{n} {\left( {\left( {f_{EXP,i} - F_{EXP} } \right)^{2} \left( {f_{ANN,i} - F_{ANN} } \right)^{2} } \right)} } }}$$
$$RMSE = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {f_{EXP,i} - f_{ANN,i} } \right)^{2} } }}{n}}$$
$$MRE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\frac{{\left| {f_{EXP,i} - f_{ANN,i} } \right|}}{{f_{EXP,i} }}} \times 100$$
$$MAE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {f_{EXP,i} - f_{ANN,i} } \right|}$$

where n is the number of used sample for modeling step, fEXP is the experimental value, and fANN is the value predicted by networks. Also, the values of FEXP and FANN are calculated as follows:

$$F_{EXP } = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} f_{EXP,i}$$
$$F_{ANN } = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} f_{ANN,i}$$

4.3 Generating model function

After accomplishing successful training and achieving the optimum network structure, the values of weights and biases were obtained. The model function (in a four-layer network) can be determined as follows:

$$a^{1} = \, f^{1} \left( {w^{1} i + b^{1} } \right)$$
$$a^{2} = \, f^{2} \left( {w^{2} i^{1} + b^{2} } \right)$$
$$a^{3} = \, f^{3} \left( {w^{3} i^{2} + b^{3} } \right)$$
$$a^{4} = \, f^{4} \left( {w^{4} i^{3} + b^{4} } \right)$$
$$M\left( {m\left( 1 \right)} \right) = \, a^{4} = \, f^{4} \left( {w^{4} f^{3} \left( {w^{3} f^{2} \left( {w^{2} f^{1} \left( {w^{1} i + b^{1} } \right) + \, b^{2} } \right) \, + b^{3} } \right) + \, b^{4} } \right)$$

where a1, a2, and a3 are outputs of the first, second, and third layers, respectively, and a4 is the fourth layer output, which is equal to the function M(m(1)). The function M collects the values of all input parameters that are fed to the network. The desired outputs of CLF are depicted with m(1).

4.4 The used methodology of ANN

The methodology of ANN is presented based on the error convergence using some well-known criteria including R2, RMSE, MRE, and MAE. When the R2 is close to 1 and both criteria of RMSE and MAE are close to zero, it can be stated that the developed network is efficient enough. Based on the results reported by Elangovan et al. [45] and Maleki et el. [37, 46, 47], attaining values of R2 more than 0.99 are much acceptable for this criterion. The methodology used in the current study is shown in Fig. 3.

Fig. 3
figure 3

The used methodology consisting of network training, investigation of the results achieved from ANN and results evaluation

5 Numerical results

In this section, two numerical examples are presented (Fig. 4) to analyze the obtained results. The first example is a one-bay and one-story and the other is a two-bay and two-story frame under horizontal and vertical forces and plastic moments. The exact CLF was determined using the method of combination of elementary mechanisms. The related data of each example (Tables 1 and  2) were used to develop an ANN for the modeling process. A package of 30 samples is employed for training and testing the networks for every example. A total of 25 samples data (83%) were employed as data sets for network training. The network testing is performed using 5 sample data (17%), which were not used during training. The conceptual schematic structure of the used four layers ANN, using feed-forward BP algorithm and fully interconnected neurons for modeling of both examples, is illustrated in Fig. 5.

Fig. 4
figure 4

Geometry, plastic moments and loadings of considering example frames. a One-bay and One-story frame for example 1 and b Two-bay and Two-story frame for example 2

Table 1 Values of effective parameters for 30 sample frames of example 1
Table 2 Values of effective parameters for 30 sample frames of example 2
Fig. 5
figure 5

Conceptual architecture of ANN according to the considered input and output parameters of the network using fee-forward and back-propagation algorithm related to a Example 1 and b Example 2

Various networks were trained to achieve the optimal structures of both examples modeling for generating the correspondence model functions. Related information of 5 different trained networks with trial and error approach for modeling of CLF of both example are presented in Tables 3 and 4, respectively. After investigating the trained network, ANN models with structures of 6 × 12 × 14 × 1and 8 × 14 × 14 × 1 that have the highest value of R2 and the least values of RMSE, MRE, and MAE were selected as an optimal structure to generate the model functions. Figures 6 and 7 present the comparative diagrams of predicted and experimental values for both training and testing samples of both examples, respectively. Figure 8 represents obtained relative error (RE) values of CLF for both training and testing samples of examples 1 and 2.

Table 3 Related information of 5 different network training for modeling of CLF for example 1
Table 4 Related information of 5 different network training for modeling of CLF for example 2
Fig. 6
figure 6

Comparison of the predicted (ANN response) and experimental values for each a training and b testing samples of example 1

Fig. 7
figure 7

Comparison of the predicted (ANN response) and experimental values for each a training and b testing samples of example 2

Fig. 8
figure 8

Values of relative errors obtained for both training and testing data set relevant to a example 1 and b example 2

Table 5 presents the data used for the evaluation of the ANN via mentioned statistical criteria for both training and testing data sets.

Table 5 Obtained values of R2, RMSE, MRE, and MAE for trained and tested network

According to the obtained results, in network training and testing processes, it is observed that the values of R2 are more than 0.999 and the values of RMSE, MRE, and MAE are close to 0 and acceptable for the predicting the output parameter (i.e., CLF) in both example 1 and 2. Therefore, it is concluded that networks were trained efficiently and adjusted carefully. In modeling of each example, some mutations are observed in errors of different samples. The reason for this phenomenon is the inability of the neural network trained by the error back-propagation algorithm to converge while simulating data in a wider range. In addition, the obtained values of statistical criteria for the testing process in the modeling of both examples were increased negligibly and the obtained error values in both models were less than 1%, which is an acceptable value.

After evaluating the developed networks, the modeling functions of examples 1 and 2 are generated discretely. Afterward, to have parametric analysis on plastic behavior of the considered planar frames, we determined the effects of effective parameters including length of the bay, the height of story, loads, and plastic moments on CLF. The generated functions of examples 1 and 2 are described by Eqs. 14a and 14b, respectively:

$$\lambda_{{{\text{ex}}.1}} = f \, \left( {h, \, l, \, M_{1} , \, M_{2} , \, F_{x} , \, F_{y} } \right)$$
$$\lambda_{{{\text{ex}}.2}} = f \, \left( {h, \, l, \, M_{1} , \, M_{2} , \, M_{3} , \, M_{4} , \, F_{x} , \, F_{y} } \right)$$

Figures 9 and 10 present the effects of input parameters on CLF obtained via generated model functions of the selected developed networks for examples 1 and 2, respectively. Figure 9a–e present the effects of the whole parameters versus the value of CLF for the parameters related to the length of the bay, the height of story, loads and plastic moments, respectively.

Fig. 9
figure 9figure 9figure 9

Parametric analysis of input parameters including length of bay l, height of story h, and plastic moments M influences on CLF obtained via generated model functions of the selected developed network for example 1; effects of a length of bays, b height of stories, c horizontal force, d vertical force, and e plastic moments and the other effective parameters all together on CLF

Fig. 10
figure 10figure 10

Parametric analysis of input parameters including length of bay l, height of story h, and plastic moments M influences on CLF obtained via generated model functions of the selected developed network for example 2; effects of a length of bays, height of stories and applied forces, and b plastic moments and the other effective parameters all together on CLF

According to the obtained results relevant to example 1, to have greater CLF, if the value of h is ascending, the value of l has to be descending; conversely, if the value of l is ascending, the value of h has to be descending. Also, if the applied loads (i.e., both Fx and Fy) are increasing, the value of l has to be reduced but if the value of parameter Fy is descending the parameter h has to rise to have more CLF. On the other hand, if the Fx is ascending, the value of h must be descending. About the plastic moments, it can be seen that to have greater CLF, by increasing the parameter M1, the parameters l, h and Fy have to decrease whereas the other parameters of M2 and Fx have to enhance simultaneously. Moreover, about the effects of the applied loads, it can be observed that by a decrease in Fy, the value of the parameter Fx has to be ascending.

In addition, the comparison of the influences of the some mentioned parameters in example 1 for several specific values is presented in “Appendix 1”. As can be seen, the parameter of the plastic moment (M1) has interesting effects; to gain a higher CLF and greater l and h, a turning point is made. Before that point, M1 has to decrease and after that, the value of M1 has to rise. This feature occurs in the plastic moment parameter (M2) of example 2, as well.

For the parametric analysis of influences of the mentioned parameters in example 2, the effects of some parameters are investigated (Fig. 10). For instance, it can be observed that by decreasing the parameter l, the parameters h, Fx, and Fy must have a reduction as well. Overall, to have more CLF, the related values for the length of the bays, the height of the stories, and applied loads have to become as lower as possible; in other words, the CLF has reverse relation with l, h, and Fx and Fy. Figure 10b illustrates the effects of all applied moments on CLF, with which each of them having different specific influence. For example, it is seen that if the value of the parameter M4 is raising, the other moments of M1, M2, and M3 have to become lower to increase the CLF. Same as example 1, the comparison of the influences of the some mentioned parameters in example 2 for several specific values is presented in “Appendix 2”.

After parametric analysis, to analyze the sensitivity and assess the relative importance of the input variables of each example, an evaluation process was used based on the neural network-weight matrix and Garson equation, which is relation based on partitioning the connection weights [30]:

$$I_{j} = \frac{{\sum\nolimits_{m = 1}^{{N_{h} }} {\left( {\left( {\frac{{\left| {W_{jm}^{ih} } \right|}}{{\sum\nolimits_{k = 1}^{{N_{i} }} {\left| {W_{km}^{ih} } \right|} }}} \right) \times \left| {W_{mn}^{ho} } \right|} \right)} }}{{\sum\nolimits_{k = 1}^{{N_{i} }} {\left\{ {\sum\nolimits_{m = 1}^{{N_{h} }} {\left( {\frac{{\left| {W_{km}^{ih} } \right|}}{{\sum\nolimits_{k = 1}^{{N_{i} }} {\left| {W_{km}^{ih} } \right|} }}} \right) \times \left| {W_{mn}^{ho} } \right|} } \right\}} }}$$

where Ij is the importance of jth input variable related to the output variable, Ni and Nh are the numbers of inputs and hidden neurons, respectively, W is the connection weight, and the superscripts i, h, and o refer to input, hidden and output neurons, respectively.

The relative importance of the inputs is demonstrated in Fig. 11. It was found that all variables strongly affect the value of CLF in each considered frames. The ranking and the value of importance for each regarded inputs are shown in Table 6.

Fig. 11
figure 11

Relative importance of each parameter on CLF for a example 1 and b example 2

Table 6 Ranking and the value of importance for each regarded inputs in both considered examples

This study was conducted to investigate parametric plastic analysis of two simple planar frames via ANN. The obtained results showed that ANN could be used as an alternative and efficient tool to accomplish this analysis. In addition, the results can be useful in the design of frames. For future studies, this method can be employed for structures that are more complex and with a larger number of bays, stories, loads and plastic moments.

6 Conclusion

In the present study, the influences of effective parameters on the collapse of planar frames are investigated and modeled via ANN. In order to develop a network, the length of the bay, the height of story, loads and plastic moments were considered as inputs and the collapse load factor (CLF) was considered as an output parameter. Afterward, the model was trained and tested. The values of statistical criteria used in training and testing processes showed that the obtained values of R2 are more than 0.99 and the values of RMSE, MRE, and MAE are very close to zero for each considered example, which all are acceptable. Then, the related model functions are generated and the effects of each input parameters are studied. The results revealed that CLF has an inverse relationship with the value of height and length such that if the value of height is increasing the value of length must become lower and, conversely, if the value of length is enhancing the value of h must have a reduction to have more CLF. Moreover, if some of the parameters of the plastic moment become higher, a turning point is made by increasing the values of height and length. According to these results, when the networks are tuned finely and their parameters are adjusted carefully, satisfactory results can be obtained. As a result, ANN can be employed as an efficient tool plastic analysis and design of frames even the structures become more complex with a higher number of bays, stories, loads, and plastic moments.