The use of machine learning techniques for assessing the potential of organizational resilience

Ewertowski, Tomasz; Güldoğuş, Buse Çisil; Kuter, Semih; Akyüz, Süreyya; Weber, Gerhard-Wilhelm; Sadłowska-Wrzesińska, Joanna; Racek, Elżbieta

doi:10.1007/s10100-023-00875-z

The use of machine learning techniques for assessing the potential of organizational resilience

Original Paper
Open access
Published: 07 August 2023

Volume 32, pages 685–710, (2024)
Cite this article

Download PDF

You have full access to this open access article

Central European Journal of Operations Research Aims and scope Submit manuscript

The use of machine learning techniques for assessing the potential of organizational resilience

Download PDF

2051 Accesses
9 Citations
Explore all metrics

Abstract

Organizational resilience (OR) increases when the company has the ability to anticipate, plan, make decisions, and react quickly to changes and disruptions. Thus the company should focus on the creation and implementation of proactive and innovative solutions. Proactive processing of information requires modern technological solutions and new techniques used. The main focus of this study is to propose the best technique of Machine Learning (ML) in the context of accuracy for predicting the attributes of the organizational resilience potential. Based on the calculations, the research includes estimating them through the applications of regression and machine learning methods. The dataset is obtained from the results of the our survey based on the questionnaire consisting of 48 items mainly established on OR attributes formed on ISO 22316:2017 standard. Based on the outcomes of the study, it can be stated that the optimal technique in the context of accuracy for predicting the attributes of the organizational resilience potential is ensemble methods. The k-nearest neighbor (KNN) filtering-based data pre-processing technique for stacked ensemble classifier is used. The stacking is achieved with three base classifiers namely Random Forest (RF), Naive Bayes (NB), and Support Vector Machine (SVM). The chosen ensemble method should be implemented in an organization systemically according to the circle of innovation, and should support the quality of managerial decision-making process by increasing the accuracy of organizational resilience potential prediction, and indication of the importance of attributes and factors affecting the potential for organizational resilience.

The dilemma of determining the superiority of data mining models: optimal sampling balance and end users’ perspectives matter

Article 12 December 2019

Finding Significant Project Issues with Machine Learning

Stacking ensemble approach in data mining methods for landslide prediction

Article 21 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Disruptions such as the COVID-19 pandemic and other crises can affect the organization’s performance. In a crisis situation, managers must make rapid, high-risk decisions based on available information (Rauner et al. 2018). To be able to withstand any kind of disruption, an organization needs three essential components: organizational resilience (OR), crisis management (CM), and business continuity management (BCM). These components work together to create a comprehensive concept of integrating corporate recovery management systems. The foundation of this concept is the element of organizational resilience (Ewertowski 2022). The pandemic crisis revealed varying levels of organizational resilience of enterprises (Ewertowski and Butlewski 2021). Companies try to cope with increasing uncertainty to survive the crisis, adapt to a new situation and ensure stability and safety (Ma et al. 2018). To become more resilient, organizations should anticipate and respond to threats and opportunities arising from sudden or gradual changes in both the internal and external context. Effective risk management help to achieve it (Ewertowski and Butlewski 2022). Proper utilization of relevant information regarding past and current disruptions, as well as the perception of organizational resilience characteristics by employees, is crucial in effectively managing the risks linked with issues related to organizational resilience. The approach allows to set and modify properly the solution based on the obtained information (Nehézová et al. 2022). Enabling the refinement of the decision-making procedure in relation to amplifying the organizational resilience potential can be accomplished. It is possible to achieve this by employing innovative technological resolutions for the proactive handling of information. Technological innovations had been faster than social change. However, Philips maintained that the era is over (Phillips 2008). Subsequent events seemed to support this opinion. Philips creates so called “circle of innovations” and claimed that innovation creates not only new products and services, but new ways of using them. It is changing the ways we organize and interact with each other. Smart organizations generate new needs that are met by the next technical innovations (Phillips 2014). In the smart organization, increasing amounts of data need to be collected, analyzed, and interpreted in order to manage technology and innovation and to detect discontinuous changes. Data mining is the process of extracting knowledge from data using algorithms that identify hidden relationships that are usually not noticeable at first glance (Pejic et al. 2022). Data mining models based on artificial intelligence allow for greater automation of these activities (Mühlroth and Grottke 2022). The applications of regression is also widely being used for prediction and forecasting (Nalcaci et al. 2019). The numerous supervised regression models and algorithms used in artificial intelligence (AI) and machine learning (ML) call for a comparative examination of AI models, which can help managers select the most suitable model, as suggested by Elmousalami (2021). According to Valaskova et. al. the most important challenges in the implementation of intelligent production solutions are, inter alia, implementation of AI/ML (77% of responders) and internet of things (IoT) (74% of responders) technologies. The reasons why business organizations plan to use AI include, inter alia, innovation management (42% of responders) and data analysis (42% of responders) (Valášková et al. 2020). It has been observed that only a small number of methods are designed to specifically measure resilience, or more specifically, resilience potential (Tew et al. 2008). In addition, they point out that there is a significant gap in the assessment of resilience using quantitative methods. Another problem is which method is optimal to help predict the organizational resilience potential. From a managerial perspective, it is important to know how to predict the weakest points of the organizational resilience potential and what methods are best to do so. The use of machine learning to support analytical processes improves management decision making (Graczyk-Kucharska et al. 2022). This allows companies to implement effective preventive measures to reduce the likelihood of a recurrence of crises or incidents. According to Phillips and Linstone, the true objective of forecasting is to handle the future in an optimistic, safe, and cost-effective manner. Achieving this goal involves crafting the most promising visions for the future, and equipping our organization to grasp the potential outcomes and be responsive to both expected and unexpected events (Phillips and Linstone 2016).

The proposed research adopts a novel approach by addressing a cognitive gap in the existing literature. This gap pertains to the scarcity of findings from word search analysis of organizational resilience and machine learning in the Scopus database. Remarkably, this search yielded a mere 20 relevant documents. The main research hypothesis of the study concerns the possibility of optimizing the results of estimating the organizational resilience potential through the implementation of the most accurate technique of regression or ML to predict the importance of attributes of organizational resilience. Additionally, the second hypothesis concerns the possible difference between the importance of the surveyed variables influencing the potential of organizational resilience in the studied subscales. The dataset was obtained from the results of our survey based on the questionnaire consisting of 48 items mainly established on OR attributes formed on ISO 22316:2017 (2017) “Organizational Resilience—Principles And Attributes”. The main objective of the research is to propose the best technique in the context of accuracy for predicting the attributes of the organizational resilience potential.

2 Literature review

Firstly, literature research explains different definitions and attributes of organizational resilience based on ISO 22316:2017 (2017) standard. In the second part, additional factors influencing organizational resilience, such as situational awareness and proactive attitude, are briefly characterized.

2.1 Organizational resilience

The concept of resilience was borrowed from physics, where it refers to the description of the properties of materials that are characterized by a return to their original shape after previous deformation resulting from pressure. This physical property, which is called resilience, is also being used in other sciences such as psychology and organizational management (Holling 1973; Xiao and Cao 2017). The concept of organizational resilience has also been included in international standards such as the ISO 22316:2017 (2017). Many authors have attempted to define this concept; some of those definitions are presented chronologically in Table 1.

Table 1 Selected definitions of organizational resilience

Full size table

In the scientific approach, they are blurred, therefore, for the purposes of the article, the definition of organizational resilience was adopted from ISO 22316:2017 (2017), which was defined as: “The organization’s ability to absorb and adapt to a changing environment”. According to this standard, organizational resilience consists of the following 9 attributes: The standard facilitates adaptability and stability, both of which are vital for the survival and growth of enterprises. This is achieved by fostering a shared vision and a clear sense of purpose (C1), comprehension and influence of the context (C2), effective and empowered leadership (C3), a culture that promotes organizational resilience (C4), shared information and knowledge (C5), access to resources (C6), development and coordination of management disciplines (C7), support for continuous improvement (C8), and the capacity to anticipate and manage change (C9).

2.2 Situational awareness and proactive attitude

Two additional factors related to the concept of resilience are proactivity and situational awareness. Organizations looking to improve their resilience should proactively scan and monitor their environment as this supports the development of situational awareness and ultimately allows for a continuous exchange and review of information (Burnard et al. 2018). A proactive organization is an organization that puts the greatest emphasis on long-term strategic planning. Being a proactive organization has many benefits in terms of business and in dealing with potential problems. One cannot precisely predict future opportunities and threats to one’s business, but one can make rational forecasts by performing an in-depth analysis. Skillful use of opportunities is a positive feature of being proactive. The awareness of the changes and the impact they may have on an enterprise will allow a proactive enterprise to plan an alternative strategy before the threats become real (Hill 2017). Another factor related to the concept of resilience is situation awareness. Resilience is a function of “situation awareness, management of keystone vulnerabilities and adaptive capacity” in a complex, dynamic and connected environment (Rzegocki 2015). According to Gilson, the concept of situation awareness was first formulated during World War I by Oswald Boelke, who realized “the importance of gaining enemy’s awareness before the enemy gained similar awareness and devised methods of achieving it” (Woods 1988). Over the years, situation awareness has become a research topic in a variety of fields, where people perform tasks in complex and dynamic systems such as military operations, aviation, air traffic control, driving and the C4I systems (Salmon et al. 2006). There is no single measure that would define the level of organizational resilience. This is called a latent (unobservable) variable that cannot be directly measured. Likewise, situation awareness and proactive attitude cannot be measured directly. To describe them, we use a set of attitudes that indicate the presence of such a feature.

3 Materials and methods

Firstly, this section explains a data set structure, its reliability, and a sample method. In the second part, the applications of regression is characterized. In the third part machine learning and ensembling techniques are briefly characterized.

3.1 Data set and a sample method

In order to examine the organizational resilience potential, a questionnaire consisting of 48 questions has been developed on the basis of the presented 9 attributes (according to ISO 22316:2017 (2017)) and 2 additional factors, i.e. a proactive attitude and situation awareness. Each attribute and additional factor was assessed with the help of 4–6 questions. The answers in the questionnaire were given on a five-point Likert scale, where 1 means “I strongly disagree”, and 5- “I strongly agree”. The Likert scale can be treated as an ordinal variable, however, assuming that there are equal distances between the answers, it can be used as an interval variable (Joshi et al. 2015). Variables on an interval scale with a length of at least five degrees can be treated as continuous variables (Weziak-Bialowolska 2011). Such an assumption was made in the following analyses and quantitative methods were used. After the validation of the questionnaire, the main study was conducted, focusing specifically on production plants within the foundry industry operating in Poland. The selection of the sample was based on data obtained from the CEIDG (Central Register and Information on Economic Activity) Polish government database, where companies were searched by legal form (active legal entities), valid email addresses and by business sector. The authors opted for a non-probability sampling method known as purposive sampling selection. This approach was employed based on the researchers’ discretion, taking into consideration the specific objectives of the study and their understanding of the target group. After screening the list of companies, the authors picked out those that fulfilled the criteria of company sector and size, and utilized them as the sampling frame. Out of a total of 233 companies in the sector, the authors narrowed down the selection to only medium and large companies. This decision was made because these companies were deemed to better represent the characteristics of the target group under investigation. Consequently, the inclusion criterion was met by a final count of 47 companies. The authors used a fixed periodic interval to select respondents from the companies listed in the sampling frame: starting with a random respondent between 1 and 9, followed by every 4th respondent thereafter (8, 12, 16, 20, 24, 28, 32, 36, 40, 44). From this process, data was collected from 10 units. The categorization of enterprise size as medium or large is based on the number of employees, which is universally applied throughout Europe as outlined in the Commission Recommendation of 6 May 2003 concerning the definition of micro, small and medium-sized enterprises (Journal of Laws UE L 124 of 20.05.2003, p. 36). Size categories are as follows: Medium (from 50 to 249 employees), and Large (more than 249 employees). The authors made an approximation of 250 as the average number of employees across all 47 companies, which allowed for the estimation of a total population of 11,750 employees. Using a sample size calculator (calculator.net, 2022)., the authors calculated that a sufficient sample size for the survey would be 373 responders, based on the number of employees, with a confidence level and interval of 95%.

The study took place from 27 January 2021 to 8 May 2021. During this time, 374 people correctly completed the questionnaires. It means that the authors got sufficient sample. The forms were sent electronically. The vast majority of people who completed the questionnaires were employees with more than 3 years of experience (84%). This means that the study was mainly attended by experienced people who know well the enterprise in which they work. The diversity regarding the structure of the respondents who completed the questionnaires is presented in Table 2.

Table 2 Features of the studied employees

Full size table

Table 3 presents the questionnaire structure used in this study. To evaluate the relationship between two continuous variables, the Spearman correlation coefficient was employed since the data set did not have a normal distribution, as confirmed by the Kolmogorov–Smirnov test with a significance level of 0.05. The item-total correlation values ranged from 0.21 to 0.68, indicating mainly a moderate or strong correlation strength, with a few weak and strong values. To assess the questionnaire’s reliability and interval consistency of questions, the overall Cronbach’s alpha coefficient was used, which was 0.94, indicating a high degree of internal consistency. This result confirms the questionnaire’s reliability for data evaluation, with a Cronbach’s alpha coefficient greater than the recommended threshold of 0.70 (Akoğlu 2018).

Table 3 Questionnaire and evaluation of its individual indicators—Part I

Full size table

3.2 Regression techniques

Two well-known regression algorithms are employed in the analyses, namely, Multiple Linear Regression (MLR), and Multivariate Adaptive Regression Splines (MARS) as a more advanced and adaptable regression technique for high dimensional data.

3.2.1 Multivariate adaptive regression splines (MARS)

Introduced by Friedman (Friedman 1991), MARS allows user to model the nonlinear structures in the data by recursively assigning different piece-wise linear functions (i.e., basis functions—BFs) in individual intervals of the feature space defined by the knot locations. In MARS, the slopes of individual piece-wise BFs may change at knots without breaks and extreme jumps. By this way, the continuity of the fully fitted function is assured. Complex nonlinear relationships between predictors and their response are also considered by employing the tensor products of BFs in MARS. MARS model building process is controlled by two primary tuning parameters (Hastie et al. 2001): (i) $max_{BFs}$: this parameter determines the number of maximum BFs to be included in the forward step and adjusts the model’s flexibility (also increasing its complexity) to penetrate the nonlinear structures in the input data space; and (ii) $max_{DEG}$: it allows the user to model the interactions and statistical dependencies between different predictor variables.

3.2.2 Multiple linear regression (MLR)

In this subsection, the basics of MLR is introduced based on Kelley and Bolin (2013). MLR is achieved by the generalization of simple linear regression. When assessing the relationship between two variables, i.e., a response and a predictor in MLR, the impact or contribution of other predictors are taken into consideration. By doing so, the effect imposed by other variables is dismissed to solely isolate and measure the relationship between the two variables of interest. The contribution (or impact) of each predictor (i.e., X) on the response (i.e., Y) is realized through a linear equation with regression coefficients, which is often expressed as:

$$\begin{aligned} Y_{i} = \beta _{0} + \beta _{1} X_{1i} + \cdots + \beta _{K} X_{Ki} + \varepsilon _{i}, \end{aligned}$$

(1)

where $Y_{i}$ is the $i_{\text {th}}$ dependent variable $(i=1,\ldots ,N)$, $\beta _{0}$ is the Y-intercept, $\beta _{K}$ is the regression coefficient (i.e., slope) for the $k_{\text {th}}$ predictor $(k=1,\ldots ,K)$, and $\varepsilon _{i}$ denotes the residual term.

3.3 Machine learning and ensembling techniques

In machine learning, classification algorithms predict class of labels according to input data. The conducted experiment is based on Likert-Scale and, these types of problems are separated from multi-class classification or regression models since there is not any ordering within the classes. In this study, Random Forests (RF), Naive Bayes Classifier (NBC) and Support Vector Machines (SVM) are also applied to estimate the organizational resilience potential.

3.3.1 Random forests (RF)

As its name directly implies, an RF is composed of numbers of individual classification and regression trees (CARTs) (Breiman et al. 1983) that function as an ensemble and the main logic is based on the idea of bagging, which is also known as bootstrap aggregating (Breiman 2004a). Bagging concept is composed of two parts: (i) bootstrapping (i.e., sampling): Numbers of data subsets are randomly obtained from the training data with replacement, and then employed to train the trees, at the end of the training process, an ensemble of several different models is generated. (ii) Aggregation: a single estimate is obtained by combining the outputs from each individual model, two-stage process ensures the model’s variance to be reduced by averaging multiple estimates that are measured from random samples of population data (Breiman 2004b).

The model building process in RFs is mainly controlled by a single parameter, i.e., the number of trees in the forest—$N_{tree}$. The other two parameters that involve in the model building are the number of potential directions to split at each node (i.e., $M_{try}$) and the numbers of examples at each tree leaf below which the split process is not allowed (i.e., $L_{size}$).

3.3.2 Naive Bayes classifier (NBC)

Basically, Bayes’ Theorem attempts to find the probability of an event occurring given that another has occurred under the assumption of independence among the predictor variables. In case of a classification problem, NBC states that existence of a certain feature in a class is not linked (or related) to the existence of any other feature (Bhowmik 2015). The posterior probability of a data sample with attributes X (i.e., predictors) belonging to a certain class $C_i$ can be calculated by NBC via the following expression (Bhowmik 2015):

$$\begin{aligned} P(C_i|X)= \frac{P(X|C_i)P(C_i)}{P(X)}, \end{aligned}$$

(2)

where $P(C_i|X)$ is the posterior probability of class $C_i$ (i.e., target) given predictors X, $P(C_i)$ denotes the prior probability of class $C_i$, $P(X|C_i)$ is the probability which is the probability of predictor given class $C_i$, and finally, P(X) stands for the prior probability of predictor X. Since P(X) is the constant for all classes, only the product $P(X|C_i)P(C_i)$ is maximized. The prior probabilities of the class are calculated as given below:

$$\begin{aligned} P(C_i)= \frac{\text {training samples of class} \ C_i}{m \ \text {(total training samples)}}. \end{aligned}$$

(3)

Using the Conditional Independence assumption between attributes, it can be written:

$$\begin{aligned} P(C_i|X)= \prod _{t=1}^{n} P(X_t|C_i), \end{aligned}$$

(4)

where $X_t$ are values for attributes in the sample X. The probabilities $P(X_t|C_i)$ can be estimated from the training dataset calculating for each attribute columns.

3.3.3 Support vector machines (SVMs)

The idea of support vector was initiated from the earlier studies of Vapnik (1982). General idea behind the support vector concept in classification, which is later improved to deal with regression problems (i.e., support vector regression—SVR) (Simionescu 2022), is to develop a decision function that depends on a certain set of data instances after introducing a test sample. These data points, upon which the decision surface (i.e., separating hyperplane) is founded, is referred as support vectors. When deciding the optimal separating hyperplane in a traditional binary classification case, SVM attempts to (i) find the margins with a maximum separating distance between two classes, and at the same time, (ii) decrease the number of misclassified instances (Vapnik 2000). The optimal separating hyperplane is expressed as:

$$\begin{aligned} {{\varvec{w}}}^T {{\varvec{x}}}+b=0, \end{aligned}$$

(5)

where x is the vector of predictor variables, w determines the direction of the hyperplane and b is the bias term. In the case of data that cannot be linearly separable, the input data is mapped into a higher-dimensional feature space by using soft margin and kernel methods. After introducing these conditions, the decision function takes the following form:

$$\begin{aligned} f(x) = \text {sgn} \bigg ( \sum _{i=1}^{r} \alpha _{i} y_{i} K ({{\varvec{x}}},{{\varvec{x}}}_\textit{i}) + b \bigg ), \end{aligned}$$

(6)

where ${{\varvec{x}}}_\textit{i}$ represents the vector of predictors for each of the r training cases with $y_i$ defining the class membership, $\alpha _{i} (i=1, \ldots r)$ are Lagrange multipliers, $K({{\varvec{x}}},{{\varvec{x}}}_\textit{i})$ specifies the kernel function that redistribute the data points allowing the insertion of a linear hyperplane. The magnitude of $\alpha _{i}$ is controlled by the constant $C > 0$ which is a “tuning parameter” to adjust the trade-off between the margin maximization and the rate of misclassification.

3.3.4 Ensemble methods

In ensemble learning, multiple instances of classical ML algorithms are combined to achieve a more feasible (i.e., optimal) solution to a specific engineering/scientific problem. Each individual ML method involved in the ensemble is a base learner, and the results obtained from these base learners are aggregated to generate more accurate and robust predictions. There are several ensemble learning techniques for classification and regression such as voting-majority voting, bagging, boosting, and stacking. Voting method combines the predictions of different classifiers to achieve better prediction performance for a classification or a regression problem. If voting is applied to regression task, averages of the predictions are taken. If the problem is a classification task, the predictions for every label are summed to predict the label which has the majority vote. Bagging is proposed by Breiman (2004a), which is an effective technique for regression and classification tasks. This technique increases the strength of each model while improving accuracy and decreasing variance to cope with over fitting. Boosting is another ensemble technique, which aims to minimize errors in the training such as decreasing both variance and bias by combining weak learners while adding them iteratively to generate a strong learner. This technique is proposed for classification problems, but it can also be used for regression problems (Schapire 2005). Stacking (stack generalization) combines the predictions of multiple machine learning algorithms. Before this combining step, all the algorithms are trained on training set and their predictions are given as inputs to a meta-classifier algorithm to make a final prediction. The meta classifier algorithm tries minimizing the instability while maximizing the strengths of models and any machine learning model can be used as a meta classifier such as k-nearest neighbors, random forests and SVM. A general illustration for stacked ensemble is shown in Fig. 1.

4 Results and discussions

The dataset contains 374 observations and 48 variables gathered in three subscales such as: dependent variables A, B, and C which is are given in Table 4. The variables indicating the degree (expressed in Likert scale) to which individuals believe in specific attributes and factors. The surveyed units (organizations) are measured on the scale of organizational resilience potential by the variable C. The dependent section variable C is the main response and consists of 38 independent variables described by 38 questions (items). Total organizational resilience potential consists of 9 attributes of C, where C1—shared vision and clarity of purpose (4 independent variables such as C11–C14), C2—understanding and influencing context (4 independent variables such as C21–C24), C3—effective and empowered leadership (4 independent variables such as C31–C34), C4—a culture supportive of organizational resilience (6 independent variables such as C41–C46), C5—shared information and knowledge (4 independent variables such as C51–C54), C6—availability of resources (4 independent variables such as C61–C64), C7—development and coordination of management disciplines (4 items such as C71–C74), C8—supporting continual improvement (4 independent variables such as C81–C84), C9—ability to anticipate and managing change (4 independent variables such as C91–C94). Additionally, the questionnaire provides two subscales—dependent variables: variable A—the measure of situational awareness (5 independent variables such as A1–A5) and section variable B—measure of proactive posture (5 independent variables such as B1–B5). In addition to these variables, there is also a variable D. The subscale—variable D (5 items) is called “demographics” (D1—gender, D2—age, D3—education, D4—experience and D5—position of responders).Using regression analysis, machine learning algorithms, and ensemble technique it has been investigated: (i) the possibility of the optimization of the results of estimating the organizational resilience potential by an appropriate machine learning method, and (ii) the possible existence of the difference between the importance of the surveyed variables influencing the potential of organizational resilience in the studied subscales. A, B and C variables are in continuous form and interval scale. D variables are in categorical forms and both nominal and ordinal scales. When the Principal Component Analysis (PCA) is applied to combine different measures, it has been decided that the loss of information can be very high, and it may eliminate the variables with high importance. Therefore, A, B, C and D are introduced by taking the mean of their own items. The mean of independent variables “C11, C12, C13, C14” are calculated and defined as a new variable “C1”. The same process is applied for all variables C2–C5. Hence, “C1, C2, C3, C4, C5” are added to the table, and the dependent variable “C” is introduced by taking the mean of these variables. The dataset is divided by taking 70% for training and 30% for testing. The questionnaire dataset is checked for multicollinearity, and variance inflation factor (VIF) is calculated since in case of VIF > 10, we can assume multicollinearity problem, then the process is continued in MLR. It has been found that there is a significant relationship between dependent variable “C” and its independent variables “C1–C9”, where p-value $< 0.001$ for each by using MLR for both training and testing set. While MLR explains significance between continuous dependent with independent variables, the method fails to explain the relationship between continuous dependent variable C and categorical independent variable D in Table 5.

Table 4 Questionnaire and evaluation of its individual indicators—Part II

Full size table

Table 5 Results of multiple linear regression

Full size table

Since linear models make assumptions on linearity, the results can be poor if there is any nonlinear relationship within the dataset. Thus, within this perspective, MARS can be regarded as a useful method to reveal the nonlinear relationships in the dataset. The plots given below illustrates model selection showing the generalized cross validation (GCV) (Friedman 1991) $R^2$ (left-hand y-axis and solid black line) according to the number of terms retained in the model (x-axis) which are constructed from a fixed number of original predictors (right-hand y-axis). The vertical dashed line is the optimal number of terms retained, where marginal increases in GCV $R^2$ is less than 0.001. The earth package of RStudio (Milborrow 2011) is used to evaluate results and model summaries for both training and testing set are given below for C and its attributes C1–C9 in Fig. 2, C and A for Fig. 3, C and B in Fig. 4 and C and D in Fig. 5, respectively.

Additionally, RF, NBC, and SVM are applied and tested on the dataset for their predictions of scales. Also, the dependent variable is redefined by rounding its value to the closest integer to apply the chosen ensemble method. After rounding the column “C”, instead of 5-point Likert Scale, it is recategorized as 4-point scale such that; Strong Disagree, Somewhat Disagree, Somewhat Agree, Strong Agree. As a meta-classifier KNN is chosen and fivefold cross validation is applied, where processes are repeated 10$\times$. The performance metrics ofAccuracy (Acc) metrics of the machine learning algorithms and the stacked ensemble (Please note that the bold-face numbers indicate the best performance) algorithms and KNN-based stacked ensemble are given in Table 6.

Table 6 Accurracy (Acc) metrics of the machine learning algorithms and the stacked ensemble

Full size table

For determining the variable importance score, the varImp function of the caret package is utilized in the RF model. In cases where there are more variables than observations, the RF model is run on each variable individually. It should be noted that the magnitude of the variable importance values may change, but the order of importance remains the same. Furthermore, the relative values maintain a similar order, and they can still be differentiated from less relevant variables. Even when additional highly correlated variables are added to the most important variables, the prediction set for each RF model still displays the most important variable. This variable was already identified as the most important by Genuer et al. (2010). According to the RF with the response variable C, the most important variable between C1–C9 is C5, and the least important one is C1. For A1–A5, the best one is A4, while A5 is the worst. The most important variable for B1–B5 is B5, and the least important variable is B4. Between D1–D5, the best one is D4, whereas the worst one is D1. The study acknowledges certain limitations that should be taken into consideration. Firstly, the utilization of a non-probability sampling method, specifically purposive sampling selection, introduced subjectivity into the sample selection process rather than relying on random selection. However, it is important to note that companies were still drawn from this selected pool. Moreover, during the survey phase, the study focused on medium and large companies exclusively within the foundry industry, which may have introduced a potential selection bias. Nevertheless, the authors recognized these limitations and justified their choice by considering these companies as the target group for their research. To address these limitations, future studies should consider conducting additional surveys that encompass companies of different sizes and from various industries. This approach would allow for a more comprehensive analysis and enhance the generalizability of the findings.

5 Conclusions

The two main findings of the research are (i) it is possible to optimize the results of estimating the organizational resilience potential by an appropriate machine learning method and (ii) there exists a difference between the importance of the surveyed variables influencing the potential of organizational resilience in the studied subscales. The best technique in the context of accuracy is ensemble methods. Regression analysis is employed on numerical variables. When data in nominal or ordinal scale are used as predictors in a regression task, they are often replaced by numeric dummy variables, which may potentially fall short to accurately encode the inherent information represented in the original structure of the data. This situation often causes the regression models to misinterpret the relationship between these numbers and degrades their performance, as in the case of MARS and MLR, which are also prone to outliers. This paper uses a KNN filtering-based data pre-processing technique for stacked ensemble classifier. The stacking is achieved with three base classifiers (i.e., RF, NB, and SVM). KNN is chosen as meta-classifier, and then, a comparative analysis among 4 classifiers is performed. The proposed stacked ensemble classifier with KNN filtering performs best among all the techniques. As far as the importance of the surveyed variables is concerned the most important variable are C5, A4, B5, and D4 in the studied subscales, respectively.

The preferred technique should support the quality of managerial decision-making process by (i) increasing the accuracy of organizational resilience potential prediction, and (ii) indication of the importance of attributes and factors affecting the potential for organizational resilience. Attributes of the organizational resilience potential prediction is crucial for the corporate recovery management systems to enhance the ability of an organization to survive and cope with adversity and uncertainty during the disruption. It assists with crisis management and business continuity planning by identifying which attributes require additional attention and which aspects of organizational resilience are more susceptible to errors and weaknesses. This can lead to a reduction in the cost of crisis preparation, ultimately lowering the overall cost for the organization.

The potential of these solutions is good and continuously growing, which makes artificial intelligence a promising field for research and implementation. Before ML techniques for assessing organizational resilience potential can reach their full potential, there are problems to be resolved. The proposed innovations should be implemented systemically according to the circle of innovation. We should consider new requirements and problems, new ways to organize the organizational resilience assessment, new social interactions, and so on. For example, one new problem is the availability of skilled staff to implement these new techniques. Unfortunately, it will take time to find the right staff and train them. It is important to keep in mind that the value of a forecast is not solely determined by its accuracy, as there are other factors that influence its usefulness. Ultimately, the decision-makers who act upon the forecast must consider the potential consequences of any variance between the forecast and the actual outcome. This raises the question of how much discrepancy is required to warrant a change in decision, a topic that has been discussed (Phillips and Linstone 2016). This task will be accomplished by preparing the organization to understand the range of possibilities and flexibly respond to events within and beyond the prepared scope. As a future study, each category in the resilience questionnaire can be considered as an individual source of data, and to establish the similarity matrix in SVR, Multiple Kernel Learning (MKL) algorithm can be deployed into SVR so that each category of questions refer to a separate kernel function within the kernel combination. The next further step could be the generalization of MKL to Infinite Kernel Learning (IKL) (Ozögür-Akyüz and Weber 2010b, a; Özögür-Akyüz et al. 2016), which has been applied to various classification problems in the literature. Inspired by IKL approach, one can employ the IKL embedded SVR technique for the organizational resilience survey questions. Furthermore, it is also planned to use conic and robustified versions of MARS named RMARS (Özmen and Weber 2014), CMARS (Weber et al. 2012), and RCMARS (Özmen et al. 2014). Eventually, this paper also provides information on the benefits that companies can obtain by adopting the innovation technique, mainly by improving the quality of decision-making process associated with organizational resilience, and corporate recovery management systems.

Availability of data and materials

The dataset can be obtained from the corresponding author. All authors have read and agreed to the published version of the manuscript.

Code availability

All of the codes are available at https://github.com/bscslotr/organizationalresilience.

References

Akoğlu H (2018) User’s guide to correlation coefficients. Turk J Emerg Med 18:91–93
Article Google Scholar
Bhowmik TK (2015) Naive Bayes vs logistic regression: theory, implementation and experimental validation. Intel Artif 18:14–30
Article Google Scholar
Breiman L (2004) Bagging predictors. Mach Learn 24:123–140
Article Google Scholar
Breiman L (2004) Random forests. Mach Learn 45:5–32
Article Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1983) Classification and regression trees. Taylor & Francis
Burnard K, Bhamra R (2011) Organisational resilience: development of a conceptual framework for organisational responses. Int J Prod Res 49:5581–5599
Article Google Scholar
Burnard K, Bhamra R, Tsinopoulos C (2018) Building organizational resilience: four configurations. IEEE Trans Eng Manage 65(3):351–362. https://doi.org/10.1109/TEM.2018.2796181
Article Google Scholar
Elmousalami HH (2021) Comparison of artificial intelligence techniques for project conceptual cost prediction: a case study and comparative analysis. IEEE Trans Eng Manage 68(1):183–196. https://doi.org/10.1109/TEM.2020.2972078
Article Google Scholar
Ewertowski T (2022) A standard-based concept of the integration of the corporate recovery management systems: coping with adversity and uncertainty during a pandemic. Sustainability 14(3):1254
Article Google Scholar
Ewertowski T, Butlewski M (2021) Development of a pandemic residual risk assessment tool for building organizational resilience within polish enterprises. Int J Environ Res Public Health 18(13):6948
Article Google Scholar
Ewertowski T, Butlewski M (2022) Managerial perception of risk in an organization in a post-covid-19 work environment. Int J Environ Res Public Health 19(22):14978
Article Google Scholar
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67
Google Scholar
Genuer R, Poggi J-M, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recogn Lett 31:2225–2236
Article Google Scholar
Graczyk-Kucharska M, Olszewski R, Weber G-W (2022) The use of spatial data mining methods for modeling HR challenges of generation z in greater Poland region. CEJOR 31:205–237. https://doi.org/10.1007/s10100-022-00805-5
Article Google Scholar
Hastie TJ, Friedman JH, Tibshirani R (2001) The elements of statistical learning: data mining, inference, and prediction. Springer series in statistics. Springer
Hill B (2017) What is a business contingency plan? Acecessed 10 Oct 10 2021. https://bizfluent.com/about-7494220-business-contingencyplan.html
Holling CS (1973) Resilience and stability of ecological systems. Ann Rev Ecol Systemat 4, 1–23. Accessed 08 Apr 2022 http://www.jstor.org/stable/2096802
ISO 22316:2017 (2017) Security and resilience—organizational resilience—principles and attributes (ISO 22316:2017 ed.). International Organization for Standardization, Vernier
Joshi A, Kale S, Chandel S, Pal DK (2015) Likert scale: explored and explained. Br J Appl Sci Technol 7:396–403
Article Google Scholar
Kelley K, Bolin JH (2013) Multiple regression. Sense Publishers, The Netherlands, pp 71–101
Google Scholar
Mallak L (1998) Putting organizational resilience to work. Ind Manag 4:8–13
Google Scholar
Masten AS (2011) Resilience in children threatened by extreme adversity: frameworks for research, practice, and translational synergy. Dev Psychopathol 23:493–506
Article Google Scholar
Ma Z, Xiao L, Yin J (2018) Toward a dynamic model of organizational resilience. Nankai Bus Rev Int 9(3):246–263
Article Google Scholar
Milborrow (2011) Derived from mda:mars by T. Hastie and R. Tibshirani, S. Earth: Multivariate adaptive regression splines [Computer software manual]. Retrieved from http://CRAN.R-project.org/package=earth (R package)
Mühlroth C, Grottke M (2022) Artificial intelligence in innovation: How to spot emerging trends and technologies. IEEE Trans Eng Manage 69(2):493–510. https://doi.org/10.1109/TEM.2020.2989214
Article Google Scholar
Nalcaci G, Ozmen A, Weber G-W (2019) Long-term load forecasting: models based on mars, ANN and LR methods. Cent Eur J Oper Res 27:1033–1049
Article Google Scholar
Nehézová TS, Skoda M, Hlavatý R, Brožová H (2022) Fuzzy and robust approach for decision-making in disaster situations. Cent Eur J Oper Res 30(2):617–645
Article Google Scholar
Nelson DR, Adger WN, Brown K (2007) Adaptation to environmental change: contributions of a resilience framework. Annu Rev Environ Resour 32:395–419
Article Google Scholar
Özmen A, Weber G-W (2014) Rmars: robustification of multivariate adaptive regression spline under polyhedral uncertainty. J Comput Appl Math 259:914–924
Article Google Scholar
Özmen A, Batmaz I, Weber G-W (2014) Precipitation modeling by polyhedral RCMARS and comparison with MARS and CMARS. Environ Model Assess 19:425–435
Article Google Scholar
Ozögür-Akyüz S, Weber G-W (2010) Infinite kernel learning via infinite and semi-infinite programming. Optim Methods Softw 25:937–970
Article Google Scholar
Ozögür-Akyüz S, Weber G-W (2010) On numerical optimization theory of infinite kernel learning. J Glob Optim 48:215–239
Article Google Scholar
Özögür-Akyüz S, Üstünkar G, Weber G-W (2016) Adapted infinite kernel learning by multi-local algorithm. Int J Pattern Recogn Artif Intell 30, 1651004:1-1651004:21
Pejic Bach M, Topalović A, Turulja L (2022, 11) Data mining usage in Italian SMES: an integrated SEM-ANN approach. Cent Eur J Oper Res. https://doi.org/10.1007/s10100-022-00829-x
Perrings C (2006) Resilience and sustainable development. Environ Dev Econ 11:417–427
Article Google Scholar
Phillips F (2008) Change in socio-technical systems: researching the multis, the Biggers, and the more connected. Technol Forecast Soc Chang 75:721–734
Article Google Scholar
Phillips F (2014) Triple helix and the circle of innovation. J Contemp East Asia 13:57–68
Article Google Scholar
Phillips F, Linstone HA (2016) Key ideas from a 25-year collaboration at technological forecasting & social change. Technol Forecast Soc Chang 105:158–166
Article Google Scholar
Powley EH (2009) Reclaiming resilience and safety: resilience activation in the critical period of crisis. Hum Relat 62:1289–1326
Article Google Scholar
Rauner MS, Niessner H, Steen O, Pope A, Neville KM, O’Riordan S, Tomic K (2018) An advanced decision support system for European disaster management: the feature of the skills taxonomy. CEJOR 26:485–530
Article Google Scholar
Régibeau P, Rockett K (2013) Economic analysis of resilience: a framework for local policy response based on new case studies. J Innov Econ Manag 11:107–147. https://doi.org/10.3917/jie.011.0107
Rzegocki MGB (2015) Organizational resilience. Org Kierow 1:60–61
Google Scholar
Salmon PM, Stanton NA, Walker GH, Green D (2006) Situation awareness measurement: a review of applicability for c4i environments. Appl Ergon 37(2):225–38
Article Google Scholar
Schapire RE (2005) The strength of weak learnability. Mach Learn 5:197–227
Article Google Scholar
Simionescu M (2022) Econometrics of sentiments- sentometrics and machine learning: the improvement of inflation predictions in Romania using sentiment analysis. Technol Forecast Soc Change 182:121867. https://doi.org/10.1016/j.techfore.2022.121867
Sitkin SB (1992) Learning through failure: the strategy of small losses. Res Org Behav 14:231–266
Google Scholar
Sutcliffe K, Vogus T (2003) Sutcliffe, K. M. and T. J. Vogus (2003). Organizing for resilience. positive organizational scholarship: foundations of a new discipline. K. S. Cameron, J. E. Dutton and R. E. Quinn. San Francisco, CA, Berrett-Koehler: 94–110, pp 94–110
Tew P, Lu Z, Tolomiczenko G, Gellatly J (2008) SARS: lessons in strategic planning for hoteliers and destination marketers. Int J Contemp Hosp Manag 20:332–346. https://doi.org/10.1108/09596110810866145
Article Google Scholar
Valášková K, Throne O, Kral P, Michalkova L (2020) Deep learning enabled smart process planning in cyber-physical system-based manufacturing. J Self-Gov Manag Econ 8:121–127. https://doi.org/10.22381/JSME8120205
Article Google Scholar
Vapnik VN (1982) Estimation of dependences based on empirical data. Springer series in statistics. Springer
Vapnik VN (1999) The nature of statistical learning theory. Springer, New York
Weber G-W, Batmaz I, Köksal G, Taylan P, Yerlikaya-Ozkurt F (2012) CMARS: a new contribution to nonparametric regression with multivariate adaptive regression splines supported by continuous optimization. Inverse Probl Sci Eng 20:371–400
Article Google Scholar
Weziak-Bialowolska D (2011) Operacjonalizacja i skalowanie w ilosciowych badaniach spolecznych. ISiD Work Pap 16:49
Google Scholar
Woods DD (1988) Coping with complexity: the psychology of human behaviour in complex systems. Tasks, errors, and mental models. Taylor & Francis Inc, USA, pp 128–148
Lei Xiao, Huan Cao (2017) Organizational resilience: the theoretical model and research implication. ITM Web Confer 12:04021. https://doi.org/10.1051/itmconf/20171204021
Article Google Scholar

Download references

Funding

This research is supported by Grant Number 0811/SBAD/1049 from Poznan University of Technology.

Author information

Authors and Affiliations

Faculty of Engineering Management, Poznań University of Technology, Poznań, Poland
Tomasz Ewertowski, Gerhard-Wilhelm Weber, Joanna Sadłowska-Wrzesińska & Elżbieta Racek
Department of Industrial Engineering, Bahçeşehir University, Istanbul, Turkey
Buse Çisil Güldoğuş
Department of Forest Engineering, Çankırı Karatekin University, Çankırı, Turkey
Semih Kuter
Department of Mathematics, Bahçeşehir University, Istanbul, Turkey
Süreyya Akyüz
Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey
Gerhard-Wilhelm Weber

Authors

Tomasz Ewertowski
View author publications
You can also search for this author in PubMed Google Scholar
Buse Çisil Güldoğuş
View author publications
You can also search for this author in PubMed Google Scholar
Semih Kuter
View author publications
You can also search for this author in PubMed Google Scholar
Süreyya Akyüz
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard-Wilhelm Weber
View author publications
You can also search for this author in PubMed Google Scholar
Joanna Sadłowska-Wrzesińska
View author publications
You can also search for this author in PubMed Google Scholar
Elżbieta Racek
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

According to CRediT taxonomy: Conceptualization, Akyüz, Ewertowski, Kuter, and Weber; methodology, Akyüz, Ewertowski, Kuter, Weber; software, Akyüz, Güldoğuş, and Kuter.; validation, Akyüz, Güldoğuş, and Kuter; formal analysis, Akyüz, Ewertowski, Güldoğuş, and Kuter; investigation, Ewertowski, Racek, Sadłowska-Wrzesińska; resources, Ewertowski, Racek, Sadłowska-Wrzesińska; data curation, Akyüz, Ewertowski, Güldoğuş, and Kuter; writing–original draft preparation Akyüz, Ewertowski, Güldoğuş, and Kuter; writing–review and editing, Akyüz, Ewertowski, Weber; visualization, Güldoğuş, Kuter; supervision, Akyüz, Ewertowski, Kuter, and Weber; project administration, Ewertowski, Güldoğuş; funding acquisition, Ewertowski, Sadłowska-Wrzesińska.

Corresponding author

Correspondence to Tomasz Ewertowski.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Ethical approval

Due to the characteristics of the research and the provisions of the regulations on the work of the Commission on Ethics of Scientific Research conducted with the participation of people at Poznan University of Technology, research was not subject to opinion. Moreover, the research began before the commission was appointed by Rector’s Order RO/IV/15/2022.

Informed consent

Informed consent was obtained from all subjects involved in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ewertowski, T., Güldoğuş, B.Ç., Kuter, S. et al. The use of machine learning techniques for assessing the potential of organizational resilience. Cent Eur J Oper Res 32, 685–710 (2024). https://doi.org/10.1007/s10100-023-00875-z

Download citation

Accepted: 12 July 2023
Published: 07 August 2023
Issue Date: September 2024
DOI: https://doi.org/10.1007/s10100-023-00875-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The use of machine learning techniques for assessing the potential of organizational resilience

Abstract

Similar content being viewed by others

The dilemma of determining the superiority of data mining models: optimal sampling balance and end users’ perspectives matter

Finding Significant Project Issues with Machine Learning

Stacking ensemble approach in data mining methods for landslide prediction

1 Introduction

2 Literature review

2.1 Organizational resilience

2.2 Situational awareness and proactive attitude

3 Materials and methods

3.1 Data set and a sample method

3.2 Regression techniques

3.2.1 Multivariate adaptive regression splines (MARS)

3.2.2 Multiple linear regression (MLR)

3.3 Machine learning and ensembling techniques

3.3.1 Random forests (RF)

3.3.2 Naive Bayes classifier (NBC)

3.3.3 Support vector machines (SVMs)

3.3.4 Ensemble methods

4 Results and discussions

5 Conclusions

Availability of data and materials

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation