1 Introduction

Living beings need minerals or mineral product for their daily-life usage such as base metals, precious metals, coking coal, iron sands, limestone, and industrial minerals. Indeed, King (2009), this minerals are necessary for civil infrastructure, automobile manufacture and fuel, computers and other devices, communications, medical and dentistry, agricultural production, and power generation and transmission. Hence, in this regard, mining is highly essential. Mining provides the globe with the minerals that modernity requires. It also creates jobs and contributes significantly to our economic and social development.

Underground mining is identified as one of the most precarious industries worldwide. Roof fall is the common and unanticipated hazard in underground mining that poses a concern to the miners. According to the US mine accident statistics (Monforton and Windsor 2010), 7,737 miners were injured in underground mining from 1996 to 2005. Coal mines had the highest rate of roof falls, with 1.75 per 200,000 h of underneath employment. The Mine Safety and Health Administration (MSHA) of the US 2006 reported 430 non-fatal and seven fatal incidents. The most common cause of roof fall is inappropriate roof installation. Coal extraction enacts forces on surrounding rock, sides, and supporting coal pillars by designing galleries. The roof and pillars were put under strain by these induced forces. A roof fall scenario occurs when the applied pressure surpasses the side pillar and roof’s loading limit.

Roof failures have an adverse influence on both lives and time. Miners were hurt, had severe disabilities, or died due to the roof fall. Roof fall causes equipment breakdown, a break-in mining operation, distorted aeration, blocked pathways, and other obstacles that add time to the process.

As discussed in Section 1, Roof falls constitute a threat to the miner and the economy in underground mining. Many studies strive to establish relationships between influential factors and roof falls because of the problem’s importance. Existing roof fall predicting and preventing approaches can be separated into two classes:

1.1 Non-fuzzy based approaches

Molinda et al. (2000) discussed regression technique to find the association between the Roof Fall Rate (RFR) and other features. The authors utilized 37 US coal mine datasetFootnote 1 to predict RFR. They concluded that the lower Coal Mine Roof Rating (CMRR) values were more vulnerable to collapse than higher CMRR values. Although RFR and Primary Roof SUPport (PRSUP) have a positive correlation coefficient, Intersection Diagonal Span (IDS) has a negative correlation. However, according to Deb (2003), the association between attributes were not accurately identify in Molinda et al. (2000). One of the critical aspects which is affecting the roof geology and pillar strength is the Depth of Cover (DoC) as a portrayal of vertical and horizontal stress (Mark et al. 2001; Molinda et al. 2000). Mark et al. (2001), said that deeper mines were more likely to have high RFR in their roof bolt design guidelines. Yet, they discovered statistically that the influence of DoC was relatively small when all other Geo-technical variables were kept constant. Biswas and Karl (2003), developed the “Taxonomic Analysis” which entails a systematic organization of data based on observation, description, and classification to identify the incident’s root cause, and make recommendations regarding roof fall prevention. They reported more roof fall incidences in the supported region \(67.6\%\) than in the unsupported region \(32.4\%\). However, unsupported or partially supported roofs have been documented as more susceptible to failure (Sanjay and Samir 2009). Palei and Das (2008), used sensitivity analysis to examine the effect of contributing components on support safety factors in 14 roof fall occurrences in India’s underground coal mines. Based on DoC, gallery width, seam thickness, immediate roof, Mining Height (MH), and roof support status, Sanjay and Samir (2009) used binary logistic regression to evaluate the severity of roof fall in five underground boards and pillar coal mines in India. They assumed minimum and maximum thresholds for each parameter to convert the raw data into its binary form. However, such threshold values may be difficult to compute and also susceptible to the risk of losing vital information. Isleyen et al. (2020, 2021), discussed roof fall hazard detection with a convolutional neural network using transfer learning and explained advantage of adding deep learning approach in last work respectively. Based on the expertise of these mines, the authors chose an image demonstrating hazardous and non-hazardous roof conditions of Subtropolis limestone mine . Because of the darkness, standard camera configurations may not accurately capture roof conditions in underground coal mines. To capture the precise shape of the roof, we need to use a lot of light. However, this may not be viable due to the nature of coal combustion. Małkowski and Juszyński (2021), applied an artificial neural network to assess roof fall hazards in Poland’s copper mines.

1.2 Fuzzy based approaches

Deb (2003), discussed the Mamdani principle and fuzzy rules to form a fuzzy relational matrix inorded to map the relationship between three attributes, such as CMRR, PRSUP, and IDS, to the target variable RFR on US coal mine data. They used nine fuzzy regulations derived from the fuzzified sets of CMRR, PRSUP, and IDS. However, they did not consider other relevant aspects like DoC and MH. Hence, RFR may not be predicted accurately in Deb (2003). Ghasemi and Ataei (2013), employed a fuzzy model based on the Mamdani principle to predict RFR. This model used 180 fuzzy rules based on the expert knowledge, and only considerd the CMRR, PRSUP, IDS, and DoC parameters from the US coal mine roof fall data (Molinda et al. 2000). However, the authors does not include MH. According to Fotta and Mallett (1997), MH is an essential parameter for roof fall estimation. Razani et al. (2013), discussed the Tasaki-Sugeno-based inference model. The Membership Functions (MFs) of parameters is calculated through subtractive clustering, which produces only 84 fuzzy rules. This model requires five inputs to anticipate RFR: CMRR, PRSUP, IDS, DoC and MH. However, by lowering the size of the rule base, the RFR may be predicted more correctly. Javadi et al. (2017), discussed the fuzzy bayesian network model for roof fall risk analysis in underground coal mines operate on longwall method.

2 Motivation and contribution

Mining sectors contribute to the global economy by creating jobs and meeting human needs for crucial minerals, metals, and coal. However, roof fall, particularly in underground mines, is a severe problem in the mining industries. It affects the miners and mining industries directly, but affects everyone indirectly on this planet. The existing fuzzy-based approaches (as discussed in Section 1.1) may not efficiently predict (or nearly predict) RFR as they have considered only a few parameters. Although, a few approaches ( e.g., (Razani et al. 2013)) considered all parameters available in the dataset to forecast RFR, they still suffer from curse of dimensionality of rule base and prediction inaccuracy. On the other hand, non-fuzzy-based approaches may not be applicable for predicting RFR accurately when uncertainty of the geological parameters are taken into consideration.

Motivated by the above-mentioned issues, in this paper, we proposed a fuzzy inference system using genetic algorithm and pattern search for predicting RFR in underground coal mines more precisely. The proposed approach offers a Tasaki-sugeno fuzzy inference system with minimum of 63 rules. The major contribution of this work are outlined as follow:

  1. (1)

    This study uses GA to learn rules and reduce the rule base size. There are only 63 rules produced.

  2. (2)

    The proposed method precisely predicts RFR compared to the existing models.

3 Preliminaries

This section begins with a summary of the existing algorithms used in this work.

3.1 Fuzzy inference system

Fuzzy logic was first introduced by Zadeh (1996), in 1965. It is a helpful tool for solving complex problems due to its ability to deal with ambiguity and vagueness. Fuzzy Inference System (FIS) is the most useful application of fuzzy logic implemented by Mamdani (1974); Mamdani and Assilian (1975). FIS is also a fuzzy rule-based system where input and output are real-valued attributes. Firstly, FIS uses MFs to convert real-valued input into fuzzy linguistic variables. Linguistic variables can be categorized into low, medium, high, etc. The heart of FIS is the fuzzy inferencing based on the fuzzy linguistic rules. These rules are made up of “if and then” statements. The architecture of FIS is shown in Figure 1. FIS is divided into three layers which are described as follows:

  1. (1)

    Fuzzification: uses the MFs to convert a crisp input value into a fuzzy linguistic value. There are various types of MFs. Among them, the most commonly used are triangular, trapezoidal and Gaussian.

  2. (2)

    Inference system: uses the knowledge base and output from the fuzzification layer for reasoning. A rule base plus a data base make up a knowledge base. All information about the input data and MFs is stored in the data base. Expert knowledge in fuzzy rules is included in the rule base.

  3. (3)

    Defuzzification: is the practice of taking fuzzy outputs and transforming them into a single or crisp output value. Defuzzification strategies include the maximum or mean-max membership principles, the centroid approach, and the weighted average method, etc.

Fig. 1
figure 1

Fuzzy inference system

3.2 Genetic algorithm

A genetic algorithm (GA) is a meta-heuristic search technique based on Charles Darwin’s natural evolution theory. Traditional optimization algorithms focus on the parameters themselves. Due to this, they rely on the parameters’ continuity and the function’s derivative. On the other hand, GA depends on parameter coding to handle multivariate optimization problems fast, which is a difficult task for traditional optimization algorithms. GA starts with a randomly created population represented by chromosomes. Chromosomes are a collection of genes that define the parameters’ code. Genetic operators are applied iteratively on chromosomes to produce successive generations that improve with time. Genetic operators are outlined as follows:

  1. (1)

    Selection: It depends on the objective function. An objective function assigns a fitness score to each individual. Chromosomes having a higher fitness value are more likely to produce progeny for the following generation. There are a variety of selection methods accessible including the tournament method, roulette wheel, ranking system, and so on Davis (1991).

  2. (2)

    Crossover: It is used to change the encoding of a chromosome or chromosomes from one population to the next. First, we must choose two chromosomes at random from the mating pool known as parents. These parents trades essential information through mating and passes on new chromosomes to the next generation. These offspring have yet to be discovered in search space.

  3. (3)

    Mutation: It introduces and maintains diversity in the population by doing a random small change in chromosomes. It is applied on chromosomes with a low probability \(p_\text{m}\) because the higher value of \(p_\text{m}\) reduces GA to random search. The mutation increases the likelihood of convergence to the global optimum. The interested reader can find more details on GA (Goldberg et al. 1989; Horner and Goldberg 1991; Mirjalili 2019).

3.3 Pattern search

Pattern Search (PS) was first introduced by Hooke and Jeeves (1961). PS starts by identifying a geometric structure and a step length around the pattern’s beginning point (referred as center). The pattern’s extremities have test points that are evaluated using an objective function to decide which is the best point which is called exploration. Then, the pattern migrates towards that point. When the pattern’s extreme points are not better than the pattern’s center point, the step length is reduced. When the step length goes below the specified tolerance, the method is terminated. The flow chart of PS is illustrated in Figure 2.

Fig. 2
figure 2

Flowchart of PS algorithm

The startup of the PS method contains a guessed initial position \(p_o\), an initial step length \(l_o\), and a set of vectors \(v_i\) called patterns that identify the coordinate directions. The vector \(v_i\) set can be represented in terms of a matrix \(V = \{v_1,..., v_i,..., v_{2n}\}\), where \(n = 1, 2,...\). The approach is an iterative procedure in which \(f(p_t + l_t v_i)\) is calculated for all 2n vectors of V in the t th iteration until a vector \(v_i\) is found such that \(f(p_t + l_tv_i) < f(p_t)\). The pattern’s step length (\(l_t\)) falls if the goal function value is not decreased on any V vectors. If at least one vector improves the objective function value, the pattern changes. When the step length is small enough, the process stops.

To check for convergence, \(l_t\) is compared with the predefined stopping tolerance \(l_{\text{tol}}\) after each unsuccessful iteration. The search ends with \(p^* = p_{t+1}\) when the step-length falls below \(l_{\text{tol}}\). The search will not finish after a successful action because \(l_t\) is only shortened after unsuccessful steps. PS is listed in Algorithm 1.

figure a

3.4 Proposed method

In this research, Takagi-Sugeno-Kang (TSK) FIS is used by hybridizing the concept of GA and PS to discover hidden and complex patterns in the dataset. TSKFIS uses the singleton output MFs that are either constant or linear combination of input variables. Sugeno defuzzification is more efficient and robust than Mamdani defuzzification because it uses a weighted average or weighted sum of a few data points rather than locating the centroid of a two-dimensional area. Each rule produces a weighted output level equal to the product of the rule firing strength \(W_i\) and rule output level \(R_i\). The formulas in Eqs. (1) and (2), respectively, define the \(W_i\) and \(R_i\). The output of the TSK model is calculated using the procedure as given in Eq. (3). f(p) and f(q) as given in Eq. (1) denote the MFs of p and q. The constant coefficients \(a_i\), \(b_i\) and \(c_i\) are estimated from training data utilizing the least-square method. The proposed methodology can be divided into two phases: rule learning and parameter tuning, as shown in Figure 3 .

Fig. 3
figure 3

Flowchart of proposed method

$$\begin{aligned} W_i= & {} Andmethod(f(p)f(q)) \end{aligned}$$
(1)
$$\begin{aligned} R_i= & {} a_i \cdot p + b_i \cdot q + c_i \end{aligned}$$
(2)
$$\begin{aligned} y= & {} \frac{\sum _{i=1}^n W_i \cdot R_i}{\sum _{i=1}^n W_i} \end{aligned}$$
(3)
  • Phase 1: As the number of attributes in data and partitions for each feature grows, the number of rules also grows exponentially. This is called curse of dimensionality of the rule base. The goal is to minimize the rule base while maximizing the precision in the predicted results, which is a two-objective optimization problem. We have utilize GA for rule learning as GA is a derivative-free method-based algorithm that doesn’t converge to the local optimum, and is also used to solve both constrained and unconstrained optimization problems based on natural selection. The Total number of rules produced in this model using GA is 63 and the samples of rules are illustrates in Figure 4.

  • Phase 2: In this phase, We vary the MFs of individual parameter after rule learning since minor changes in these designs can increase the model performance. The proposed method leverages the PS method, part of the Direct Search (DS) method family, for tuning reasons. PS is also not bounded by the derivative needs to discover the best solution, i.e., it does not converge to the local minimum. The pace of convergence to the global PS optimum is also speedy in nature. After applying PS, the MFs values of DoC show considerable deflection, whereas PRSUP, CMRR, MH, and IDS stay consistent. Figure 5 shows the changes before or after tuning DoC. Figure 6 depicts the ending membership value of the other parameters, which remains constant.

Fig. 4
figure 4

Rules for prediction

Fig. 5
figure 5

Effect of tuning on DoC

Fig. 6
figure 6

Final MFs value of other parameters

As the number of attributes in data and partitions for each feature grows, so does the number of rules produced. As a result, the dimensionality curse of the rule base is overcome using a genetic method.

4 Data description

The proposed method is applied to the US underground coal mine dataset, Molinda et al. (2000), to demonstrate its applicability and robustness. The dataset contains 109 tuples of 21 attributes, and our target variable is RFR. After gathering information from the existing related work and conducting interviews with experts many times, in this section, we put focus on identified parameters. The identified parameters as listed below. The statistical description of the data is given in table 1. The input parameters are divided into two parts based on the control of parameters (as depicted in Figure 7).

Fig. 7
figure 7

Division of input parameters based on men-controlled

Table 1 Statistical description of roof fall data
  1. (1)

    CMRR: Molinda and Mark explained the CMRR index, Molinda and Mark (1994), defining roof rock quality in coal mines ranging from 0 to 100. The higher value of CMRR shows the roof is less prone to fall than lower values. This score depends on a wide range of natural causes of roof fall such as roof rock strength, groundwater, bedding, and other discontinuities.

  2. (2)

    PRSUP: The roof bolt system is mainly the primary support system in underground coal mines. Increasing the roof bolt density can be a more leisurely way to reduce the likelihood of roof fall risk in many cases. PRSUP is a roof bolt density indicator calculated by the Eq. (4) where \(B_\text{l}\) be the bolt length in meter (mt), \(B_\text{r}\) be the number of bolts per row, c be the bolt capacity in Kilo-Newton, \(B_\text{s}\) be spacing between row of bolts in mt, and \(E_\text{w}\) be the entry width in mt.

    $$\begin{aligned} PRSUP = \frac{B_\text{l} \times B_\text{r}\times c}{14.5 \times B_\text{s} \times E_\text{w}} \end{aligned}$$
    (4)
  3. (3)

    IDS: Many reports and research suggest intersections are more likely to fall than entrances or crosscuts (Molinda et al. 2000). Rock load applied on the roof in conjunction is proportional to the cube of span (Molinda et al. 1998), which is not in the case of entries and crosscuts. IDS is a sum of the two intersecting diagonals as depicted in figure 8. We can reduce the likelihood of intersection falls by shortening the span.

  4. (4)

    DoC: It is the principal cause of roof fall in underground coal mines. When we travel deeper down the mine, the rock mass’s vertical and horizontal stress levels increase (Ghasemi and Ataei 2013). As a result, achieving sufficient stability at greater depths is challenging.

  5. (5)

    MH: It is the most influential parameter in roof fall incidents. The pillar gets weaker and more susceptible to breakage as it grows in height, increasing the risk of roof falls.

Fig. 8
figure 8

Method of estimating IDS (Molinda et al. 2000)

The performance of the proposed method was evaluated using the statistical parameters such as \(R^2\), RMSE, and MAE. The mathematical formulas of these measures are listed below.

$$\begin{aligned} R^2= & {} 1-\frac{\sum _{i=1}^{n}(actual_i- predict_i)^2}{\sum _{i=1}^{n}(actual_i-\overline{actual})^2} \end{aligned}$$
(5)
$$\begin{aligned} \text{RMSE}= & {} \sqrt{\frac{\sum _{i=1}^{n}(predict_i - actual_i)^2}{N}} \end{aligned}$$
(6)
$$\begin{aligned} \text{MAE}= & {} \frac{\sum _{i=1}^n|predict_i - actual_i |}{N} \end{aligned}$$
(7)

5 Results and discussion

The FIS by integrating GA and PS is applied to 25 records of the dataset, which are not considered during the model’s training. Prediction of the proposed model on the testing dataset is presented in Table 2. The proposed model was constructed using the fuzzy logic toolbox on MATLAB R2021b- academic use, using 84 records from the US coal mine dataset that were not considered during the model’s testing. In the methodology, triangular MFs are used as an input parameter, GA is used for rule learning, and PS is used to optimize the input MFs. In GA, the maximum population size, distance metric, validation tolerance, fitness scaling function, and cross-over fraction are all set to 100, 0.100, rank-based scaling function, and 0.8, respectively. Initial mesh size, number of iterations, and step tolerance value for PS are 1, 100, and \(1.0 \times 10^{-6}\), respectively. The suggested model’s efficacy and reliability in predicting RFR in underground coal mines are evaluated using three performance metrics.

Table 2 Testing dataset to validate the proposed model’s performance

In Table 3, we have compared Rajani et al.’s model, in Razani et al. (2013), with our proposed model based on the testing dataset. It is evident that our proposed model surpasses all other models on all performance indicators. MAE and RMSE values are used to represent model testing errors. Our model’s MAE value is 0.4919, which is much lower than that of ANN (1.711), MVR (2.834), and FIS (1.119). The RMSE value of our proposed model (0.8745) is also lower than that of ANN (2.54), MVR (4.033), and FIS (1.72). The \(R^2\) score is the measure’s goodness of fit for regression applications. This index shows the percentage of variance in the dependent variable when independent factors are taken together. According to Shamseldin et al. (1997), \(R^2\) values less than 0.8 are undesirable for regression models. MVR and ANN have \(R^2\) ratings of 0.039 and 0.687, respectively, which are unsatisfactory for this task. The proposed model’s \(R^2\) score is 0.9512 which is higher than that of \(\text{FIS's} \, (0.872)\). The fuzzy-based model’s \(R^2\) score suggests that fuzzy logic is a valuable paradigm for accurately predicting RFR in underground coal mines.

Table 3 Statistical results of the models on the testing dataset

The size of the input data determines the proposed method’s performance. If more data is provided for training, the model’s performance can be enhanced. When the amount of data grows, combining fuzzy logic with optimization algorithms may be able to predict RFR accurately, which can be seen as a future scope.

The proposed model can precisely predicts RFR values based on only 63 rules utilized during reasoning whereas FIS used 84 rules. As a result of the RFR predictions in underground coal mines, the proposed model is regarded as credible. Based on the men-controlled parameters described in the Section 4, the real-time prevention and safety procedures to limit or eliminate RFR risk in underground mines can be identified as follows:

  1. (1)

    Higher IDS are more vulnerable to roof falls. Creating the intersection with the minimum possible span reduces the RFR risk. In order to improve safety, the density of roof bolts in intersection regions may increase.

  2. (2)

    Roof falls are more likely to occur in roomy galleries than narrower galleries. This fact should be taken into account while creating the gallery width.

  3. (3)

    Roofs that are not protected or only partially supported, are more likely to collapse. Providing adequate support can also help to minimize the risk of RFR.

6 Conclusions

To estimate RFR, this research developed a TSK FIS integrating a GA with PS. The model is applied to five attributes of 109 records in the underground coal mine US dataset: CMRR, PRSUP, IDS, DoC, and MH. The proposed model outperforms the competition in terms of accuracy and the number of rules to predict RFR more precisely as compared to other the fuzzy-based models. The proposed model’s MAE (0.4919), RMSE (0.8745), and \(R^2 \, (0.9512)\) on the testing dataset corroborate this conclusion. Mining industries can set up control mechanisms and other safety precautions based on this model’s RFR prediction.