Introduction

For a long time, underground mining safety is a critical issue [1, 2]. Belt conveyor is the core component of coal transportation, which directly affects the production efficiency of coal mining enterprises. Most of the damage of the belt conveyor is caused by iron tools (such as anchor bolts and channel steel) or large foreign objects (such as large coal gangue) entering the belt conveyor. If large foreign objects can be identified, screened and classified at the early stage when they enter the belt conveyor, early warning will be given in time and removed from the belt conveyor, so as to effectively prevent and ensure the stable operation of the belt conveyor and realize the safe and intelligent production of coal mining enterprises.

Machine learning and deep learning technologies have been widely used in nonlinear control [3, 4], fault diagnosis [5] and other industrial fields. Parallel computing based on GPUs can greatly improve operation efficiency, machine vision is a branch of the development of artificial intelligence [6, 7]. Machine vision technology includes edge extraction, image segmentation, texture analysis, recognition and classification of targets in images [8]. The system developed based on machine vision has the advantages of safety and reliability, simple installation and maintenance and low application cost, which makes it widely used in various fields. So, foreign object detection has been studied by many scholars and many methods based on machine vision have been proposed. Chen et al. [9] used machine vision method to realize real-time detection of foreign matters on the belt conveyor. First, the quality of the captured low-quality image is improved through feature processing using the King +  + low-light image enhancement algorithm. Finally, the YOLOv4 algorithm is combined with the optimized anchor box to realize the effective detection of foreign object in the belt conveyor. Wang et al. [10] proposed a video detection method of belt surface foreign objects based on SSD. The experimental results show that the improved algorithm proposed in this paper is superior to the original SSD algorithm, the average accuracy rate increases from 87.1 to 90.2%. In various raw materials carrying conveyor belts of steelworks, Saran et al. [11] used a polarization camera, then these original image frames are processed in parallel using a series of image processing techniques to detect foreign object from local raw materials. The foreign objects method based on the depth learning model has the problems of large calculation amount and high computing power requirements of GPU. At the same time, it is also necessary to establish a large sample data set, which increases the cost of engineering applications. In this paper, a new machine vision detection method for identifying and screening out various large foreign objects on coal belt conveyor lines is proposed. This paper first describes the significance of the detection of large foreign objects on the belt conveyor in the coal mine, builds the framework of the visual detection system for large foreign objects on the belt conveyor in the coal mine, determines the detection process for large foreign objects, and provides support for the follow-up algorithm implementation. The multi-scale Retinex image enhancement algorithm is proposed to improve the image quality in the mine due to the dark image conditions. The template matching algorithm based on correlation is improved to identify and filter large foreign objects, and the appropriate pyramid layers are selected to improve the recognition speed. The two texture parameters, energy and contrast, are selected as input vectors to be added to the MLP classification algorithm to complete the classification of foreign objects.

Related work

Template matching methods

Template matching is a simple image detection algorithm, which can easily detect different types of objects by changing the template without tedious training process. Therefore, many scholars have conducted extensive research. Han et al. [12] proposed a method to improve the reliability of template matching. The template image obtained by the vision sensor provides a depth template. The depth template can be used to predict the change of the image according to the difference of 3D direction and object size. Using the predicted changes in these images, the template is calibrated to be close to the given image, and then the template matching is performed. A closed form solution is proposed to avoid the tedious recursive or training process. Peng et al. [13] proposed a multi-scale template training method to improve the sensitivity to template depth. When the method performs template matching, the test image is first divided into several regions, and then the training template with similar depth is selected according to the depth of each test image region, which improves the speed of the algorithm. Park et al. [14] introduced a fast segmentation model based on template matching method. Short-term matching enhances target object positioning by focusing on adjacent frames. Long-term matching improves fine details and handles object shape changes by considering long-distance frames. Eba et al. [15] proposed a template matching method, which realized multi-category classification by scanning the input image only once, focusing on the classification ability of each pixel in the template image. Manana et al. [16] proposed a license plate recognition method for identifying vehicles with similar license plates. This method uses license plate template matching instead of character segmentation and recognition. Only edge detection is used, and the method of calculating line ratio is used to locate and extract license plates. Then, the extracted license plate templates are compared for license plate matching. Wang et al. [17] proposed a discriminant distance template matching method for image recognition. It is not only compatible with the handmade function, but also compatible with the DCNN for automatic learning function. DDTM can distinguish whether instances and templates match, and can identify arrows through template matching. Tian et al. [18] proposed a template matching target tracking algorithm based on improved effective second-order minimization. On the basis of ESM algorithm, an improved efficient second-order minimization algorithm is used to track the motion template. This method has good effect and superior performance in real-time augmented reality tracking registration of moving objects.

Multilayer perceptron networks

Fully connected neural network, also known as multilayer perceptron (MLP), is an artificial neural network with a relatively simple connection mode and a feedforward neural network. Shawky et al. [19] adopted an enhanced multilayer perceptron based on the Adagrad optimizer as the depth classifier, and proposed an effective classification model CNN-MLP. The proposed CNN-MLP model was evaluated using the public remote sensing dataset of three VHR images. The experimental results show that the proposed method will help improve the classification performance compared with the most advanced methods. Ying [20] proposed an improved label propagation method to propagate labels from tagged data to unlabeled data. The results show that under the framework of global and local consistency, soft tags of unlabeled data can give more effective predictions. Using the additional soft tags of unlabeled data, the simulation results obtained using toy examples demonstrate the label propagation performance of MLP. Azad et al. [21] proposed an intelligent integrated classification method for breast cancer based on multilayer perceptron neural network. The method includes two stages: parameter optimization and set classification. The proposed IEC-MLP method not only reduces the complexity of MLP-NN, effectively selects the optimal subset of features, but also minimizes the cost of misclassification. IEC-MLP was used to evaluate the classification results of different breast cancer data sets, and the prediction results were good. Nilesh et al. [22] used a multilayer perceptron classifier based on artificial neural network to classify the tool state. The classification accuracy of MLP classifier is 97.33%. The results show that MLP gives higher classification accuracy. He et al. [23] proposed multi-scale MLP to aggregate adjacent patches with multi-scale shape to obtain rich spectral spatial information. In addition, soft MLP is proposed to further enhance the classification performance by applying soft segmentation operations. Finally, label smoothing is introduced to alleviate the over fitting problem in soft MLP (Soft MLP-L), which greatly improves the classification performance of MLP-based methods. Tang et al. [24] proposed a rice hyperspectral image classification model based on MLP network and residual learning. The results show that this model has higher classification accuracy than other commonly used classification models.

Underground coal mines foreign object detection methods

At the same time, many scholars have done research on foreign object detection of belt conveyor. Wu et al. [25] proposed a foreign object identification model of coal conveyor belt based on fast RCNN; By analyzing the characteristics of data transmission and target detection, Xu et al. [26] proposed the belt foreign object detection based on edge calculation and the target detection optimization algorithm suitable for edge equipment; Zhu et al. [27] proposed a foreign object recognition method of coal mine belt transportation based on deep learning, and improved the target detection method based on center net. Du et al. [28] proposed an enhanced YOLOv3 target detection model and applied it to coal mine conveyor foreign object detection, which improved the speed of foreign object detection while retaining a high detection accuracy.

Discussion

It is difficult for vision feature extraction and detection of foreign objects in the dusty and harsh environment of underground coal mine. And the original template matching algorithm is susceptible to the influence of light and the matching results are unstable for the fluctuant lighting condition. Therefore, an improved normalized cross-correlation template matching (NCC-TM) algorithm is proposed to effectively identify target objects with obscure shape features and fuzzy focus. Even if the target object is blocked for a small area, the detection result will not be affected. The proposed algorithm is demonstrated in the following sections.

The proposed method

Basic theory

A. Retinex image enhancement. In the Retinex image enhancement theory, the image \(S(x,y)\) seen by human eyes can be decomposed into the incident image \(L(x,y)\) and the reflected image \(R(x,y)\). The imaging mechanism is shown in Fig. 1, where \(S(x,y)\) denotes the image received by the image acquisition device. The incoming and outgoing component \(L(x,y)\) is estimated and removed from \(S(x,y)\) to obtain the reflection component \(R(x,y)\). \(R(x,y)\) reflects the original color of the target object. The Retinex image enhancement removes/reduces the influence of the incident component \(L(x,y)\) and retains the reflection component \(R(x,y)\) to keep the original color of the object, so as to improve the output image quality for further analysis and processing.

Fig. 1
figure 1

Schematic diagram of object imaging

The single scale Retinex (SSR) algorithm uses the Gaussian function to estimate the incoming and outgoing images from the original images.

$$ \log [R(x,y)] = \log [S(x,y)] - \log [F(x,y)*S(x,y)], $$
(1)
$$ F(x,y) = \frac{1}{{2\pi \sigma^{2} }}e^{{\frac{{ - (x^{2} + y^{2} )}}{{2\sigma^{2} }}}} , $$
(2)
$$ c^{2} = 2\sigma^{2} , $$
(3)

where \(c\) is the only adjustable parameter of the SSR image enhancement algorithm. When the value of \(c\) is too large, the image details will be lost to a certain extent. When the value of \(c\) is too small, although the image details can be saved better, the color information will be not well retained.

B. Template matching algorithm. The idea of template matching is to select a target object, create a template according to it, and then search the area most similar to the template in the image to be detected. Different template matching algorithms can be selected according to the characteristics of the target object or different environments. According to different matching methods, template matching can be divided into the following two types.

  1. (1)

    Shape-based template matching.

    The shape-based template matching algorithm obtains the shape information of the target object by obtaining the contour of the object in the template image, and analyzes whether the shape of the object in the image to be detected is similar to the shape of the object in the template image to find the position of the target object. The flow chart of template matching based on shape is shown in Fig. 2.

  2. (2)

    Template matching based on gray level.

    Gray-based template matching uses the gray value of the image as the similarity measure, calculates the gray value similarity of the target object in the template image and the target object in the image to be detected, and compares the similarity between the two. The grayscale-based template matching flow chart is shown in Fig. 3.

Fig. 2
figure 2

Flow chart of template matching based on shape

Fig. 3
figure 3

Flow chart of template matching based on gray scale

C. MLP neural network. Multilayer Perceptron (MLP), also known as artificial neural network, is an extension of the perceptron model. The original Perceptron only has two layers of neurons: the input layer and the output layer. Such a simple perceptron has very limited learning ability because it has only one layer of functional neural network and can only handle some simple linear separable problems.

To solve the nonlinear problem, the original perceptron model is expanded, and several hidden layers are added to the original model to obtain the current multi-layer perceptron model. Among many classification algorithms, MLP neural network classification algorithm has the advantages of simple structure, good fault tolerance, easy implementation, strong nonlinear mapping ability and good robustness. Its structure diagram is shown in Fig. 4.

Fig. 4
figure 4

MLP structure

The figure above shows the network structure of the multi-layer perceptron, which is the input layer first, the hidden layer in the middle, and the output layer at last. First, input a dimension vector in the input layer \(X = \{ x_{1} ,x_{2} ,x_{3} ,...,x_{n} \}\), The formula is as follows:

$$ a_{i} = \sum\limits_{j = 1}^{n} {W_{ji}^{1} x_{j} + b_{i}^{1} } , $$
(4)

\(W_{ji}^{1}\) is the weight from the input layer node to the hidden layer node; \(b_{i}^{1}\) is the offset of the node of the hidden layer.

When calculating the output of the hidden layer, it is necessary to introduce an activation function to add nonlinear operations to the multi-layer perceptron so that it can approximate the nonlinear function. The commonly used activation functions are tanh function and relu function. The calculation formula is shown below, and the curve is shown in Fig. 5.

$$ \tanh (x) = \frac{{e^{x} - e^{ - x} }}{{e^{x} + e^{ - x} }}, $$
(5)
$$ {\text{relu}} (x) = \left\{ {\begin{array}{*{20}c} 0 \\ x \\ \end{array} } \right.\quad \begin{array}{*{20}c} {(x \le 0)} \\ {(x > 0)} \\ \end{array} . $$
(6)
Fig. 5
figure 5

Graphs of tanh function and relu function

Before using the multi-layer perceptron to classify objects, it is necessary to constantly update the weight coefficients of the training model to improve the classification accuracy of the multi-layer perceptron. The training flow chart is shown in Fig. 6.

Fig. 6
figure 6

Multi-layer perceptron training process

Framework

To realize intelligent detection of the large foreign objects in coal mine conveyor, the proposed machine vision framework is shown in Fig. 7.

Fig. 7
figure 7

Framework of visual detection system for large foreign objects

In the machine visual detection system, the industrial camera array is used to collect the coal conveying image and the image preprocessing (including the image filtering, denoising and image enhancement) is performed correspondingly. Then, the template matching is applied for preliminary identification of the foreign objects, and the qualified large foreign objects are screened out by combining the frame difference and area methods. If there is no large foreign object in the image, the image is cleared in the system to release storage memory. If large foreign objects are identified and screened out in the image, the texture features of the large foreign objects are extracted, and the optimized MLP is used to accurately classify the large foreign objects to prevent them entering the belt conveyor from the coal source. Specific detection process is shown in Fig. 8.

Fig. 8
figure 8

Flowchart of large foreign objects detection

Image enhancement using the improved MSR with adaptive weight

The SSR is difficult to balance the image color information and detail information, it cannot be applied to the application scenarios with harsh environments. The multi-scale Retinex (MSR) is proposed to solve this problem. The calculation formula of the MSR is described as follows.

$$ \log [R_{{{\text{MSR}}}} (x,y)] = \sum\limits_{n = 1}^{N} {\omega_{n} \{ \log [S(x,y)] - \log [F_{n} (x,y)*S(x,y)]} \} , $$
(7)
$$ \sum\limits_{n = 1}^{N} {\omega_{n} = 1} , $$
(8)

where \(\omega_{n}\) is the weight coefficient and \(N\) represents the number of scales; when \(N = 1\), it is the case of single scale Retinex.

Therefore, this paper proposes an improved MSR with adaptive weight. The principle and corresponding processing steps of the proposed method are described as follows:

  • Step 1: use three Gaussian functions with different scale parameter values to perform the convolution operation with the three-color channels of the image, and calculate the weighted average value of the pixels in the Gaussian neighborhood.

  • Step 2: in each color channel, move the Gaussian function by the distance of a Gaussian template from left to right and from top to bottom, and calculate the weighted average value of the pixels in the Gaussian template of the current pixel until the end of the traversal.

  • Step 3: sum the convolution outputs of the three channels, respectively, to obtain the convolution average value of different channels. Calculating the convolution average value of each channel is expressed as

    $$ x_{1} = \frac{1}{n}\sum {\frac{1}{{2\pi \sigma_{1}^{2} }}e^{{\frac{{ - (x^{2} + y^{2} )}}{{2\sigma_{1}^{2} }}}} *s} , $$
    (9)

    where \(s\) is the original image in the Gaussian template, and \(\sigma\) 1 is the scale parameter.

  • Step 4: to assign larger weight to the channel with more incident components, the softmax function is adopted to process the convolution average value output from the three channels. The calculation is described as follows:

    $$ y_{i} = \frac{{e^{{x_{i} }} }}{{\sum_{j = 1}^{m} {e^{{x_{j} }} } }}, $$
    (10)
    $$ \omega_{1} = \frac{{e^{{x_{1} }} }}{{\sum_{j = 1}^{3} {e^{{x_{j} }} } }}, $$
    (11)

    where \(\omega_{1}\) is the weight coefficient assigned to the channel with more incident components. Since the three convolution average values are convoluted by three different Gaussian functions on the three channels of the original image, and the incident component to be removed is equal to \(F(x,y)*S(x,y)\), the larger the convolution average value in the channel, the more incident components to be removed in the channel.

The image enhancement effect of the traditional and the improved MSR is shown in Figs. 9 and 10, where an original image of the underground mining is taken as the test object. Through comparison, it is found that with the improved MSR enhancement, the image contrast is significantly enhanced, the color information is further enriched, and the image quality is significantly improved. The image quality of the improved MSR is better than that of the traditional MSR because the enhanced image using the improved MSR is closer to the real scene seen by the human eyes.

Fig. 9
figure 9

Image enhancement effect of traditional MSR

Fig. 10
figure 10

Image enhancement effect of improved MSR

Foreign object recognition based on correlation between vectors and template matching

A template matching algorithm based on the correlation between vectors is developed to identify the large foreign objects. The similarity measure of the template matching adopts the normalized correlation coefficient between the template image and the image to be detected. The best matching position is found by comparing the normalized correlation coefficient between each position of the images.

Due to large amount of calculation of the template matching, it is difficult to meet the requirements of real-time detection of the target object. Therefore, it is necessary to set an appropriate number of image pyramid layers to shorten the calculation time of template matching. Every time, the number of pyramid layers increases, the number of points of the image to be detected and the number of points of the template image will be reduced by 4 times, and the matching speed will be 16 times faster. By doing so, the matching speed can be greatly accelerated. However, when the number of the pyramid layers is too large, it will damage the useful information in the image and reduce the accuracy of target object recognition.

In this study, the image pyramids with different layers are evaluated using a trial-test. When the number of image pyramids is 6, the recognition accuracy of the foreign objects can reach more than 97% and the matching time is about 12 ms; while when the number of pyramid layers is set to 7, the recognition accuracy is lower than 95.3%. Thereafter, the higher the number of the pyramid layers, the lower the accuracy of the matching and recognition. Figure 11 shows the results of foreign object recognition by the proposed template matching with the number of pyramid layers of 6.

Fig. 11
figure 11

Template matching foreign object recognition. Subplot (a) and (b) represent create matching template and recognition results

If the pixel proportion of foreign objects accounts for 1/8 of the total image size, it is determined that this area is a large foreign object area (Fig. 12).

Fig. 12
figure 12

Large foreign object area after screening

Classifying large foreign objects using MLP

Through the investigation of different coal mining enterprises, it is statistically concluded that the large foreign objects on the belt conveyor in the process of coal transportation are mainly divided into three categories: stone (coal gangue and ordinary stone), iron and wood. It is possible to classify the types of the large foreign objects to count the occurrence times of different types of large foreign objects in the coal mine and analyze the causes to prevent the large foreign objects from entering the belt conveyors (Fig. 13).

Fig. 13
figure 13

Foreign object classification based on MLP

In this study, the MLP neural network is employed to perform the classification task, as shown in Fig. 13. In the large foreign object classification, it is particularly important to select the appropriate feature vector as the input of the MLP. The surface texture of the target object is an important feature. Therefore, the texture features of different large foreign objects are extracted as the input of MLP neural network classification algorithm. Considering that there is a certain gray-level relationship between any two pixels in the image, the gray-level co-occurrence matrix can reflect the change characteristics of the gray level on the surface of the objects. Based on the gray-level co-occurrence matrix, 14 parameters describing the texture features can be calculated. The four commonly used parameters are energy, contrast, correlation and difference moment. The specific values of the texture feature parameters of different large foreign objects and coal are shown in Table 1.

Table 1 Texture feature parameter values

According to the above table, different kinds of large foreign objects and coal present obvious differences in the two texture features of the energy and contrast. Therefore, the energy and contrast are selected as the input vector of the MLP.

There are three category labels in the data set, namely, large stone, wood and iron. The energy and contrast of 150 groups of foreign objects are used as the training data set of the network, and the number of each type of data is 1:1:1.

To further improve the accuracy of the MLP for classifying the large foreign matters, the gray wolf algorithm is introduced to optimize the MLP model. The gray wolf algorithm aims to find the optimal parameters of interest by simulating the hunting behavior of the gray wolves and achieving a balance between the local optimization and global search. The gray wolf population has a strict social hierarchy, as shown in Fig. 14.

Fig. 14
figure 14

Social hierarchy of gray wolf population where the social hierarchy is divided into 4 levels. As the leader of the wolf group, the \(\alpha\) wolf is the decision maker of the hunting operations and the leader of the entire wolf group. The \(\beta\) wolf is responsible for assisting the \(\alpha\) wolf in making decisions, and the \(\delta\) wolf obeys the decision of the \(\alpha\) wolf and the \(\beta\) wolf; and the rest are the \(\omega\) wolfs, which are the majority of the wolf pack and are responsible for following the first three wolves. The predation process of the gray wolf population is mainly divided into three parts: round-up process, hunting process and attack stage

In the traditional gray wolf algorithm, the wolves only rely on the position of the first three wolves to locate the target, limiting the speed and accuracy of searching target. Therefore, in the present gray wolf algorithm, the central positions of the first three wolves are used as the guidance information for the wolf pack, which can reduce the probability of falling into local optimization. The central positions of the first three wolves are calculated according to the current position by

$$ \left\{ {\begin{array}{*{20}c} {X_{{\alpha \beta }} (t) = \frac{1}{2}|X_{\alpha } (t) - X_{\beta } (t)|} \\ {X_{{\alpha \delta }} (t) = \frac{1}{2}|X_{\alpha } (t) - X_{\delta } (t)|} \\ {X_{{\beta \delta }} (t) = \frac{1}{2}|X_{\beta } (t) - X_{\delta } (t)|} \\ \end{array} } \right., $$
(12)
$$ \left\{ {\begin{array}{*{20}c} {D_{\alpha \beta } = |C_{4} X_{\alpha \beta } (t) - X(t)|} \\ {D_{\alpha \delta } = |C_{5} X_{\alpha \delta } (t) - X(t)|} \\ {D_{\beta \delta } = |C_{6} X_{\beta \delta } (t) - X(t)|} \\ \end{array} } \right., $$
(13)
$$ \left\{ {\begin{array}{*{20}c} {X_{4} = |X_{\alpha \beta } (t) - A_{4} D_{\alpha \beta } |} \\ {X_{5} = |X_{\alpha \delta } (t) - A_{5} D_{\alpha \delta } |} \\ {X_{6} = |X_{\beta \delta } (t) - A_{6} D_{\beta \delta } |} \\ \end{array} } \right., $$
(14)
$$ X(t + 1) = \frac{{X_{1} + X_{2} + X_{3} + X_{4} + X_{5} + X_{6} }}{6}, $$
(15)

where \(X_{\alpha \beta } (t)\), \(X_{\alpha \delta } (t)\) and \(X_{\beta \delta } (t)\) are the center positions between the first three wolves, \(D_{\alpha \beta }\), \(D_{\alpha \delta }\) and \(D_{\beta \delta }\) are the distances between the wolf pack and the center positions of the three wolves.

The convergence factor has a great impact on the global and local search ability of the Gray Wolf; the larger the value of the convergence factor, the larger the global search range while the slower the search speed. In the initial stage, it is necessary to slow down the searching speed to ensure the global search ability of the Gray Wolf algorithm; in the middle and late stage, it is necessary to speed up the searching speed to ensure the local search ability of the wolfs. The cosine function is used to optimize the convergence factor. The optimized calculation is expressed as

$$ a = 2 \times \cos ((t \div t_{\max } ) \times (\pi \div 2)). $$
(16)
  • Step 1: initialize the MLP model parameters, convert the weights and offsets into the matrix codes, and use them as the input of the gray wolf algorithm.

  • Step 2: initialize the parameters of the gray wolf optimization, set the wolf population size to 200 and the maximum number of iterations to 140.

  • Step 3: calculate the fitness values of the gray wolf individuals, and record the first three optimal individual fitness values as \(X_{\alpha } (t)\), \(X_{\beta } (t)\) and \(X_{\delta } (t)\), indicating the positions of the wolf \(\alpha\), \(\beta\) and \(\delta\).

  • Step 4: update the convergence factor in the current iteration and the position information of each wolf.

  • Step 5: when the maximum iteration value or convergence accuracy is reached, the position information of the wolf \(\alpha\) is output as the optimal solution, which is converted into the corresponding weight and offset parameters of the MLP model.

  • Step 6: extract the texture features of the images to train the optimized MLP model.

The trained MLP model is used to test 150 selected images of the large foreign matters, and the classification accuracy is shown in Fig. 15. The classification accuracy of the traditional MLP and the present MLP models are compared. As can be seen in the figure, the present MLP model outperforms the traditional MLP model. The wolf optimized MLP model can accurately classify the large foreign matters, including the stone, iron and wood, with an accuracy of 98.8% and calculating time of 20 ms.

Fig. 15
figure 15

MLP classification accuracy

Experimental results and analysis

Experimental test platform

To evaluate the performance of the proposed method for large foreign object detection, an experimental platform is built, as shown in Fig. 16.

Fig. 16
figure 16

Laboratory experiment platform

Relevant parameters of hardware are shown in Table 2.

Table 2 Relevant design parameters of the test bench

The implementation and computational of foreign object detection mainly adopt python programming language and use OpenCV algorithm library to process related image data. The final results are processed with Jetson Xavier NX. Because the platform uses a small belt conveyor, it is necessary to select the large foreign objects according to the size proportion between the objects and the belt conveyor.

In the experimental test, 180 samples of the large foreign objects with different areas and colors are prepared and divided into three groups, including group A for stone, group B for iron and group C for wood. Each group contains 60 samples. Furthermore, the foreign objects in each group are subdivided into four subgroups with 15 samples according to the size of the object area. Multiple experiments are carried out to detect the large foreign objects in each group. The experimental results are shown in Table 3.

Table 3 Experimental results

As can be seen in Table 3, the screening accuracy of the proposed method for the large foreign objects is more than 95% and the classification accuracy is more than 96%, indicating that the proposed machine vision method is able to realize accurate recognition, screening and classification of the large foreign objects. To verify the effectiveness of the proposed algorithm, state of the art algorithms for measurement are introduced and compared with the proposed algorithm. As far as we know, there is no measurement algorithm for large foreign objects. So, measurement algorithms, such as Root collar diameter measurement (RCDM) [29], Tool wear detection (TWD) [30], Chisel edge wear measurement (CEWM) [31] and surrounding rock dynamic deformation measurement (SRDDM) [32], are selected to compare with the proposed foreign object detection algorithm. These contrastive algorithms are based on machine vision and competitive in their fields. The re-implementations of each algorithm are based on corresponding papers and the comparison results are listed in Table 4.

Table 4 The comparison results with other state of the art algorithms

In Table 4, terms of high, medium and low refer to target images with different resolution: \(\left( {1920,1920} \right)\), \(\left( {1280,1280} \right)\) and \(\left( {640,640} \right)\)the weighted score is introduced to demonstrate the overall performance among precision in different resolutions and time consumption. It can be obtained as follows:

$$ w_{{{\text{score}}}} = \sum\limits_{i}^{{\text{N}}} {\lambda_{{\text{res - norm}}}^{i} \frac{{{\text{pre}}_{i} }}{{t_{i} }}} . $$
(17)

Foreign object detection requires high accuracy and high execution efficiency. Therefore, In the process of comprehensive evaluation, the weighted score is positively correlated with the accuracy \({\text{pre}}_{i}\), while the time \(t_{i}\) of foreign object detection is negatively correlated. Where, N denotes the number of different resolution types, \(N = 3\); \({\text{pre}}_{i}\) and \(t_{i}\) denote average precision and time consumption for corresponding resolution images. \(\lambda_{{\text{res - norm}}}^{i}\) denotes the coefficient for different resolution image, since the high-resolution image accounts for the majority of the actual belt conveyor foreign object detection, set a higher value of \(\lambda_{{\text{res - norm}}}^{i}\), their values are 0.50, 0.33 and 0.17 for high-, medium- and low-resolution images, respectively.

To realize terminal detection at the belt end, we also use embedded equipment, as follows in Fig. 17.

Fig. 17
figure 17

Jetson Xavier NX

The accelerated computing capacity of up to 21 TOPS can run modern neural networks in parallel and process data from multiple high-resolution sensors. Meanwhile the process of computing coal is continuous, foreign object detection also needs to maintain high detection accuracy under moving objects., we set the speed of the belt conveyor to check the real-time performance of the algorithm. The experimental results are shown in the following Table 5.

Table 5 Detection performance of foreign object detection on Jetson Xavier NX

From this table, in embedded devices with limited equipment resources, the algorithm can still effectively detect foreign object. At the same time, different belt speeds will have some impact on the detection accuracy, but the algorithm can still ensure high detection accuracy.

Industrial flied test

To verify the feasibility of the proposed detection method, industrial filed test is carried out in the Gaoyang coal mine, China. Due to dark underground environment and lacking of natural light source, the image quality is very poor in the underground coal seam. The image enhancement processing is performed on the collected image and the result is shown in Fig. 18. The processed image is significantly improved in terms of background noise and brightness, which enriches the detail information and color information of the image and improves the reliability of the subsequent large foreign object recognition.

Fig. 18
figure 18

Image enhancement effect. Subplot (a) and (b) represent original image and Enhanced image using the proposed MSR method

In the industrial field test, 120 large foreign objects and 40 coal samples with different areas and colors are prepared to evaluate the proposed method, as shown in Fig. 19. The test samples include four groups, including group A for stone, group B for iron, group C for wood and group D for coal. All these group are mixed together and put on the belt conveyor for on-line detection using the proposed machine vision method. Before the online detection, the offline training is performed to build the MLP model by extracting the texture features of the energy and contrast of the offline training images of the belt conveyor. Then, the proposed method is applied to the belt conveyor images acquired by the camera array in a real-time manner.

Fig. 19
figure 19

Large foreign objects

The online test has been implemented for several times and the detection results demonstrate that the accuracy of the proposed machine vision method is able to realize real-time recognition of the large foreign objects on the coal belt conveyor line; the screening accuracy in Table 6 is more than 95% and the classification accuracy is more than 95.5%, which meets the industrial detection requirements.

Table 6 Industrial field test results

Conclusions

In this study, a new machine vision method is proposed to online detect the large foreign objects on the coal belt conveyor line. An improved MSR with adaptive weight is proposed to image enhancement. The improved normalized cross-correlation template matching (NCC-TM) algorithm is proposed to effectively identify foreign objects. Meanwhile, the appropriate pyramid layers are selected and set to improve the recognition speed. The recognition accuracy of the algorithm for large foreign objects exceeds 97%. The energy and contrast texture parameters are selected as input vectors to be added to the MLP classification algorithm for classification, and the classification accuracy reaches 98.7%. As a result, the proposed machine vision method can be applied to industrial practice. At the same time, relevant industrial experiments were carried out to verify the effectiveness of the proposed method. At the same time, the making of foreign object detection template in this paper requires experienced people. Therefore, how to design a more scientific foreign object detection template still needs further research.