The CTCN-LightGBM Joint Model for Industrial Balanced Loading Prediction

Chen, Zihua; Wang, Chuanli; Jin, Huawei; Li, Jingzhao; Zhang, Shunxiang; Ouyang, Qichun

doi:10.1007/s44196-022-00175-5

The CTCN-LightGBM Joint Model for Industrial Balanced Loading Prediction

Research Article
Open access
Published: 04 January 2023

Volume 16, article number 1, (2023)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

The CTCN-LightGBM Joint Model for Industrial Balanced Loading Prediction

Download PDF

Zihua Chen¹,
Chuanli Wang ORCID: orcid.org/0000-0002-0016-8650¹,
Huawei Jin¹,
Jingzhao Li²,
Shunxiang Zhang² &
…
Qichun Ouyang³

1993 Accesses
2 Citations
Explore all metrics

Abstract

Balanced industrial loading mainly relies on accurate multi-adjustment values, including the truck speed and chute flow. However, the existing models are weak in real-time loading prediction because the single-objective regression may ignore the correlation of multi-adjustment parameters. To solve the problem, we propose a joint model that fuses the composited-residual-block temporal convolutional network and the light gradient boosting machine (i.e., called CTCN-LightGBM). First, the instance selection deviations and abnormal supplement methods are used for data preprocessing and normalization. Second, we propose a side-road dimensionality reduction convolutional branch in the composited-residual-block temporal convolutional network to extract collaborative features effectively. Third, the feature re-enlargement method reconstructs extracted features with the original features to improve extraction accuracy. Fourth, the reconstructed feature matrix is utilized as the input of the light gradient boosting machine to predict multi-adjustment values parallelly. Finally, we compare the CTCN-LightGBM with other related models, and the experimental results show that our model can obtain superior effects for multi-adjustment value prediction.

Research on the Prediction of Tire Radial Load Based on 1D CNN and BiGRU

Article Open access 20 November 2023

Data Anomaly Detection for Bridge SHM Based on CNN Combined with Statistic Features

Article 10 March 2022

Multi-agent collaborative control parameter prediction for intelligent precision loading

Article Open access 19 March 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Industrial loading that aims to achieve precise and quantitative loading for materials is widely used in mining, transportation, etc. However, when we need to achieve target loading by the conventional manual-programmable logic controller system, the real-time loading parameters that include the truck speed and the chute flow (i.e., shown in Fig. 1) are usually predicted and adjusted by the manual experience. Further, operators stop the truck and replenish underloading values according to the actual-target errors. This process often leads to unbalanced loading problems, economic losses (e.g., about 10% cost of coal mining enterprises per year in China), and even railway accidents. Thus, breaking down the barrier that predicts multi-adjustment values by manual interference has been challenging in the industrial loading field. How to precisely obtain multi-adjustment values for balanced loading based on historical experience has become the critical exploration issue in this paper.

Nowadays, the prediction of industrial time-series targets has been promoted with many innovative learning methods (e.g., deep learning models [1]). Especially hybrid deep learning, which aims to integrate the advantages of individual learners, has become a significant focus on improving model-generalization effects in industrial fields [2,3,4]. Ensemble-based learning models have been proposed to achieve single-target prediction [5,6,7]. For example, Li et al. [5] proposed a long short-term memory recurrent neural network to predict the short-term power load. Zhou et al. [6] and Ren et al. [7] provide industrial prediction methods based on the convolutional neural network (CNN) and the long short-term memory network (LSTM). However, these models have limitations in learning hidden and temporal correlations for collaborative features. Namely, they have a weak extracting or forecasting ability in the industrial loading field due to the loss of prior-historical knowledge and low-receptive fields for time-series data. To pursue high extraction capabilities and efficient prediction performance, some researchers have explored the combination of neural networks and machine learning methods [8,9,10]. Significantly, the convolutional neural network and light gradient boosting decision trees have provided sound effects for feature extraction and linear regression. The convolutional neural network relies on the layer-by-layer processing mechanism to learn sequential features from the raw data [11, 12]. In addition, the expansive decision tree methods adopt gradient descent to accelerate convergence [13]. However, the hybrid model cannot efficiently capture long-distance features because the weight-feedback adjusting process will be slow with many deep network layers. In a word, its application to collaborative feature extraction is relatively limited.

Based on the above analysis, the temporal convolutional network (TCN), an expansive convolutional neural network with dilated causal convolution layers, has been proposed for achieving a wide receptive field [14]. The method integrates the advantages of parallel distributed extraction for the convolutional neural network and temporal regression for the recurrent neural network [15]. It is suitable for parallel and dynamic nonlinear feature extraction. However, due to positive and negative multi-adjustment values of industrial loading, its application to collaborative feature predicting is relatively weak. In addition, the gradient-boosting decision tree (GBDT) algorithms have become popular because of their distributed and fast processing capacity for massive data [16, 17]. Among them, the gradient boosting machine (GBM) adopts the local low-gradient data to reduce the time and space overhead, which has an advantage in predicting single targets with positive and negative values while having a shortcoming for multi-objective tasks. Thus, it is not easy to accurately predict multi-adjustment values for industrial balanced loading.

To accurately predict multi-adjustment values for balanced loading, this paper proposes a joint learning model (CTCN-LightGBM) based on the composited-residual-block TCN and the Light-GBM. The novelty of the work is that the CTCN-LightGBM integrates a wide receptive field and dimensionality reduction convolution for the CTCN and negative-gradient ensemble learners for the parallel Light-GBM. The model can improve the predictive accuracy by auxiliary branches and optimize the data-regression performance of the expansive GBDT. Also, we provide a feature re-enlargement (FR) method that reconstructs the collaborative feature matrix with original features to improve the extraction ability of the CTCN. Experimental results show that the CTCN-LightGBM model achieves significant and reasonable improvement compared to other contrast models in the industrial loading field. The main contributions of the paper are as follows:

1.
We extract collaborative features through a composited residual block in the TCN, replacing the 1 × 1 convolutional shortcut with a side-road dimensionality reduction convolutional branch. The branch can acquire auxiliary features to improve the generalization ability and preserve the sign characteristics of multi-adjustment values.
2.
The feature re-enlargement method (FR method) is proposed to enlarge the extraction accuracy of the CTCN. We process the original features with the extracted speed-flow element ratios and integrate them with the collaborative feature matrix extracted by the CTCN. Further, the reconstructed feature matrix will be used as the input of the Light-GBM for predicting accurate multi-adjustment values (i.e., the truck speed and chute flow).
3.
This paper is an academic research based on actual industrial demands. We need to adjust their loading parameters in real engineering scenarios to achieve target loading. Absolutely, only by accurately predicting the multi-adjustment values can industrial loading make a balanced plan. The CTCN-LightGBM model effectively solves practical industrial demands and brings essential significance.

The remainder of the paper is organized as follows: Sect. 2 overviews the related work of hybrid learning models for industrial target prediction. Section 3 proposes the structure of the CTCN-LightGBM model. Section 4 gives some experimental results and theoretical analysis. Finally, the conclusion and future work are given in Sect. 5.

2 Related Work

We review the related research work in two main areas in this paper, including the industrial hybrid model via neural networks and the optimized gradient method via decision trees.

2.1 The Industrial Hybrid Model via Neural Networks

The industrial hybrid model via neural networks has proven successful for forecasting parameters [6, 7, 18, 19] and target detection [20,21,22] in related industrial fields. For example, Li et al. [18] propose a deep learning algorithm composed of long short-term memory and fully connected layers to predict photovoltaic power generation. Because of the simple structure of the FC layer, the hidden distribution of features cannot be efficiently exploited for data prediction. Geng et al. [19] propose a novel gated-convolutional neural network-based transformer for dynamic soft sensor modeling of industrial processes. The model can adaptively filter the essential features. Further, Zhou et al. [6] provide a hybrid model to improve electrical equipment's load decomposition accuracy. Ding et al. [7] propose a model based on convolutional neural networks and a gate recurrent unit model to identify rough-stored express deliveries intelligently. Xia et al. [20] and Qiang [21] propose depth neural networks for industrial control. Siegel [22] proposes an anomaly detection mechanism based on the convolutional neural network and the generative adversarial network for industrial equipment. However, these heterogeneous neural networks are weak for extracting and predicting multi-adjustment values in industrial loading.

2.2 The Optimized Gradient Method via the Decision Trees

The light gradient boosting decision tree and expansive models are adopted to achieve precise regression/classification [23, 24]. Zhang et al. [23] propose a gradient-boosting decision tree-based fault prediction tool for cyber-physical production systems. The online test results prove that the model has high prediction accuracy. Yan and Wen [24] propose a light gradient boosting machine to detect power theft from power companies. However, the learning ability of these single decision tree models is insufficient to process the multi-distribution features. Nakamura et al. [25] use a hybrid model based on the bidirectional long short-term memory and the gradient-boosted decision tree for the binary classification of radiology reports. Lu et al. [26] integrate the long short-term memory with the gradient boosting machine to predict end-to-end inferences. Dan et al. [27] combine a convolutional neural network with the gradient-boosting decision tree for temperature prediction. Also, Ju et al. [28] propose a convolutional neural network and light-GBM model to predict wind power. Due to the limitation of the receptive field, these models have a poor learning effect on temporal feature relationships. Y. Wang et al. [29] propose a short-term load forecasting model based on the temporal convolutional network and the gradient boosting machine for industrial customers. Experiments show that the TCN-LightGBM model can predict electrical loads in multiple industrial scenarios.

However, the existing hybrid models are less mentioned and unsuitable for collaborative feature extraction in industrial loading fields. Thus, this paper explores the CTCN-LightGBM model to achieve accurate multi-adjustment value prediction.

3 Structure of the CTCN-LightGBM Model

The CTCN-LightGBM prediction model consists of three parts: the data preprocessing and normalization, the feature extraction based on the CTCN, and the Light-GBM prediction. The detailed process of the CTCN-LightGBM model is designed in Fig. 2.

3.1 The Data Preprocessing and Normalization

The dataset features consist of speed-related features (i.e., Feature_1), flow-related features (i.e., Feature_2), and labels in this paper. The raw dataset usually has some missing/abnormal instances because of the manual experience inference and recording accuracy errors. We propose data processing methods to deal with this problem, as listed in Table 1. We adopt the unit-adjustment values (i.e., $\Delta V,\Delta Q = 0.0001$) to replace the zero-value in actual instances, improving the data accuracy while conforming to industrial conditions. In addition, we set data selection deviations according to actual industrial requirements in Table 2, which can ensure the prediction effect and uniformly regulate loading target standards.

Table 1 The raw dataset processing methods

The CTCN-LightGBM Joint Model for Industrial Balanced Loading Prediction

Abstract

Similar content being viewed by others

Research on the Prediction of Tire Radial Load Based on 1D CNN and BiGRU

Data Anomaly Detection for Bridge SHM Based on CNN Combined with Statistic Features

Multi-agent collaborative control parameter prediction for intelligent precision loading

1 Introduction

2 Related Work

2.1 The Industrial Hybrid Model via Neural Networks

2.2 The Optimized Gradient Method via the Decision Trees

3 Structure of the CTCN-LightGBM Model

3.1 The Data Preprocessing and Normalization

3.2 The Feature Extraction Based on the CTCN

3.2.1 Dilated causal convolution

3.2.2 Composited dimensionality reduction convolution

3.2.3 Composited residual block

3.3 The Light-GBM Optimized Prediction

4 Experiments

4.1 The Experimental Settings and Performance Metrics

4.2 Comparison Results for the Feature Dimensionality Reduction Ratio of the CRB

4.3 Ablation Experiments for the Extraction Ability of the CTCN

4.4 Comparison Results for the Extraction Ability of the FR method

4.5 Prediction for Multi-Adjustment Values

4.6 Discussion and Analysis

5 Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval and consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation