Abstract
Process optimization and fault diagnosis technology, which is represented by production process monitoring, design and production condition adjustment, play an important role in the modern petroleum industry. Accurate inventory reconciliation model is the basis of process optimization and fault diagnosis. To eliminate the impact of the inventory reconciliation error caused by different metering systems, the error prediction method of inventory reconciliation during storage and transportation process based on partial least squares (PLS) and least squares support vector machine optimized by modified fruit fly optimization algorithm (MFOA-LSSVM) is proposed. The general error prediction method flow of inventory reconciliation is provided. The principles of PLS and MFOA-LSSVM are elaborated in detail. Firstly, the algorithm of PLS is used to exclude the interference of unrelated factors and extract the most relevant factors that influence the error of inventory reconciliation. Then the modified three-dimensional fruit fly optimization algorithm with diminishing steps as well as a good global search capability is adopted to select the LSSVM model parameters and build the error prediction model. Finally, the sample data were revised by using the predictive value to verify the validity of the proposed method. The experimental modeling was carried out by PLS and MFOA-LSSVM. Compared with other forecasting methods, this method not only has the advantage of faster calculations, but also can well predict the error of reconciliation.
Similar content being viewed by others
Introduction
The accurate inventory reconciliation model plays an important role in improving the capacity of fault diagnosis and reducing the misdiagnosis rate as well as omissive judgement rate of fault diagnosis during storage and transportation process (Chen et al. 2010; United States Environmental Protection Agency 1995). The error of inventory reconciliation is related to source measurements, terminal measurements and process loss. It obeys the normal distribution with zero mean in theory. Different measurement systems have unlike instruments based on various principles. Unlike instruments have different accuracy, nonlinear, zero drift and other characteristics. These characteristics are affected by internal and external environment and so on. Under the influence of different measurement methods between sources and terminals as well as the impact of process technology, process parameters, process loss and environmental changes, the error of inventory reconciliation obeys the normal distribution that the mean is not zero actually. It is difficult to establish a precise error prediction model of inventory reconciliation using first principles.
In recent years, machine learning methods represented by neural network and support vector machine have been widely used in the aspects of instruments error modeling, forecasting or other areas (Austina et al. 2013; Wang et al. 2015). He et al. (2014) used the GM (1, N) system optimized by neural network to correct the nonlinear error of sensor. And the corrected sensor has the desired input and output characteristics. Peng et al. (2013) made use of the BP neural network optimized by genetic algorithm to solve the problem of sensor temperature compensation. The performance of fluxgate magnetometers was improved by using RBF neural network to establish a compensation model for the bias and scale factors (Pang et al. 2012). Ye et al. (2016) and Zhang et al. (2012) used the LSSVM to predict and compensate the temperature error of instruments. And they carried a useful exploration of parameters optimization for LSSVM. But these papers were just about a single-sensor error compensation or estimation, and they did not involve in error between different metering systems. How to establish the error prediction model by making full use of the collected and stored process data has become the important research content of storage and transportation process.
Aiming at the error of inventory reconciliation caused by different metering systems, the error prediction method of inventory reconciliation during storage and transportation process based on PLS and MFOA-LSSVM is proposed in the paper. This method uses PLS to achieve the goal of key factors feature extraction for inventory reconciliation model equations firstly. Then, in order to avoid falling into local optimum, LSSVM optimized by MFOA is adopted for modeling. Finally, the validity of the method is verified by the experiment of oil storage and transportation process on the advanced process control experimental platform. In order to simplify the calculation, the error prediction model is established when the pump frequency is 42.5 Hz.
The basic principles of prediction method
The flow of prediction method
The general error prediction flow of inventory reconciliation is provided in Fig. 1:
First of all, analyze the source and terminal metering system. Establish the formula of inventory reconciliation error under the dynamic steady condition during storage and transportation process. Collect the experimental data of key factors. Then, eliminate gross error in data by using \(3\sigma\) rule. Normalize it subsequently, so that the trained LSSVM model immune to gross error and different dimensions. Then, extract principal components of the independent variables which relevant to dependent variables by PLS. At the same time, redundant or irrelevant data are eliminated and the amount of input data is reduced. Hereafter, optimize the model of LSSVM to find out the optimum value of parameters by using training data and MFOA. Establish the prediction model of LSSVM. Finally, input prediction model built before with the test data to obtain the prediction error.
The error of inventory reconciliation and influential factors
According to the source and terminal of storage and transportation process, the error of inventory reconciliation can be established as formula (1).
where \(V_{\text{e}}\) is the error of inventory reconciliation, \(V_{{{\text{loss}}ij}}\) is the process loss from the \(j\)th source to \(i\)th terminal, \(V_{{{\text{T}}i}}\) is the amount of media changed in the \(i\)th terminal, \(V_{{{\text{S}}j}}\) is the amount of media changed in the \(j\)th source, and n and m are the number of sources and terminals, respectively.
\(V_{\text{e}}\) obeys the normal distribution in which mean is \(\mu\). The value of \(\mu\) is depended on the measurement accuracy of \(V_{{{\text{loss}}ij}}\), \(V_{{{\text{T}}i}}\) and \(V_{Sj}\). The measurement accuracy of \(V_{{{\text{loss}}ij}}\) is connected with pipe diameter, length, friction, medium velocity and other factors. The terminal and the source can be measured by level gauges, pressure sensors and flow meters. The measurement accuracy of \(V_{{{\text{T}}i}}\) and \(V_{Sj}\) is affected by internal mechanical structure of metering instruments and external environment. It relates to medium temperature, ambient temperature, level, flow and other factors.
Feature extraction by PLS
Assume that influential factors (independent variables) are \(X \in R^{n \times p}\) and the inventory reconciliation error (dependent variable) is \(Y \in R^{n \times 1}\).
where n is the number of samples and \(p\) is the number of influential factors.
In order to obtain the principle components \(t_{i}\) (\(i = 1,2, \ldots ,q\), \(q\) is the number of principle component) that can not only represent the independent variables, but also explain the dependent variable as much as possible, the nonlinear iterative algorithm is needed (You et al. 2013). Specific steps are as follows:
-
Step 1: Centralize \(X\) and \(Y\). \(E_{0}\) and \(F_{0}\) are the result of centralized \(X\) and \(Y\). Let \(i = 1\).
-
Step 2: Calculate the enter weight vector \(w_{i}\), score vector \(t_{i}\), load vector \(P_{i}\) and internal regression coefficients \(r_{i}\).
-
$$w_{i} = E_{i - 1}^{'} F_{i - 1} /\left\| {\left. {E_{i - 1} 'F_{i - 1} } \right\|} \right.$$(3)
-
$$t_{i} = E_{i - 1} w_{i}$$(4)
-
$$P_{i} = E_{i - 1}^{{}} t_{i} /\left\| {t_{i} } \right\|^{2}$$(5)
-
$$r_{i} = F_{i - 1} t_{i} /\left\| {t_{i} } \right\|^{2}$$(6)
-
Step 3: Let \(E_{i} = E_{i - 1} - t_{i} P_{i} '\) and \(F_{i} = F_{i - 1} - r_{i} t_{i}\).
-
Step 4: Calculate \({\text{press}}_{i}\), \({\text{ss}}_{i}\) and \(Q_{i}^{2}\) according to cross validation.
-
$${\text{press}}_{i} = \sum\limits_{z = 1}^{n} {\left( {y_{z} - \hat{y}_{i( - z)} } \right)^{2} }$$(7)
-
$${\text{ss}}_{i} = \sum\limits_{z = 1}^{n} {(y_{z} - \hat{y}_{iz} )^{2} }$$(8)
-
$$Q_{i}^{2} = 1 - {\text{press}}_{i} /{\text{ss}}_{i}$$(9)
-
where, \(\hat{y}_{i( - z)}\) is the estimate value of deleted point by regression model of \(i\) principle components
-
Step 5: The new score vector \(t_{i}\) can significantly improve the performance of the extracted components when \(Q_{i}^{2} \ge 0.0975\) (Abdi and Williams 2010). Analyze whether the inequality \(Q_{i}^{2} \ge 0.0975\) is established. If is, return to Step 2 to continue the calculations. Otherwise, output principle components \(t_{i}\) and the enter weight vector \(w_{i}\).
The MFOA-LSSVM prediction model of inventory reconciliation error
LSSVM changes the inequality constrains into equality constraints on the basis of SVM. It reduces the complexity of model and improves the performance of model (Miranian and Abdollahzade 2013; Mellit et al. 2013)
Training data are \((x_{i} ,y_{i} )_{n\, \times \,(p + 1)}\) where \(n\) is the number of samples and \(p\) is the dimension of input variables. Use LSSVM to solve the problem of data fitting, or classification is equivalent to settle optimization problems as shown in formula (10):
where \(C\) is the penalty factor,\(\omega\) is the weight vector,\(b\) is the deviation,\(\varepsilon_{i}\) is the error variable, and \(\varphi (x_{i} )\) is the mapping function.
Construct the function of Lagrange according to formula (10):
where \(\alpha\) is the Lagrange multiplier and \(\alpha\) is equal to \([\alpha_{1} ,\alpha_{2} , \ldots ,\alpha_{n} ]^{T}\).
Calculate the function of Lagrange to obtain formula (12):
where \(1_{n \times 1}\) is a matrix of n rows which value is 1, \(E_{n \times n}\) is identity matrix of n orders, \(\varOmega_{ij} = \varphi \left( {x_{i} } \right)^{T} \varphi \left( {x_{j} } \right) = K\left( {x_{i} ,x_{j} } \right)\), and \(K\left( {x_{i} ,x_{j} } \right)\) is kernel function.
Calculate formula (12) to obtain the model of LSSVM:
The common kernel functions include linear kernel function, polynomial kernel function, Gauss radial basis kernel function and sigmoid kernel function. What need to consider when selecting the kernel function are its ability to handle nonlinear and the number of undetermined parameters. In this paper, select Gauss radial basis kernel function, that is \(K\left( {x_{i} ,x_{j} } \right) = \exp \left[ { - \left( {x_{i} - x_{j} } \right)^{2} /2\sigma^{2} } \right]\).
In the process of modeling LSSVM, it is difficult to select the penalty factor and kernel parameter by artificial experience. In order to better predict the error, it needs a kind of optimization algorithm for parameters optimization of LSSVM. Fruit fly optimization algorithm is an optimization method based on the foraging behavior of fruit flies. It was proposed by Pan W. T. coming from Taiwan in 2012 (Pan 2012; Dai et al. 2014; Si et al. 2016). The modified fruit fly optimization algorithm with a small amount of calculation as well as a good global search capability is adopted to select the LSSVM model parameters for the reason that the fruit fly optimization algorithm is easy to fall into local optimum. Specific steps of three-dimensional improved fruit fly algorithm with diminishing steps are as follows.
-
Step 1: Parameters initialization: maximum number of iterations (maxgen), population size (popsize), maximum step (\(L_{\hbox{max} }\)), minimum step (\(L_{\hbox{min} }\)), and the initial position of fruit flies \(X(i)\), \(Y(i)\) and \(Z(i)\), as well as the best location \(X\_{\text{axis}}\), \(Y\_{\text{axis}}\) and \({\text{Z}}\_{\text{axis}}\). Let the iteration \({\text{gen}} = 1\).
-
Step 2: Set the directions and distances of foraging for fruit flies.
-
$$\begin{array}{*{20}l} {X(i,:) = X\_{\text{axis}} + L*{\text{rands}}(1,2)} \hfill \\ {Y(i,:) = Y\_{\text{axis}} + L*{\text{rands}}(1,2)} \hfill \\ {Z(i,:) = Z\_{\text{axis}} + L*{\text{rands}}(1,2)} \hfill \\ {L = L_{\hbox{max} } - \frac{{(L_{\hbox{max} } - L_{\hbox{min} } )*{\text{gen}}^{2} }}{{\hbox{max} {\text{gen}}^{2} }}} \hfill \\ \end{array}$$(14)
-
where \(L\) is the steps of fruit flies.
-
Step 3: Calculate the concentration value \(S(i,1)\) and \(S(i,2)\) of fruit fly individuals.
-
$$\begin{aligned} S(i,1) = \frac{1}{{\sqrt {X(i,1)^{2} + Y(i,1)^{2} + Z(i,1)^{2} } }} \hfill \\ S(i,2) = \frac{1}{{\sqrt {X(i,2)^{2} + Y(i,2)^{2} + Z(i,2)^{2} } }} \hfill \\ \end{aligned}$$(15)
-
Step 4: Let \(C = S(i,1)\), \(\sigma^{2} = S(i,2)\) in LSSVM. Train LSSVM through using the training data. Let the mean square error (MSE) of test samples equals to concentration function of taste. That is \({\text{smell}}(i) = {\text{MSE}}(i)\).
-
Step 5: Calculate the positions of minimum concentration value: \([{\text{bestsmell}},{\text{index}}] = \hbox{min} ({\text{smell}})\). Let \(X\_{\text{axis}} = x({\text{index}})\), \(Y\_{\text{axis}} = y({\text{index}})\).
-
Step 6: Let \({\text{gen}} = {\text{gen}} + 1\) and repeat steps 2–5 until meeting the maximum number of iterations. Output the locations of optimum concentration and the model of LSSVM.
Experimental modeling of inventory reconciliation error prediction
Experimental preparation
The experiments are carried out on the type of THJ-4 advanced process control system platform (Fig. 2) to simulate transmission oil operations with water instead of oil. Detectors used by experimental platform are diffused silicon pressure transmitter, Pt100 temperature sensor and a turbine flow meter. MCGS configuration software is used by monitoring system, which is shown in Fig. 3.
The calculation of inventory reconciliation error
Water was transported from the storage tank to the medium tank via the pump and flow meter in the experiment. Inventory reconciliation equation shown in formula (16) can be established in the dynamic stability condition.
where \(V_{\text{in}}\) is the amount of volume changed in the medium tank, \(V_{\text{loss}}\) is the loss media volume, and \(V_{\text{out}}\) is the volume via media flow meter. The error V e is calculated over a certain time interval of \(\Delta t\) (\(V_{\text{loss}}\) can be neglected this moment):
where \(D\) is the diameter of the medium tank, \(h_{1}\) is the water level at the time of \(t\), \(h_{2}\) is the water level at the time of \(t + \Delta t\), and \(v\) is the value of flow meter.
As can be seen from Eq. (17), factors that affect the error of inventory reconciliation model include flow error, level error and integration time. In addition, the medium temperature, ambient temperature and pump frequency will influence the error by affecting factors \(v\) and \(h\).
Data preparation
In order to meet the need of status monitoring for storage and transportation process, let \(\Delta t = 1\,s\). The operating frequency of pump was taken as 42.5, 45 and 47.5 Hz, respectively. The external temperature changed among 20–25 °C, and the medium temperature changed among 25–30 °C. Measurements were repeated at different levels. The pump pressure, medium temperature, media velocity, pump frequency, the water level of medium tank and the ambient temperature were selected as model inputs x 1–x 6. The calculated error was taken as model output. A total of 25,907 sets of data are collected. The number sets of data are 9167, 8803 and 7937 corresponding to 42.5, 45 and 47.5 Hz of the pump.
The average error results of three different frequencies when the ambient temperature is 20 °C, the medium temperature is 30 °C, and the level ranges from 2 to 14 cm are shown in Fig. 4.
At the frequency of 42.5 Hz, the average error results for different ambient temperature and medium temperature when the level ranges from 2 to 4 cm are shown in Fig. 5.
As can be seen from Fig. 4, the error varies with different level at the same frequency. At the same level, there are big differences for the model error between different frequencies. These results show that the error of inventory reconciliation model really exists and is related to pump frequency, media velocity and the media level. As shown in Fig. 5, the error varies with different medium temperature and ambient temperature too.
Data preprocessing
Gross error is eliminated in data according to \(3\sigma\) rule. Numbers of gross error are eleven, ten and thirty-three when the pump frequency are 42.5, 45 and 47.5 Hz. Excluded gross error is shown in Table 1, and the comparison of before and after gross error elimination is shown in Fig. 6.
Feature extraction
The correlation coefficient between the input variables is calculated. As shown in Table 2, there is a high linear correlation between some input variables such as: \(x_{1}\) and \(x_{3}\), \(x_{3}\) and \(x_{4}\), \(x_{1}\) and \(x_{4}\).
Standardize the data and then adopt the PLS to extract the principal components.
The first principal component \(t_{1}\):
Because \(Q_{1}^{2} = 1 \ge 0.0975\), extraction of the second principal component \(t_{2}\) is continued.
Because \(Q_{1}^{2} = 0. 0 1 0 3\le 0.0975\), extraction of the principal component is stopped.
Where \(X' = \left[ {x_{1} '\;\,x_{2} '\;, \ldots ,\;\,x_{6} '} \right]^{T}\) and \(x_{i} '(i = 1,2, \ldots ,6)\) is the input variables after standardization process.
Prediction model
To reduce the amount of calculation, the error prediction model was only established when the pump frequency was 42.5 Hz. According to the general method of splitting the dataset, the 80/20 ratio was adopted to take the 1831 samples extracted from the principal components data at intervals of 5 as prediction data. In the course of training model, the parameters \(C\) and \(\sigma^{2}\) of LSSVM are optimized by using MFOA. These two corresponding parameters value were obtained (\(C = 0. 0 0 7 3\), \(\sigma^{2} = 0. 0 0 9 6\)) after iterating 100 times. The training process is shown in Figs. 7 and 8.
Comparison of methods
The method detailed in this paper is taken as the base method for comparison with other methods. From hereon, the base method in this paper will be referred to as method one. In order to verify the effectiveness of the method one, it also selected method two: least squares support vector machine optimized by fruit fly algorithm (FOA-LSSVM), method three: partial least squares regression and least squares support vector machine optimized by fruit fly algorithm (PLS and FOA-LSSVM) and method four: least squares support vector machine optimized by particle swarm optimization algorithm (PSO-LSSVM) to compare with it (Wang et al. 2012; Sedighizadeh and Kashani 2014).
As shown in Table 3, related parameters were set. The above four methods were used to predict the balance model error and correct the model with it. The results are shown in Figs. 9 and 10.
In order to clearly understand the difference between various methods, the paper also compares these methods in aspects of the root mean square error (RMSE), the mean absolute error (MAE), the relative mean absolute error (E ave), simulation time and the distribution of absolute relative error. The results are shown in Table 4 and Fig. 11.
The predictive value of these four methods may be used to correct model error. Through the above comparison, these four methods can eliminate the systematic error influence on inventory reconciliation model to some extent, but the performance of various methods is somewhat different. Overall, the model of LSSVM optimized by fruit fly algorithm has a small error with respect to particle swarm optimization. Because of excluding the effects of noise and resolving the multicollinearity among input variables, the method with feature extraction by using partial least squares has some features compared to other methods, such as the less time spent and higher accuracy. The method of PSO-LSSVM consumed longest time with the biggest errors in predicting the error of inventory reconciliation model. The PLS and MFOA-LSSVM method used herein not only shorten the simulation time and improve the accuracy, but also their relative error distribution is better than other methods. It improved 1.48, 0.93 and 2.91% than other three methods in the aspects of RMSE, MAE and E ave, respectively. The maximum value of modeling time saved by it could reach up to 17.6%.
Conclusion
On the basis of instrument error prediction methods, use modular least squares support vector machines to predict the error of inventory reconciliation and eliminate it subsequently. On the one hand, the feature extraction has been implemented for the independent variables by partial least squares regression. On the other hand, the problems of large computation and low accuracy for error prediction have been solved by using the modified fruit fly algorithm to optimize parameters of least squares support vector machine. Compared with other three different prediction methods, experiments show that the method of PLS and MFOA-LSSVM can predict the error of inventory reconciliation model effectively.
References
Abdi H, Williams LJ (2010) Principal component analysis. Wiley Interdiscip Rev Comput Stat 2:433–459
Austina N, Kumarb PS, Kanthavelkumaranc N (2013) Artificial neural network involved in the action of optimum mixed refrigerant (domestic refrigerator). Int J Eng 26:1025–2495
Chen C, Feng Y, Xu H, Rong G (2010) G. Multiscale data rectification method with application to material balance in petrochemical enterprises. CIESC J 61:1919–1926
Dai H, Zhao G, Lu J, Dai S (2014) Comment and improvement on “A new fruit fly optimization algorithm: taking the financial distress model as an example”. Knowl Based Syst 59:159–160
He W, Song X, Ganyi Li H, Ihara T (2014) Research on optimized grey neural network modeling method for sensor calibration. Yi Qi Yi Biao Xue Bao/Chin J Sci Instr 35:504–512
Mellit A, Pavan AM, Benghanem M (2013) Least squares support vector machine for short-term prediction of meteorological time series. Theor Appl Climatol 111:297–307
Miranian A, Abdollahzade M (2013) Developing a local least-squares support vector machines-based neuro-fuzzy model for nonlinear and chaotic time series prediction. IEEE Trans Neural Netw Learn Syst 24:207–218
Pan WT (2012) A new fruit fly optimization algorithm: taking the financial distress model as an example. Knowl Based Syst 26:69–74
Pang H, Luo F, Chen D, Pan M, Luo S (2012) Temperature compensation model of fluxgate magnetometers based on RBF neural network. Yi Qi Yi Biao Xue Bao/Chin J Sci Instr 33:695–700
Peng J, Lv W, Xing H, Wu X (2013) Temperature compensation for humidity sensor based on improved GA-BP neural network. Yi Qi Yi Biao Xue Bao/Chin J Sci Instr 34:153–160
Sedighizadeh M, Kashani MF (2014) A tribe particle swarm optimization for parameter identification of proton exchange membrane fuel cell (technical note). Int J Eng Trans A Basics 28:16–24
Si L, Wang Z, Liu X, Tan C, Liu Z, Xu J (2016) Identification of shearer cutting patterns using vibration signals based on a least squares support vector machine with an improved fruit fly optimization algorithm. Sensors 16:90
Wang H, Hong R, Chen J, Tang M (2015) Intelligent health evaluation method of slewing bearing adopting multiple types of signals from monitoring system. Int J Eng Trans A Basics 28:573–581
Wang X, Kang DU, Qin B, Hai-Jun XU (2012) Drying rate modeling based on FOALSSVR. Control Eng China 19:630–638
United States Environmental Protection Agency (1995) Introduction to statistical inventory reconciliation for underground storage tanks. EPA Web. https://www.epa.gov/ust/introduction-statistical-inventory-reconciliation-underground-storage-tanks. Accessed Sept 1995
Ye Y, Lu J, Qian Z, Wang Y (2016) Study on the temperature error prediction of mechanical temperature instrument based on LS-SVM. Yi Qi Yi Biao Xue Bao/Chin J Sci Instr 37:57–66
You L, Liu J, Yang T, Sun W (2013) NOx emission characteristic modeling based on feature extraction using PLS and LS-SVM. Yi Qi Yi Biao Xue Bao/Chin J Sci Instr 34:2418–2424
Zhang C, Jiang J, Yanmei LI, Chen S, Zha C, Wang C (2012) Temperature compensation of sensor based on CMPSO-LSSVM. Chin J Sens Actuators 25:472–477
Acknowledgements
Funding was provided by Department of material and fuel, General Logistics Department (Grant No. oil20130208).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Shui, A., Fang, H., Zong, F. et al. The error prediction of inventory reconciliation during storage and transportation process based on PLS and MFOA-LSSVM. J Petrol Explor Prod Technol 7, 895–904 (2017). https://doi.org/10.1007/s13202-016-0292-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13202-016-0292-0