Introduction

The incorrect two- or three-phase flow rate estimations may cause optimum production failure and damage surface production facilities. In addition, in the field oil production period, it seems to be vital to predict the oilfield output for an efficient production management system (Guo and Deng 2009); and increasing the well’s lifetime. Many multiphase choke correlations have been proposed by some scholars and researchers such as Gilbert (1954), Baxendell (1958), Ros (1960), Achong (1961) and Mesallati et al. (2000) between liquid flow rate, gas/liquid ratio, wellhead pressure and choke size. General form of these correlations and their coefficients are listed in Table 1. However, due to the highly complicated nature of the flow mechanisms and regimes in the pipes, wellheads and orifices, no comprehensive theoretical relation has been accepted under all operating conditions.

Table 1 Multiphase flow choke correlations: Q = A P wh S B GLR C

Uncertainty and risk analysis greatly help to explore the nature of complex systems such as choke behavior.

Basically, engineering judgment is needed when the lack of historical ‘hard data’ becomes and important issue (Apeland et al. 2002). Some researchers such as Aven and Pörn (1998) stated that quantitative risk assessment distinguishes between epistemic and aleatory (stochastic) uncertainties (Vaidogas and Juocevicius 2009). However, in the risk analysis, regarding the predictive and epistemic path, probabilities are used for expressing uncertainties related to future amounts of observable quantities in a time period (Apeland et al. 2002).

In the real world especially in the oil and gas industry, there are different kinds of these uncertainties associated with the measured or calculated parameters. To cope with these data uncertainties, several statistical approaches are developed. These approaches are the study of gathering, classification, analysis and interpretation of data. Some of the well-known statistical approaches are Monte Carlo Sampling, Goodness of fit tests, Hypothesis tests, etc.

In parametric statistics, linear regression is a mathematical method in which a straight line is fitted between a numbers of points to measure the explanatory variables effect on a scalar dependent variable (Hu 2011).

Monte Carlo simulation in production engineering is more or less new and is shown versatile for performance prediction of individual wells during a field development plan. Considering the computational cost and time, Monte Carlo simulation is very useful to predict the accuracy of various estimation methods and their optimization in a wide range of applications (Ambrozic 2011).

Statistical analysis is used in this paper for parameter study of a new multiphase choke correlation that proposed (Safar Beiranvand and Babaei Khorzoughi 2012) recently. Interrelationship between some of its independent and dependent variables is then investigated. Getting access to a rich data bank, distribution function of each parameter of the new correlation is obtained. Monte Carlo analysis is then performed to fully explore the range of flow rate uncertainty that will be obtained from the proposed correlation at various operating conditions.

Proposed correlation

In the previous paper (SPE-158649-PA), 182 production test data of different Iranian wells were used to propose the following correlation (Safar Beiranvand and Babaei Khorzoughi 2012):

Q = A P wh F S B ( 1 BS & W 100 ) D ( T T sc ) E GLR C
(1)

where Pwh is wellhead pressure (psig), GLR is gas/liquid ratio (scf/stb), Q is gross liquid flow rate (bbl/day), S is choke size (64th of an inch), T is temperature (R) and BS&W is volume percentage of basic sediment and water in the producing fluids. These parameters ranges are given in Table 2. The field tests data used for this correlation are given in SPE-158649-PA.

Table 2 Range of data used for correlation

It is shown in the previous paper that inclusion of ( 1 BS & W 1 0 0 ) and ( T T sc ) greatly improves the liquid flow rate prediction of the proposed correlation compare to other existing models. However, to explore the relation between these two parameters and liquid flow rate, a calculation of the Pearson product-moment correlation coefficient is done. It is obtained by dividing the covariance of the two variables by the product of their standard deviations. To conduct correlation analysis, Wessa (2013) statistics software is employed.

The Pearson product-moment correlation coefficient is obtained by the following relation (Rodgers and Nicewander 1988) and takes real values between +1 and −1 which correspond to the best positive and negative correlations, respectively (Dowdy and Wearden 1983):

P xy = COV ( X , Y ) σ x σ y = E [ ( X μ x ) ( Y μ y ) ] σ x σ y
(2)

where COV is covariance and σ is standard deviation.

Figures 1, 2, 3 present the scatter plot of correlation coefficient between ( 1 BS & W 1 0 0 ) , ( T T sc ) and Pwh versus Q, respectively. As depicted in the mentioned figures ( 1 BS & W 100 ) and Pwh have stronger correlation with Q than T T sc and Q. In addition to the scatter plot, these figures show the histogram distribution of each parameter. So, the graph of Q versus ( 1 BS & W 1 0 0 ) and Pwh are ascendant that is compatible with the physics of problem. In other hand, for Q versus T T sc it can be seen that for Q < 4,000 bbl the relationship is ascendant but for Q > 4,000 bbl, the T T sc is almost constant. Therefore, using the T T sc strongly depends on the other parameters.

Fig. 1
figure 1

Pearson product-moment correlation scatter plot of ( 1 BS & W 1 0 0 ) vs. Q

Fig. 2
figure 2

Pearson product-moment correlation scatter plot of Pwh vs. Q

Fig. 3
figure 3

Pearson product-moment correlation scatter plot of ( T T sc ) vs. Q

All of mentioned parameters show moderate correlation coefficient in the range of 0.3–0.7 with flow rate which are shown in the Table 3. It is therefore evident that the new parameters of ( 1 BS & W 1 0 0 ) and ( T T sc ) versus Q have comparable correlation coefficient to that of between Pwh and Q. This analysis shows that these parameters have almost the same importance level in predicting Q and we should expect more prediction errors in previously published choke flow correlations compare to our proposed correlation as will be explored shortly. For example, the correlation between Q and ( 1 BS & W 1 0 0 ) is even stronger than correlation between Q and Pwh.

Table 3 Correlation coefficient between Q and ( 1 BS & W 1 0 0 ), Pwh and T T sc

In addition to a correlation functional form and its variables, correlation parameters (A, B, C, D, E, and F in this case) also greatly control its performance. As shown in Table 1, all the previous correlations proposed by other researchers share the same functional form and variables, but they have different parameters. It is therefore of great importance to correctly obtain these parameters. Two optimization methods for this purpose are used which will be explained in the next section.

Determination of correlation parameters

Two optimization methods, linear regression and Nelder–Mead, are used to generate the unknown parameters A, B, C, D, E, and F of the proposed choke correlation. Performance goodness of these methods is quantified by calculating two types of errors as follows:

error = Q test Q correl Q test
(3)
absolute error = | Q test Q correl Q test |
(4)

Nelder–Mead algorithm

In this approach the bounds based on the expert knowledge and previous models are applied into the model parameters, i.e., A, B, C, D, E, and F in the present study. Based on the correlation (1) and the proposed bounds at each iteration (Table 4), the calculated values are compared with the actual values and the error is computed until an acceptable answer is reached. A script is written in MATLAB to perform the Nelder–Mead algorithm by means of MATLAB function Fminsearch based on arbitrary initial values for A, B, C, D, E and F. The results are tabulated in the first row of Table 5. Small error of 2.89 % of the proposed correlation indicates close prediction of flow rates to the measured values compare to very unsatisfactory results of the previous correlations with errors in the range of 60–160 %.

Table 4 Arbitrary initial values for A, B, C, D, E and F
Table 5 Equation coefficient for different correlation

Linear regression

In this study, the general linear regression form is introduced by:

y = c 1 x 1 + c 2 x 2 + + c n x n + ε
(5)

where y is dependent variable, c i (i = 1, 2, 3,…,n) is model parameter, x i is independent variable and ε is error.

Rewriting correlation (1) in logarithmic form leads to the following relation more suitable to apply linear regression:

ln ( Q ) = ln ( A ) + F ln ( P wh ) + B ln ( S ) + D ln ( 1 BS & W 1 0 0 ) + E ln ( T T sc ) C ln ( GLR )
(6)

Compared to general linear regression form, the ln(A), B, C, D, E and F are model parameters and ln(Pwh), ln ( 1 BS & W 1 0 0 ) , ln(S), ln ( T T sc ) and ln(GLR) are physical measurable parameters (explanatory variables).

In this study, a script is written in MATLAB to perform the linear regression algorithm. The results are shown in Table 5. This table shows that dimensions are not suitable due to the negative value for C.

Uncertainty analysis of the proposed correlation

Statistical analysis

Statistical analysis is used here to investigate the inherent data uncertainty effect on the liquid flow rate calculated from the proposed correlation. Pwh, S, ( 1 BS & W 1 0 0 ), T T sc , and GLR are considered as independent random variables and Q as dependent variable. It is worth to emphasize that a random variable does not necessarily refer to an unknown parameter, but also to a parameter with some uncertainty as it is here.

A random variable can be completely introduced by its probability density function (PDF). It is therefore necessary to determine the best PDF of the aforementioned parameters. This is achieved by functional analysis of histogram of each variable. Selected PDFs best fitted to the recorded variables histograms are given in the Tables 6, 7. The assigned distributions are based on the Chi-square goodness of fit. By applying these probability distribution functions in the Monte Carlo simulation of the new model, the most probable value of choke flow rate can be determined as explained in the next section. Choke flow rate changes as a function of considered operating variables is of utmost importance in production engineering and management.

Table 6 Probability distributions for random variables
Table 7 Most probable values based on the assigned distributions

Monte Carlo simulation

Monte Carlo simulation is a common method widely used in the field of probability analysis and risk management. It deals with uncertain variables to explore their variation impact on an objective function in engineering or other mathematical and physical systems. In the Monte Carlo simulation uncertainties variables are modeled based on their statistical distribution and random number sampling. In the recent decades this approach has attracted researcher’s attention to estimate uncertain parameters and to do risk analysis in oil and gas industry.

As previously discussed, one of the most important issues in production engineering is to predict flow behavior in multiphase systems. However, one of the challenges in predication of flow rate in the two-phase systems is that it is prone to error. This error includes instrumentation errors as well as human (operator) errors. All of the variables gathered from the field to predict flow rate include these errors. Therefore, a calibration or an index as a measure of accuracy is required to have more accurate results.

In the current investigation, Monte Carlo simulation is used to forecast the probability distribution function (PDF) of choke flow rate, Q. The comparison between measured flow rates in the field and simulated values shows the level of uncertainty. Sampling from the PDFs of the independent physical variables is performed to obtain chock flow rate from correlation (1). Total number of 1,000 trials are performed to cover the physical variables ranges and provide proper flow rate distribution. The results are shown in Table 7 and in Fig. 4. Gamma distribution is obtained for Q with the most probable value of 4,120 stb/day. This Monte Carlo simulated distribution compares very well to the actual chock flow rate distribution in Figs. 5, 4. Although the distributions are not exactly the same, but it is clear from this analysis that the proposed correlation can reproduce the actual choke flow rate distribution from the considered random variables in their physical recorded ranges. The difference between these probability density functions was caused by the uncertain variables.

Fig. 4
figure 4

Probability distribution of Q due Monte carlo simulation

Fig. 5
figure 5

Probability distribution of actual Q from field data

Summary and recommendations

It is shown that the empirical correlations of Gilbert (1954), Baxendell (1958), Ros (1960), Achong (1961) and Mesallati et al. (2000) failed to predict choke flow rate of 182 production test data of Iranian wells. Prediction errors of 60–160 % are obtained from these correlations. Two new physical variables of ( 1 BS & W 1 0 0 ) and ( T T sc ) are shown to be correlated with choke flow rate. Inclusion of these variables into a new correlation greatly improved its liquid flow rate prediction. Determination of coefficient parameters of the proposed correlation by Nelder–Mead method resulted in only 2.89 % prediction error of the production test data.

Monte Carlo simulation of the proposed correlation showed a gamma distribution for choke flow rate Q, with the most probable value of 4,120 stb/day. This was in very good agreement with the actual choke flow rate distribution from the recorded test data. The proposed correlation can therefore be effectively used to predict choke flow rate in the case of lack of information or large uncertainty in the production recorded data.

Regression results shows that linear regression is not appropriate for this problem because the dimensions are not suitable due to negative value for C which disturbs the dimensional equilibrium. It shows that the optimization method should be selected due to physical properties of problem. Hence, Nelder–Mead method is applicable in some cases of production problems.

Based on the difference in the actual Q and forecasted distribution, the coefficient of parameters in the proposed equation should be improved to obtain more accurate results.

Other new optimization techniques, especially evolutionary algorithms, such as Genetic Algorithm, Artificial Ant Colony, Simulated Annealing and Imperialist competitive algorithm, are recommended for further researches in production engineering such as multiphase flow prediction through the surface choke.