Construction cost estimation of spherical storage tanks: artificial neural networks and hybrid regression—GA algorithms
Abstract
One of the most important processes in the early stages of construction projects is to estimate the cost involved. This process involves a wide range of uncertainties, which make it a challenging task. Because of unknown issues, using the experience of the experts or looking for similar cases are the conventional methods to deal with cost estimation. The current study presents datadriven methods for cost estimation based on the application of artificial neural network (ANN) and regression models. The learning algorithms of the ANN are the Levenberg–Marquardt and the Bayesian regulated. Moreover, regression models are hybridized with a genetic algorithm to obtain better estimates of the coefficients. The methods are applied in a real case, where the input parameters of the models are assigned based on the key issues involved in a spherical tank construction. The results reveal that while a high correlation between the estimated cost and the real cost exists; both ANNs could perform better than the hybridized regression models. In addition, the ANN with the Levenberg–Marquardt learning algorithm (LMNN) obtains a better estimation than the ANN with the Bayesianregulated learning algorithm (BRNN). The correlation between real data and estimated values is over 90%, while the mean square error is achieved around 0.4. The proposed LMNN model can be effective to reduce uncertainty and complexity in the early stages of the construction project.
Keywords
Cost estimation Manufacturing project Spherical storage tanks Neural networks Genetic algorithm Regression methodIntroduction
Cost estimation in the early stages of construction projects involves an extensive amount of uncertainty. Thus, there is a high demand to establish an effective method to reduce uncertainty in cost estimation. An effective cost estimation technique could facilitate the process of time/cost control in construction projects. One conventional method for a rough cost estimation and conducting a feasibility study with a predefined budget is the use of some experts. However, continuous access to these experts is not an easy option, leading to developing another method to estimate the cost of construction projects especially in their early stages. The new method could be based on data generated from the previous similar projects.
As artificial intelligence became popular in the 1980s, a new approach was introduced for construction cost estimation, while several studies employed different methods to estimate the costs in a wide range of industrial applications. Later, in the 1990s, neural networks (NNs) as a branch of artificial intelligence were employed as an alternative to estimate construction costs. This method does not require the determination of a cost estimating function that mathematically relates the cost to the variables with the most effect on the cost. A featurebased neural network (NN) for modelling cost estimation was developed by Zhang et al. (1996) for packaging products. Shtub and Versano (1999) compared the performances of NNs and regression analysis when they estimate the construction cost of a steelpipebending process. Gwang and SungHoon (2004) investigated the accuracies of several cost estimation methods such as multiple regression analysis (MRA), NNs, and casebased reasoning (CBR) based on 530 available historical construction costs from residential buildings. General contractors conducted these projects between 1997 and 2000 in Seoul, Korea. CBR is a methodology that received an increasing attention to make cost estimation during the early phases of a project. The existing knowledge is exploited by this method to make better estimations compared with the case without its use. De Soto and Adey (2015) investigated the CBR reasoning retrieval process to estimate resources in construction projects. Cavalieri et al. (2004) compared parametric and NN models for the estimation of production costs and concluded that NN performs better and is more reliable. Kim et al. (2004) employed a backpropagation neuralnetwork (BPNN) approach combined with genetic algorithms (GAs) to estimate construction costs of residential buildings. The aim of using GAs in their work was to determine the BPNN’s parameters and to improve the accuracy of the estimation. Murat and Ceylan (2006) implemented an artificial neuralnetwork (ANN) process to estimate the cost of energy transportation. Verlinden et al. (2008) developed MRA and ANNbased models to estimate the cost of a sheet metal production. Wang et al. (2013) developed a cost estimator model based on NN. The learning procedure of his NN was completed by means of a particleswarm optimization algorithm. Zima (2015) presented an approach to estimate the unit price of construction elements with the use of the CBR method. The CBR system presents a knowledge base that supports the cost estimation at the early stage of a construction project.
In addition to the applications of the artificial intelligence approach in cost estimation, a large number of works were devoted to evaluate the effectiveness of machinelearning methods for prediction and forecasting. Geem and Roper (2009) used an ANN model to anticipate the energy demand in South Korea. Geem (2011) developed machinelearningbased models to forecast South Korea’s transport energy demand. By considering the socioeconomic indicators as input, Assareh et al. (2010) presented the usage of a particleswarm optimization (PSO) and a GA to predict the demand for oil in Iran. Gudarzi Farahani et al. (2012) applied a Bayesian vector autoregressive methodology to forecast Iran’s energy consumption and discussed its potential implications. In addition, Rodger (2014) applied different variables such as price, operating expenditures, drilling cost, the cost for turning the gas on, the price of oil and royalties to predict gas demand. His implemented method was a fuzzy regression nearest neighbor ANN model.
There are different methods in the abovementioned literature for applying estimation models in various fields, where based on the type of data, input values, and required accuracy, a specific model has been developed for each case. However, there has been no study for estimating the cost of pressure vessel construction. As such, the novelty of the current work is to introduce a simple, high accuracy databased model for cost estimation in construction projects of pressure vessel tanks. In other words, this paper deals with the problem of estimating cost involved in constructing a spherical storage tank during its early stages, the case that has been overlooked in the past. Two main steps are taken to tackle the problem: (1) identifying the input variables and (2) evaluating the performance of the proposed cost estimator methods using the real construction data. Four different models, i.e., NNs with Levenberg–Marquardt and Bayesianregulated training algorithms, a linear regression model, and an exponential regression model, are applied in this paper to estimate the cost. In addition, a genetic algorithm is employed to find better estimates of the parameters of the linear and the exponential regression models.
The structure of this paper is illustrated as follows. A brief background on spherical storage tank construction is presented in “Construction cost of spherical storage tank” section. The proposed modeling techniques are proposed in “Artificial neural network” section. Comparative analyses to evaluate the performance of the proposed models come in “Results” section. Finally, conclusions are made in “Conclusion” section.
Construction cost of spherical storage tank
Artificial neural network
The ANN used in this paper is well known and one of the most applied NNs called multilayer perceptron (MPNN). To design an MPNN, the data set is divided into three groups of training, validation, and testing to update the weights and the bias factors. There are two types of learning algorithms to develop an NN in this paper: the Levenberg–Marquardt and the Bayesian regulated. Moreover, in the training phase, 70% of the data is assigned for training, 15% for testing, and 15% for validation (Wang et al. 2002).
Genetic algorithm
Integration of a GA
In this section, the application of GA in the proposed regression cost estimation models is discussed. The aim of using GA is to find the best combination of the coefficients of a linear and an exponential regression model to minimize MSE. The use of a GA was demonstrated in Hasheminia and Niaki (2006), where they proposed a GA to find the best regression model among some candidates.

Population size (n): 20.

Iterations (number of the generation): 20,000.

Mutation percentage: 70%.

Crossover percentage: 30%.
Results
Training stage MSE for different numbers of hidden layers
Number of hidden layers  5  6  7  8  9  10  11  12  13  14  15 

Levenberg–Marquardt neural network (MSE)  9.1e^{−4}  4.73e^{−4}  5.50e^{−4}  2.53e^{−4}  8.15e^{−4}  4.77e^{−4}  4.66e^{−4}  4.67e^{−4}  6.36e^{−4}  4.70e^{−4}  3.84e^{−4} 
Bayesian regularized neural network (MSE)  5.24e^{−4}  5.20e^{−4}  6.17e^{−4}  8.00E^{−04}  7.20e^{−4}  5.07e^{−4}  5.73e^{−4}  5.81e^{−4}  6.10e^{−4}  6.80e^{−4}  6.06e^{−4} 
Testing performance of the LMNN and BRNN
Target values  0.477  0.729  0.822  0.596  0.351  0.578  0.355  0.710  0.616  0.650  0.267  MSE 
Estimated values by LMNN  0.470  0.717  0.760  0.562  0.326  0.588  0.323  0.698  0.632  0.663  0.285  0.291 
Estimated values by BRNN  0.364  0.700  0.714  0.521  0.300  0.423  0.281  0.602  0.616  0.601  0.235  0.333 
The mean squared error is used to evaluate the performance of the NNs when they are used on 11 randomly selected testing data sets to estimate the costs. The results that are given in Table 2 show the better performance of the LMNN. The range of the predicted cost by LMNN is (0.285–0.760) and for BRNN is (0.235–0.714). The range of the actual cost is (0.267–0.822).
MSE of the linear and the exponential regression cost models
Target values  0.588  0.650  0.782  0.603  0.284  0.305  0.839  0.356  0.267  0.318  MSE 
Estimated values (linear)  0.633  0.744  0.884  0.706  0.336  0.315  0.926  0.416  0.309  0.261  0.5 
Estimated values (exponential)  0.676  0.773  0.871  0.667  0.332  0.336  0.885  0.430  0.330  0.302  0.4 
Conclusion
This paper presented datadriven models consisting of ANNs and regression models to estimate the construction cost of spherical storage tank projects. The variables considered in these models were thickness, tank diameter, and length of the weld. The learning algorithms of the developed multilayer perceptron NNs were the Levenberg–Marquardt and the Bayesian regulated. Moreover, a GA was employed to find nearoptimum values of the parameters of a linear as well as an exponential regression model. While it was shown that the NNs had better capabilities in cost estimation compared to the regression models, the LMNN performed better than the BRNN in terms of MSE. In general, we showed that ANN models could play a very important role for an efficient estimate of construction project costs in their early stages.
The level of uncertainties can be reduced by means of increasing the samples in training data. Moreover, gathering data is important to form a reliable and effective data set. For this purpose, valid and updated resources should be available. In addition, based on the results obtained in the current study, choosing an appropriate model to describe the trend of data is an important task in this regard.
Future work may focus on comparing the performance of the proposed ANN method with the one of another ANN approach when it is hybridized with a metaheuristic such as GA, Bees Algorithm, Artificial Bee Colony, Ant Colony, etc.
Notes
References
 American Society of Mechanical Engineers (ASME) Codes Sec VIII (2015). https://www.asme.org/shop/standards/newreleases/boilerpressurevesselcode/pressurevessels. Accessed 7 Feb 2016
 Assareh E, Behrang MA, Assari MR, Ghanbarzadeh A (2010) Application of PSO (particle swarm optimization) and GA (genetic algorithm) techniques on demand estimation of oil in Iran. Energy 35:5223–5229CrossRefGoogle Scholar
 Cavalieri S, Maccarrone P, Pinto R (2004) Parametric vs. neural network models for the estimation of production costs: a case study in the automotive industry. Int J Prod Econ 91:165–177CrossRefGoogle Scholar
 De Soto BG, Adey BT (2015) Investigation of the casebased reasoning retrieval process to estimate resources in construction projects. Procedia Eng 123:169–181CrossRefGoogle Scholar
 Geem WZ (2011) Transport energy demand modeling of South Korea using artificial neural network. Energy Policy 39:4644–4650CrossRefGoogle Scholar
 Geem ZW, Roper WE (2009) Energy demand estimation of South Korea using artificial neural network. Energy Policy 37:4049–4054CrossRefGoogle Scholar
 Gonzalez RL (2008). Neural networks for variation problems in engineering. Ph.D. Thesis, Department of Computer Languages and Systems, Technical University of CataloniaGoogle Scholar
 Gudarzi Farahani Y, Varmazyari B, Moshtaridoust S (2012) Energy consumption in Iran: past trends and future directions. Procedia Soc Behav Sci 62:12–17CrossRefGoogle Scholar
 Gwang H, SungHoon A (2004) Comparison of construction cost estimating models based on regression analysis, neural networks and casebased reasoning. Build Environ 39:1235–1242CrossRefGoogle Scholar
 Hasheminia H, Niaki STA (2006) A genetic algorithm approach to find the best regression/econometric model among the candidates. Appl Math Comput 183:337–349MathSciNetMATHGoogle Scholar
 Kim GH, Yoon JE, An SH, Cho HH, Kang KI (2004) Neural network model incorporating a genetic algorithm in estimating construction costs. Build Environ 39:1333–1340CrossRefGoogle Scholar
 Murat YS, Ceylan H (2006) Use of artificial neural networks for transport energy demand modeling. Energy Policy 34:3165–3172CrossRefGoogle Scholar
 Rodger JA (2014) Fuzzy nearest neighbor neural network statistical model for predicting demand for natural gas and energy cost savings in public buildings. Expert Syst Appl 41:1813–1829CrossRefGoogle Scholar
 Shtub A, Versano R (1999) Estimating the cost of steel pipe bending, a comparison between neural networks and regression analysis. Int J Prod Econ 62:201–207CrossRefGoogle Scholar
 Valipour M (2012) Ability of BoxJenkins models to estimate of reference potential evapotranspiration (a case study: Mehrabad synoptic station, Tehran, Iran). IOSR J Agric Vet Sci (IOSRJAVS) 1:1–11CrossRefGoogle Scholar
 Valipour M (2013) Increasing irrigation efficiency by management strategies: cutback and surge irrigation. ARPN J Agric Biol Sci 8:1990–6145Google Scholar
 Valipour M (2014) Application of new mass transfer formulae for computation of evapotranspiration. J Appl Water Eng Res 2:33–46CrossRefGoogle Scholar
 Valipour M (2016) How Much meteorological information is necessary to achieve reliable accuracy for rainfall estimations? Agriculture 6:53CrossRefGoogle Scholar
 Valipour M, Montazar A (2012) An evaluation of SWDC and WinSRFR models to optimize of infiltration parameters in furrow irrigation. Am J Sci Res 69:128–142Google Scholar
 Verlinden B, Duflou JR, Collin P, Cattrysse D (2008) Cost estimation for sheet metal parts using multiple regression and artificial neural networks: a case study. Int J Prod Econ 111:484–492CrossRefGoogle Scholar
 Wang J, Jiang D, Zhang Y (2002) A recurrent neural network for solving Sylvester equation with timevarying coefficients. IEEE Trans Neural Netw 13:1053–1063CrossRefGoogle Scholar
 Wang HS, Wang YN, Wang YC (2013) Cost estimation of plastic injection molding parts through integration of PSO and BP neural network. Expert Syst Appl 40:418–428CrossRefGoogle Scholar
 Zhang Y, Fuh J, Chan W (1996) Featurebased cost estimation for packaging products using neural networks. Comput Ind 32:95–113CrossRefGoogle Scholar
 Zima K (2015) The casebased reasoning model of cost estimation at the preliminary stage of a construction project. Procedia Eng 122:57–64CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.