Review on stochastic modeling methods for building stock energy prediction

Increasing risks of energy security and greenhouse gas emission due to the growing urbanization trend have prompted the need for urban energy demand prediction and management, in which the building energy consumption is the main cause. This paper reviews the recent advances and state-of-the-art in modeling building stock energy consumption, including both the top-down and bottom-up approaches. The study compares and summarizes the strengths and weaknesses of each primary method. Specific focus has been paid to the bottom-up stochastic engineering modeling methods, which hold sound quantitative theory bases as well as considering uncertain reality conditions. Stochastic building stock energy models account for the uncertainties that are the main limitation in existing building stock models. Discussions are provided regarding the process in the current stochastic building stock energy model. Challenges and possible future directions are examined for the improvement of stochastic building stock energy model.


Introduction
More than 50% of the world's population lives in the urban area in 2015 (WHO-UN Habitat 2016), and the world urban population is expected to rise to 72% by 2050 (Department of Economic and Social Affairs 2012). Urban areas consume two-thirds of the world's energy and account for 71% of global greenhouse gas emissions (Birol 2008). In the U.S., residential and commercial buildings account for 22% and 19% of the total energy use, respectively (EIA 2016). Policymakers have been actively engaged in the establishment of regulations and incentives in an effort to promote strategies for energy and greenhouse gas reduction in urban areas. To assist such decision makings requires an accurate and dynamic prediction and analysis of urban energy needs and developing trends, especially for building stocks. However, extensive survey and auditing of urban building energy use are not only costly but also inadequate for predicting the dynamic characteristics of the energy demands. Simulation tools, based on reasonable inputs from public databases, are thus necessary for this purpose. Validated modeling tools can also be used to predict and analyze the implications of new technologies, products, and policies on the current and future energy use of a city.
Building energy models have been utilized to predict building energy demands such as electricity, gas use since the 1970s. All current building energy models including physical models (e.g., eQUEST, EnergyPlus, and TRNSYS) and data-driven models (Black-box models such as Regression models, Artificial Neural Networks-ANN, and Support Vector Machine-SVM) can only predict energy and thermal performance of individual buildings. To predict the energy consumption of building stocks over time, researchers have developed a number of modeling methodologies. Swan and two categories: the top-down method and the bottom-up method. The top-down methods start with the aggregated energy consumption for a given region and time. These methods typically factor into the interrelationships between the energy sector and other variables such as econometric and technological factors. The aggregated energy use can then be divided into sections according to building function or spatial proximity. The bottom-up methods work at an individual level. It calculates the energy consumption of individual end-uses or buildings and then sums them up to represent the required region. The bottom-up methods can be further divided into two categories: the statistical and engineering approaches as shown in Fig. 1. In this review, two sub-groups of the bottom-up engineering method are suggested: namely the deterministic and stochastic approaches.
This paper provides a comprehensive review of various methods and models for building stock energy prediction. Both the strengths and weaknesses of each approach are discussed. Specific focus has been paid to the bottom-up stochastic engineering building stock energy models, which hold solid and quantitative theoretical grounds as well as taking into account uncertain reality conditions.
The following sections of this paper provide the definitions and comparisons of all methods and models, as well as discuss the opportunities and challenges for improving these methods. Section 2 reviews remarkable and recent top-down and bottom-up building stock models. Section 3.1 discusses the uncertainty issues for the existing building stock energy models. The stochastic approaches to overcoming the uncertainty issues are then introduced and reviewed (Sections 3.2 and 3.3). Discussions on the challenges and applications of the stochastic bottom-up engineering building stock models are presented in Section 4.

Existing building stock energy prediction models
Top-down methods are primarily based on the statistical relationships between historical aggregated energy use and socio-economic factors such as population, fuel prices, climate conditions, and gross domestic product to show the connections between the energy sector and economic output (Swan and Ugursal 2009;Kavgic et al. 2010). Most of these top-down models can predict the macroeconomic performance of building stock and the impacts of various "what-if" scenarios over time. The demand module of the National Energy Modeling System (NEMS) (EIA 2009) is the most well-known top-down prediction tool. The purpose of NEMS is to forecast the energy, economic, environmental, and security impacts of the alternative energy policies and different assumptions in the United States. Examples of top-down models include demand modules for residential and commercial sectors in other integrated energy system models: MARKAL (Fishbone and Abilock 1981), TIMES (Loulou et al. 2005), AIM (Matsuoka et al. 1995) and GCAM (Edmonds et al. 1994;Kyle et al. 2010). The resolution of such energy demands in the integrated energy system models roughly accounts for temporal (e.g. annual), spatial (e.g. region or national) and end-use level (e.g. total commercial, total heating energy) rather than individual buildings.
These top-down methods utilize an interaction between energy sectors and other economic indicators so that it can reflect the impact of social-economic policies. Since the top-down models are dependent on historical data, it is difficult to examine the consequences of specific technological advances and policies. Furthermore, they give a coarse analysis on individual end-uses and do not explicitly consider individual physical factors specific to each type of building. Therefore, top-down building stock models are not suitable to analyze the relationships among the building energy use, the building design and operational specifications.
Bottom-up methods calculate the energy consumption of individual end-uses, individual buildings, or groups of buildings, and then use these representative models to predict the regional or national energy consumption by different weighting approaches (Swan and Ugursal 2009). Bottom-up methods can be further divided into two sub-categories based on its modeling mechanisms: statistical and engineeringbased (physical) methods.
Statistical methods use historical data and analyze the relationships between available building information and energy use data. These methods rely on energy utility billing data (e.g., electricity, gas) from energy providers and survey data that includes human behaviors and building characteristics. Researchers have studied diverse approaches to analyzing the building stock energy consumption utilizing this information.
Recent bottom-up statistical models have used geographic information system (GIS) for data acquisition to build a model and visualize results. Girardin et al. (2010) developed EnerGIS based on GIS to model the energy requirement for Geneva, Switzerland. The linear regression model was utilized to obtain the annual consumption for heating, cooling, electricity, and hot water production and these predictions were then used to compute the required temperatures for the district heating system. The visualization of the energy consumption on the map facilitates a better understanding of energy planning and for analyzing the large-scale integration of renewable energy. Kolter and Ferreira (2011) presented a statistical approach to model energy consumption in residential and commercial buildings in Cambridge, MA, USA. They used utility data with some features from publicly available tax assessors and GIS data such as building value, square footage, building type. To develop a model, two statistic techniques were examined: linear regression and Gaussian process (GP) regression. While the GP method showed the best overall performance, the linear regression model was chosen due to its better succinctness and computational efficiency. Using the linear models, the authors developed a tool, EnergyView, which provides a visual map and compares energy use to similar buildings. Howard et al. (2012) also used linear regressions to derive an annual energy use intensity (EUI) for several building functions in New Your City. The estimated EUIs were apportioned into several of end uses by ratios derived from the Residential Energy Consumption Survey (RECS) (EIA 2005) and the Commercial Building Energy Consumption Survey (CBECS) (EIA 2003) for end-use estimation. The interactive map for spatial distribution of building energy use was produced based on GIS data (http://sel.columbia.edu). The model is unable to consider the construction type and building vintage for the energy conservation measures, although they play a major role in energy saving potential identification. Mastrucci et al. (2014) adopted GIS data to disaggregate the measured energy consumption from the aggregated postcode level to dwellings based on an archetype.
Most of the bottom-up statistical models are based on the regression techniques. The advantage of the statistical models is relatively easy to build a model once it has enough information to attribute building energy consumption to relevant building characteristic and data. Another benefit is that the statistical models are capable of considering demographics and occupant behaviors that have a significant influence on the energy consumption. On the other hand, the statistical methods have limitations when exploring new technologies. Some information (e.g. energy data, building type, number of occupancy) needed for statistical models is not always possible to access because of the privacy and insufficient data. These limitations reduce the accuracy and utilization of statistical models.
Bottom-up engineering models (also called physics models) explicitly account for the energy consumption of individual end-uses based on the equipment use, heat transfer and thermodynamic relationships. It predicts the energy consumption according to building properties such as geometry, envelope, climate, occupancy schedule, HVAC system, and equipment.
The conventional engineering building energy simulations are deterministic. The deterministic approach does not take into account the inherent uncertainties in the building and its subsystems. It always produces the same single output from given inputs. Bottom-up engineering building stock models using the deterministic approach are discussed in this section. The uncertainty issues in the deterministic methods and the stochastic methods will be examined in the next section.
Several studies predict building stock energy use by modeling every individual building of the stock (Li et al. 2015b;Cerezo Davila et al. 2016). It requires substantial time and effort in data collection, development of model and conduction simulation (Reinhart et al. 2013;Sehrawat and Kensek 2014). Therefore, the most of the deterministic engineering approach use archetype (also called representative building or prototype building). This approach uses the following typical procedure: (1) Classify building stock into several categories based on building geometry and non-geometry (e.g., function) properties, and microclimate conditions; (2) Develop archetype buildings that represent specific building stocks of each type; (3) Predict the unit energy consumptions (energy use of one building or energy use intensity) for each archetype by using a building energy simulation tool; (4) Obtain the total energy consumption by aggregating the predicted unit energy consumptions with proper weighting factors (e.g., the number of units or floor area in each type of building sector).
Most research efforts for the bottom-up engineering methods have focused on the residential buildings using archetype. In the UK, several residential building stock models used BREDEM (building research establishment domestic energy model) (Anderson et al. 1985) as a core calculation engine for single houses to predict energy use for each archetype (Shorrock and Dunster 1997;Jones et al. 2000Jones et al. , 2007Johnston 2003;Boardman et al. 2005;Boardman 2007;Natarajan and Levermore 2007a,b;Firth et al. 2010). More detailed comparisons of UK models were discussed in Kavgic's review (Kavgic et al. 2010) and Oladokun's review (Oladokun and Odesola 2015). In Canada, Canadian Residential Energy End-use Model (CREEM), developed by Farahbakhsh et al. (1998), was based on 16 house archetypes that were derived from 8,767 actual houses' data. In Japan, the residential energy model developed by Shimoda et al. (2003) used 20 archetypes and 23 household types for Osaka City's residential sector. The occupants' schedule determined by the national time use survey was utilized to obtain hourly energy use for hot water supply and dynamic cooling and heating load. Total energy consumption for the residential stock was estimated by multiplying the simulated energy use and the number of households and then summing them up. This concept of stock modeling has been developed and employed in a series of research (Shimoda et al. 2004(Shimoda et al. , 2010aYamaguchi et al. 2008). These residential building stock models have core simulation engines tailored for residential buildings and specific regions.
The bottom-up engineering building stock models have been expanded to the commercial sector and integrated with GIS platform for acquisition and expression of data thanks to the advances in the mapping technologies. Yamaguchi et al. (2007a) developed 612 archetypes for the commercial sector in Osaka city and proposed a district clustering approach that develops several representative districts based on dominant building types. The district level energy system simulation model was applied to estimate the annual energy use intensity of each representative district (Yamaguchi et al. 2003(Yamaguchi et al. , 2005(Yamaguchi et al. , 2007bHashimoto et al. 2007;Yamaguchi and Shimoda 2010). The model can quantify the energy use intensity for each representative district, not for specific individual building. Heiple and Sailor (2008) created 30 archetypes in the commercial and residential sectors of Houston, U.S using the national survey data (RECS, CBECS). The energy intensities calculated by eQuest (Hirsch 2006) were aggregated into the city scale, providing hourly and seasonal energy uses, which were visualized and compared with the top-down approach results in a GIS platform. The method has a time-intensive process of creating a set of representative building simulations. Caputo et al. (2013) developed 56 detailed archetypes considering form factor of buildings for the city of Milan, Italy. The potential energy consumption reduction derived from EnergyPlus was visualized in the GIS framework at the urban scale.
As the energy policy decisions require more detailed information to address future planning interventions, recent building stock models were developed for a finer temporal and spatial resolution. CitySim Kämpf 2009) is a software developed at EPFL to predict the energy demands from small neighborhoods to an entire city. It can calculate the on-site energy use for heating, cooling, and lighting based on multiple physical models such as RC (resistor-capacitor network) models, Radiation models, Plant and Equipment models, and Behavioral models. One interesting feature of CitySim is that it can take into account heat gain from direct and diffuse solar radiation as well as long-wave exchanges between the walls of each building. Therefore, CitySim requires a field survey to check the glazing ratio and facade state for a 3D modeling information. CitiSim model has been used for optimization Vermeulen et al. 2013), analysis of the impact of occupants' stochastic behaviors , and combination with other platforms (Perez et al. 2012;Dorer et al. 2013). The model developed by Fonseca and Schlueter (2015) can provide spatial (building location) and temporal (hourly) dimensions of analysis using GIS framework. The statistical and analytical methods were integrated to calculate the hourly energy use and temperature requirements of energy services in 172 building archetypes. Statistical clustering algorithms were also integrated to classify buildings in the area. The model result was validated against measured and simulated data. Li et al. (2015b) proposed an urban-scale building energy model, integrating simplified energy model with GIS. The model considers an impact of urban contexts such as urban heat island effect and mutual shading. However, those models do not consider the calibration for more accurate prediction. Cerezo  developed an urban building energy model that can estimate hourly demand load at individual building level for Boston, U.S.A. In the process, all 92,000 buildings in Boston were assigned into 76 archetypes that were simulated using multicore computer clusters for three days. The bottom-up engineering methods need an immense effort of processing data, creating archetypes and running simulations. However, it can provide an hourly energy use resolution from a specific location to city-wide with detailed energy classifications such as building general use type and fuel type.
Since bottom-up engineering models are built on the details of individual buildings, it is possible to consider system variables, physical and behavior characteristics of buildings. In these models, the building stock is divided into several types based on the physical and functional characteristics. Energy use for each type is quantified using the archetype buildings that are representing the same building stock type. As a result, the most outstanding advantage of the bottom-up engineering models is the capability of examining new technologies and no reliance on historical energy use data. The use of fundamental thermodynamics and heat transfer models enables the assessment of various energy conservation measures and new technologies in the absence of historical energy data (Swan 2010). However, the bottom-up engineering models require more detailed building information to calculate the energy consumption. Various model assumptions due to lack of accurate data can cause significant uncertainties on the building energy models. Furthermore, no single archetype model can reasonably represent the entire building sector of the same type in the bottom-up engineering model using archetype. More detailed classification of archetype models is often needed to avoid this shortcoming, but this can be a massive computational burden in the simulation process. Another shortcoming of the engineering models is the assumption of occupant schedules that can significantly alter energy consumptions. Moreover, these models do not include market interactions and tend to neglect the correlations between energy use and macroeconomic activity. To address some of these shortcomings, the engineering models may apply the statistical survey data (Kavgic et al. 2010) that allows for incorporating the effects of occupant behaviors, technological and social trends. However, the deterministic approach of engineering models has inherent constraints to accommodate distributions of various building attributes. Table 1 summarizes the advantages and disadvantages of each modeling approach based on literature (Swan and Ugursal 2009;Kavgic et al. 2010).

Uncertainty issues in existing building stock models
The limitations of existing building stock models can be categorized into five areas (Booth et al. 2012): (1) Accuracy-the capability of the models is limited to imitate the assumed processes that define energy transfer and consumption in the building. Building energy models are not perfectly accurate, even for engineering-based models.
(2) Data collection-the quantity of data for the models can be difficult and time-consuming to gather.
(3) Computational time-engineering-based models are more computationally expensive than statistical models. (4) Uncertainty-existing engineering-based models are unable to consider the influence of uncertainties in the models. (5) Flexibility-statistical models are less flexible than engineering-based models in handling diverse predictive situations and assessing new technologies for energy conservations. Engineering-based models overcome this limitation but are more computationally expensive.
Deterministic bottom-up engineering building stock energy models overcome some limitations of top-down and bottom-up statistical models in terms of flexibility. They can evaluate various energy conservation measures for building stock without end-use energy billing information in a region. However, they are not still able to address the uncertainty issues during the modeling.
Even for a single building, the engineering-based energy model encounters significant uncertainty challenges. Building energy simulation models are complex, requiring comprehensive inputs. Some of these inputs may be difficult to collect so that assumptions must be made to operate the models. The problem becomes more significant when expanding the simulation model from an individual building to the building stock, as only a limited number of archetype buildings are built with the aim to represent the entire building stock. Although sharing some of the main features (e.g., function, or age) within the same archetype, the representative building cannot, by all means, represent the wide actual building distributions in various building features such as building geometric properties (height, total floor area, window-wall-ratio), and materials. Finer classification is desired but under the penalty of additional computational cost.  Reinhart and Cerezo Davila (2016) indicated that the building energy prediction might be largely different from measured data due to the uncertainties associated with the definition of archetypes. In the bottom-up engineering building stock model, the errors at the individual building level are higher than at the aggregate level since the inaccuracies at individual levels tend to average out at the aggregate level.
Moreover, most of the bottom-up engineering methods employ the deterministic building energy models such as EnergyPlus, eQUEST, and ESP-r. These energy models provide only one deterministic output for one building with given inputs. Hence, the deterministic bottom-up engineering methods have inevitably large errors in predicting the energy use of a city with many buildings of different sizes, types, ages, functions, and operating conditions.

Stochastic building energy models for Individual buildings
The stochastic approach was proposed to incorporate and handle the uncertainty factors of a building model. It uses deterministic simulation models to perform stochastic analysis. It performs an uncertainty analysis or sensitivity analysis through the Monte Carlo sampling method (Lomas and Eppel 1992;Macdonald 2002;de Wit and Augenbroe 2002;Reddy et al. 2007;Hopfe 2009;Corrado and Mechri 2009;Eisenhower et al. 2012a). In this approach, the values of input parameters are sampled randomly from a given range and then feed into the energy model. The uncertainty and sensitivity are analyzed by the correlations between the variations of the input and output parameters.
Some studies attributed the building model uncertainty to occupants' behaviors and have developed the models that can explain the presence and behaviors of occupants Page et al. 2007;Haldi and Robinson 2011). Findings from these studies suggest that building occupants' behaviors have a significant impact on the building energy performance.
Others more focused on the uncertainty emanating from the distributions of various building parameters. The Monte Carlo method has been used to represent the distribution of various buildings in a building stock (Nishio and Asano 2006;Smith 2009;Korolija et al. 2013). The variety of physical characteristics of buildings is considered to establish probable distributions. In these studies, however, validation and calibration with the measured data were not performed.
The calibration is to tune the inputs in a building energy model in order to match the observed data with the model outputs. A detailed review of the calibration techniques for building simulation can be found elsewhere (Coakley et al. 2014;Fabrizio and Monetti 2015). The calibration can be conducted with either deterministic or probabilistic method. The deterministic method can obtain input parameters by a manual trial-and-error method that is very time-consuming, and the result is more dependent on the user. By contrast, the probabilistic approach can deal with this problem in an efficient manner by handling the unknown inputs as random parameters with probability density functions (Tian et al. 2014).
For the probabilistic calibration, Kennedy and O'Hagan (2001) proposed a generic approach for the Bayesian calibration of computer models. The authors noted the Bayesian calibration improved traditional approaches in two ways. First, the predictions can consider all sources of uncertainties, including the remaining uncertainty over the fitted parameters. Second, the Bayesian calibration method attempts to correct any inadequacy of the model that is revealed by a discrepancy between the model predictions and the measured data from even the best-fitting parameters.
Bayesian inference is a statistical method that utilizes Bayes' theorem in Eq. (1) to obtain a posterior distribution for unknown parameters (θ) given the observed data (y). The model parameters are considered to be uncertain and have a probabilistic distribution based on their plausible values. The uncertain parameters of the energy model are updated to match the model prediction with the observed data. As a result, Bayesian calibration provides the posterior distribution ( | ) p θ y in the form of plausible distribution of calibration parameters.
where p(θ) is prior distributions assigned for uncertain parameters; ( | ) p θ y is a likelihood function that measures how closely model predictions match observed data.
Heo (Heo 2011;Heo et al. 2012) applied a Bayesian calibration to the domain of building energy simulation. A normative energy model (base on CEN-ISO standard (ISO 2004)) and Bayesian calibration were used to account uncertainties for retrofit analysis of existing individual buildings. Normative models approximately represent energy performance of building systems with a small number of macro-level inputs based on a simplified description of a building and its system. Therefore, it can drastically reduce the computational cost of modeling and calibration process although it has a limitation to execute fine adjustment for building's subsystems and components. Figure 2 illustrates the calibration process. The first step involves the quantification of uncertain parameters in the energy model based on expert knowledge collected from surveys, standards, and technical reports. The second step is a parameter screening to select dominant parameters for calibration. In the third step, selected parameters are calibrated with given prior distributions for the parameters, utility data, and the normative building energy model. Then, the calibrated model with resulting posterior distribution is validated with the utility data. Last, the validated model propagates uncertainty to compare energy conservation measures (ECMs). From the results, the accuracy of calibrated normative model is comparable to the calibrated transient model (EnergyPlus) but requires much less computational time.
The Bayesian calibration procedure has been used in the individual building analysis for calibration of unknown input (Heo et al. 2013(Heo et al. , 2015aLi et al. 2015aLi et al. , 2016Kang and Krarti 2016;Berger et al. 2016), retrofit analysis (Heo et al. 2013(Heo et al. , 2015a, comparison with traditional calibration method (Pavlak et al. 2013), use of simplified model (Kim et al. 2013;Pavlak et al. 2013), influence of uncertainties in the input data (Heo et al. 2015b), determination of informative energy data (Tian et al. 2016), and meta-model comparison (Kim 2016;Li et al. 2016).

Fig. 2
Procedure of Bayesian calibration for retrofit analysis by Heo (2011)

Stochastic building energy models for building stock
There have been few attempts to apply probabilistic calibration to building stock. Booth et al. (2012) discussed three sources that can cause uncertainties in building stock models: the first is uncertainty from the model accuracy, that is, how adequately the model represents the true process of energy in a building. Second, it arises from different building characteristics according to various building types. Finally, within a homogeneous building group, the uncertainty stems from two factors: differences in outcome due to random variation and a lack of knowledge on certain input parameters.
To consider the uncertainty of building stock models, Booth et al. (2012) suggested the Stochastic Urban Scale Domestic Energy Model (SUSDEM). The SUSDEM is also based on Kennedy and O'Hagan (2001) Bayesian calibration process. It used the Energy Performance Standard Calculation Toolkit (EPSCT) developed in Georgia Institute of Technology (Lee et al. 2011) as an energy simulation tool because of the speed of the calculation and ability to assess the impact of technological interventions. The overall process is similar to Heo's method (Heo 2011;Heo et al. 2012) for individual buildings. However, the calibration was performed to the average value of building stock and added the bias function into the model to reduce the difference between observed data and computer model outputs. The equation below expresses the relationship between the observed data and the computer model output: where z i is observed data, ( , ) i ρη x θ , represents the emulator for the computer model. e i indicates the observation error. In this model, the bias function, ( ) i δ x , is dependent on the average daily external temperature and it is used to account for the inability of the model to fully represent the true process.
The case study of 35 flats in the UK was performed to demonstrate the method. There are five steps in the Bayesian calibration.
Step 1 is a selection of the calibration parameters. The Factorial Sampling Analysis (FSA, or the Morris method) was used to select dominant parameters. The authors indicated that calibrating too few parameters means that the uncertainty in other (un-calibrated) parameters is subsumed into the chosen (calibrated) parameters. Therefore, it derives distorting the physical meaning of the resultant posterior distributions. Calibrating too many parameters reduces the effectiveness and accuracy of the Bayesian inference.
Step 2 is a quantification of prior uncertainty. The initial distributions for calibration parameters were estimated based on any prior knowledge or assumption.
Step 3 is a formulation of calibration framework. They used a statistical model (known as an emulator or meta-model) to replace the building energy simulation model. Step 4 is a Bayesian calibration process. The posterior distribution for each unknown parameter was derived. The representative model calibrated to the average daily value of measured energy consumption, not for each energy consumption of 35 flats. The last step is the validation. The calibrated output is compared against the observation data. The results showed that the Bayesian calibration has almost eliminated any discrepancy between the observation data and the normative model output. However, since the calibration was performed to estimate the average energy usage for all the flats, the posterior distribution for parameters could not represent the distribution of parameters of all flats. The proposed SUSDEM was further refined and extended to large-scale by applying approximately 15,000 houses in the UK . Energy savings from retrofit measures were predicted considering the installation costs, the future prices of energy, the lifetime carbon savings, and increased thermal comfort.
Three primary challenges exist in the current stochastic modeling for building stock energy use: (a) lack of building details for massive infrastructures (e.g. thermal properties such as insulations, glazing); (b) uncertainty caused by human behaviors (e.g., occupant schedule, equipment power density); and (c) lack of actual building energy use data for validation or calibration to reduce errors. Obtaining energy usage data is one of the main challenges for the calibration. In most cases, the public data is only available at the macro-level (i.e., at the district, urban, or national level). In another research by , they suggested a method calibrating micro-level models using macro-level data. A hierarchical framework was proposed to utilize a combination of regression analysis and Bayesian inference. The top-down stochastic method was employed to infer average energy consumption for different dwelling types using aggregated macro-level energy statistics. The top-down analysis at the macro-level provided the data for calibrating the input parameters of the bottom-up stochastic engineering models at the micro-level (i.e., at an individual building level). Tian and Choudhary (2012) applied the process to the school buildings in London, U.K (see Fig. 3). In the parameter estimation step, the authors considered two approaches for implementing the calibration: linear inverse problem and Bayesian inference. The main difference between two approaches is in the assumption of prior distributions. The linear inverse problem can only set the prior distribution to uniform distribution with a range upper and lower bound. The Bayesian inference allows specific shape and range of the prior distributions. The advantage of Bayesian calibration relative to the linear inverse problem is that one can utilize existing knowledge of building stock from the previous survey, research, and reports. The distributions of input parameters were inferred to the observed energy consumption data. The inferred distributions of input parameters were used to quantify the benefits of energy conservation measures (ECMs) for the building stock. The calibration procedure for the individual building is similar to Heo's procedure (Heo 2011;Heo et al. 2012). However, in this study, the calibration procedure was repeated to represent the building stock of secondary schools in London. They   Fig. 3 Procedure of probabilistic building stock energy model proposed by Tian and Choudhary (2012) derived 2,000 samples from energy consumption data in the technical report. Then, the calibration with 5,000 of MCMC iterations was performed for every sample value of 2,000 energy data. Since the resulting total number of samples is extremely large (5,000 × 2,000), the authors resampled 10,000 sets from the large set. This method could reduce the simulation run-time using the simple surrogate model (linear regression model) although they need to run a large number of calibrations.
In the study by Yamaguchi et al. (2013) (see Fig. 4) to quantify the energy use for supermarket at the city scale, the authors also used the Bayesian calibration framework based on Kennedy and O'Hagan (2001), and the process of building simulation and calibration by Heo et al. (2012) and Booth et al. (2012). They stated that the previous application of a probabilistic approach was only limited to calibrate the model parameters to a single resolution of simulation output (e.g., monthly energy consumption). The authors also mentioned that the uncertainty of the input parameters might be changed at different time resolutions. Their studies revealed that the building insulation performance might have a considerable impact on seasonal and weekly energy consumption while its effect on annual energy use might be modest. The authors proposed a hierarchical calibration that considers annual and weekly parameter variations. In the calibration method, they selected parameters for annual and weekly variations to calibrate the developed archetype energy model. Then, the input parameters were calibrated based on weekly energy consumptions and annual energy consumption. The proposed calibration method can consider the overall characteristics of the building stock and the influences of meteorological conditions caused by seasonal variations in energy consumption. Kim et al. (2015) introduced "lifestyle factor" in the optimization method to consider the combined effect of occupancy variabilities such as presence, an operation of set-point temperatures, lighting schedules, and equipment use. The "lifestyle factor" is treated as a stochastic variable with four other variables: cooling coefficient of performance (COP), set-point temperature, internal gain, and infiltration. By considering the occupant factor, they reproduce more realistic EUI distribution for 2,182 apartment units in Korea than a deterministic method. Zhao et al. (2016) proposed an approach to replicate an office building stock energy use in Chicago, U.S. from the survey data. This method used a linear inverse problem to derive design parameters of buildings. At first, the authors set 30 unknown parameters and examined to quantify the prior uncertainty. Only 12 parameters out of 30 were chosen for calibration parameters after a sensitivity analysis. The multiple linear regression models using the selected 12 parameters were built to represent the normative building energy model for reducing computational effort. If the inverse problem is overdetermined, a set of values for the unknown parameter can be sampled by approximately satisfying the linear equation using Markov Chain Monte Carlo. In the validation, the predicted EUI distribution from the inverse problem was approximately identical with the EUI distribution of CBECS 2003 office buildings (765 samples, adjusted for Chicago's climate). The estimated parameter distributions were significantly different from the actual parameter distributions in CBECS 2003. The author indicated that the predicted building parameter distributions should not be considered as the real values, but the "best guess" of the real distributions that will produce the similar outcome of the real world. This "best guess" can be further Fig. 4 Proposed calibration process by Yamaguchi et al. (2013) improved using more measurement data to quantify the measurement errors between the data and the model. Sokol et al. (2017) compared stochastic building stock energy modeling based on Bayesian calibration to deterministic simulation using literature, building data, adjusted variables using energy data. The deterministic archetype method is unable to represent the measured EUI distribution. In the Bayesian calibration, the calibration using monthly energy data is more accurate for predicting seasonal trends than the calibration using annual energy data. Table 2 lists stochastic building energy models and main features.

Differences in stochastic models for individual building and building stock
A review of the stochastic building model notes the differences between models for individual building and building stock.

Development of representative building model
In the calibration for an individual building, the purpose of building energy modeling is to construct the energy models as similar as possible to the actual target building. In the building stock, the buildings in the target district are classified into several sub-groups. The representative archetype for the sub-group should cover all buildings for the type of building. The most common characteristics such as an average value of building geometry and the most used type of HVAC systems can be used to model the representative building energy model.

Uncertainty in input parameters
The prior distributions for input parameters are treated as uncertainties of the input parameters in the individual building models. The uncertainties are caused by insufficient building information, construction error, or human behaviors. On the other hand, in the building stock model, the

Iteration of Bayesian calibration
Bayesian calibration for the individual building uses one set of energy data for the target building. If the building energy model is robust, the estimated posterior distribution for input parameter can be regarded as the actual target value.
In the building stock, the Bayesian calibration is repeated for the several sets that represent energy data points from the building energy distribution. Each distribution obtained from the Bayesian calibration is combined. This is similar to the kernel density estimation procedure (Shawe-Taylor and Cristianini 2004) as shown in Fig. 5. Therefore, the Bayesian calibration for building stock requires more computational time.

Aggregation
Bayesian calibration for the building stock requires aggregation process. There are two types of aggregation. The first aggregation is a process from the representative building to the building stock. Each posterior distribution from the Bayesian calibration iteration is combined to represent one type building stock. Another aggregation is for different building types. If there are several building classifications, each result for building types is combined to represent all building stocks.

Issues with existing stochastic building stock energy models
The probabilistic methods were used in building energy simulation to address uncertainties with a deterministic building energy stock model. Monte Carlo method enables the deterministic building energy model to assess probabilistic distribution. The use of calibration method will improve the predictability of the building stock model. Stochastic calibration methods using engineering-based model allow users to take into account uncertainties of unknown input information in representative building models. Moreover, it can analyze the impacts of new energy conservation measures taking advantage of the features of the engineeringbased models. However, limitations still exist in the stochastic building stock energy models. Therefore, this section discusses the limitations of the current stochastic building stock energy models and makes suggestions for improvement.

Computational time
Stochastic calibration methods such as Bayesian inference and inverse problem incur a high computational cost. An MCMC is often used to draw posterior distributions for unknown parameters. The iteration number of MCMC varies by model characteristics such as the number of unknown parameters, the range of prior distribution, etc. For one MCMC process for an individual building, it requires a number of iterations ranging from thousands to tens of thousands: 4,000 iterations in (Kim et al. 2013), 5,000 iterations in (Tian and Choudhary 2012), 20,000 iterations in (Tian et al. 2014), and 25,000 iterations in (Zhao 2012). When the MCMC process is applied to building stock, it will require additional iterations to represent the distribution of the building stock.
In an attempt to reduce computational time for dynamic building energy simulation, the following approaches have been proposed: simplified energy models, parameter screening, and meta-models.
Simplified models use simple descriptions for a building and its systems. It can drastically reduce simulation time compared to dynamic building energy simulations such as EnergyPlus, eQUEST, and TRNSYS. The normative model (e.g. EPSCT (Lee et al. 2011)) was employed in several studies (Heo 2011;Zhao 2012;Heo et al. 2012;Booth et al. 2012;Kim et al. 2013). A simple hourly method based on RC network used in CEN-ISO 13790 was employed to calibrate and optimize the unknown parameters (Jacob et al. 2010;Pavlak et al. 2013;Henze et al. 2014). Simplified models often use few input data and limited set of equations. They usually do not take into account multi-zone interaction in the building and can cause a large error. Therefore, such simplified models have limitation to estimate the impact of new technologies.
Parameter screening performs a sensitivity analysis and identifies dominant parameters (input) affecting building energy use (output) for energy simulation models and observation study. By selecting only key variables as calibration parameters, one can reduce iteration number in the MCMC process. For detailed reviews of sensitivity analysis, please refer to (Saltelli et al. 2000;Tian 2013;Pianosi et al. 2016). Menberg et al. (2016) compared three sensitivity analysis methods in terms of computational costs and extractable information. However, the importance of the parameters varies by the sensitivity analysis method and the target output (e.g. total, electricity, or gas energy). There is a need for a method that can determine the importance of parameters while considering various sensitivity methods and outputs. Moreover, there is insufficient research on how many parameters should be used for the calibration. Booth et al. (2012) pointed out that selecting the number of calibration parameters is a "balancing act." By reducing the number of calibration parameter, uncertainty in uncalibrated parameters is subsumed into the calibrated parameters. On the other hand, increasing the number of calibration parameters reduce accuracy and effectiveness of the Bayesian calibration (Booth et al. 2012). It is necessary to study the computational cost and accuracy of the model depending on the number of parameters selected. A meta-model (also called surrogate model) can be defined as a "model of model" (Eisenhower et al. 2012b), which is simpler and computationally faster than an original model. Different meta-models were applied to reduce simulation time in many studies: Multiple linear regression model (Zhao 2012;Tian et al. 2014;Manfren et al. 2013;Tian and Choudhary 2012), Gaussian process emulator (Manfren et al. 2013;Heo 2011;Heo et al. 2012;Booth et al. 2012) and Support Vector Machines (Eisenhower et al. 2012b). Wei et al. (2015) investigated the predictive performance of six meta-models (full linear, Lasso, MARS, SVM, bagging MARS, and boosting) developed based on measured data. Tian et al. (2015) compared the accuracy of eight meta-models for the 114 campus buildings. Kim (2016) compared the prediction capability of Gaussian process emulator and polynomial chaos expansion for an uncertainty quantification.
There are a few studies on the effect of meta-model accuracy on the calibration. Li et al. (2016) compared calibration performance using Gaussian process emulator and multiple linear regression models. However, further research is required to determine which meta-models are available for the calibration of building stock. The effect on the calibration results should be compared in terms of accuracy and computational cost.

Building stock information for representative building
Building stock models require a wide range of information including geometric and non-geometric factors to develop representative buildings (archetypes). In practice, such building stock information is rarely available to classify and develop the representative building models. Most of existing stochastic building stock models are limited to one building type and developed with sufficient building stock information. If information is sufficient, buildings are classified into archetypes based on building properties such as building function and age. Ballarini et al. (2014) classified residential archetypes at European level by location, age, building size, and shape. Österbring et al. (2016) integrated measured energy use and envelope area from a 2.5D GIS model to building characteristics for age-type building stock classification. Although the classification is based on such sufficient building stock information, there is a possibility of being influenced by the analyst's subjective decision. More research is required to provide appropriate and robust criteria for classifying archetypes. Classification and clustering methods using machine learning may be considered. On the other hand, if there is missing data in the building stock information, further study is required how to obtain the alternative information to substitute the required data. It is also necessary to explore how such insufficient information affects the accuracy of the building stock model.

Uncertainty in human behavior
Human behavior in building energy modeling remains the greatest uncertainty yet. Input parameters such as schedules of occupant presence, operation of appliance, equipment power density, lighting power density, and heating and cooling set point temperature are related to human behavior. In particular, an occupant schedule is the most important factors to classify building type. Such parameters related to human behavior have a significant impact on energy use in individual buildings (Hong and Lin 2013;Clevenger et al. 2014;Silva and Ghisi 2014). However, in building energy modeling, it has traditionally been calculated using deterministic simplified hourly schedules and peak load. Much research is underway to identify the uncertainty of human behavior in buildings. Most of the research has been based on surveys, but in recent years, there has been an attempt to develop stochastic occupancy profiles for individual building using occupancy sensor (Duarte et al. 2013;Wang et al. 2016;Diraco et al. 2015), Bluetooth positioning (Zhao et al. 2014), and random process (Chen et al. 2015;O'Neill and Niu 2017). The identification of actual occupancy schedule may contribute to accurate building energy forecasting and occupancy-based control.
For the building stock, Evins et al. (2016) confirmed that physical variables such as fabric properties and geometries have a greater impact on energy use than behavioral variables such as occupancy and lighting. However, they found that stochastic schedules provide much smoother hourly energy use profile due to interactions of the different occupant, appliance profiles. They also recommended using stochastic profiles rather than deterministic profiles when temporal behavior is important, such as finding peak loads. He et al. (2015) emphasized the need for stochastic occupancy profiles for high-resolution temporal (e.g. hourly) energy analysis, showing that deterministic profiles have unrealistic peak loads. Comparing thermal demands from deterministic and stochastic schedules, there was substantial differences in hourly demand profiles but relatively small differences in total daily demands. As pointed out by He et al. (2015), for energy analysis with low temporal resolution (e.g. annual) or large-scale regions, the effect of human behavior may not be significant due to an averaging effect. In addition to studying appropriate occupant behavior models, it is necessary to grasp more precisely the impact of human behavior on building stock energy use at various temporal and spatial resolutions.

Energy use data for calibration
The calibration process requires quality energy use data. Most building stock models perform calibration with sufficient energy use data. However, most of the modelers have difficulties in accessing the energy utility data for individual buildings due to privacy issues and absence of monitoring systems. It is necessary to study whether the model can be calibrated with insufficient energy use data. Tian et al. (2016) investigated how to determine informative energy data in Bayesian calibration using correlation analysis and a hierarchical clustering method. This method can improve understanding of the amount and quality of energy usage data required for Bayesian calibration.

Calibration process and results
Another challenge arises from the calibration process. Most of the building stock energy models include complicated Bayesian calibration process requiring specialized knowledge. The lack of transparency in existing models to date in the calibration process may pose challenges in accessing the models and interpreting results. Careful technical and systematic details in Bayesian calibration are required for policymakers and public.
There is also an issue of whether the calibrated building parameter distributions can accurately represent the real distribution. Zhao et al. (2016) argued in his research that the calibrated building parameter distributions should be considered as "best guess" of the real world. Future research should investigate the relationship between the estimated parameter distribution and the actual distribution, which will improve an understanding the energy conservation measures of building stock.

Aggregation
Aggregation is one of the most distinctive aspects of stochastic building stock modeling compare to the individual building modeling. There are two types of aggregation. One is the aggregation from the representative building to one type of building stock. The most straightforward method is that the energy use of building stocks is calculated by multiplying the EUI distribution of representative building models by total floor area. Tian and Choudhary (2012) used the number of students per school to estimate gas consumption by all schools in London when the total floor area of schools is not available. Another aggregation is collecting different building stock types. To estimate the energy use at urban-scale with many built infrastructures of different sizes, types, ages, functions, and operating conditions, simply adding up all energy usage of individual buildings is not correct as building operates at different schedules and intensities. More research is required to combine the diverse building stock types in the target district and to explore how to aggregate on different temporal and spatial scales.

Conclusions
Due to recent interest both in urbanization and building energy consumption, the research of building stock energy model has expanded. This paper examines existing building stock energy models aiming to focus more on the bottom-up stochastic engineering methods. Top-down approaches are often used for supply analysis based on the energy demand by taking account of historical data and macroeconomic indicators. It is appropriate when evaluating energy at the national level. The top-down models lack the comprehension of building characteristics and transparency in the correlations between the indicators. Moreover, the top-down models have limitations to disaggregate the total energy consumption into each sector of a building stock. The bottom-up statistic models work at a disaggregated level. Since the statistic models are easy to develop, they can be useful if there are sufficient building data. However, it needs numerous databases of empirical data to investigate the relationships between each component and energy use. The statistical models have a limited capacity to assess the impact of energy conservation measures or new technologies. The bottom-up deterministic engineering building stock models can consider the impact of new technologies to reduce energy consumption in the building stock without historical data. However, these models have disadvantages of computational efficiency and capability of covering the uncertainties.
To overcome the uncertainty issues, some stochastic building stock models were proposed. Currently, there have been only a few studies that take into account the uncertainty associated with the large-scale building stock. Current stochastic building stock energy models have considerable limitations, including high computational cost, low accessibility to the detailed building information, and lack of transparency in the calibration process. This review paper has illuminated some interesting discussion points regarding the process in the current stochastic building stock energy model. Current limitations and possible future directions have been examined for the development of stochastic building stock energy model. The findings and discussion of this study will be considered to predict and evaluate the building stock energy consumption in order to cope with these challenges.