1 Introduction

With the development of smart gird, demand side management (DSM) has been considered as an effective mean to improve the energy efficiency and economic operation of the grid [1]. Many research & industrial efforts have been made to encourage customers participate in the demand response (DR) programs. Generally, DR techniques can be categorized into two classes: direct load control (DLC, also known as the dynamic load control or demand dispatch) [2] and price-based DR (also known as the indirect load control) [3]. In DLC, the system operator directly adjusts the power demand through remote control devices, with the aim to accomplish specific objectives (e.g., peak load reduction, frequency regulation, etc.). In the price-based DR, customers actively adjust their energy consumptions to respond to the price signals (e.g., time-of-use (TOU) pricing, real-time pricing (RTP), etc.). Comparing with the price-based DR, DLC can provide more reliable services while aiming to minimize customers’ comfort disturbances.

Among different kinds of loads, thermostatically controlled loads (TCLs) (such as the air conditioners (ACs), water heaters, refrigerators, etc.) are often considered as excellent candidates for DR programs due to their capabilities of storing thermal energy [4]. There have been some industrial programs established to aggregate the residential ACs for peak load shaving and emergent load management, such as the SmartAC program of Pacific Gas & Electric (PG&E) [5]. In the literature, the DLC of TCLs are also well studied. Ramanathan and Vittal [6] outlined the fundamental requirements of DLC and presented a general optimization framework to do the feeder-scale load reduction while minimizing the residents’ thermal comfort disruption; Hao et al. [7] transformed the conventional dead-band dynamic model of the TCL into the continuous power model, and then modelled the aggregated TCLs as a stochastic battery model. Then, Hao et al. [7] applied the aggregated battery model in a frequency regulation application; Bashash and Fathy [8] developed a Lyapunov-stable sliding mode controller to model the TCLs, based on a partial differential equation framework; Vrettos and Anderson [9] proposed a two-stage control method for TCLs. In the first stage, a day-ahead scheduling model is proposed to determine the optimal TCL dispatch by solving an AC optimal power flow. In the second stage, a real-time control model is proposed to allocate the desired set points to individual TCL. Mathieu et al. [10] used the Markov Transition Matrix to model the populated TCLs to do the load reduction. Luo et al. [11] studied the day-ahead scheduling strategy of a building aggregator; Luo et al. [12] proposed a decomposed dispatch framework for large scale TCLs; Luo et al. [13] studied the coordinated dispatch strategy for TCLs and generation units. Luo et al. [14, 15] studied the impact of controllable TCLs on the microgrid operation.

In most of existing works [6,7,8], the general thermal inertia model of the TCL is adopted to estimate the indoor temperature profile. And the resident’s thermal comfort is considered by properly scheduling the ON/OFF states of TCLs so that the indoor temperature profiles are controlled within the pre-set temperature dead-band. With the development of two-way communication technology, it would be feasible to establish more accurate models to estimate users’ thermal comfort degree. There have been many efforts done in this aspect in the field of heating, ventilating and air conditioning (HVAC), where the researchers developed sophisticated thermal comfort estimation strategies to optimize the energy consumption of a house/building [16, 17].

Reference [11] applied an International Standard Organization (ISO) standard 7730 thermal comfort model [18] into the DLC scheme to design a day-ahead scheduling model for a single residential AC aggregator. The ISO7730 thermal comfort model can estimate the thermal comfort degree of the residents more precisely, and it has been widely used in the building’s real-time energy management [17]. However, one technical obstacle of practically applying the ISO7730 model in the day-ahead DLC scheme is that the thermal comfort model requires many stochastic parameters, which are sometimes difficult to be accurately estimated in the day-ahead stage.

In this paper, we report further works carried out along the direction in [11]. The major contributions of this paper include:

  1. 1)

    A human thermal comfort estimation module is developed for the load control, which includes two components reported in the building environment science, i.e. (i) an advanced thermal inertia model, which can precisely capture the dynamical indoor temperature variations caused by the TCL, and (ii) a simplified thermal comfort model, which estimates the thermal comfort of the residents. The simplified thermal comfort model can provide good approximations for the ISO7730 standard model while relying on only three stochastic parameters. These 3 parameters are easy to be forecasted in the day-ahead stage. To the best of our knowledge, this is the first research to apply the thermal inertia model and simplified thermal comfort model on DLC;

  2. 2)

    A 3-stage day-ahead TCL scheduling model is proposed. In the first stage, the TCL aggregator solve a capacity estimation model to estimate their maximum interruptible TCL capacities. In the second stage, the system operator solves a day-ahead dispatch model to determine the load shedding instructions for TCL aggregators. In the third stage, based on the load shedding instructions, the TCL aggregators schedule the ON/OFF states of the TCLs to follow the instructions.

The remaining parts of this paper are organized as follows. In Section 2, the thermal comfort estimation module is introduced; in Section 3, the day-ahead TCL scheduling framework is presented; in Section 4, the approach to solve the proposed models is discussed; in Section 5, case study results are presented; finally, conclusions are drawn in Section 6.

2 Thermal comfort estimation module

In DLC, ensuring the resident’s comfort is a major concern. In this paper, a residential thermal comfort estimation module is developed. Two components are included in this module. Firstly, an advanced thermal inertia model is used, which can accurately capture the thermal dynamics of the units; secondly, based on the indoor temperature calculated by the thermal inertia model, a simplified thermal comfort model is used to estimate the residents’ thermal comfort degree.

2.1 Thermal inertia model

An accurate modeling of the building’s thermal dynamic process is fundamental to ensure the resident’s thermal comfort. Traditional thermal inertia models take into account of parameters like internal and external temperatures, but only considers the thermal resistance of walls and neglects walls’ thermal capacitance (shown in Fig. 1a). In this paper, a more accurate two-parameter model is used and is shown in Fig. 1b. The unit is divided into two components, one of which is the internal of the house and the other is the additional thermal mass such as walls with much larger thermal capacitance.

Fig. 1
figure 1

Traditional thermal inertia models

The indoor air temperature of a unit varies considerably when considering thermal capacitance of walls. The reason is the heat gain of a unit consists two parts: the relatively steady-state transmission caused by the indoor air and outdoor surroundings temperatures differences and the unsteady-state gain as a result of the changing intensity of solar radiation on the external walls. The unsteady-state heat flow across walls is therefore very complicated as part of heat passing through walls is captured and later released to either the indoor air or the outdoor ambient. Hence, the thermal dynamic model of a two-parameter model could be expressed as [19].

$$\frac{{{\text{d}}T_{r} (t)}}{{{\text{d}}t}} = \frac{1}{{M_{a} \cdot Cp_{a} }}\left( {\frac{{{\text{d}}Q_{gain,a} \left( t \right)}}{{{\text{d}}t}} - \frac{{{\text{d}}Q_{ex,w,r} \left( t \right)}}{{{\text{d}}t}} - \frac{{{\text{d}}Q_{ac} \left( t \right)}}{{{\text{d}}t}}} \right)$$
(1)
$$\frac{{{\text{d}}T_{w} \left( t \right)}}{{{\text{d}}t}} = \frac{1}{{M_{w} \cdot Cp_{w} }}\left( {\frac{{{\text{d}}Q_{gain,w} \left( t \right)}}{{{\text{d}}t}} + \frac{{{\text{d}}Q_{ex,w,r} \left( t \right)}}{{{\text{d}}t}}} \right)$$
(2)
$$\frac{{{\text{d}}Q_{gain,a} \left( t \right)}}{{{\text{d}}t}} = \frac{{T_{amb} - T_{r} }}{{R_{eq} }}$$
(3)
$$\frac{{{\text{d}}Q_{ex,w,r} \left( t \right)}}{{{\text{d}}t}} = \frac{{T_{w} - T_{r} }}{{R_{wr} }}$$
(4)
$$\frac{{{\text{d}}Q_{ac} \left( t \right)}}{{{\text{d}}t}} = COP \cdot P_{ac}$$
(5)
$$\frac{{{\text{d}}Q_{gain,w} \left( t \right)}}{{{\text{d}}t}} = \frac{{T_{amb} - T_{w} }}{{R_{wa} }}$$
(6)

where \(COP\) is the coefficient of performance of TCL; \(Cp_{a}\) is the heat capacity of air; \(T_{r} (t)\) represents the indoor air temperature at time t; \(M_{a}\) is the mass of the air inside the house; \(Q_{gain,a} \left( t \right)\) represents the heat gain by the indoor air from the ambient; \(Q_{gain,w} \left( t \right)\) represents the heat gain by the wall from the ambient; \(Q_{ac} \left( t \right)\) is the cooling energy delivered by the TCL; \(Q_{ex,w,r} \left( t \right)\) represents the heat exchange between the door and indoor air; \(P_{ac}\) is the rated power of the TCL; \(T_{w}\) represents the wall temperature; \(T_{amb}\) is the outdoor ambient temperature; \(R_{eq}\) is the equivalent thermal resistance of the house envelop; \(R_{wr}\) is the thermal resistance between the wall inner surface and the indoor air; \(R_{wa}\) are is the thermal resistance between the wall outer surface and the ambient.

Previous works [13, 20] have shown that different complexities of models can pose significant impacts on the accuracy of cooling energy consumption. The thermal dynamic model in (1)–(5) can be linearized for convenient calculating the indoor temperature variation. For each dispatch time interval, \(\Delta t\) is divided into K steps. Provided that K is sufficiently large, we can assume that the temperatures of the ambient, walls, and the indoor air within any time step are constant. Hence, the change in temperatures can therefore be presented by the temperature difference between two adjacent time steps. Therefore, the thermal dynamic model can be linearized as (7)–(10).

$$T_{r} (k) = \left( {1 - \frac{1}{{M_{a} \cdot Cp_{a} \cdot R_{eq} }}} \right)T_{r,init} + \frac{1}{{M_{a} \cdot Cp_{a} \cdot R_{eq} }} \cdot T_{amb,init} + \frac{{T_{w,init} - T_{r,init} }}{{M_{air} \cdot Cp_{a} \cdot R_{wr} }} - S_{ac,init} \frac{{Q_{ac} }}{{M_{a} \cdot Cp_{a} }}\quad\quad\quad k = 1$$
(7)
$$T_{r} (k)(k) = \left( {1 - \frac{1}{{M_{a} \cdot Cp_{a} \cdot R_{eq} }}} \right)T_{r} (k - 1) + \frac{1}{{M_{a} \cdot Cp_{a} \cdot R_{eq} }}T_{amb} \left( {k - 1} \right) + \frac{{T_{w} \left( {k - 1} \right) - T_{r} \left( {k - 1} \right)}}{{M_{a} \cdot Cp_{a} \cdot R_{wr} }} - S_{ac} (k)\frac{{Q_{ac} \left( {k - 1} \right)}}{{M_{a} \cdot Cp_{a} }}\quad \forall k \in [2,K]$$
(8)
$$T_{w} \left( k \right) = T_{w,init} + \frac{{T_{amb,init} - T_{w,init} }}{{M_{w} \cdot Cp_{w} \cdot R_{wa} }} + \frac{{T_{r,init} - T_{w,init} }}{{M_{w} \cdot Cp_{w} \cdot R_{wr} }}\quad k = 1$$
(9)
$$T_{w} \left( k \right) = T_{w} \left( {k - 1} \right) + \frac{{T_{amb} \left( {k - 1} \right) - T_{w} \left( {k - 1} \right)}}{{M_{w} \cdot Cp_{w} \cdot R_{wa} }} + \frac{{T_{r} \left( {k - 1} \right) - T_{w} \left( {k - 1} \right)}}{{M_{w} \cdot Cp_{w} \cdot R_{wr} }}\quad \forall k \in [2,K]$$
(10)

where \(M_{a}\) and \(M_{w}\) are mass of the air inside of the house and the walls; \(T_{r,init}\) is the initial indoor temperature; \(S_{ac} (k)\) represents the state of the TCL at time k (0-OFF, 1-ON); \(Cp_{w}\) is the heat capacity of the wall.

2.2 Simplified thermal comfort model

Generally, the ISO 7730 thermal comfort model [18] establishes the analytical determination and interpretation of human’s thermal comfort degree by two indices: predicted mean vote (PMV) and predicted percentage of dissatisfied (PPD). PMV predicts the mean value of votes of a large group of people on the ISO thermal sensation scale. Based on the PMV value, PPD predicts the percentage of a large group of people likely to feel discomfort. Details of the ISO 7730 model can be found in [11, 18].

There are 6 stochastic parameters in the ISO7730 model: 4 environment factors (air temperature, air relative humidity, air velocity, and mean radiant temperature) and two individual factors (activity level and clothing insulation). Although the efficiency of the ISO7730 model has already been verified by both of the lab experiments and field measurements [21], the fact that it contains too many stochastic parameters limits its practical application in the grid-level, day-ahead DLC scheme. In order to reduce the number of required parameters, Barrati et al. [22] proposed a simplified thermal comfort model. It uses only the air temperature and relative humidity as inputs. The reliability of the simplified model was validated with a wide range of clothing conditions and metabolic rate values. The simplified thermal comfort model, expressed in (11), is very useful in the DLC day-ahead scheduling because of the less stochastic variables.

$$PMV = aT_{a} + bP_{v} - c$$
(11)
$$P_{v} = rh \cdot 10 \cdot {\text{e}}^{{(16.6536 - 4030.183)/(T_{a} + 273)}}$$
(12)

where rh is the relative humidity (%); \(T_{a}\) is the ambient temperature (Celsius degree). The values of coefficients \(a\), \(b\) and \(c\) are determined based on the third parameter, i.e. clothing condition (\(I_{cl}\)), shown in Table 1. Based on the PMV value, the PPD value could be calculated, just the same with the ISO 7730 model.

Table 1 Coefficients of simplified thermal comfort model [22]
$$PPD = 100 - 95 \cdot \exp ( - 0.03353 \cdot PMV^{4} - 0.2179 \cdot PMV^{2} )$$
(13)

3 Day-ahead TCL scheduling framework

There are 3 stages in the proposed day-ahead TCL scheduling framework: interruptible TCL capacity estimation, day-ahead system dispatch in the system operator side, and day-ahead TCL dispatch in the TCL aggregator side. The schematic of the framework is depicted in Fig. 2, and the models of these 3 stages are presented as below.

Fig. 2
figure 2

Schematic of the day-ahead TCL dispatch framework

It is worth noting that for the peak load shaving application, the control of TCLs is with high frequency base (often on one or multiple minutes), and the system dispatch is with low frequency base (often 15–60 minutes, depending on different electricity market structures). Therefore, we simply use the notations \(t\) and \(t'\) to denote the time indices of the system dispatch interval and the TCL control interval, respectively. We also use the notations \(T\) and \(T'\) to denote the total number of the system dispatch intervals and the TCL control intervals, respectively.

3.1 Interruptible capacity estimation for TCL aggregator

When participating in the day-ahead scheduling program, a TCL aggregator needs to estimate the maximum interruptible TCL capacity at each dispatch interval. Since different TCL control actions at one time interval will affect the indoor temperature trajectories and further affect the control actions of the following time intervals, the aggregator solves following model to maximize the interruptible TCL capacity over the whole scheduling horizon.

$$\hbox{max} \, F_{1} = \sum\limits_{t' = 1}^{T'} {P_{a}^{LS} (t ' )}$$
(14)
$$P_{a}^{LS} (t ' ) { = }\sum\limits_{i = 1}^{{NG_{a}^{{}} }} {\left( { 1- \varvec{s}_{a ,i}^{{}} (t ' )} \right) \cdot PG_{a ,i}^{rate} }$$
(15)
$$PG_{a ,i}^{rate} { = }\sum\limits_{j = 1}^{{N_{a ,i}^{TCL} }} {P_{a ,i ,j}^{TCL ,rate} }$$
(16)

where \(P_{a}^{LS} (t ' )\) is the total shed power of aggregator a at time \(t '\) (kW); \(NG_{a}\) is the number of TCL groups managed by the TCL aggregator a; \(\varvec{s}_{a ,i}^{{}} (t ' )\) represents the state of the ith TCL group of the ath aggregator at time \(t '\) (0-OFF, 1-ON); \(PG_{a ,i}^{rate}\) represents the aggregated rated power of ith TCL group of the ath TCL aggregator; \(N_{a,i}^{TCL}\) is the number of TCLs of the ith TCL group of the ath aggregator; \(P_{a ,i ,j}^{TCL ,rate}\) represents the rated power of the jth TCL of the ith TCL group of aggregator a (kW). Models (14)-(16) are subjected to following constraints.

  1. 1)

    TCL group state constraint

    $$\varvec{s}_{a,i}^{{}} (t') \in \{ 0,1\} \quad\quad \forall a = 1:A,\, i = 1:NG_{a}^{{}} ,\, t' = 1:T'$$
    (17)
  2. 2)

    Thermal comfort constraint. It strictly constraints the mean PPD trajectory of each TCL group below a pre-specified PPD threshold, where the function \(\text{PPD} \left( \cdot \right)\) represents model (13).

    $$\overline{{PPD_{a ,i} }} (t ' )\le PPD^{\text{limit}} \, \quad \, \forall a = 1 :A, \, i = 1 :NG_{a}^{{}} , \, t' = 1 :T'$$
    (18)
    $$\overline{{PPD_{a ,i} }} (t ') = PPD\left( {\overline{{PMV_{a ,i} }} (t ')} \right)$$
    (19)
    $$\overline{{PMV_{a ,i} }} (t ') = \frac{{\sum\limits_{j = 1}^{{N_{a ,i}^{TCL} }} {PMV_{a ,i ,j} (t ' )} }}{{N_{a ,i}^{TCL} }}$$
    (20)

    where a and A are index and number of TCL aggregators; \(\overline{{PPD_{a ,i} }} (t ' )\) and \(\overline{{PMV_{a ,i} }} (t ')\) are mean values of PPD and PMV of the ith TCL group of ath aggregator at time \(t '\); \(PMV_{a ,i ,j} (t ' )\) is the PMV value of the jth TCL of the ith TCL group of aggregator a at time t′.

  3. 3)

    Minimum online time constraint. It applies to avoid mechanical weariness of the TCL due to frequently turning compressors on/off.

    $$\tau_{a,i}^{on} (t') \ge \tau_{\hbox{min} }^{on} \quad\quad\quad \forall a = 1:A,\, i = 1:NG_{a}^{{}} ,\, t' = 1:T'$$
    (21)
    $$\tau_{a,i}^{on} (t '){ = }\left( {\tau_{a,i}^{on} (t '- 1){ + }\varvec{s}_{a ,i}^{{}} (t ') \cdot \Delta t '} \right) \cdot \varvec{s}_{a ,i}^{{}} (t ')$$
    (22)

    where \(\tau_{a ,i}^{on} (t ')\) is the accumulated online time of the ith TCL group of the ath aggregator at time t′; \(\tau_{\hbox{min} }^{on}\) is the minimum required online time.

Since the time interval frequencies of TCL control and system dispatch are different, TCL aggregators therefore calculate the final load shedding capacity of each system dispatch interval by averaging the capacities of TCL control intervals. By taking into account some uncertain situations, the averaged capacity is multiplied by the dispatch margin factor \(\gamma\) (\(\gamma\) is within (0, 1]).

$$EP_{a}^{{}} (t ) { = }\frac{{\sum\limits_{{t ' { = }t}}^{{t{ + }\Delta t}} {\sum\limits_{{i{ = }1}}^{{NG_{a}^{{}} }} {P_{a}^{LS} (t ' )} } }}{{t{ + }\Delta t}} \cdot \gamma$$
(23)

where \(EP_{a}^{{}} (t )\) is the estimated interruptible TCL power capacity of aggregator a at time t.

3.2 Day-ahead dispatch model for system operator

After receiving bids from load aggregators, the system operator solves model (24) to determine the load reduction amount of each aggregator at each system dispatch time interval. The objective of the system dispatch is to minimize the load shedding cost.

$$\hbox{min} \, F_{2} { = }\sum\limits_{t = 1}^{T} {\sum\limits_{a = 1}^{A} {pr_{{}}^{clc} (t )\cdot LS_{a} (t )} }$$
(24)

where \(LS_{a}^{{}} (t )\) is the load shedding instruction for aggregator a at time t; \(pr_{{}}^{clc} (t )\) is the market clearing price at time t.

Model (24) is subjected to following constraints:

$$0 \le LS_{a} (t )\le EP_{a}^{{}} (t )\quad\quad\quad t = 1 :T$$
(25)
$$\sum\limits_{a = 1}^{A} {LS_{a} (t )} \ge P_{{}}^{sys} (t )\quad\quad\quad t = 1 :T$$
(26)

where \(P_{{}}^{sys} (t ) { }\) is the total required shed power of the system at time t. Note that the system operator’s day-ahead dispatch task in an actual day-ahead power market also includes accepting bids from the generation side and making power output allocations among units. However, in this paper we do not consider the scheduling of generation side, but focus on the application scenario in which the system operator schedules the load reduction amounts among different load aggregators, so as to satisfy the system peak load reduction requirement.

3.3 Day-ahead scheduling model for TCL aggregator

After solving model (24), the system operator sends dispatch results to each TCL aggregator. The load shedding deviations between the actual and instructed load shedding will bring penalty costs to the load aggregator (e.g. the imbalance energy prices in some market structure), therefore each aggregator needs to solve a day-ahead dispatch model to schedule the ON/OFF states of TCLs, so that the total load shedding deviations over the DLC period can be minimized:

$${ \hbox{min} }F_{3} = \sum\limits_{{t{ = }1}}^{T} {\sum\limits_{{t ' { = (}t{ - }1 )\Delta t{ + }1}}^{t\Delta t} {\left| {(P_{a}^{LS} (t ' )- LS_{a}^{{}} (t )} \right|} }$$
(27)

where \(\Delta t\) is the duration of a system dispatch interval; \(P_{a}^{LS} (t ' )\) is the total shed power of aggregator a at time t’. The model is subjected to following constraints:

  1. 1)

    TCL group state constraint

    $$\varvec{s}_{a,i}^{{}} (t') \in \{ 0,{\kern 1pt} 1\} \quad\quad\quad \forall a = 1:A,\, i = 1:N_{a}^{TCL} ,\, t' = 1:T'$$
    (28)
  2. 2)

    Thermal comfort constraint

    $$\overline{{PPD_{a,i} }} (t') \le PPD^{limit} \quad\quad\quad \forall a = 1:A,\, i = 1:NG_{a}^{{}} ,\, t' = 1:T'$$
    (29)
  3. 3)

    Minimum online time constraint

    $$\tau_{a,i}^{on} (t') \ge \tau_{\hbox{min} }^{on} \quad\quad\quad \forall a = 1:A,\, i = 1:NG_{a}^{{}} ,\, t' = 1:T'$$
    (30)

4 Approach to solve the models

In the proposed day-ahead TCL scheduling framework, there are three optimization models need to be solved. In the system operator side, the day-ahead load shedding dispatch model (24) is a convex optimization problem with continuous variables and linear constraints. This model can be directly solved by the linear programming technique. In this study, the commercial optimization software AMPL [23] is employed to solve model (24).

For the TCL aggregator, both the interruptible TCL capacity estimation model (14) and day-ahead TCL dispatch model (27) are non-linear, complex combinatorial optimization problems. Control actions of the TCL groups at a given time interval are dependent with the control actions at other time intervals. Considering \(G\) TCL groups and \(T'\) TCL control intervals, for both models the number of search path for the global optimization is in the order of \(2^{G \times T'}\). This would be a tremendous, if not impossible, search task for the global optimization for models (14) and (27). In order to effectively search for the global/near-global solutions in affordable time scales, we employ a heuristic based optimization method proposed in [24], history driven differential evolutionary (HDDE) algorithm, to solve the two models.

4.1 Introduction of HDDE

HDDE is based on the differential evolutionary (DE) algorithm, which is proposed by Storn and Price in 1997 [25] and has been applied in many industrial applications. In the original DE, most of the solutions, which are generated in the search process, are discarded. Only the best solution will be memorized. The motivation of HDDE is that the discarded solutions can actually provide useful information to guide the search. In HDDE, a binary partitioning (BP) tree is constructed to record all the historically generated solutions, referred as the BP fitness tree. HDDE evolves the population using the same mutation and crossover mechanisms of DE. There are two important steps that distinguish HDDE from DE: BP fitness tree updating and BP fitness tree guided search.

In the BP fitness tree updating process, each newly generated solution will be inserted into the tree. By using a certain distance metric, the entire search space of the problem is divided into multiple disjoint sub-spaces. In the BP fitness tree guided search process, the mutated individual will be generated by using the pseudo-gradient and global search operators, which use certain selection algorithms to select the solutions stored in the BP fitness tree to generate the mutant. The pseudo-gradient search operator is used to guide the search toward the local optimum to speed up the convergence of the algorithm; the global search operator is used to improve the global search capability of the algorithm. Detailed principles of HDDE can be found in [24].

4.2 HDDE-based approach for TCL day-ahead dispatch

By using HDDE to solve models (14) and (27), each individual represents a TCL dispatch solution and can be coded as a vector with \(NG_{a}^{{}} \cdot T'\) dimensions. Value of the dth dimension represents ON/OFF state of the \(\bmod (d,NG_{a}^{{}} )\)th TCL group at the time of \(floor(d/G_{a} )\), where \(\bmod ( \cdot )\) represents the modulus operation and \(floor( \cdot )\) represents round down operation. The whole population can be expressed as follow.

(31)

Models (14) and (27) are with the same constraints. For each model, the constraint handling procedures of an individual are outlined below.

  • Step 1: Set the dimension index j=0;

  • Step 2: If \(p_{ij} { = 1}\), skip to Step 6; otherwise, check if the minimum online time constraint is satisfied. If “yes”, then go to Step 3; if “no”, then set \(p_{ij} { = }1\) and go to Step 4;

  • Step 3: Calculate the mean indoor air temperature value at time \(floor(d/G_{a} )\) by using the TCL aggregation model in Section 2, and calculate the PPD value (for models (20) and (40), the simplified thermal comfort model is used; for model (44), the standard thermal comfort model is used). Check if the thermal comfort constraint is satisfied. If “no”, then set \(p_{ij} { = }1\);

  • Step 4: Set j=j+1. If \(j{ = }G_{a} T' - 1\), then terminate the constraint handling process; otherwise, go to Step 2.

The HDDE-based approach for models (14) and (27) is summarized in Table 2 based on the constraint handling strategy and its searching mechanism. Firstly, HDDE randomly generates p individuals to form a population, and the constraint handling algorithm is applied in each individual (Lines 2–3). The population is then evaluated and each individual is inserted into the BP fitness tree (Lines 4–9). In each generation, the mutant is generated by the BP fitness tree guided search, and the BP fitness tree is updated by using the newly generated individuals (Lines 11–19). Finally, the optimal solution is output (Line 21).

Table 2 HDDE based TCL dispatch

5 Simulation study

The proposed models are implemented in MATLAB. The AMPL solver is used to solve model (27). The MATLAB programs invoke the AMPL/IPOPT solver through the external command interface to perform the upper-layer day-ahead dispatch. The results of the AMPL/IPOPT are output into a text file, which is retrieved by the MATLAB programs to do the further optimization.

All the simulation programs are performed on a 4 core, 64-bit DELL Workstation with Intel ® Core ™ i5-2400 CPU and RAM 4 Giga-byte.

5.1 Sampling and grouping of units

In this paper, the Monte-Carlo simulation method is employed to generate unit samples where TCLs are located. We assume that the occupants are with the moderate activity environment, and thus set the value of \(M\) to be 1.2. This assumption covers many typical building scenarios, such as dwellings, offices, classrooms, etc. [22]. Several parameters of the thermal comfort estimation module are treated the stochastic parameters, shown in Table 3. The Monte-Carlo simulation is then performed to repeatedly sample the stochastic parameters to generate heterogeneous unit scenarios.

Table 3 Parameter settings of Monte-Carlo simulation

After the Monte-Carlo simulation, the C-means clustering method [25] and its MATLAB implementation are employed to cluster the unit samples based on the parameter similarity, and the samples are finally grouped to multiple groups, where each group includes samples belonged to the same cluster.

5.2 Simulation setup

Totally 5558 TCL samples are generated. 4 TCL aggregators are set up (denoted as Agg A, B, C, and D) and each one manages multiple TCL groups. Agg A and Agg B are set to have larger capacities than Agg C and Agg D. The general information of the TCL groups is shown in Table 4.

Table 4 TCL aggregator information

The DLC is assumed to be performed by the grid operator for two hours, i.e. from 13:00 to 15:00, and the dispatch interval is assumed to be every 0.25 hours. The temperature and relatively humidity profiles are obtained by the Guangzhou Central Meteorological observatory, China [26] are used for the simulation, shown in Fig. 3. The value of \({\text{PPD}}^{\text{limit}}\) is set to 20%, and the value of \(\tau_{\hbox{min} }^{on}\) is set to 1/12 hours (i.e. 5 minutes. For HDDE, we set the control parameters by several trails: \(F = 0.7\) and \(Cr = 0.8\); the population size and maximum iteration time are set to 200 and 800, respectively.

Fig. 3
figure 3

One-day air temperature and humidity profiles

5.3 Numerical analysis

The day-ahead capacity estimation result of the 4 aggregators are shown in Table 5. From Table 5, a general trend can be observed that the high outdoor temperature leads to less controllable TCL capacity (the first several intervals). After estimating the interruptible TCL capacities at each dispatch time interval, the TCL aggregators submit the bids to the system operator. Many factors would be considered by the load aggregators to determine the bid prices, such as the risk aversion level, behaviors of other competitors, etc. The optimal bidding strategy is worthy studied itself. Since the bidding strategy is not the focus of this paper, in this study we assume that all bidding strategies based on the load aggregator’s marginal cost curve. As a demonstration, Fig. 4 depicts the bidding curve of the Agg A.

Table 5 Estimated capacities of TCL aggregators
Fig. 4
figure 4

Bidding curve of the aggregator A

After receiving the bids of the aggregators, model (24) is solved by AMPL/IPOPT to determine the load shedding instructions. Results are shown in Fig. 5. Agg D is required to shed least load compared with that of other 3 aggregators. Totally 114356 kWh energy is shed during the 2-hour DLC horizon, and the total load shedding costs for the system operator is $481.56.

Fig. 5
figure 5

Day-ahead TCL planning results of the 4 aggregators

Based on the load shedding instructions made by the system operator, the load aggregators solve model (27) to do the day-ahead dispatch plan of TCLs, in 1-minute basis. Results are also shown in Fig. 5. It can be seen that there are some unavoidable small fluctuations between the hourly-based load shedding instructions and minute-based TCL control plans. Figure 6 shows the mean indoor temperature variations and corresponding mean PPD profiles of a representative TCL group under the optimized ON/OFF control actions. The mean indoor temperature and PPD profiles are calculated by the thermal inertia model and the thermal comfort model, respectively. It can be seen that the proposed method well controls the mean PPD values of the group under the pre-set threshold.

Fig. 6
figure 6

Profiles of the mean indoor temperature and mean PPD of a representative TCL group

Figure 7 shows the scatter of PPD values of all TCLs managed by a representative aggregator at a random selected time interval. It shows that the PPD values of most TCLs are under the threshold. Only a very small number of TCLs (24 out of 959) beyond the PPD threshold (the red points).

Fig. 7
figure 7

PPD scatters of all the TCLs of a representative aggregator

6 Conclusion and future works

This paper proposed a 3-stage day-ahead scheduling framework for the TCLs by DLC. An advanced thermal inertia model and a simplified thermal comfort model are employed in the proposed framework to support the thermal comfort of the occupants.

In the first stage, the load aggregators solve an optimization model to estimate their interruptible TCL capacities at each system dispatch interval. This is followed by the system operator that determines the load shedding instructions based which the aggregators solve a day-ahead TCL dispatch model to schedule the control plans of the TCL groups, while accounting for the occupants’ thermal comfort. The HDDE algorithm and AMPL software are employed to solve the optimization models, and the simulation results validate the proposed framework.

Although PMV and PPD models have been widely used in HVAC systems, in the last several years the concept of “adaptive model” has proposed in the building environment science [27]. The adaptive model is an extension of PMV and PPD, which not only considers the impacts of indoor conditions on people’s thermal comfort, but also considers people’s adaptive behaviors to the indoor environment. In future, the authors will study the impacts of the adaptive model on TCL control and demand side management.