1 Introduction

As the increasing tension of power supply, it becomes more important to balance power supply and demand effectively, and improve safety, reliability and economics of power system at the same time. Since conventional energy resources are depleting at an alarming rate, the penetration of renewable energy is increasing greatly [1], but most of them has the characteristics of randomness and intermittency, leading to negative effects on power quality and reliability. In order to counteract the negative effect, some methods were proposed to forecast the output of photovoltaic stations or wind farms [24], and some solutions were proposed to balance power fluctuation by using storage devices [5] as well. In [6], the energy storage technology profile was analyzed and summarized, in terms of technology maturity, efficiency, scale, lifespan, cost and applications, taking into consideration their impact on the whole power system. However, high cost of storage devices hinders their wide applications. By contrast, DR is a promising alternative technology that takes advantage of all types of flexible demands and thus need much less investment. Therefore, it is very attractive to use DR to integrate more renewable energy.

Demand side management (DSM) [7] has become an important measure to reduce power consumption, alleviate the pressure of electricity dearth, and reduce power supply cost without increasing power supply capacity. DR programs have been extensively investigated in recent years to provide multiple types of ancillary services [8], including primary frequency regulation [911], spinning reserve [1214], system security improvements [1517], etc. Generally, there are two typical categories of DR approaches: indirect load control [1724] and direct load control [11, 25]. TCAs [20] are ideal demand resources for direct load control due to their great thermal storage capability which allows the load shifting from peak hours to off-peak time [21]. The cooperative control of TCAs can constitute a highly controllable distributed energy storage system to provide ancillary services.

In order to deal with the various problems caused by renewable energy, a variety of DR control strategies can be applied. A multi-objective day-ahead optimal scheduling model for wind farm integrated power systems was proposed in [16], introducing DR into traditional unit commitment (UC) strategy. A centralized DR control algorithm was proposed in [18, 19] to balance power fluctuation while considering customer comfort constraints at the same time. A resilient strategy for optimal DR control based on the management of highly-distributed electric loads was presented in [20]. In [11], researchers put forward a temperature priority list and dispatch algorithm based on state-queuing model to optimize the control sequence of TCAs. The communication and computation burdens between centralized controller and individual devices were reduced. Besides, a regulation strategy based on power flow tracing and comfort-constrained DR strategy was demonstrated in [23] to balance the fluctuation of wind farm output.

However, the previous research works mostly focused on centralized control strategies, which relied on large-scale information exchange between the centralized controller and DR devices. On the customer side and even the power system side, low cost communication techniques are usually preferred, which may incurs serious packet loss and bit errors during data transmission. In addition, both the customers’ private information and control decisions may be intercepted during the communication process, thus may result in security and privacy problems [26].

To surmount these deficiencies of centralized control, this paper aims to develop a hierarchical and distributed DR control strategy. It is nearly a center-free algorithm and there is no need to collect information of all DR devices or send control signal to them. Instead, all participators are divided into different regions according to their geographical positions, and one aggregator is set in each region. Each region is regarded as a virtual power plant (VPP), and the VPPs are connected to the upstream power system. Therefore, power system only needs to exchange the total power information with VPPs. The amount of communication data is decreased, and then the occurrence of packet loss and bit errors in the signal transmission is reduced.

Besides, regulation capacities of VPPs are taken into account so that the control targets can be assigned in a self-regulating way, making the most use of the DR resources in the control regions. Moreover, an improved version of the optimal centralized control strategy—OTR-O [23] is proposed in this paper by considering the model prediction and customers’ responsive behavior model to further reduce the amount of communication and improve the control performance.

The remainder of the paper is organized as follows. Equivalent thermal parameter (ETP) model, index model of heat pump and the modeling of VPP are presented in Sect. 2. Optimization and strategies are introduced in Sect. 3. The simulation results are discussed in Sect. 4. The conclusions and future work are summarized in Sect. 5.

2 Modeling methodologies

The ETP model of a heat pump proposed in [23] is used as the simulation model and the index model in [25] is used as prediction model in this paper. The VPP model based on ETP model is also proposed in this section.

2.1 ETP model

In ETP model, thermal dynamics of a single heat pump can be described as follows:

$$\varvec{A} = \left[ {\begin{array}{cc} { - \left( {\frac{1}{{R_{m} C_{a} }} + \frac{1}{{R_{a} C_{a} }}} \right)} & {\frac{1}{{R_{m} C_{a} }}} \\ {\frac{1}{{R_{m} C_{m} }}} & { - \frac{1}{{R_{m} C_{m} }}} \\ \end{array} } \right]$$
$$\varvec{B} = \left[ {\begin{array}{cc} {\frac{1}{{R_{a} C_{a} }}} & {\frac{1}{{C_{a} }}} \\ 0 & 0 \\ \end{array} } \right]$$
$$\varvec{T} = \left[ {\begin{array}{cc} {T_{a\_E} } \\ {T_{m\_E} } \\ \end{array} } \right]$$
$$\varvec{C} = \left[ {\begin{array}{cc} {T_{o\_E} } \\ {K_{{}} } \\ \end{array} } \right]$$
$$\dot{\varvec{T}} = {\varvec{AT}} + {\varvec{BC}}$$

where C a is the air heat capacity; C m is the mass heat capacity; R a is the air thermal resistant; R m is the mass thermal resistant; T o_E is the temperature outside; T a_E and T m_E are the air temperature and mass temperature inside, respectively; K is the electric operation rate.

Equations (1)–(5) can be discretized by:

$${\varvec{T}}^{k} = {\varvec{T}}^{k - 1} +\Delta t({\varvec{AT}}^{k - 1} + {\varvec{BC}}^{k - 1} )$$

Since the parameters such as C a , C m , R a , R m for different customers are usually various, normal distribution function N(aσ) is used in this paper for the population of heat pump units to model the load diversity. The superscript k and k − 1 of all variables indicate the values at steps k and k − 1.

According to (1)–(6), the temperature and power consumption have a one-to-one relationship. For a heat pump numbered i, the relationship can be described as follows:

$$Q_{i}^{k} = Z_{E,i}^{k} Q_{op} = Z_{E,i}^{k} \frac{{P_{rated}^{i} }}{{\eta_{AC} }}$$
$$Z_{E,i}^{k} = \left\{ {\begin{array}{*{20}l} 1 \hfill & {\quad T_{a\_E,i}^{k - 1} \le T_{ - \_E,i}^{k} = T_{s\_E,i}^{k} - \frac{\delta }{2}} \hfill \\ 0 \hfill & {\quad T_{a\_E,i}^{k - 1} \ge T_{ + \_E,i}^{k} = T_{s\_E,i}^{k} + \frac{\delta }{2}} \hfill \\ {Z_{E,i}^{k - 1} } \hfill & {\quad \text{otherwise}} \hfill \\ \end{array} } \right.$$

where Q op is the rated heat rate; Z k E,i is the off/on state (0 for ‘off’, 1 for ‘on’) for i th heat pump at step k; \(T_{{a\_E,{{i}}}}^{{{{k}} -1}}\) is the current temperature inside at step k − 1; P i rated is the rated power of i th heat pump; \(T_{s\_E,i}^{k} ,T_{ + \_E,i}^{k}\) and \(T_{ - \_E,i}^{k}\) are the temperature set point, upper and lower limits for i th heat pump at step k, respectively; δ is the temperature range. Note that (7) and (8) actually give the basic control logic of a heat pump, i.e. the way the operating status changes.

2.2 Index model

The index model [25] also describes the thermal dynamics of a heat pump as the ETP model does, but much more simpler and easier to be calculated. Therefore, it is used as the prediction model in this paper to forecast the state evolution of the heat pump population. According to the index model, the indoor temperature rises and falls following the below rules:

$$\begin{aligned} T_{a\_I,i}^{k} & = T_{o\_I,i}^{k} + Z_{I,i}^{k} QR \\ & \quad - \,\left( {T_{o\_I,i}^{k} + Z_{I,i}^{k} QR - T_{a\_I,i}^{k - 1} } \right)\text{e}^{{ - \frac{1}{RC}}} \\ \end{aligned}$$
$$Z_{I,i}^{k} = \left\{ {\begin{array}{*{20}l} 1 \hfill & {\quad T_{a\_I,i}^{k - 1} \le T_{ - \_I}^{k} = T_{s\_I}^{k} - \frac{\delta }{2}} \hfill \\ 0 \hfill & {\quad T_{a\_I,i}^{k - 1} \ge T_{ + \_I}^{k} = T_{s\_I}^{k} + \frac{\delta }{2}} \hfill \\ {Z_{I,i}^{k - 1} } \hfill & {\quad {\text{otherwise}}} \hfill \\ \end{array} } \right.$$

where \(T_{s\_I}^{k} ,T_{ + \_I}^{k}\) and \(T_{ - \_I}^{k}\) are the temperature set point, upper, lower limits at step k, respectively; \(T_{a\_I,i}^{k - 1}\) and \(T_{o\_I,i}^{k}\) are the measured room temperature and outdoor air temperature for i th heat pump at step k; C is the indoor air heat capacity; R is the mean envelope thermal resistance; Q is the operational heat rate; \(Z_{I,i}^{k}\) is the off/on state for i th heat pump at step k.

2.3 VPP consisting of heat pumps

A population of heat pumps can be modeled as a VPP in which each heat pump is modelled by the ETP model. The upper and lower limits of the power output of VPP P upper and P lower are time-varying parameters depending on the temperature states and rated power of every heat pump in the certain population.

At each step, the temperature of all heat pumps are collected, based on which the \(P_{upper}^{k}\) and \({{P}}_{{lower}}^{k}\) are calculated by:

$$\left\{ \begin{aligned} P_{upper}^{k} = \sum_{i = 1}^{N} {P_{rated}^{i} } Z_{i}^{k} \quad T_{ + ,i}^{k} < T_{i}^{k} < T_{ + ,i}^{k} + T_{buffer} \hfill \\ P_{lower}^{k} = \sum_{i = 1}^{N} {P_{rated}^{i} } Z_{i}^{k} \quad T_{ - ,i}^{k} - T_{buffer} < T_{i}^{k} < T_{ - ,i}^{k} \hfill \\ \end{aligned} \right.$$

where T buffer is a temperature margin that was defined in [23] to avoid the violation of temperature limits; N is the number of heat pumps; \(T_{i}^{k} ,T_{ + ,i}^{k} ,T_{ - ,i}^{k}\) and \(Z_{i}^{k}\) are the room temperature, upper, lower limits and off/on state of i th heat pump at step k, respectively.

\(P_{{VPP}}^{k}\) is actual output power of the VPP at step k, which can be described by:

$$P_{{VPP}}^{k} = \sum\limits_{i = 1}^{N} {P_{rated}^{i} } Z_{i}^{k} \quad T_{ - }^{k} < T_{i}^{k} < T_{ + }^{k}$$

As shown in Fig. 1, the blue dotted lines represent the output limits of VPP, and the dark point stands for the actual output power of the VPP at step k.

Fig. 1
figure 1

VPP model

In addition, based on the basic parameters such as states of heat pumps, \(P_{rated}^{i}\) and so on, two indicators can be defined to evaluate the performance of the control strategy:

$$\left\{ \begin{aligned} \rho_{r}^{k} = P_{upper}^{k} - P_{{VPP}}^{k} \hfill \\ \rho_{{f}}^{k} = P_{{VPP}}^{k} - P_{lower}^{k} \hfill \\ \end{aligned} \right.$$

where \(\rho_{r}^{k}\) is the power rising capacity; \(\rho_{{f}}^{k}\) is the power falling capacity . \(\rho_{r}^{k}\) and \(\rho_{{f}}^{k}\) represent the feasible region of power output state of VPP, that is to say, the feasible region of energy-storage or energy-release at step k. \(\rho_{r}\) and \(\rho_{{f}}\) will be two of the most important indicators to evaluate the effects of control strategy.

3 Optimization and strategies

A schematic drawing of control framework is shown in Fig. 2 and the proposed hierarchical and distributed demand response control strategy will be discussed in this section.

Fig. 2
figure 2

Schematic drawing of control framework

As shown in Fig. 2, the hierarchical control consists of two levels. First of all, at the upper level, a power target P T is generated for the whole community to balance the power fluctuation of the tie line caused by renewable energy such as wind power. The target P T is passed to the aggregator of each VPP to start the process of distributed control. With P T and the total power of the heat pumps of all the other VPPs P HP_i , a target assignment algorithm based on OTR-I (with the model prediction and customers’ responsive behavior considered) inside each aggregator will work. As a result, all the heat pumps of each VPP will receive the respective optimal temperature set-point changes u for them to follow. Under the joint efforts of all the heat pumps in the community, the overall goal of balancing the power fluctuation of the tie line is fulfilled.

3.1 OTR-O of the aggregated heat pumps

Considering a large number of heat pumps, a robust control strategy to regulate the load demand profile by a common perturbation u to the temperature set-point under a centralized control strategy could be applied [18, 20, 23].

To describe the power consumption of the heat pumps that have the same temperature state, two temperature-power factors are defined as follows: \(\varphi_{1}^{k}\) for ‘on’ and \(\varphi_{0}^{k}\) for ‘off’ states. These functions describe the amount of power at a given air temperature T x :

$$\varphi_{1}^{k} (T_{x} ) = \sum\limits_{i = 1}^{N} {P_{rated}^{i} Z_{i}^{k} n\left( {T_{i}^{k} ,T_{x} } \right)}$$
$$\varphi_{0}^{k} (T_{x} ) = \sum\limits_{i = 1}^{N} {P_{rated}^{i} \left( {1 - Z_{i}^{k} } \right)n\left( {T_{i}^{k} ,T_{x} } \right)}$$
$$Z_{i}^{k} = \left\{ {\begin{array}{*{20}l} 1 \hfill & {\quad T_{i}^{k} \le T_{ - ,i}^{k} + u_{k} } \hfill \\ 0 \hfill & {\quad T_{i}^{k} \le T_{ + ,i}^{k} + u_{k} } \hfill \\ {Z_{i}^{k - 1} } \hfill & {\quad \text{otherwise}} \hfill \\ \end{array} } \right.$$
$$n\left( {T_{i}^{k} ,T_{x} } \right) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {T_{i}^{k} = T_{x} } \hfill \\ 0 \hfill & {T_{i}^{k} \ne T_{x} } \hfill \\ \end{array} } \right.$$

where n(T k i T x ) is a factor represents that only the devices in T x will be calculated.

The total power consumption of heat pump population \(P_{HP}^{k}\) can be described by:

$$P_{HP}^{k} (u^{k} ) = \sum\limits_{{T_{x} = - \infty }}^{{T_{ - }^{k} + u^{k} }} {\varphi_{0}^{k} (T_{x} )}\Delta \theta + \sum\limits_{{T_{x} = - \infty }}^{{T_{ + }^{k} + u^{k} }} {\varphi_{1}^{k} (T_{x} )}\Delta \theta$$

where Δθ is the increment of temperature in discrete integration.

If the target power P T is known for the next step, the optimal temperature set-point change u can be solved with the following constrained convex programming:

$$\left\{ \begin{array}{l} \hbox{min} \,\,F = \left( {P_{T}^{k} - P_{HP}^{k} (u^{k} )} \right)^{2} \hfill \\ {\text{s}\text{.t}\text{.}}\,\,T_{{{\rm min}}} \le T_{ - }^{k} + u^{k} < T_{ + }^{k} + u^{k} \le T_{{\rm max} } \hfill \\ \end{array} \right.$$

where T max and T min are the acceptable temperature limits for customers; \(P_{T}^{k}\) is total target given by central controller at step k.

The solution method of this optimization problem can be found in [20]. In this centralized control strategy, all the heat pumps participated in the calculation will get an identical u to take part in the regulating.

3.2 Balancing algorithm for power fluctuation of tie line

Considering the distribution network with wind power generation and heat pump groups, wind power P W and two load types are defined: nominal load P N (load of all electrical energy demand other than heat pumps) and heat pumps load P HP .

The idealized total net load P L at step k is:

$$P_{L}^{k} = P_{N}^{k} + P_{HP}^{k} - P_{W}^{k}$$

Here, we assume that at step k + 1, the nominal load and wind power output are known (or can be predicted based on wind power prediction method [27] and historical demand data). In order to balance the wind power injection fluctuations, the power target of heat pumps can be calculated by:

$$P_{T}^{k + 1} = P_{W}^{k + 1} - P_{N}^{k + 1} + \frac{1}{M}\sum\limits_{j = 1}^{M} {P_{L}^{{k - j{ + }1}} }$$

Then, we set heat-pump power target so as to minimize the deviations from the average total load over the M previous sampling intervals (as \(\frac{1}{M}\sum\limits_{j = 1}^{M} {P_{L}^{{k - j{ + }1}}}\)), using the average value to balance the fluctuation. This control method is called M-average control method given by [28].

3.3 Target assignment and compensation strategy

The configuration of the distributed control strategy is shown in Fig. 3. The only needed signal from the central controller is the total power target P T . Regional aggregators will evaluate preliminary target assignment with power consumption of heat pumps in other VPPs as follows:

$$\left\{ \begin{aligned} P_{T\_i}^{k + 1} = \frac{{P_{HP\_i}^{k} }}{{P_{HP\_total}^{k} }}P_{T}^{k} \hfill \\ P_{HP\_total}^{k} = \sum\limits_{i = 1}^{m} {P_{HP\_i}^{k} } \hfill \\ \end{aligned} \right.$$

where \(P_{T\_i}^{k + 1}\) is the preliminary power target of the i th VPP at step k + 1; \(P_{HP\_i}^{k}\) is the power consumption of heat pumps of i th VPP at step k; \(P_{HP\_total}^{k}\) is the total power consumption of heat pumps of all controlled VPPs at step k; m is the number of VPPs; m is the number of VPPs.

Fig. 3
figure 3

Configuration of the distributed control strategy

In order to make the target assignment more accurate, the regulating capacity of each VPP is considered in this paper. Similar to the power rising and falling capacity (\(\rho_{r}\) and \(\rho_{{f}}\)), the definitions of capacity of regulation η up and η down according to the optimal temperature set-point change u are put forward to amend the assignment, as shown in (23).

$$\left\{ \begin{array}{l} \eta_{up} = \frac{{\left| {u_{up}^{{}} - u^{k} } \right|}}{{\delta_{u\_up} }} \times 100\% \hfill \\ \eta_{down} = \frac{{\left| {u_{down}^{{}} - u^{k} } \right|}}{{\delta_{u\_down} }} \times 100\% \hfill \\ \end{array} \right.$$

where u up is the upper limit of u; u down is the lower limit of u; \(\eta_{{up}}\) is the up regulating capacity; η down is the down regulating capacity; δ u_up and \(\delta_{{u\_{{down}}}}\) are the up and down deadband of u.

$$P_{T\_i}^{k + 1*} = \left\{ {\begin{array}{*{20}l} {(1 - \eta_{up} )P_{T\_i}^{k + 1} } \hfill & {\quad \eta_{up} \le \xi } \hfill \\ {P_{T\_i}^{k + 1} } \hfill & {\quad \xi < \eta_{up} \le 100} \hfill \\ \end{array} } \right.$$

Then η up and η down are used to compensate the target assignment as follows:

$$P_{T\_i}^{k + 1*} = \left\{ {\begin{array}{*{20}l} {(1 + \eta_{down} )P_{T\_i}^{k + 1} } \hfill & {\quad \eta_{down} \le \xi } \hfill \\ {P_{T\_i}^{k + 1*} = P_{T\_i}^{k + 1} } \hfill & {\quad \xi < \eta_{down} \le 100} \hfill \\ \end{array} } \right.$$

where \(P_{T\_i}^{k + 1*}\) is the correcting of the preliminary target assignment for i th VPP; ξ is the range of η for limiting capacity.

As shown in Fig. 4, when the optimal temperature set-point change u goes into the limiting capacity area, the method of correcting is used to compensate the power target to each VPP.

Fig. 4
figure 4

Capacity of regulation defined by temperature set-point change u

After the correcting for the VPPs whose u have gone into the limiting capacity area, the differences between \(P_{T\_i}^{k + 1}\) and \(P_{T\_i}^{k + 1*}\) will be undertook by the other VPPs on average.

On account of considering regulation capacities of VPPs in the target assignment and correcting methods, the target assignment becomes self-regulating and leads VPPs to yield their greatest self-regulating capability to get better control effects. In the centralized strategy, all the controlled heat pumps will get only one change of temperature setting u to follow the target. Because of the nature of discrete integral of OTR-O, all the heat pumps will be taken as a whole, which is just like all the people in the team will get the average task assignment ignoring the personal ability. However, in the new distributed control strategy, heat pumps are divided into different regions, of which the heat pumps are aggregated as a VPP with their own u according to their own target, which is just like people in the team will get assignment according to their ability.

3.4 Model prediction strategy in OTR-I

In order to further reduce the amount of data transmission, model prediction strategy is proposed. The prediction model is integrated in OTR-O, which will be evolved into OTR-I. The strategy will set correction interval depending on the demand of control accuracy. In the correction interval, index model is used in OTR-I to predict the equipment states. When the correction interval reaches, the regional aggregator will collect costumer’s real data to correct the prediction model. The model prediction strategy can be described in (26) and (27).

If k ≠ nΔt cor :

$$\left\{ \begin{array}{l} T_{x} = T_{a\_I} \hfill \\ T_{s} = T_{s\_I} \hfill \\ T_{ + } = T_{ + \_I} \hfill \\ T_{ - } = T_{ - \_I} \hfill \\ Z = Z_{I} \hfill \\ \end{array} \right.$$

If k = nΔt cor :

$$\left\{ \begin{array}{l} T_{x} = T_{a\_E} \hfill \\ T_{s} = T_{s\_E} \hfill \\ T_{ + } = T_{ + \_E} \hfill \\ T_{ - } = T_{ - \_E} \hfill \\ Z = Z_{E} \hfill \\ \end{array} \right.$$

where Δt cor is the calibration interval; n is the natural number; T x , T s , T +, T and Z are the parameters that applied to the OTR-I algorithm.

3.5 Customers’ responsive behavior model in OTR-I

Customers’ responsive behavior affects the performance of control strategy a lot. To describe customers’ responsive behavior patterns in DR, an algorithm is proposed in this section to describe the DR behaviors of power customers given real-time prices and their preferences.

Each power customer is in the electricity market, and they prefer to turn on heat pumps at lower price and turn off them at higher price. Considering the different properties of different customers, the proposed algorithm takes customers’ preference into consideration. According to OTR-O, all the factors of customers’ responsive behavior can be reflected in adjustment range of optimal solution for the change of temperature setting u, as shown in Fig. 5.

Fig. 5
figure 5

Schematic diagram of customers’ responsive behavior effects

The proportion of effects p caused by customers’ responsive behavior and adjustment range D for u can be described in (28)–(31).

If \(\varepsilon_{{realtime}}\) > \(\varepsilon_{{bace}}\):

$${{p}} = \lambda \left( {\text{e}^{{\frac{{\varepsilon_{base} }}{{\varepsilon_{max} }} - 1}} - \text{e}^{{\frac{{\varepsilon_{realtime} }}{{\varepsilon_{max} }} - 1}} } \right) + N(\beta ,0.01)$$

If \(\varepsilon_{{realtime}}\) < \(\varepsilon_{{bace}}\):

$${{p}} = \lambda \left( {\text{e}^{{\frac{{\varepsilon_{base} }}{{\varepsilon_{max} }} - 1}} - \text{e}^{{\frac{{\varepsilon_{realtime} }}{{\varepsilon_{max} }} - 1}} } \right) - N(\beta ,0.01)$$
$$\beta \propto \frac{l}{{{{I}}\upsilon }}$$
$$D = pT_{range}$$

where \(\varepsilon_{{bace}}\) is the base price at which customers will get full participation in DR; \(\varepsilon_{{max}}\) is the max real-time price one day before; \(\varepsilon_{{realtime}}\) is the real-time price; λ is the weight of price impact; T range is the temperature range; β is the customers’ preference [29] which can be set by the customer, according to equipment switching loss l, DR incentives I and family economic income υ. We use the normal distribution function to randomize the customers’ preferences to ensure its diversity (from a statistical point of view, the customer preference skews towards a particular concentrated direction is a possibility exist). Considering the customers’ responsive behavior, (19) comes into (32) and can be solved with the same solution method [20].

$$\left\{ \begin{array}{l} \hbox{min} \,F = \left( {P_{T}^{k} - P_{HP}^{k} (u^{k} )} \right)^{2} \hfill \\ {\text{s}\text{.t}\text{.}} \, \,\,T_{{\rm min} } + D \le T_{ - }^{k} + u^{k} < T_{ + }^{k} + u^{k} \le T_{{\rm max} } + D \hfill \\ \end{array} \right.$$

3.6 Comparison of centralized and distributed control strategy

Different from traditional centralized control strategies, in the hierarchical and distributed DR control strategy proposed in this paper, the central controller and regional aggregators make the DR control ‘hierarchical’, and the heat pumps being divided into different VPPs makes it ‘distributed’.

We also improved OTR-O. The load model prediction strategy and customers’ responsive behavior model are designed and integrated into OTR-O. The optimal solution of regions for the change of temperature setting u is calculated by OTR-I in regional aggregators according to the status information S’, including heat pump status information S (such as indoor temperature and switch states, the same with OTR-O), real-time price D and customers’ preference β, as shown in Fig. 6.

Fig. 6
figure 6

Comparison of centralized and distributed control strategy

4 Simulation results and analysis

A typical geographic distribution graph is used in this section, as shown in Fig. 7. The regions surrounded by the dotted line represent the customers in these VPPs with willing to participate in DR, and the regions can be regarded as VPPs. The regions in shadow are inactivated area which means customers there having not taken part in this DR strategy.

Fig. 7
figure 7

Actual graph and result of communication network optimization

The comparison of target tracking control effects for the traditional centralized and new distributed control strategy is discussed initially. Then the distributed control in balancing power fluctuation of tie line is given in case 2. Finally, the effects of model prediction and customers’ responsive behavior in OTR-I on control strategy are analyzed in case 3 and case 4.

The simulation tool used in this paper is MATLAB. In all cases, the solution time of the proposed strategy is 0.15928 s on average, which justifies the real-time control.

4.1 Case 1: Distributed control strategy for target tracking

Assume that some residents using electric heat pump equipment in a certain area, and all the heat pump users agree to participate in the DR control.

Parameter configuration is shown in Table 1. The distribution of active heat pumps of VPPs is given in Table 2, corresponding to the location in Fig. 7. In this case, customers’ preference β is supposed to be set by costumers shown in Table 2.

Table 1 Simulation parameters
Table 2 Active heat pumps of VPPs

Simulation results of the distributed strategy with given target are demonstrated in Figs. 8 and 9.

Fig. 8
figure 8

Control effects of distributed control strategy in VPPs

Fig. 9
figure 9

Control effects of distributed and centralized control strategies

For a more visual representation of the difference between centralized control and distributed control effect, we defined the control error \(e_{c}^{k}\) at step k as follows:

$$e_{c}^{k} = P_{HP\_total}^{k} - P_{T}^{k}$$

The control error comparison of the efficiencies gained by distributed and centralized control strategies is shown in Fig. 10. From the figure, it can be clearly seen that distributed control strategy gets better control effects.

Fig. 10
figure 10

Control error of centralized control and distributed control effects

Figure 8 illustrates the simulation results of distributed control strategy in different VPPs and Fig. 9 shows the comparison of control effects for the traditional centralized and new distributed control strategy (all VPPs are aggregated).

As can be seen, under the condition of the same heat pump resources and total power target, distributed control strategy has two obvious advantages. First, \(\rho_{r}\) and \(\rho_{{f}}\) of the VPPs are bigger, which represents that the customer comfort will be better guaranteed. Second, in the worse situation, new distributed control strategy will achieve target following somewhere centralized control strategy cannot track (as shown in marked area of Fig. 9). When u gets into limits boundary, population of heat pumps will lose their adjustment ability (one population in centralized strategy and nine populations in distributed control strategy). Taking the marked region in Fig. 9 for example, Fig. 11 shows simulation results of u. It is clear that in the shaded area u in centralized strategy keeps at the boundary and then the population loses adjustment ability. However, in distributed control strategy, although some of them reach the boundary, the others will not at the same time. And that’s why distributed control strategy will get better control effects.

Fig. 11
figure 11

Optimal temperature set-point change u of centralized and distributed control strategies

4.2 Case 2: Distributed control strategy for power fluctuation of tie line

To balance the power fluctuation of tie line caused by renewable energy, the balancing algorithm for power fluctuation of tie line is used. P N comes from normalized typical residential home data [30] with a standard deviation equal to 10% of the off-peak levels. \({{P}}_{{W}}\) is generated using a typical turbine power curve and environmental wind speed data [24], which can be forecast by existing algorithms. M in the algorithm is 15 in this case.

The simulation results of power fluctuation balancing curves of tie line power are shown in Fig. 12. The wind power and nominal power are shown in Fig. 12a, total heat pump power in controlled and uncontrolled cases are shown in Fig. 12b and the power fluctuation balancing curves of tie line power are shown in Fig. 12c. From Fig. 12, conclusions can be drawn that the distributed control strategy and balancing algorithm for power fluctuation of tie line can effectively smooth the power fluctuation of tie line caused by wind power.

Fig. 12
figure 12

Simulation results of power fluctuation

4.3 Case 3: Effects analysis of model prediction in OTR-I

On the basis of the above control effects, the model prediction strategy is introduced. With different correction intervals, the control effects are shown in Fig. 13. The simulation results show that the control effect is the best when the correction interval is 1 min. With the increase of the correction interval, the control effect is gradually weakened. But when the correction interval is reached, the control strategy collects the real-time customer side information to correct the prediction model. Thus, the actual response curve can re-fit the target curve to ensure the effectiveness of control strategies and reduce the amount of signal transmission as far as possible.

Fig. 13
figure 13

Control effects of target following with different correction intervals

4.4 Case 4: Customers’ responsive behavior effects on distributed control strategy

The real-time price in New York [31] is utilized in this case. Make \(\varepsilon_{{bace}}\) to be 20 $/MWh (at which customers will totally take part in DR), λ = 0.8 and T range  = 2 °C and δ is given in Table 2 in this case.

Since the number of VPPs in this study is relatively large, and the properties of each VPP are similar, we only gives a representative VPP simulation results. The real-time price and deviation between \(\varepsilon_{{realtime}}\) and \(\varepsilon_{{base}}\), control effects and the changing trend of u for some typical VPPs are shown in Fig. 14 to illustrate the customers’ responsive behavior on control effect. Figure 15 shows the comparison of aggregated load control effects in distributed control strategy with or without considering customers’ responsive behavior.

Fig. 14
figure 14

Control effects and the changing trend of u for some typical VPPs

Fig. 15
figure 15

Aggregated load control effects in distributed control strategy with or without considering customers’ responsive behavior

Instead of getting full participation in DR, customers’ responsive behavior brings some adverse effects on control effects. When the deviation between \(\varepsilon_{{realtime}}\) and \(\varepsilon_{{base}}\) becomes large, the regulating capacity of VPPs (\(\rho_{r}\) or \(\rho_{{f}}\)) will reduce because of the adjustment range of optimal solution for u changing.

Similarly, customers’ responsive behavior weakens the fluctuation balancing effects, as shown in Fig. 16.

Fig. 16
figure 16

Simulation results of power fluctuations considering customers’ responsive behavior

Under the influence of the price, the costumers’ behavior changes, which will inevitably lead to the corresponding changes in electricity load. This process is in the case of unconscious energy storage and release in fact.

5 Conclusion

A hierarchical and distributed control strategy of TCAs is established, which is used to balance the power fluctuation of tie line caused by renewable energy through controlling heat pumps. Target assignment and compensating algorithm are introduced to achieve maximum utilization of DR resources in controlled regions. Besides, the nearly center-free hierarchical and distributed control strategy decreases the amount of communication data. We also improve OTR-O to OTR-I by integrating model prediction and customers’ responsive behavior model. The combined analysis is used to investigate the performance of proposed method. The results indicate that VPPs composed of heat pumps can follow the given target, and then balance power fluctuation of tie line caused by renewable energy, which is better than centralized control strategy. Moreover, instead of getting full participation in demand response, customers’ responsive behavior brings some negative influences on control effects. But considering the customers’ responsive behavior is much closer to the actual situation, and will be more customer-friendly.

The possibility of using prediction model in the DR strategy to further reduce information transmission quantity and the influence of packet loss, bit error and time delay in the communication system on the DR control performance, the influence of tariffs change on VPP charging–discharging mode at the aggregator level, and the interaction between the pricing of compensation strategy and the control strategy of VPP charging–discharging mode will be studied in the future.