A data-driven method for the optimal control of centralized cooling station in an office park

An effective way to reduce the energy consumption of a building is to optimize the control strategy for the HVAC system. Load prediction is suggested and used to match the supply and demand for air conditioning and achieve energy savings. However, the gap between load prediction models and real-time optimal control of HVAC systems still exists. Hence, this paper proposed an optimization method for dynamically determining the best setpoints of chillers and chilled water pumps under a specific load. The energy consumption model of each equipment in the centralized cooling station is established and validated using the operational data. Then an optimization problem is defined to find the optimal setpoints for each equipment under certain load, to realize the lowest energy consumption. To verify the validity of the proposed method, a period of real operational data in an office park is used. The proposed method is applied on one centralized cooling station in the office park and results in an 4% lower overall energy consumption than the existing intelligent control strategies in the park. This method provides feasible directions and reference for realizing overall optimal control of the whole HVAC system in the future.


Introduction
Building energy accounts for a large part of total energy consumption.The energy consumption of public buildings has doubled during the past decade in China, with its total amount and increase rate exceeding other building types (Tsinghua University Building Energy Research Center, 2022).In order to ensure a healthy, productive and energyefficient office environment, the air-conditioning system is widely and extensively used in office buildings, which makes HVAC energy consumption in office buildings account for over 40% of the total energy consumption (Li et al., 2022).However, the majority of office buildings' HVAC system control heavily depends on the maintenance staff's experience or simply a few basic regulations, which results in energy waste and uncomfortable indoor environment.Thus, it's crucial to optimize the HVAC system control strategies in office buildings to reduce energy consumption and enhance indoor environment quality.
Most of the HVAC control systems in office buildings adopts a feedback-control based operation approach, i.e.only taking action after indoor environment problems have already occurred (such as increasing the cooling supply when indoor temperature has exceeded the upper bound of the comfort range in summer), which leads to the mismatch between supply and demand of airconditioning.Therefore, various studies have suggested ways for predicting the HVAC system load.(Bourdeau et al., 2019).However, HVAC system load prediction are mostly used in thermal energy storage (TES) systems to reduce the operation cost (Cox et al., 2019;Kang et al., 2022).The use of load prediction in real-time optimal control of HVAC equipment is also mentioned in several works (Fan & Ding, 2019;Gao et al., 2022), but the application effect of the optimal control strategies in actual building remains unvalidated.In general, the utilization of load prediction models in real-time optimization of HVAC system setpoints is still insufficient, which makes load prediction less useful in actual buildings, and large amount of building operational data is wasted.
On the other hand, the study on optimal control of HVAC systems is more extensive.In addition to setting reasonable on-off schedules of each equipment, finding the best setpoints of the control variables in HVAC system (i.e., the temperature and flow rate of chilled water) is a main approach to achieve real-time optimal operation and energy saving.Establishing a model between the control variables and energy usage of each HVAC component is essential in this process.There are two major types of modelling techniques: software-based (Huang et al., 2017;Ling et al., 2018), and theoretical derivationbased (Belic et al., 2021;Ma & Wang, 2011).Both of these methods require a considerable amount of detailed information about the modelled equipment, which is difficult and time-consuming to gather.Additionally, simulation software can only feature a limited selection of typical equipment models and performance curves.For the buildings not employing such typical equipment, the accuracy of the energy models cannot be ensured.With the development of sensors and data collecting technologies in recent years, building operational data becomes easier to acquire, and can be a trustworthy source of the energy consumption model.Therefore, the use of historically obtained data from building operation has increased in energy modelling as an important part of grey-box and black-box models, making the energy models simpler and more flexible (Kim et al., 2022).
Existing studies have demonstrated that the energy consumption of chillers and chilled water pumps can account for up to 70% of the total energy consumption of air conditioning systems (Li, 2021).Therefore, optimizing the control of the centralized cooling station is crucial for optimizing the operation of HVAC systems.This study proposes an intuitive approach for optimizing centralized cooling station operation using historical building data, which can be utilized in load-prediction situations to lower the computational cost and avoid data waste.The rest of the paper is organized as follows.
Section 2 presents the methodology of this paper, including the overall workflow of the proposed method, the method of establishing the energy consumption models of the equipment, and the method for determining the best setpoints.In the proposed method, an optimization problem is defined, where the total energy consumption of the centralized cooling station serves as the target function.
Section 3 presents a case study conducted in an office park, aiming to validate the effectiveness of the proposed method.Real operational data from a centralized cooling station in this office park is collected and the energy consumption model is established.Subsequently, the optimization problem of the centralized cooling station is defined based on the energy consumption models.
In Sect.4, a one-week test period is selected, during which the optimal setpoints for the chillers and chilled water pumps are determined using the proposed method with two different algorithms.Finally, the actual energy consumption of the centralized cooling station during the test period, recorded by the meter, is compared with the energy consumption under the optimal setpoints determined by the algorithms.Moreover, a comparison is made between the optimal setpoints and the original setpoints established by the existing control strategy.
Section 5 provides the conclusions of this paper.

Methodology
The workflow of the proposed method is shown in Fig. 1.Building operational data is first gathered over a period of time, which is then used to create energy models based on empirical formulas and regression techniques.
The forms of the models can vary for different buildings.
The model is then updated via the results of load prediction, and an optimization problem is established.The optimal values (setpoints) for the control variables are determined by resolving this optimization issue.The setpoints are updated and the controller will adjust the chillers and pumps.Moreover, a supervisor controller can be installed prior to the last stage to determine whether the centralized cooling station will operate with the optimal setpoints.For instance, the setpoints won't be updated if the new optimal setpoints are very near to the current setpoint.

Energy consumption models of refrigeration equipment
Building an energy consumption model for each of the components in the centralized cooling station is the first and most essential step of the suggested methodology.An centralized cooling station typically includes chillers and chilled water pumps.The chilled water supply temperature and the frequency of the pumps need to be adjusted to provide the required cooling or heating supply.Thus, it is necessary to determine the relationships between these control variables and the energy consumption of the relevant equipment.Different empirical models are utilized for different equipment, and multiple linear regression (MLR) method is used on the building operational data to fit the energy consumption model and determine the parameters needed.
For chillers, the relationship between energy consumption and chilled water temperature needs to be found.An empirical formula was provided in ASHRAE handbook to determine the optimal cooling load to be carried by each chiller (ASHRAE, 2019).In this formula, chiller's power can be calculated by cooling load, chilled water supply temperature and cooling water return temperature.Since the cooling load can be predicted by load prediction model, the ASHRAE formula can be utilized to establish the chiller energy model.P ch represents the chiller's power, t cwr is the cooling water return tempera- ture, t 1 represents the temperature difference between outdoor air wet bulb temperature and the cooling water supply temperature, generally set as 3℃.t 2 represents the temperature difference between the entrance and the exit of the condenser, often set as 6℃.t chws is the chilled water supply temperature, and a 0 -a 5 are regression coef- ficients fitted based on building history operational data.
For pumps, the calculation methods are different between fixed and variable frequency devices.For fixed frequency pumps, the pump power can be estimated as the rated power.For variable frequency pumps, the energy consumption is relative to the flow rate and the frequency, and can vary in different pipeline resistance conditions.Usually, the energy consumption model for variable frequency pumps can be simplified as quadratic or cubic equation of the flow rate (Yan et al., 2015;Zeng, 2014), but during the operation of the HVAC system, the (1) pipeline resistance is changing with the running state of air-condition terminals, and the energy consumption is hard to be described by a fixed equation.However, the frequency needed to achieve the required flow rate can indirectly show the pipeline resistance.So, in this study, the energy consumption of variable speed pump can be simplified as a cubic homogeneous equation of the flow rate and frequency as below, where P p is the power rate of the pump, f is the frequency, L is the flow rate, and b 0 to b 9 are regression coefficients.Furthermore, forward stepwise regression is used to filter the regression variables to simplify the model.As one of the most used variable filtering methods, forward stepwise regression can retain the appropriate variables at low computational cost.Through validating the model by actual building operational data, it was found that the improvement on the fitting R 2 coefficients of the cubic terms ( b 8 and b 9 ) are too small, so the cubic terms were omitted in order to simplify the model.The final model used in this paper is shown as below:

Definition of the optimization problem
The optimal setpoints of the control variables are found by solving the optimization problem.The objective function is minimization of the total power of the centralized cooling station, which is the total power of each equipment.Based on the models mentioned in Sect.2.1, the total power can be described as a function of the control variables below and the decision variables of this optimization problem are the chilled water supply temperature, (3) Fig. 1 The overall workflow of the proposed method the frequency of the chilled water pumps and the flow rate of the pumps.
The constrains of the optimization problem can be divided into two categories.First is the physical principles that the control variables should follow, including: (1) The required cooling or heating supply: where t chwr is the return temperature of chilled water.
(2) The constraint of pipeline resistance.Under the same pipeline resistance, the flow rate and frequency should be proportional.In this study, the most recent recorded historical data is used to determine the pipeline resistance.
where f pump last and L pump last are the most recent recorded values of the frequency and flow rate of the pump.
(3) The constraint of the same pump head when multiple pumps are running in parallel: where H 1 to H n are the heads of the pumps.
The second category of the constraints is the range of the control variables to ensure the regular operation of the system.The constraints include the pumps' minimum frequency, minimum permitted flow rates, the temperature difference of chilled water, and the lower bound of the chilled water supply temperature.The ranges of the control variables are determined according to the building's actual condition.
With the objective function and the constraints defined, the optimization problem can be solved to find the optimal setpoints of the control variables.

Optimization algorithms
The algorithms for solving optimization problems mainly include traditional optimization algorithms and heuristic optimization algorithms.Traditional optimization algorithms include gradient descent method, Newton's method, conjugate gradient method, etc.They have fixed structures and parameters and are generally deterministic algorithms.Heuristic optimization algorithms include genetic algorithms (GA), particle swarm algorithms (5) (PSO), etc.They can adaptively adjust the search direction, and have better global search capability (Wang et al., 2021).In this paper, sequential least squares programming (SLSQP) and GA are used as representatives of two types of optimization algorithms.
SLSQP uses Taylor expansion to transform the constraint problem to a quadratic programming problem.SLSQP has the advantages of good convergence and high computational efficiency.GA is a computational model that simulates biological evolution and genetics to search for optimal solutions by simulating natural evolutionary processes.GA tends to obtain better results when solving more complex optimization problems and are more likely to find the global optimal solution.The optimal results found by these two algorithms are compared in Sect. 4.

Case information
The selected case is an office park located in Hangzhou province in China (Fig. 2), which has four centralized cooling stations, and one of them is selected for study.The selected station covers the HVAC load of three office buildings with a total construction area of 137,000m 2 and a total air-conditioning area of 101,000 m. 2 .The three buildings mainly consist of open office areas, with about 20% area contains of several small conference rooms and a meeting area.The office park received LEED certification in 2013, and the building equipment is generally in good operating condition, with a series of intelligent control strategies to operate the equipment according to the outdoor weather and indoor temperature.
Figure 3 presents the overview diagram of the selected cooling station.The station includes four chillers and five chilling pumps, and uses fan coil unit (FCU) as terminal.The number of running pumps corresponds with the chillers.The operational data is collected by the smart management platform installed in the office park, the timestep of the data is set to one hour to avoid the chillers' short cycling.The rated parameters of each equipment are shown in Table 1.
The operation of the HVAC system in this office park adopts the intelligent strategies developed by an in-park company.The strategies include the everyday first-time starting of cooling stations, operation of switch on/ off chillers, setpoint adjustment strategies and fresh air operation strategy, etc., as shown in Table 2.
A load prediction model of this system was already established in previous study (Li, 2023).Four categories of input variables including time labels, indoor blackbulb temperatures, numbers of occupancy and outdoor weather parameters were used in the load prediction model, and the model has a mean absolute percentage error (MAPE) of 13.7% on the collected data.However, the error of load prediction can affect the optimal setpoints found by the algorithm and can also influence the indoor environment control effect.Therefore, to validate the effectiveness of the proposed method, the actual air-conditioning load and meter energy observed are used in this paper as a baseline.The test time period is from June 27, 2022 to July 1, 2022.Operational data such as on-off status of each equipment, the flow rate of chilled water, water temperature, and meter energy is collected during this time.

Chillers
Usually there are two chillers running together during the day in the case system.However, the electricity meter can only obtain the total energy consumption of all chillers.Therefore, the running chillers are considered as a whole, and the average chilled water temperature is used to fit the equation.The temperature of return cooling water is set as 3 °C above the wet bulb temperature of outdoor air.To expand the data source for fitting the model, operational data from June to August is used, and a total of 8626 valid data points were obtained after excluding the missing values and outliers.These data include four operation conditions for the number of running chillers.In actual controlling process, the operation condition is determined by the total cooling load and the cooling capacity of the chillers.The cooling load in this study is calculated by the flow rate and temperature difference of the chilled water, as is shown below: where c represents the specific heat capacity of water, which is 4.2 kJ/(kg • °C), m represents the mass flow rate of chilled water.
Figure 4 shows the fitting result and error distribution of the four operation conditions.The chiller power (9) Fig. 2 General appearance of the office park Fig. 3 HVAC system form of the office park calculated by the fitted model is compared with the measured power to evaluate the accuracy of the model.The average R square of the four conditions is 0.84, and the error of the calculated power is mainly below 100 kW, indicating that the model is reliable for estimating the power of the chillers.The operation conditions with less data have relatively smaller R square value, indicating that larger data amount is beneficial for model accuracy.The values of the regression coefficients are also listed in Table 3.

Pumps
In Sect.2, a simplified cubic equation is proposed for calculating the power of the pumps.Through the on-site test of the centralized cooling station, it is discovered that the flow rate and power of the pump can be described as a linear relationship under the same frequency, as shown in Fig. 5.The operational data also shows the same trend.Therefore, the power model of the pump can be further simplified by omitting the secondary form of the flow rate, as shown in Equation 10.
The fitting result in Fig. 6 shows that the pump power model has enough accuracy for optimization.The overall R square is 0.753, and the absolute error is mainly within 25 kW.The coefficients are also listed in Table 4.

The ranges of the control variables
The upper and lower bound for each control variable needs to be determined in order to ensure the system operation and to restrict the searching area of the optimization problem.By interviewing the maintenance staff of the case building, the bounds are set based on the staff 's experience to ensure the proper operation of the system and to prevent the indoor terminals (FCU) from ( 10) Table 2 The current intelligent operation strategies in the office park

Strategy types Operation strategies
First-time starting of cooling station Determine the number of chillers, chilled water temperature and pump frequency based on weekday/weekend, the forecast precipitation and the maximum outdoor temperature of the day Cooling station startup time 1.When the indoor temperature is higher than the setpoint, the pre-cooling time is calculated according to the temperature difference between indoor and outdoor 2. If the indoor temperature is higher than the setpoint, the cooling station will be switched on immediately if it is already working time Chillers on/off 1. Switch the chillers on/off when the indoor temperature continuously exceeds the setpoint range (22℃-26℃) 2. When the load ratios of all the chillers exceed 110%, add another chiller 3.If the difference between the chilled water supply and return temperatures exceeds 5 °C or the chilled water supply temperature exceeds 16 °C, add another chiller Chilled water temperature and pump frequencies 1. Adjustment is made when the indoor temperature exceeds the limit 2. Calculate the increased or decreased cooling load, then calculate the adjustment needed for only changing the chilled water supply temperature or for only changing the pump frequency, then calculate the possible energy consumptions and select the strategy with lower energy consumption Cooling water return temperature 3℃ above the wetbulb temperature of outdoor air Fresh air unit operation 1.Keep the FAU running during working time 2. When the enthalpy of outdoor air is higher than that of indoor air, the fresh air volume is adjusted according to the indoor CO2 concentration limit 3. When the enthalpy of outdoor air is lower than that of indoor air, the FAU is running at maximum volume Fig. 4 The a) fitting result and b) error distribution of the power of chillers in four operation conditions condensing.The lower and upper bounds of the variables are shown in Table 5.

Total energy consumption
The energy consumption model established in Sect. 3 is put into the optimization problem, and the two selected algorithms mentioned in Sect. 2 are used to solve the problem and find the optimal setpoints of chilled water supply temperature and chilled water pump frequency.The corresponding optimized energy consumption is compared with the baseline.
Figure 7 shows that the two algorithms achieve similar energy saving effect.The optimization model provides 4.14% (3408kWh) energy saving by SLSQP, and 4.01% (3297kWh) by GA.Both algorithms provide a solution with higher chiller energy cost, while reducing the pump energy in a greater scale.Figure 8 shows that the energy saving effect can be different under various cooling load conditions.The higher energy saving happens when the cooling load is lower than 4000 kW, on which situation only one chiller is running.When the cooling load is near maximum (beyond 7000 kW), i.e., the centralized cooling station needs to operate at full load, the energy saving effect becomes unobvious, and the optimized energy consumption may be higher than the baseline.However, when the cooling load is between 5000 and 5800 kW, two chillers need to be turned on but running at lower load, the optimized energy usage becomes higher than the baseline.This is because there are fewer data samples within this range since the load is typically in a state of rapid rise or decline, making it hard to guarantee the accuracy of the energy model.As Friday is close to the weekend, the cooling load in the building remains at a low level throughout the day, so the optimization result on Friday is affected by the inaccurate model, causing the energy consumption exceeding that before optimization in some time.

The optimal setpoints of control variables
The two control variables (chilled water supply temperature and pump frequency) are compared before and after optimization, and June 29 is taken as an example, as shown in Fig. 9.The optimal t chws after optimization is 2-3℃ higher than that before optimization, while the optimal pump frequency remains the lowest permitted value at 35 Hz, so that the COP of the chillers will be higher, and the lower flow rate results in lower pump energy.However, it indicates that the optimal solution suggests a higher chilled water return temperature, which may not be feasible in some of the actual operation scenarios, since the cooling capacity of the terminals may be reduced.Therefore, in practical application, it is suggested to determine the limit range of the variables according to the performance of the terminals.
Additionally, the two optimization solutions both suggest a 16℃ temperature of chilled water in the last hour before the chillers were totally closed, while the pump frequency was increased.This means that the chilled water is supplied at a high temperature, and the flow rate was increased to meet the cooling need in the last hour.In actual situation, this indicates that the chillers can be shut down earlier while the pumps are still running, and the remaining chilled water in the system can afford the cooling load before the air-conditioning demand disappears, which represents more energy saving potential.

Discussion
In conclusion, the two optimization algorithms perform similar in the current case.In most situations the optimization can achieve certain energy saving effect, but under specific load level, the result can be undesirable due to the unreliable accuracy of the energy model.The optimized results suggest higher chilled water temperature and lower pump frequency than the baseline strategy, reducing the pump energy remarkably.Combined with the consideration of earlier shut down of the system, the optimal strategy can bring more energy saving.
In this case, as a heuristic optimization algorithm, GA didn't reflect the advantage that it is easier to find the global optimal solution in this case.This can be as a result of merely taking the chilled water side equipment into account.The objective function will get more complex as exterior weather and cooling side equipment are furtherly added into the model, and the benefits of using a heuristic method may become clearer.Additionally, the accuracy of the equipment energy model also has a potential impact on the optimization results.In this study, forward stepwise regression is used to fit the equipment power.The variables are introduced into the model one by one.Therefore, the order in which variables are introduced may affect the final retained variables, and model accuracy may be influenced, which could be a potential limitation.The accuracy of the energy consumption model can also be influenced by systemic causes.The instability during the system startup period can produce data that may not comply with general equipment operation patterns, resulting in poor fitting results; the lack of amount of data pertaining to certain operating conditions can result in less accuracy in fitting; additionally, chillers and chilled water pumps may only operate within a narrow range of load ratio and   Fig. 7 The a) total, b) chiller, and c) pump energy consumption of the system before and after optimization water temperature in practice, which causes the incompleteness of the scope of data distribution.
The energy consumption model established in this paper is mainly accurate, but with further improvement of the model accuracy, the optimization results will be closer to the real situation.More real building experiments will help to clarify the impact of energy model accuracy, as well as investigate what level of accuracy should be achieved to make the optimization results reliable.

Conclusions
Combining load prediction with real-time optimal control of HVAC system is very important to achieve energy efficiency in office buildings.In this paper, a method for dynamic data-driven optimization of centralized cooling station at a given load is proposed and validated in real building dataset.Two representative optimization algorithms (SLSQP and GA) are selected for comparison and found to be able to provide 4% energy savings both, compared to existing intelligent control strategy.However, the energy saving effect can be negatively influenced by the lack of the accuracy of energy models.Also, the optimal control strategies generated by the two algorithms are compared.The optimization solutions tend to lower the pump energy to achieve energy saving effect, while the chilled water temperature is increased at the same time.
However, the influence of the setpoints of HVAC terminals on the HVAC system performance and energy consumption needs further verification.To prevent the terminal performance decline caused by excessive water temperature, terminals in the HVAC system could be considered in the optimization process.In this study, the cooling water return temperature is kept in line with the current operating strategy.The possible influence caused by cooling water return temperature still needs further investigation, which signifies that cooling towers need to be included in the energy consumption models.With more equipment enclosed in the model, the feasibility and reliability of the optimize method can be enhanced and realize overall optimal control of the HVAC system.Additionally, the methods to clarify the influence of model accuracy on the optimization effect and to improve the accuracy of the energy model also needs further investigation.This study is supported by the National Natural Science Foundation of China (Grant No. 52130803, 52161135201, 51825802)

Availability of data and materials
The data that support the findings of this study are available from Alibaba Group but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available.Data are however available from the authors upon reasonable request and with permission of Alibaba Group.

Declarations
Ethics approval and consent to participate Not applicable.

Fig. 5
Fig. 5 The flow rate and power of the pump in a) on-site test result, b) history operational data

Fig. 8
Fig. 8 Scatter plot of the cooling load and the energy saving effect

Fig. 9
Fig. 9 The optimal a) chilled water supply temperature and b) average pump frequency before and after optimization

Table 1
The rated parameters of a) chillers, b) pumps

Table 3
The regression coefficients of the chiller energy consumption models

Table 4
The regression coefficients of the pump energy consumption model

Table 5
The ranges of each control variable and other parameters , Tsinghua University Henglong Real Estate Research Center Program, and Alibaba Innovative Research (AIR) Program.