Benchmarking of data predictive control in a real-life apartment during heating season

Model Predictive Control is an energy efficient climate control strategy in buildings. However, the effort associated with physics-based modelling seems to prevent widespread application in residential buildings. Applying machine-learning algorithms on historical data promises efficient generation of predictive models for control. In a recent experimental study, Data Predictive Control based on random forests and linear models outperformed a baseline controller during cooling season. In this paper, the approach is benchmarked against hysteresis control and conventional Model Predictive Control based on an RC-network model during heating season. Data Predictive Control shows promising results in terms of energy consumption and thermal comfort.


Introduction
Model Predictive Control (MPC) is a well-established optimal control strategy. Several studies demonstrated its potential to reduce energy consumption without sacrificing thermal comfort in buildings [1,2]. However, MPC is still not widely used in residential buildings, partly because of the effort associated with the development of physics-based models [2]. Data-driven models could be an alternative to physics-based models as operational building data becomes more available thanks to the digitalization of the building sector. To be suitable for MPC, datadriven models need to achieve reasonable prediction accuracy while being simple enough to be used in an online optimization scheme. MPC with data-driven models is often called Data Predictive Control (DPC).
Artificial Neural Networks (ANN) have been used for various applications including MPC in buildings [3]. However, as ANN are generally non-linear in their nature, they have two main disadvantages. Foremost, there is no guarantee that an optimal solution is found within a feasible time when solving the MPC optimization problem. Furthermore, the computational cost to solve non-convex programs is significantly higher than that of convex programs. In [4], DPC based on Input Convex Neural Networks (ICNN) was deployed. Convexity in ICNN models is achieved by constraining the network parameters, structure, and activation functions. In [5], an approach based on regression trees (RT) and linear regression was introduced to solve a demand-response problem with DPC. By combining RT and linear regression, the model stays linear for a given disturbance with respect to the decision variables. In [6], the approach was further developed by using random forests (RF) instead of single RT which allows one to reduce the variance in the predictions. In [7], we adapted the DPC approach based on RF and linear models and deployed it in a real-life apartment during cooling season.
Most studies about DPC are simulation-based while the study in [7] is conducted during the cooling season and does not compare with MPC. In this paper, DPC based on RF combined with linear models is applied in the real-life apartment Urban Mining & Recycling (UMAR) of the NEST demonstrator at Empa Dübendorf, Switzerland, during the heating season. Using the same models as in [7], we demonstrate the flexibility of the approach to be applicable in the heating and cooling season. Furthermore, in four experiments we benchmark the DPC with a hysteresis controller (HC) and an MPC based on an RC-network model.
The remaining article is structured as follows. In Section 2, the concept of MPC, and the applied modelling techniques are explained. In Section 3, the residential apartment used for the experiments is introduced and the controller setup is defined. In Section 4, the performance of the DPC is compared to HC and conventional MPC in terms of comfort constraint violations and energy usage. In Section 5, the study is concluded and an outlook of future research directions is given.

Model Predictive Control
MPC is an optimal control strategy, where a sequence of optimal control inputs u * for a prediction horizon N is calculated by minimizing a cost function while satisfying the system dynamics, state and input constraints. The optimization is executed repeatedly at discrete time steps k while only the first optimal control input u * k is applied to the system. Such an optimization scheme can be written as where x, u, and d are the states, inputs, and disturbances, respectively. The slack variable , which relaxes the state constraints, makes the optimization problem feasible at all time. To have theoretical guarantees for obtaining an optimal solution, the optimization problem in equation (1) must be convex, i.e. the cost function and constraint sets must be convex. The system modelling methods described in the following two sections fulfill this convexity requirement.

Random forest and linear regression model
For DPC, we apply a modelling approach for the system dynamics in equation (1b) based on RF and linear regression, which was adapted in [7] from the original work of [6]. The introduced modelling process is separated into two steps. First, the RF is trained considering the noncontrollable inputs X d (e.g. weather conditions, past and current room temperatures, time variables). Afterwards, a linear model is fitted for every leaf of all n tree trees in the RF based on the corresponding data sets. By that, the model Y = f RF (X d , X c ) is a linear function of the controllable inputs X c for a given disturbance X d . For application in the MPC problem of equation (1), a model is trained for each step of the prediction horizon N . The online optimization is also separated into two steps. First, the non-controllable inputs X d are fed to the RF to obtain the corresponding leaves and linear functions. Subsequently, the linear models are included in the optimization problem of equation (1b). For more details about the approach, the reader is refered to the original sources [6,7].

Resistor-capacitor network model
The linear state-space model used in MPC is derived from the first law of thermodynamics. The dynamics of the temperature T of a single room can be described by where m is the mass, c p the specific heat capacity, andQ the in-and outgoing heat flows. Here, we take into account heat flows between the room and the ambientQ amb , the room and adjacent roomsQ adj , gains from solar irradiation through the windowQ sol and from the heating systeṁ Q heat . Internal gains caused by occupants and appliances are assumed to be negligible.Q amb andQ adj can be expressed by a linear function of the formQ amb/adj = (T amb/adj − T )/R amb/adj with R being the thermal resistance between the two reservoirs.
where A win is the window area, α the azimuth angle and β the elevation angle of the sun, respectively, while α 0 denotes the orientation of the window and I hor the global irradiation measured on a horizontal surface. Gains from the heating system are calculated byQ heat =ṁ · c p · (T s − T r ) with the mass flowṁ, supply and return temperature T s and T r , respectively. By substituting these equations into (2), and applying time discretization, we obtain the system model to be used in equation (1b) where the coefficients Θ are estimated with least-squares regression on historical building data.

Experiment setup
The residential apartment UMAR in the NEST demonstrator offers a real-life test environment for new building technologies [8]. Figure 1 illustrates the floor plan of UMAR with its southeast facing, panoramic windows, and the adjacent office SolAce. UMAR is equipped with water-based, radiant ceiling panels that can be used for heating and cooling. The mid-and low-temperature network of NEST supplies UMAR with the corresponding thermal energy via heat exchangers. With binary valves the heat distribution to the designated rooms can be controlled. Information about the building operation is collected through various sensors (e.g. temperature, humidity, etc.). These historical data and weather forecasts provided by Meteo Swiss are accessible from an SQL database via a REST API. The actuators in the building (e.g. valves) can be oversteered for research purposes via an OPC UA client. In our case study, we consider the two bedrooms only. Both bedrooms have identical floor area, window size and orientation, furniture, and heating equipment which allows for comparing different control strategies with limited systemic error sources. The only significant difference between the two bedrooms is the adjacent office SolAce next to bedroom II. SolAce acts as an improved insulation of the corresponding wall which causes a reduced heat demand in bedroom II. To address this issue, each experiment is executed twice switching the controllers after the first experiment.

Controller setup
Historical data generated by a rule-based controller over 10 months is used to train the RF-based and the RC-network model. As inputs X d for the RF training we take autoregressive terms and forecasts of the ambient temperature and the solar irradiation, autoregressive terms of the room temperature gradient and the temperature difference to the adjacent rooms, the time of day and month of the year encoded with sine and cosine functions. Each RF consists of n tree = 200 trees with a minimum of 200 data samples per leaf. For both the DPC and MPC, a prediction horizon of 6 hours is applied with a sampling time of 30 minutes for DPC and 10 minutes for MPC, respectively. The sampling times for DPC and MPC were optimized with respect to the prediction accuracy during model validation. For J in equation (1a) the quadratic stage cost function J = u k+j Ru k+j + λ k+j+1 with R = 1 and λ = 100 is chosen for DPC and MPC. The lower and upper comfort constraints are set at 23°C and 25°C, respectively, during the day (i.e. from 06:00 to 22:00). During the night the lower comfort bound is relaxed to 22°C. The HC uses the same lower boundary as the DPC, while the upper boundary is 1°C higher than the lower boundary, i.e. a hysteresis of 1°C.

DPC vs HC
In a first step, the DPC approach based on RF and linear models is compared to HC that acts as a benchmark. Figure 2 shows in the top plot the temperature trajectories of the DPC controlled room in solid blue and the HC controlled room in dashed orange. In the second plot, the control inputs are shown. The third plot depicts the cumulative energy consumption of each of the two rooms. The bottom plot shows the ambient temperature and solar irradiation over the course of the experiments. Initially, the DPC was implemented in bedroom I and HC in bedroom II (2019-11-06 to 2019-11-11). Subsequently, the arrangement was switched (2019-11-15 to 2019-11-18). The first gray shaded area on 2019-11-07 marks a period with open blinds in bedroom II while the ones in bedroom I were closed. Hence, the temperature of the room controlled by HC increased faster due to higher solar gains. The second gray shaded area on 2019-11-16 again depicts a situation where the blinds were open in bedroom II while the ones in bedroom I were closed resulting in higher solar gains for the DPC controlled room. During the third gray shaded area on 2019-11-17, the windows were open for a short period in both bedrooms simultaneously causing a sharp drop in room temperatures. In the first experiment, the room controlled with DPC consumed 2.4% less energy than the room controlled by HC even though the bedroom II had higher solar gains. The cumulative constraint violations were 83.1% lower in the DPC controlled room than in the HC controlled one. After the change-over the DPC controlled room needed 50.7% less heating energy than the HC controlled room while the cumulative constraint violations were 42.4% lower. One notes that the constraint violations of the HC controlled rooms are mainly caused by a lack of predictability of changes in comfort constraints. The large difference in energy savings between the two experiments is explained by the lower heat demand of bedroom II due to the adjacent office SolAce. However, the energy consumption was reduced with DPC in both experiments while the thermal comfort was even increased.

DPC vs MPC
In a second set of experiments DPC was compared to MPC that acts as a benchmark. Figure  3 shows the signals analogous to Figure 2 with the difference that MPC is marked in dashed green. First, DPC was implemented in bedroom II and MPC in bedroom I (2020-01-18 to 2020-01-28) and subsequently switched (2020-01-30 to 2020-02-09). During the experiments the blinds were open several times in both bedrooms on sunny days, as can be observed by the peaks in room temperature for example at noon on the three days 2020-01-19 to 2020-01-21. However, since the blinds were open in both rooms simultaneously the results are not affected by these disturbances. As the control system is not able to influence upper constraint violations directly (e.g. by closing the blinds), only lower constraint violations are considered in the performance evaluation. Marked gray, on 2020-01-24 and 2020-02-04 the communication to the actuators was interrupted. During this time a backup thermostat controller took over. Consequently, these periods are excluded in the performance evaluation. In the first experiment, the room controlled with DPC consumed 21.4% less heating energy while the cumulative lower constraint violations are 23.4% lower than in the room controlled by the MPC. In the second experiment, the room controlled by MPC required 49.9% less heating energy than the room controlled by DPC. The large difference in energy consumption is again explained by the lower heat demand of bedroom II. However, MPC also saved heating energy by staying closer to the lower bound as shown in the second experiment on 2020-02-02 and 2020-02-03 for example. Moreover, the MPC model predicts the influence of the heating system more accurately resulting in a better timing to start heating which can be observed on 2020-02-01. The lower comfort constraints are met almost equally well by the DPC (0.08 Kh) and MPC (0.74 Kh) over the course of the second experiment. Altogether, DPC achieved a performance in these two experiments which comes close to the one of MPC in terms of energy consumption and thermal comfort.

Conclusion
In this paper, we applied DPC based on RF and linear models in a real-life apartment during heating season. By that, the flexibility of using the approach also during heating season was demonstrated. DPC outperformed HC in terms of heating energy consumption and thermal comfort in two experiments. In another two experiments, DPC achieved a performance close to that of conventional MPC based on an RC-network model. Further studies aim at conducting a large scale field test with DPC in order to get statistical evidence of the promising performance.