Show simple item record

Deep learning methods applied to modeling and policy optimization in large buildings

dc.creatorNaug, Avisek
dc.date.accessioned2022-05-19T17:19:11Z
dc.date.available2022-05-19T17:19:11Z
dc.date.created2022-05
dc.date.issued2022-05-16
dc.date.submittedMay 2022
dc.identifier.urihttp://hdl.handle.net/1803/17367
dc.description.abstractDeveloping an optimal policy for building energy management is a difficult problem because the system exhibits non-stationary behaviors and the target policy needs to evolve with changes in the state transition and reward functions. Non-stationary real world problems often present a set of challenges: the non-stationarities are difficult to detect; and systems with low sampling rates can create sample-inefficiency problems for learning algorithms. In addition, the system may have to satisfy safety-critical constraints, and, therefore, the policy must be learned offline. In this thesis, we develop deep reinforcement learning methods for designing supervisory controllers for building energy management. This process requires a model of the system for planning or training purposes. It may be computationally infeasible to derive accurate physics-based models for complex systems that generalize across non-stationarities due to resource constraints or the lack of detailed domain knowledge. On the other hand, data-driven models, while relatively easier to develop, need sufficient data to cover a variety of building operations and environmental conditions to ensure that the building model is not under constrained or overfitting. Our approach solves the problem of deploying a condition-based lifelong reinforcement learning agent for building energy management. We assume that building systems can be modeled as a Lipschitz Continuous NS-MDP with bounded changes in the system dynamics and reward, allowing policies to adapt incrementally to the changing behavior via a relearning procedure within a given duration of time determined by the time-constants associated with the system. The approach involves two loops: An outer loop detects the non-stationarity by tracking the deployment phase reward and triggers an inner loop call the relearning phase. The relearning process starts by updating the data-driven models with recent system data to adapt to the non-stationarity. We employ elastic weighted regularization to prevent overfitting with limited data. Then, the agent policy trains by interacting with the updated data-driven models simulating the system behavior. To generate a large amount of diverse data, we allow the system to simulate over a long future horizon by using forecasts of the exogenous variables. To account for the variance issues due to inaccurate returns for the data-driven models, we train the agent across multiple environments in parallel, thereby bootstrapping experiences to reduce the effects of uncertainty and to collect decorrelated transitions. We demonstrate our proposed approach on a building simulation testbed and benchmark it against other state-of-the-art approaches in building supervisory control like G36, PPO, DDPG and MPC. We also deploy the approach in a real building on our university campus. Our approach is able to perform significantly better than existing supervisory control strategies and highlights the need for a condition-based offline relearning framework in dynamic systems.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectDeep Learning, Deep Reinforcement Learning, Building Energy Optimization
dc.titleDeep learning methods applied to modeling and policy optimization in large buildings
dc.typeThesis
dc.date.updated2022-05-19T17:19:11Z
dc.type.materialtext
thesis.degree.namePhD
thesis.degree.levelDoctoral
thesis.degree.disciplineComputer Science
thesis.degree.grantorVanderbilt University Graduate School
dc.creator.orcid0000-0003-3253-7286
dc.contributor.committeeChairBiswas, Gautam


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record