Lee, Shinwoo (2020-07). Analysis of Support Vector Machine Regression for Building Energy Use Prediction. Master's Thesis. Thesis uri icon


  • There are many inverse modeling methods to model the whole building energy use. Multiple linear regression (MLR) and change-point liner regression (CPLR) have been some of the most common methods due to their intelligibility concerning building energy modeling and accuracy. Recently, as machine-learning techniques have become user-friendly, there have been an incremental number of attempts to apply these techniques to building energy modeling. However, few studies conducted an in-depth comparison with the conventional inverse model methods using large sample size. This study conducted an exhaustive comparative study based on Support Vector Machine (SVM), one of the most widely used machine-learning methods for flexibility and accuracy, with enough samples to draw a reasonable conclusion between models generated from conventional methods such as MLR and CPLR, and those from SVM. This work, besides the comparative analysis, included a thorough SVM performance analysis for building energy modeling. It described in detail its implementation, and showed its performance as a regression technique for building energy modeling under the influence of different variables. The comparative study focused on modeling whole building chilled water use (CHW) and heating hot water use (HHW), and analyzed the influence of such variables as the outdoor dry-bulb temperature (OAT), the outdoor dew-point temperature (DPT), the outdoor air enthalpy (OAE), and operational effective enthalpy (OEE). The numerical experiments were based on 41 whole year hourly building energy use dataset samples. These datasets were transformed into daily and monthly datasets. According to the comparative analysis between SVM and MLR, based on CHW datasets, SVM consistently showed higher performances by an average of 6.8% on daily and 2.0% on monthly models, respectively. For the SVM and CPLR performance analysis, four pairs of dependent and independent variables were considered: CHW-OAT, CHW-OAE, CHW-OEE, and HHW-OAT. On the daily model, SVM demonstrated consistently higher performances although most of the cases resulted in a marginal advantage by less than 1% for all variables utilized. Despite such marginal gains in mean performance, SVM showed advantages by up to 3% for some datasets. On the monthly model, however, SVM did not exhibit better results for any dependent-independent variable pair.

publication date

  • July 2020