Single Family Residences Seasonal Load Forecast

1. The case

The purpose of this project is to design an AI model to forecast the electricity consumption by Single Families for the last week of each season in a particular year, in one shot. This study is a one-time project on demand.
This case study reviews an actual load forecast model for electricity consumption for areas containing “Families”. This forecast model is a seasonal model and takes into account the seasonal data for forecasting the last week of each season of a year. The load forecast models are implemented using QR AI Forecaster.

2. The Data

This case study uses seasonal data with hourly resolution for one complete year missing the last week of each season. This data is frozen for the year 2001 and is used as historical data without revision.
Input data used by the model
  • Actual data published by ***. We fetch this data all at once.
  • Los Angeles regional weather data published every hour. We used humidity and temperature data.
Output dataof the model forecasts: load forecast
We have used weather data, mostly humidity and temperature data for this load forecasting in different seasons. A calendar was used listing holidays related to the School industry. The system does understand the weekly cycle Monday to Friday as working day, and Saturday and Sunday as weekend.

3. Data Exploration

The first step in designing the AI model is to take a deeper look into the data and identify quantifiable behavioral patterns that may not be visible at first glance.
In order to do that, the QR data science team uses different data analysis and visualization tools to extract information from raw data. You can see below the one year hourly data for the year 2001, for Single families electricity consumption.
Whole Year - Single Family
Raw Data for The Whole Year - Single Family
As is clear in the plot, different seasons have different behavior in the electricity consumption. Thus the rest of the document is structured based on the seasons which are:
  • Spring from March 1st to May 31st
  • Summer from June 1st to Aug 31st
  • Autumn from September 1st to November 30th
  • Winter from December 1st to February 28th
It is important to mention that this project is defined such that the last week of each season is missing to forecast. However, in order to design our models, we had performed analysis and model experiments on the previous last week of each season so that we can analyze the performance of the model and compute the errors. The finalized model is then used to forecast the missing last week.
3.1. Seasonal Behavior of Data - Spring
The plot below displays the behavior of the Office electricity consumption data in Spring in a closer look.
data for Single families02
Raw data behavior of Spring data for Single families
The plot below displays the hourly consumption averages across the 3 months training data for Spring ranging from March 1st to May 24th. This data can help us see the daily pattern and daily Peak and off Peak.
data for Single families03
Hourly Average Behavior of Spring Data for Single Families
Conclusions:
  • There are two peaks in a day for Single Family Data
  • The consumption starts rising from around 4:00 AM and lasts until around 9:00 AM and then starts to decrease.
  • The second consumption starts rising from around 4:00 PM and lasts until around 9:00 PM and then starts to decrease.
  • The peak hours are 7:00 AM to 8:00 AM and 7:00 PM to 8:00 PM.
  • TThe data is clearly hour dependent. We therefore must use the built-in feature Hour-of-the-day.
We can further refine the above analysis by drilling down to days of the week. The next plot displays the average hourly consumption for each day of the week, computed for the 3-month training data for Spring.
Daily Average Behavior of Spring Data for Single families
The above plot suggests:
  • a) Weekdays have the same behavior with regular peak and off peak hours.
  • b) Peak hour on weekdays is at 7:00 AM and 6:00 PM to 9:00 PM.
  • c) Weekend shows a different pattern with higher consumption and different peak hours.
  • d) Peak hours on weekends are 8:00 AM to 9:00 AM and 7:00 PM to 10:00 PM.
  • e) On weekdays the offPeak is very low in consumption in the middle of the day as kids go to school and elders go to their work.
  • f) On weekends there is an off Peak but it’s not as low as on weekdays.
These observations lead to the following feature engineering configuration of our AI Price Forecast model:
  • Hour of the day can be a good feature to set in the models to capture daily behavior of the data.
  • Weekday as a built-in feature or predictor to the AI Model to indicate the days of the week.
  • The Model Splitter feature of the QR AI Forecaster should be activated to split data at run time and create and train 3 different Models to forecast different parts of the day.
3.2. Seasonal Behavior of Data - Summer
The plot below displays the behavior of the Single Family electricity consumption data in Summer in a closer look.
data for Single families05
Raw data behavior of Summer data for Single Family
The plot below displays the hourly consumption averages across the 3 months training data for Summer ranging from June 1st to Aug 24th. This data can help us see the daily pattern and daily Peak and off Peak.
data for Single families06
Hourly Average Behavior of Summer Data for Single Families
Conclusions:
  • There are two peaks in a day for Single Family Data.
  • The consumption starts rising from around 4:00 AM and lasts until around 9:00 AM and then starts to decrease.
  • The second consumption starts rising from around 4:00 PM and lasts until around 9:00 PM and then starts to decrease.
  • The peak hours are 7:00 AM to 8:00 AM and 7:00 PM to 8:00 PM.
  • The data is clearly hour dependent. We therefore must use the built-in feature Hour-of-the-day.
We can further refine the above analysis by drilling down to days of the week. The next plot displays the average hourly consumption for each day of the week, computed for the 3-month training data for Summer.
data for Single families07
Daily Average Behavior of Summer Data for Single Families
The above plot suggests:
  • g) Weekdays have the same behavior with regular peak and off peak hours.
  • h) Peak hour on weekdays is at 7:00 AM and 6:00 PM to 9:00 PM.
  • i) Weekend shows a different pattern with higher consumption and different peak hours.
  • j) Peak hours on weekends are 8:00 AM to 9:00 AM and 7:00 PM to 10:00 PM.
  • k) On weekdays the offPeak is very low in consumption in the middle of the day as kids go to school and elders go to their work.
  • l) On weekends there is an off Peak but it’s not as low as on weekdays.
These observations lead to the following feature engineering configuration of our AI Price Forecast model:
  • Hour of the day can be a good feature to set in the models to capture daily behavior of the data.
  • Weekday as a built-in feature or predictor to the AI Model to indicate the days of the week.
  • The Model Splitter feature of the QR AI Forecaster should be activated to split data at run time and create and train 3 different Models to forecast different parts of the day.
3.3. Seasonal Behavior of Data - Autumn
The plot below displays the behavior of the Single Family electricity consumption data in Autumn in a closer look.
data for Single families08
Raw data behavior of Autumn data for Single Families
The plot below displays the hourly consumption averages across the 3 months training data for Autumn ranging from September 1st to November 23rd. This data can help us see the daily pattern and daily Peak and off Peak.
data for Single families09
Hourly Average Behavior of Autumn Data for Single Family
Conclusions:
  • There are two peaks in a day for Single Family Data
  • The consumption starts rising from around 4:00 AM and lasts until around 9:00 AM and then starts to decrease.
  • The second consumption starts rising from around 4:00 PM and lasts until around 9:00 PM and then starts to decrease.
  • The peak hours are 7:00 AM to 8:00 AM and 7:00 PM to 8:00 PM.
  • The data is clearly hour dependent. We therefore must use the built-in feature Hour-of-the-day.
We can further refine the above analysis by drilling down to days of the week. The next plot displays the average hourly consumption for each day of the week, computed for the 3-month training data for Autumn.
data for Single families10
Daily Average Behavior of Autumn Data for Single Families
The above plot suggests:
  • m) Weekdays have the same behavior with regular peak and off peak hours.
  • n) Peak hour on weekdays is at 7:00 AM and 6:00 PM to 9:00 PM.
  • o) Weekend shows a different pattern with higher consumption and different peak hours.
  • p) Peak hours on weekends are 8:00 AM to 9:00 AM and 7:00 PM to 10:00 PM.
  • q) On weekdays the offPeak is very low in consumption in the middle of the day as kids go to school and elders go to their work.
  • r) On weekends there is an off Peak but it’s not as low as on weekdays.
These observations lead to the following feature engineering configuration of our AI Price Forecast model:
  • Hour of the day can be a good feature to set in the models to capture daily behavior of the data.
  • Weekday as a built-in feature or predictor to the AI Model to indicate the days of the week.
  • The Model Splitter feature of the QR AI Forecaster should be activated to split data at run time and create and train 3 different Models to forecast different parts of the day.
3.4. Seasonal Behavior of Data - Winter
The plot below displays the behavior of the Office electricity consumption data in Winter in a closer look.
Raw data behavior of Winter data for Single Families
The plot below displays the hourly consumption averages across the 3 months training data for Winter: January and February and December 1st until 24th of the same year. This data can help us see the daily pattern and daily Peak and off Peak.
data for Single families13
Hourly Average Behavior of Winter Data for single Families
Conclusions:
  • There are two peaks in a day for Single Family Data
  • The consumption starts rising from around 4:00 AM and lasts until around 9:00 AM and then starts to decrease.
  • The second consumption starts rising from around 4:00 PM and lasts until around 9:00 PM and then starts to decrease.
  • The peak hours are 7:00 AM to 8:00 AM and 7:00 PM to 8:00 PM.
  • The data is clearly hour dependent. We therefore must use the built-in feature Hour-of-the-day.
We can further refine the above analysis by drilling down to days of the week. The next plot displays the average hourly consumption for each day of the week, computed for the 3-month training data for Winter.
data for Single families14
Daily Average Behavior of Winter Data for Single Families
The above plot suggests:
  • s) Weekdays have the same behavior with regular peak and off peak hours.
  • t) Peak hour on weekdays is at 7:00 AM and 6:00 PM to 9:00 PM.
  • u) Weekend shows a different pattern with higher consumption and different peak hours.
  • v) Peak hours on weekends are 8:00 AM to 9:00 AM and 7:00 PM to 10:00 PM.
  • w) On weekdays the off peak is very low in consumption in the middle of the day as kids go to school and elders go to their work.
  • x) On weekends there is an off Peak but it’s not as low as on weekdays.
These observations lead to the following feature engineering configuration of our AI Price Forecast model:
  • Hour of the day can be a good feature to set in the models to capture daily behavior of the data.
  • Weekday as a built-in feature or predictor to the AI Model to indicate the days of the week.
  • The Model Splitter feature of the QR AI Forecaster should be activated to split data at run time and create and train 3 different Models to forecast different parts of the day.
4. Feature & Correlation Analysis
To select the best external time series as features or predictors, we need to analyze their correlation with the main price time series Single Family electricity consumption, we are forecasting. A good or useful feature must be closely correlated to and shed light on a particular behavior of the main time series.
The Single Family electricity consumption data present several challenges in the 1 year period considered in this Case Study. Our data science team resolves these by configuring the right AI model so as to produce accurate load forecasts for our clients.
Factoring in Temperature
Weather components such as temperature and/or feels like can affect the consumption pattern since they are directly affecting people as consumers of electricity considering this case study. This was uncovered by our data science team, using our data analysis toolbox, they executed correlation analysis across many external time series data against the main Single Family electricity consumption. As is displayed in the plot below, average consumption in different seasons is affected with average temperature shown in the plot below. For example in summer with the higher average temperature we can see a lower average consumption in red.
data for Single families17
Average Behavior of Single Family Electricity Consumption vs Temperature Separated by Seasons
Conclusions:
Temperature time series should be included as an external feature or predictor in the AI model, to guide the forecast level in different seasons’ trends.
Factoring in Solar Radiance
As can be seen in the plot below, Solar Radiance data such as DHI, DNI and GHI are good markers to have the relative trend in different seasons. This is displayed in the plot below as the average hourly behavior of solar data during the day compared to consumption data.
data for Single families18
Hourly Average Behavior of Single Family Data vs Solar Radiance Separated by Seasons
Conclusions:
Solar Radiance time series should be included as an external feature or predictor in the AI model, to guide the forecast to catch different parts of the day as well as the seasonality trend.
Factoring in Humidity
Relative humidity should be inversely proportional to electricity consumption as can be seen in the plot below which clearly suggests with drops in the humidity we have peaks in the consumption.
data for Single families19
Average Behavior of Single Family Electricity Consumption vs Humidity Separated by Seasons
Conclusions:
Relative Humidity time series should be included as an external feature or predictor in the AI model, to guide the forecast to adjust with the pattern.
5. The AI Forecast Model
By now the preliminary data analysis work has been done and we are ready to implement the best AI model for this case study, considering the specific features and data discussed previously. We use QR AI Forecaster cloud service. This platform is a no-coding automated AI platform, where modeling is done by dragging and dropping various model and feature engineering components in an intuitive AI dashboard, to automatically assemble and execute the final forecast model.
Following the previous chapter, we are going to have separate forecast models for each season; however, the general structure is the same. Aside from training data as the input of the machine learning model which would be a specific date range for each season, some lagged predictors or internals are also different.
1) Data Preprocessing
We first configure a few standard data processing features discussed above in QR AI Forecaster
  • a) Gap Filling can be accomplished by several methods (Linear Interpolation, Weekly Pattern, Daily Pattern, etc. ), Generally we did not have missing data in this case study, but we used gap filling methods such as weekly pattern method to fill up if there were any.
  • b) Calendar is used by a processor to swap mid–week holiday data with Sundays in the future, and with a regular day in the past during forecast and training respectively, as well as labeling working days and weekends for the AI modeling. In this case study we used the US calendar for 2001 dedicated to the Single Family profile.
2) Feature Engineering
We configure the following features in QR AI Forecaster dashboard. As mentioned above, based on the designed model for each season we have used 1, 2, or more previous point/week or even have applied an experimented n hours simple moving average.
Predictor Type Features
Lagged Predictors
n Previous Point
Built-in Predictors
Hour, Weekday, Month, Week Number
External Time Series Predictors
Temperature, Humidity, Radiance
data for Single families20
QR Forecaster Feature Engineering Dashboard Load Forecast Model for Single Family Consumer Type for Spring
3) AI Model
We configure the following AI model specifications in QR AI Forecaster dashboard:
  • a) Model Splitter:
    Recall that this feature allows the AI machine to split the data and define and train multiple models at run-time to forecast specific profiles. In this case considering the structure of the data, discussed in the previous chapter, we want to model specific parts of the day with a total of three models forecasting for midnight until 5:00 A.M and from 5:00 AM until 2:00 PM and the rest of the day.
  • b)Model Optimization:
    We configured a range of deep learning AI models for this case study. They have their own advantages. We present here 3 machine learning models, NGBoots, LightGBM and XGBoost. These can be configured to perform with nearly equal accuracy levels. For example with XGBoost we configure the following 12 hyper-parameters:
booster number of estimators gamma
max depth
min child weight
max delta step
Subsample
l1 regularization coefficient
l2 regularization coefficient
base score
evaluation metric
objective
The classical pitfalls of over or under fitting must be avoided. If we choose 5 values for each hyper-parameter, there are 512 combinations to try. This is an impossible manual task. QR AI Forecaster has an auto-ML toolbox that fine-tunes and optimizes the hyper-parameters. This toolbox runs the equivalent of thousands of scenarios under 30 minutes.
  • c) Model Execution:
    Once the model is finalized, training one of the machine learning models over about 3 months of hourly Office Electricity Consumption data for each season, and executing a forecast takes about 1 minute .
6. Forecast and Accuracy Analysis
1) The Forecast Display Dashboard
QR AI Forecaster has several data visualization dashboards that gather in one screen:
  • a) Forecast error is computed by MAPE (mean absolute percentage error), and MAE (mean absolute error). These are listed in table format and gauges.
  • b) Table of time series (actual, forecast and errors of each datapoint).
In the accuracy analysis dashboards below you can find the results of our forecasts for the previous to last week of each season when we have actual data to compute the errors plus the last week of each season when there is no actual data and the goal of the case study is to forecast that interval.
Spring
Single families-Spring
QR Forecaster Data Visualization Dashboard Actual, Forecast and Accuracy Hourly Data. May 18th - May 31st.
Summer
Data-summer
QR Forecaster Data Visualization Dashboard Actual, Forecast and Accuracy Hourly Data. Aug 18th - Aug 31st.
Autumn
Data-Autumn
QR Forecaster Data Visualization Dashboard Actual, Forecast and Accuracy Hourly Data. Nov. 17th - Nov. 30th.
Winter
Data-winter
QR Forecaster Data Visualization Dashboard Actual, Forecast and Accuracy Hourly Data. Dec.18th - Dec 31st.
2) Accuracy Analysis
The Single Family Electricity Consumption forecast has a MAPE of:
    • 7.20 for Spring
    • 12.2 for Summer
    • 13.7 for Autumn
    • 9.64 for Winter
As you can see in the other Case Studies we have published for other industries, electricity load forecasting in ***, using the same machine learning model, with different parameters and features, results in a MAPE of ***. This is mainly due to the fact that:
    • The dynamic of consumption level change from one day to the next can be very different across other profiles. E.g. offices have a more regular consumption behavior throughout the year compared to schools or hotels.

Next Step

We look forward to exploring the range of options for your projects. Please write to us and one of our project managers will get back to you at once.