1. The case
- The California ISO provides open and non-discriminatory access to the bulk of the state’s wholesale transmission grid, supported by a competitive energy market and comprehensive infrastructure planning efforts.
- The California ISO, CAISO, organizes prices by Locational Marginal Prices (LMP) incorporating the cost of producing and delivering energy from individual nodes, or locations on the grid, where the transmission lines and generation interconnect. CAISO also publishes the System Marginal Price (SMP) which is LMPs minus system loss and system congestion at a given node.
- CAISO allows trading at trading hubs THSP15, THNP15 and THZP26, in an intraday 15-minute market (FMM), as well as an hourly DA market.
- This case study reviews a day-ahead (DA) price forecast model for THSP15 LMP price, at an hourly resolution. The price forecast model is implemented using QR AI Forecaster.
- The DA market opens for bids and schedules seven days before and closes the day prior to the trade date at around noon. We therefore execute and publish our DA forecast at 9:00 a.m. for the next day, to allow ample time for traders to plan their DA trading strategies. QR AI Forecaster serves as the key input of QR Trading Optimizer to optimize the bidding of your assets.
The data for this case study comes from QuantRisk free and publicly available forecast webportal. You can access the performance of our price forecast models for multiple markets here.
2. The Data
- Actual THSP15 LMP prices data published by CAISO. This is the price we are going to forecast.
- Demand side data: this is our own DA System demand forecast for THSP15.
- Supply side data: Different fuel prices by regions within CAISO, DA solar and wind generation forecasts, and total outages (hydro, renewable and thermal) data all published by CAISO.
Output data. The output of this AI forecast model is hourly DA price forecast for THSP15 executed and published at 9:00 a.m. California time for the next day.
3. Data Exploration
The first step in designing the AI model is to take a deeper look into the data and identify quantifiable behavioral patterns that may not be visible at first glance.
In order to do that, the QR data science team uses different data analysis and visualization tools to extract information from raw data. You can see below the 6 months hourly THSP15 DA price used for training.
The plot above displays the hourly THSP15 price averages across the 6 months training data ranging from February 5th to August 5th. This data can help us see the daily pattern and daily Peak and off Peak.
● The THSP15 price peaks at 6:00 a.m. and 7:00 p.m., and bottoms at 12:00 noon. The forecast model is required to capture these.
● This plot suggests to split the day in 3 periods and train the model on each data set, midnight to 6 a.m., 6 a.m. to 7 p.m., and 7 p.m. to midnight. This can be automatically accomplished by checking the Model Splitter feature of QR AI Forecaster.
● The price is clearly hour dependent. We therefore must use the built-in feature Hour-of-the-day.
The above plot suggests:
a) Saturday and Sunday THSP15 prices are lower than the rest of the week and even the first peek pattern is different.
b) Thursday THSP15 prices are higher but the pattern is the same with other weekdays.
c) Other days are all similar in pattern and in levels of prices.
These observations lead to the following feature engineering configuration of our AI Price Forecast model:
● Weekday as a built-in feature or predictor to the AI Model to indicate the days of the week.
● The Model Splitter feature of the QR AI Forecaster should be activated to split data at run time and create and train 3 different Models to forecast different days of the week: Monday – Friday, Saturday, Sunday.
4. Feature & Correlation Analysis
To select the best external time series as features or predictors, we need to analyze their correlation with the main price time series THSP15 we are forecasting. A good or useful feature must be closely correlated to and shed light on a particular behavior of the main time series.
The CAISO DA price data, including THSP15, present several challenges in the 6 months period considered in this Case Study. Our data science team resolves these by configuring the right AI model so as to produce accurate price forecasts for our clients.
Factoring in gas price effects
In the first plot below you can see that THSP15 DA prices are very high at the end of December 2022. Weather impact could not explain this. As seen in the second plot below, it turns out gas prices in CAISO were causing these price spikes in the DA market. This was uncovered by our data science team, using our data analysis toolbox, they executed correlation analysis across many time series data published by CAISO against the main THSP15 DA price.
CAISO gas price time series should be included as an external feature or predictor in the AI model, to guide the forecast in predicting such irregularities.
Factoring in renewable generation effects
As can be seen in the plot below, there are significant occurrences of negative prices between May and June. Weather impact could not explain this. As seen in the second plot below, this is due to high solar generation penetration in CAISO. As above, this was uncovered by our data science team using our data analysis toolbox.
● The Outlier Treatment feature of QR AI Forecaster needs to be enforced for this THSP15 DA price forecast model.
● CAISO Solar Generation time series should be included as an external feature or predictor in the AI model, to guide the forecast in predicting such irregularities.
5. The AI Forecast Model
By now the preliminary data analysis work has been done and we are ready to implement the best AI model for this case study, considering the specific features and data discussed previously.
We use QR AI Forecaster cloud service. This platform is a no-coding automated AI platform, where modeling is done by dragging and dropping various model and feature engineering components in an intuitive AI dashboard, to automatically assemble and execute the final forecast model.
1) Data Preprocessing
- a) Gap Filling can be accomplished by several methods (Linear Interpolation, Weekly Pattern, Daily Pattern, etc. ), for price data we use Linear Interpolation.
- b) Outlier Detection can be accomplished by several methods. For the current price forecast case study, we use a standard deviation method, removing data at 1.6 SD, and replacing it with local average.
- c) Calendar is used by a processor to swap mid–week holiday data with Sundays in the future, and with a regular day in the past during forecast and training respectively, as well as labeling working days and weekends for the AI modeling.
2) Feature Engineering
We configure the following following features in QR AI Forecaster dashboard:
Previous Point (which here is previous hour)
Hour, Calendar Weekday, Week Number
External Time Series Predictors
Gas price forecast, Demand forecast, Wind &, Solar forecast, Outages and a QR designed formula with combination of some predictors
3) AI Model
- a) Model Splitter:
- Recall that this feature allows the AI machine to split the data and define and train multiple models at run-time to forecast specific profiles. In this case considering the structure of the data, discussed in the previous chapter, we want to model specific days of the week, with a total of three models forecasting for Monday – Friday, Saturday and Sunday.
- b) Model Optimization:
- We configured a range of deep learning AI models for this case study. They have their own advantages. We present here 3 machine learning models, NGBoots, LightGBM and XGBoost. These can be configured to perform with nearly equal accuracy levels.
For example with XGBoost we configure the following 12 hyper-parameters:
|booster||number of estimators||gamma|
min child weight
max delta step
l1 regularization coefficient
l2 regularization coefficient
- c) Model Execution:
- Once the model is finalized, training one of the machine learning models over 6 months of hourly SP15 DA price data, and executing a DA forecast takes about 1 minute.
6. Forecast and Accuracy Analysis
1) The Forecast Display Dashboard
QR AI Forecaster has several data visualization dashboards that gather in one screen:
- a) Our forecast SP15 DA price data, computed and published at 9:00 a.m CA time.
- b) The actual SP15 DA price data, published around 1:00 p.m., after the DA market has closed.
- c) Forecast error is computed by MAPE (mean absolute percentage error), and MAE (mean absolute error). These are listed in table format and gauges.
2) Accuracy Analysis
- The CAISO DA trading hub SP15 price forecast has a MAPE of 9.7%.
- Other CAISO trading hubs DA price forecasts, using the same machine learning model, have a similar accuracy.
- As you can see in the other Case Studies we have published for DA price forecasting in MISO and PJM, using the same machine learning model, with different parameters and features, results in a MAPE of 6 to 7%. This is mainly due to the fact that:
- The dynamic of price level change from one day to the next can be very different across markets. E.g. CAISO DA prices are a lot more correlated to fuel.
- Each market publishes different sets of supply and renewable data. Some are better correlated to DA prices.