16

Short-Term Load and Price Forecasting with Artificial Neural Networks*

Alireza Khotanzad

Southern Methodist University

16.1    Artificial Neural Networks

Error Back-Propagation Learning RuleAdaptive Update of the Weights during Online Forecasting

16.2    Short-Term Load Forecasting

ANNSTLF ArchitectureHumidity and Wind SpeedHolidays and Special DaysPerformance

16.3    Short-Term Price Forecasting

Architecture of Price ForecasterPerformance

References

16.1  Artificial Neural Networks

Artificial neural networks (ANN) are systems inspired by research into how the brain works. An ANN consists of a collection of arithmetic computing units (nodes or neurons) connected together in a network of interconnected layers. A typical node of an ANN is shown in Figure 16.1. At the input side, there are a number of so-called connections that have a weight of Wij associated with them. The input denoted by Xi gets multiplied by Wij before reaching node j via the respective connection. Inside the neuron, all the individual inputs are first summed up. The summed inputs are passed through a nonlinear singleinput, single-output function “S” to produce the output of the neuron. This output in turn is propagated to other neurons via corresponding connections.

While there are a number of different ANN architectures, the most widely used one (especially in practical applications) is the multilayer feed-forward ANN, also known as a multilayer perceptron (MLP), shown in Figure 16.2. An MLP consists of n input nodes, h so called “hidden layer” nodes (since they are not directly accessible from either input or output side), and m output nodes connected in a feed-forward fashion. The input layer nodes are simple data distributors whereas neurons in the hidden and output layers have an S-shaped nonlinear transfer function known as the “sigmoid activation function,” f(z) = 1/1 + ez where z is the summed inputs.

For hidden layer nodes, the output is

Hj=11+exp(i=1nWijXi)

Image

FIGURE 16.1  Model of one node of an ANN.

Image

FIGURE 16.2  An example of an MLP with three input, three hidden, and two output nodes.

where Hj is the output of the jth hidden layer node, j = 1,…,h, and Xi represents the ith input connected to this hidden node via Wij with i = 1,…, n.

The output of the kth output node is given by

Yk=11+exp(j=1hWjkHj)

where Yk is the output of the kth output layer node with k = h + 1,…, m, and Wjk representing connection weights from hidden to output layer nodes.

One of the main properties of ANNs is the ability to model complex and nonlinear relationships between input and output vectors through a learning process with “examples.” During learning, known input-output examples, called the training set, are applied to the ANN. The ANN learns by adjusting or adapting the connection weights through comparing the output of the ANN to the expected output. Once the ANN is trained, the extracted knowledge from the process resides in the resulting connection weights in a distributed manner.

A trained ANN can generalize (i.e., produce the expected output) if the input is not exactly the same as any of those in the training set. This property is ideal for forecasting applications where some historical data exists but the forecast indicators (inputs) may not match up exactly with those in the history.

16.1.1  Error Back-Propagation Learning Rule

The MLP must be trained with historical data to find the appropriate values for Wij and the number of required neurons in the hidden layer. The learning algorithm employed is the well-known error back-propagation (BP) rule (Rumelhart and McClelland, 1986). In BP, learning takes place by adjusting Wij. The output produced by the ANN in response to inputs is repeatedly compared with the correct answer. Each time, the Wij values are adjusted slightly in the direction of the correct answers by back-propagating the error at the output layer through the ANN according to a gradient descent algorithm.

To avoid overtraining, the cross-validation method is used. The training set is divided into two sets. For instance, if three years of data is available, it is divided into a two-year and a one-year set. The first set is used to train the MLP and the second set is used to test the trained model after every few hundred passes over the training data. The error on the validation set is examined. Typically this error decreases as the number of passes over the training set is increased until the ANN is overtrained, as signified by a rise in this error. Therefore, the training is stopped when the error on the validation set starts to increase. This procedure yields the appropriate number of epochs over the training set. The entire three years of data is then used to retrain the MLP using this number of epochs.

In a forecasting application, the number of input and output nodes is equal to the number of utilized forecast indicators and the number of desired outputs, respectively. However, there is no theoretical approach to calculate the appropriate number of hidden layer nodes. This number is determined using a similar approach for training epochs. By examining the error over a validation set for a varying number of hidden layer nodes, a number yielding the smallest error is selected.

16.1.2  Adaptive Update of the Weights during Online Forecasting

A unique aspect of the MLPs used in the forecasting systems described in this section is the adaptive update of the weights during online operation. In a typical usage of an MLP, it is trained with the historical data and the weights of the trained MLP are then treated as fixed parameters. This is an acceptable procedure for many applications. However, if the modeled process is a nonstationary one that can go through rapid changes, e.g., variations of electric load due to weather swings or seasonal changes, a tracking mechanism with sensitivity to the recent trends in the data can aid in producing better results.

To address this issue, an adaptive weight adjustment strategy that takes place during online operation is utilized. The MLP is initially trained using the BP algorithm; however, the trained weights are not treated as static parameters. During online operation, these weights are adaptively updated on a sample-by-sample basis. Before forecasting for the next instance, the forecasts of the past few samples are compared to the actual outcome (assuming that actual outcome for previous forecasts have become available) and a small scale error BP operation is performed with this data. This mini-training with the most recent data results in a slight adjustment of the weights and biases them toward the recent trend in data.

16.2  Short-Term Load Forecasting

The daily operation and planning activities of an electric utility requires the prediction of the electrical demand of its customers. In general, the required load forecasts can be categorized into short-term, mid-term, and long-term forecasts. The short-term forecasts refer to hourly prediction of the load for a lead time ranging from 1 h to several days out. The mid-term forecasts can either be hourly or peak load forecasts for a forecast horizon of one to several months ahead. Finally, the long-term forecasts refer to forecasts made for one to several years in the future.

The quality of short-term hourly load forecasts has a significant impact on the economic operation of the electric utility since many decisions based on these forecasts have significant economic consequences. These decisions include economic scheduling of generating capacity, scheduling of fuel purchases, system security assessment, and planning for energy transactions. The importance of accurate load forecasts will increase in the future because of the dramatic changes occurring in the structure of the utility industry due to deregulation and competition. This environment compels the utilities to operate at the highest possible efficiency, which, as indicated above, requires accurate load forecasts. Moreover, the advent of open access to transmission and distribution systems calls for new actions such as posting the available transmission capacity (ATC), which will depend on the load forecasts.

In the deregulated environment, utilities are not the only entities that need load forecasts. Power marketers, load aggregators, and independent system operators (ISO) will all need to generate load forecasts as an integral part of their operation.

This section describes the third generation of an ANN hourly load forecaster known as Artificial Neural Network Short-Term Load Forecaster (ANNSTLF). ANNSTLF, developed by Southern Methodist University and PRT, Inc. under the sponsorship of the Electric Power Research Institute (EPRI), has received wide acceptance by the electric utility industry and is presently being used by over 40 utilities across the U.S. and Canada.

Application of the ANN technology to the load forecasting problem has received much attention in recent years (Dillon et al., 1991; Park et al., 1991; Ho et al., 1992; Lee et al., 1992; Lu et al., 1993; Peng et al., 1993; Papalexopolos et al., 1994; Khotanzad et al., 1995, 1996, 1997, 1998; Mohammed et al., 1995; Bakirtzis et al., 1996). The function learning property of ANNs enables them to model the correlations between the load and such factors as climatic conditions, past usage pattern, the day of the week, and the time of the day, from historical load and weather data. Among the ANN-based load forecasters discussed in published literature, ANNSTLF is the only one that is implemented at several sites and thoroughly tested under various real-world conditions.

A noteworthy aspect of ANNSTLF is that a single architecture with the same input-output structure is used for modeling hourly loads of various size utilities in different regions of the country. The only customization required is the determination of some parameters of the ANN models. No other aspects of the models need to be altered.

16.2.1  ANNSTLF architecture

ANNSTLF consists of three modules: two ANN load forecasters and an adaptive combiner (Khotanzad et al., 1998). Both load forecasters receive the same set of inputs and produce a load forecast for the same day, but they utilize different strategies to do so. The function of the combiner module is to mix the two forecasts to generate the final forecast.

Both of the ANN load forecasters have the same topology with the following inputs:

•  24 hourly loads of the previous day

•  24 hourly weather parameters of the previous day (temperatures or effective temperatures, as discussed later)

•  24 hourly weather parameters forecasts for the coming day

•  Day type indices

The difference between the two ANNs is in their outputs. The first forecaster is trained to predict the regular (base) load of the next day, i.e., the 24 outputs are the forecasts of the hourly loads of the next day. This ANN will be referred to as the “Regular Load Forecaster (RLF).”

On the other hand, the second ANN forecaster predicts the change in hourly load from yesterday to today. This forecaster is named the “Delta Load Forecaster (DLF).”

The two ANN forecasters complement each other because the RLF emphasizes regular load patterns whereas the DLF puts stronger emphasis on yesterday’s load. Combining these two separate forecasts results in improved accuracy. This is especially true for cases of sudden load change caused by weather fronts. The RLF has a tendency to respond slowly to rapid changes in load. On the other hand, since the DLF takes yesterday’s load as the basis and predicts the changes in that load, it has a faster response to a changing situation.

Image

FIGURE 16.3  Block diagram of ANNSTLF.

To take advantage of the complimentary performance of the two modules, their forecasts are adaptively combined using the recursive least squares (RLS) algorithm (Proakis et al., 1992). The final forecast for each hour is obtained by a linear combination of the RLF and DLF forecasts as

L^k+1(i)=αB(i)L^k+1RLF(i)+αC(i)L^k+1DLF(i),i=1,,24

The αB(i) and αC(i) coefficients are computed using the RLS algorithm. This algorithm produces coefficients that minimize the weighted sum of squared errors of the past forecasts denoted by J,

J=k=1NβNk[Lk(i)L^k(i)]2

where Lk(i) is the actual load at hour i, N is the number of previous days for which load forecasts have been made, and β is a weighting factor in the range of 0 < β ≤ 1 whose effect is to de-emphasize (forget) old data.

The block diagram of the overall system is shown in Figure 16.3.

16.2.2  Humidity and Wind Speed

Although temperature (T) is the primary weather variable affecting the load, other weather parameters, such as relative humidity (H) and wind speed (W), also have a noticeable impact on the load. The effects of these variables are taken into account through transforming the temperature value into an effective temperature, T__eff, using the following transformation:

T__eff=T+αHT__eff=TW(65°T)100

16.2.3  Holidays and Special Days

Holidays and special days pose a challenge to any load forecasting program since the load of these days can be quite different from a regular workday. The difficulty is the small number of holidays in the historical data compared to the typical days. For instance, there would be three instances of Christmas Day in a training set of 3 years. The unusual behavior of the load for these days cannot be learned adequately by the ANNs since they are not shown many instances of these days.

It was observed that in most cases, the profile of the load forecast generated by the ANNs using the concept of designating the holiday as a weekend day, does resemble the actual load. However, there usually is a significant error in predicting the peak load of the day. The ANNSTLF package includes a function that enables the user to reshape the forecast of the entire day if the peak load forecast is changed by the user. Thus, the emphasis is placed on producing a better peak load forecast for holidays and reshaping the entire day’s forecast based on it.

The holiday peak forecasting algorithm uses a novel weighted interpolation scheme. This algorithm will be referred to as “Reza algorithm” after the author who developed it (Khotanzad et al., 1998). The general idea behind the Reza algorithm is to first find the “close” holidays to the upcoming one in the historical data. The closeness criterion is the temperature at the peak-load hour. Then, the peak load of the upcoming holiday is computed by a novel weighted interpolation function described in the following.

The idea is best illustrated by an example. Let us assume that there are only three holidays in the historical data. The peak loads are first adjusted for any possible load growths. Let (ti, pi) designate the i-th peak-load hour temperature and peak load, respectively. Figure 16.4 shows the plot of pi vs. ti for an example case.

Now assume that th represents the peak-load hour temperature of the upcoming holiday. th falls in between t1 and t2 with the implication that the corresponding peak load, ph, would possibly lie in the range of [p1, p2] = R1 + R2. But, at the same time, th is also between t1 and t3 implying that ph would lie in [p1, p3] = R1. Based on this logic, ph can lie in either R1 or R1 + R2. However, note that R1 is common in both ranges. The idea is to give twice as much weight to the R1 range for estimating ph since this range appears twice in pair-wise selection of the historical data points.

The next step is to estimate ph for each nonoverlapping interval, R1 and R2, on the y axis, i.e., [p1, p3] and [p3, p2].

For R1 = [p1, p3] interval:

p^h1=p3p1t3t1(tht1)+p1

Image

FIGURE 16.4  Example of peak load vs. temperature at peak load for a three-holiday database.

For R2 = [p3, p2] interval:

p^h2=p2p3t2t3(tht3)+p3

If any of the above interpolation results in a value that falls outside the respective range, Ri, the closest pi, i.e., maximum or minimum of the interval, is used instead.

The final estimate of ph is a weighted average of p^h1 and p^h2 with the weights decided by the number of overlaps that each pair-wise selection of historical datapoints creates. In this case, since R1 is visited twice, it receives a weighting of two whereas the interval R2 only gets a weighting coefficient of one.

p^h=w1p^h1+w2p^h2w1+w2=2p^h1+1p^h22+1

16.2.4  Performance

The performance of ANNSTLF is tested on real data from ten different utilities in various geographical regions. Information about the general location of these utilities and the length of the testing period are provided in Table 16.1.

In all cases, 3 years of historical data is used to train ANNSTLF. Actual weather data is used so that the effect of weather forecast errors do not alter the modeling error. The testing is performed in a blind fashion meaning that the test data is completely independent from the training set and is not shown to the model during its training.

One-to-seven-day-ahead forecasts are generated for each test set. To extend the forecast horizon beyond one day ahead, the forecast load of the previous day is used in place of the actual load to obtain the next day’s load forecast.

The forecasting results are presented in Table 16.2 in terms of mean absolute percentage error (MAPE) defined as

MAPE=100Ni=1N|Actual(i)Forecast(i)|Actual(i)

with N being the number of observations. Note that the average MAPEs over ten utilities as reported in the last row of Table 16.3 indicate that the third-generation engine is quite accurate in forecasting both hourly and peak loads. In the case of hourly load, this average remains below 3% for the entire forecast horizon of 7 days ahead, and for the peak load it reaches 3% on the seventh day. A pictorial example of one-to-seven-day-ahead load forecasts for utility 2 is shown in Figure 16.5.

TABLE 16.1 Utility Information for Performance Study

Utility

No. Days in Testing Period

Weather Variable

Location

  1

141

T

Canada

  2

131

T

South

  3

365

T,H,W

Northeast

  4

365

T

East Coast

  5

134

T

Midwest

  6

365

T

West Coast

  7

365

T,H

Southwest

  8

365

T,H

South

  9

174

T

North

10

275

T,W

Midwest

TABLE 16.2 Summary of Performance Results in Terms of MAPE

Image

TABLE 16.3 Training and Test Periods for the Price Forecaster Performance Study

Database

Training Period

Test Period

MAE of Day-Ahead Hourly Price Forecasts ($)

CALPX

Apr 23, 1998–Dec 31, 1998

Jan 1, 1999–Mar 3, 1999

1.73

PJM

Apr 2, 1997–Dec 31, 1997

Jan 2, 1998–Mar 31, 1998

3.23

Image

FIGURE 16.5  An example of a one-to-seven-day-ahead load forecast.

As pointed out earlier, all the weather variables (T or T__eff) used in these studies are the actual data. In online usage of the model, weather forecasts are used. The quality of these weather forecasts vary greatly from one site to another. In our experience, for most cases, the weather forecast errors introduce approximately 1% of additional error for 1–2 days out load forecasts. The increase in the error for longer range forecasts is more due to less accurate weather forecasts for three or more days out.

16.3  Short-Term Price Forecasting

Another forecasting function needed in a deregulated and competitive electricity market is prediction of future electricity prices. Such forecasts are needed by a number of entities such as generation and power system operators, wholesale power traders, retail market and risk managers, etc. Accurate price forecasts enable these entities to refine their market decisions and energy transactions leading to significant economic advantages. Both long-term and short-term price forecasts are of importance to the industry. The long-term forecasts are used for decisions on transmission augmentation, generation expansion, and distribution planning whereas the short-term forecasts are needed for daily operations and energy trading decisions. In this work, the emphasis will be on short-term hourly price forecasting with a horizon extending up to the next 24 h.

In general, energy prices are tied to a number of parameters such as future demand, weather conditions, available generation, planned outages, system reserves, transmission constraints, market perception, etc. These relationships are nonlinear and complex and conventional modeling techniques cannot capture them accurately. In a similar manner to load forecasting, ANNs could be utilized to “learn” the appropriate relationships. Application of ANN technology to electricity price forecasting is relatively new and there are few published studies on this subject (Szkuta et al., 1999).

The adaptive BP MLP forecaster described in the previous section is used here to model the relationship of hourly price to relevant forecast indicators. The system is tested on data from two power pools with good performance.

16.3.1  Architecture of Price Forecaster

The price forecaster consists of a single adaptive BP MLP with the following inputs:

•  Previous day’s hourly prices

•  Previous day’s hourly loads

•  Next day’s hourly load forecasts

•  Next day’s expected system status for each hour

The expected system status input is an indicator that is used to provide the system with information about unusual operating conditions such as transmission constraints, outages, or other subjective matters. A bi-level indicator is used to represent typical vs. atypical conditions. This input allows the user to account for his intuition about system condition and helps the ANN better interpret sudden jumps in price data that happen due to system constraints.

The outputs of the forecaster are the next day’s 24 hourly price forecasts.

16.3.2  Performance

The performance of the hourly price forecaster is tested on data collected from two sources, the California Power Exchange (CALPX) and the Pennsylvania-New Jersey-Maryland ISO (PJM). The considered price data are the Unconstrained Market Clearing Price (UMCP) for CALPX, and Market Clearing Price (MCP) for PJM. The average of Locational Marginal Prices (LMP) uses a single MCP for PJM. The training and test periods for each database are listed in Table 16.3. Testing is performed in a blind fashion, meaning that the test data is completely independent from the training set and is not shown to the model during its training. Also, actual load data is used in place of load forecast.

TABLE 16.4 Results of Performance Study for the Test Period

Database

MAE of Day-Ahead Hourly Price Forecasts ($)

Sample Mean of Actual Hourly Prices ($)

Sample Standard Deviation of Actual Hourly Prices ($)

CALPX

1.73

19.98

5.45

PJM

3.23

17.44

7.67

The day-ahead forecast results are presented in the first column of Table 16.4 in terms of mean absolute error (MAE) expressed in dollars. This measure is defined as

MAE=100Ni=1N|ActualPrice(i)ForecastPrice(i)|

with N being the total number of hours in the test period.

To put these results in perspective, the sample mean and standard deviation of hourly prices in the test period are also listed in Table 16.4. Note the correspondence between MAE and the standard deviation of data, i.e., the smaller standard deviation results in a lower MAE and vice versa.

Figures 16.6 and 16.7 show a representative example of the performance for each of the databases. It can be seen that the forecasts closely follow the actual data.

Image

FIGURE 16.6  An example of the ANN price forecaster performance for CALPX price data.

Image

FIGURE 16.7  An example of the price forecaster performance for PJM price data.

References

Bakirtzis, A.G. et al., A neural network short term load forecasting model for the Greek power system, IEEE Trans. Power Syst., 11, 2, 858–863, May 1996.

Dillon, T.S., Sestito, S., and Leung, S., Short term load forecasting using an adaptive neural network, Int. J. Electr. Power Energy Syst., 13, 4, 186–192, Aug 1991.

Ho, K., Hsu, Y., and Yang, C., Short term load forecasting using a multi-layer neural network with an adaptive learning algorithm, IEEE Trans. Power Syst., 7, 1, 141–149, Feb 1992.

Khotanzad, A., Afkhami-Rohani, R., Lu, T.L., Davis, M.H., Abaye, A., and Maratukulam, D.J., ANNSTLF—A neural network-based electric load forecasting system, IEEE Trans. Neural Netw., 8, 4, 835–846, July 1997.

Khotanzad, A., Afkhami-Rohani, R., and Maratukulam, D., ANNSTLF—Artificial neural network shortterm load forecaster-generation three, IEEE Trans. Power Syst., 13, 4, 1413–1422, Nov 1998.

Khotanzad, A., Davis, M.H., Abaye, A., and Maratukulam, D.J., An artificial neural network hourly temperature forecaster with applications in load forecasting, IEEE Trans. Power Syst., 11, 2, 870–876, May 1996.

Khotanzad, A., Hwang, R.C., Abaye, A., and Maratukulam, D., An adaptive modular artificial neural network hourly load forecaster and its implementation at electric utilities, IEEE Trans. Power Syst., 10, 3, 1716–1722, Aug 1995.

Lee, K.Y., Cha, Y.T., and Park, J.H., Short-term load forecasting using an artificial neural network, IEEE Trans. Power Syst., 7, 1, 124–132, Feb 1992.

Lu, C.N., Wu, N.T., and Vemuri, S., Neural network based short term load forecasting, IEEE Trans. Power Syst., 8, 1, 336–342, Feb 1993.

Mohammed, O. et al., Practical experiences with an adaptive neural network short-term load forecasting system, IEEE Trans. Power Syst., 10, 1, 254–265, Feb 1995.

Papalexopolos, A.D., Hao, S., and Peng, T.M., An implementation of a neural network based load forecasting model for the EMS, IEEE Trans. Power Syst., 9, 4, 1956–1962, Nov 1994.

Park, D.C., El-Sharkawi, M.A., Marks, R.J., Atlas, L.E., and Damborg, M.J., Electric load forecasting using an artificial neural network, IEEE Trans. Power Syst., 442–449, May 1991.

Peng, T.M., Hubele, N.F., and Karady, G.G., Advancement in the application of neural networks for shortterm load forecasting, IEEE Trans. Power Syst., 8, 3, 1195–1202, Feb 1993.

Proakis, J.G., Rader, C.M., Ling, F., and Nikias, C.L., Advanced Digital Signal Processing, Macmillan Publishing Company, New York, 1992, pp. 351–358.

Rumelhart, D.E. and McClelland, J.L., Parallel Distributed Processing, Vol. 1, MIT Press, Cambridge, MA, 1986.

Szkuta, B.R., Sanabria, L.A., and Dillon, T.S., Electricity price short-term forecasting using artificial neural networks, IEEE Trans. Power Syst., 14, 3, 851–857, Aug 1999.

*  This work was supported in part by the Electric Power Research Institute and 1997 Advanced Technology Program of the State of Texas.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset