Microsoft Word - 44-2929_s_ETASR_V9_N4_pp4548-4553


Engineering, Technology & Applied Science Research Vol. 9, No. 4, 2019, 4548-4553 4548  
  

www.etasr.com Dung & Phuong: Short-Term Electric Load Forecasting Using Standardized Load Profile (SLP) and … 

 
Short-Term Electric Load Forecasting Using 

Standardized Load Profile (SLP) And Support Vector 

Regression (SVR)  
 

Nguyen Tuan Dung 

Planning Department, 
EVNHCMC Power Company, 

Ho Chi Minh, Vietnam 

dp1526@gmail.com 

Nguyen Thanh Phuong 

Institute of Engineering, 
Hutech University of Technology, 

Ho Chi Minh, Vietnam 

nt.phuong@hutech.edu.vn 
 

Abstract—Short-term load forecasting (STLF) plays an 

important role in business strategy building, ensuring reliability 

and safe operation for any electrical system. There are many 

different methods used for short-term forecasts including 

regression models, time series, neural networks, expert systems, 

fuzzy logic, machine learning, and statistical algorithms. The 

practical requirement is to minimize forecast errors, avoid 

wastages, prevent shortages, and limit risks in the electricity 

market. This paper proposes a method of STLF by constructing a 
standardized load profile (SLP) based on the past electrical load 

data, utilizing Support Regression Vector (SVR) machine 

learning algorithm to improve the accuracy of short-term 
forecasting algorithms. 

Keywords-short-term load forecast; regression model; 
standardized load profile; support vector regression 

I. INTRODUCTION  

Load forecasting is electrical systems is a topic that has 
been studied extensively. There are two main approaches in 
this area: Traditional statistical methods of the relationship 
between the load and load-affecting factors (such as time 
series, regression analysis, etc.) and machine learning methods 
(a branch of artificial intelligence). Statistical methods assume 
load data according to a sample and try to forecast the value of 
future loads using different time series analysis techniques. 
Intelligent systems are derived from mathematical expressions 
of human behavior and experience. Especially since the early 
1990s, neural networks have been considered one of the most 
commonly used techniques in the field of electrical load 
forecasting, because they assume that there is a nonlinear 
function related to historical values and some external variables 
with future values may affect the output [1]. The approximate 
ability of neural networks has made their applications popular. 
In recent years, an intelligent calculation method involving 
Support Vector Machines (SVM) has been widely used in the 
field of load forecasting. Authors in [2] used the Support 
Vector Regression (SVR) technique to solve the electrical load 
prediction problem (forecasting a maximum daily load for the 
next 31 days). This was a competition organized by EUNITE 
(European Network on Intelligent Technologies for Smart 
Adaptive Systems). The provided information included: the 
demand data of the past two years, daily temperature of the past 

four years and local holiday events. The data were divided into 
2 parts: a part used for training (about 80%-90%) and the rest 
used for algorithm testing (about 20%-10%). The set of training 
inputs included data of previous day, previous hour, previous 
week, and the average of the previous week. Since then, there 
have been several studies exploring the different techniques 
used for optimizing SVR to perform load forecasting [3-10]. 
The main reason for using SVM in load forecasting is that it 
can easily model the load curve, the relationship between the 
load and the dynamics of changing load demand. However, 
there are some problems encountered when the above 
algorithms are applied to real situations: 

• Climate conditions always play an important role in load 
forecasting. They show the relationship between climate 
and load demand. When we do load forecasting for the 
post-test period, it is very difficult to forecast the values of 
weather and climate used as the input of the algorithm and 
these values are often not available. 

• Electrical load samples include hidden elements, which 
tend to be similar to the previous load model. However, it 
will lead to a false forecast of the following days if the date 
pattern is different from the previous day or there is an 
event that impacts. Therefore, the use of the dataset 
(training inputs include data of the previous day, the 
previous hour, the previous week, the average of the 
previous week) has many risks if the load models are not 
identical. 

• If the forecast time frame is greater than the past data frame 
(more than 7 days), there will be a lack of input to run the 
algorithm. 

• In addition, for Asian countries (such as Vietnam) that use 
lunar calendar, there are difficult and unpredictable issues 
as the Lunar New Year (usually in late January or early 
February), etc. There is a deviation between the solar 
calendar and the lunar calendar (the load models are not 
identical). Therefore, it often leads the forecast results of 
algorithm for this period with large errors. 

This paper proposes a solution to build a Standardized Load 
Profile (SLP) based on the historical load dataset as a training 

Corresponding author: Nguyen Tuan Dung


Engineering, Technology & Applied Science Research Vol. 9, No. 4, 2019, 4548-4553 4549  
  

www.etasr.com Dung & Phuong: Short-Term Electric Load Forecasting Using Standardized Load Profile (SLP) and … 

 
dataset. This input dataset is combined with the SVR algorithm 
to improve the accuracy of short-term forecast results, solve the 
problem of deviation between the solar and the lunar calendar, 
and overcome the input data frame. SLP will be built for all 
365 days and 8,670 hour cycles in a year. SLP will be an 
important dataset during the training, testing and forecasting 
process. It will standardize load models by hours, days, 
seasons, and special day types (including lunar dates). 
Therefore, SLP will contribute to solve the above-mentioned 
difficulties and improve the quality of electrical load 
forecasting. 

II. METHODOLOGY 

Observing the load profiles of February of Ho Chi Minh 
City over the years (Figure 1), we can see a huge fluctuation in 
the chart shape over the years. The results in the use of 
historical data for forecasting this period of time are extremely 
complicated. 

 
(a) 

 
(b) 

 
(c) 

 
Fig. 1.  The load profiles of February over the years: (a) 2016, (b) 2017, 

(c) 2018 

In fact, the algorithms used to forecast in Vietnam have to 
go through an intermediary stage in which the months are 
converted into regular months (without holidays and Lunar 
New Year). After, the forecast result will be reversed or the 
result will be accepted with a large error. This is a common 
problem in software provided by foreign countries. 

A. Standardized Load Profiles (SLP) 

While observing the load profiles of the days in a week and 
some special holidays of the year in Ho Chi Minh City (Figure 
2), we see the difference between weekdays (from Tuesday to 

Friday) is not much and they have the same load chart. For the 
load profiles on Monday, they are different from the normal 
days at 0:00 to 9:00, due to the forwarding demand from 
Sunday. For load profiles on Saturday, there is a change but not 
much compared to normal days. Mainly the load demand 
decreases in the evening due to the start of weekends. 
Particularly for load profiles on Sunday, it is completely 
different from normal days (the demand for electricity is low). 

 
(a) 

 
(b) 

 
(c) 

 
Fig. 2.  Typical load profiles on some days in a year 

When observing the load chart of the New Year and the 
Lunar New Year, we see the difference completely, the graphs 
are almost flat, and the load demand is quite low because these 
are holidays. Particularly on Lunar New Year, the load demand 
is the lowest, because this is the longest holiday of the year 
(from 6 to 9 days). SLPs are built by taking the value of the 
collected capacity in a 60-minute period divided by its 
maximum capacity. We need to build SLP for 365 days per 
year. Some typical SLPs are shown in Figure 3. Based on the 
SLP of each cycle of the past data set, we can build the SLP 
data set for future forecast periods. This should be accurate to 
each cycle, each type of day (weekdays, working days, 
holidays, etc.), each week, and each month. Therefore, the SLP 
is a special feature and is also an important input parameter of 
the SVR (NN) training process to rebuild the load curves, from 
which we can estimate lost or not recorded data during the 
measurement process. 

B. Support Vector Regression (SVR) 

The feature of SVR is that it provides us with a sparse 
solution. That is, to build the regression function, we do not 
need to use all the data points in the training set. The points that 


Engineering, Technology & Applied Science Research Vol. 9, No. 4, 2019, 4548-4553 4550  
  

www.etasr.com Dung & Phuong: Short-Term Electric Load Forecasting Using Standardized Load Profile (SLP) and … 

 
contribute to the construction of the regression are called 
support vectors. The layering for a new data point will depend 
only on the support vectors [5–6]. 

 
(a) 

 
(b) 

 
(c) 

 
(d) 

 
Fig. 3.  SLP of some days in a year: (a) Sunday, (b) Lunar New Year, 

(c) Saturday, and (d) a normal day. 

The regression function has the formula: 

( ) ( )
T

y f x w x b= = Φ +
 

  (1)
 

Thus, the goal of SVR training is to find w and b [7-10] for 

the training set {(x1, t1), (x2, t2), …, (xN, tN)} RR
n
×⊂ . With a 

simple regression problem, to find w and b we have to 
minimize the normalized error function: 

2

1

2

2
}{

2

1
wty

N

n
nn

λ
+−∑

=   

 (2) 

where λ is a normalized constant. 

To get a sparse solution, we will replace the above error 
function with the ε-insensitive error function. The characteristic 
of this error function is that if the absolute value of the 

difference between the predicted value y(x) and the target value 
is less than ε (with ε>0) then the error is considered zero. Now, 
we must minimize the normalized error function: 

2

1

2

2

1
))(( wtxyEC

N

n
nn +−∑

=
ε    (3) 

with ( ) ( ) ( )
T

n ny x f x w x b= = Φ + , C is a normalized constant 

like λ but is multiplied by an error function instead of 
2

w . 

To allow some points outside the tube ε, we will add slack 
variables. For each data point xn, we need two liquid variables 

0νξ ≥  and 
ˆ 0νξ ≥ , with 0νξ ≥ corresponding to the point that 

( )n nt y x ε> +  (outside and above the tube) and 
ˆ 0νξ ≥  

corresponding to the point that ( )n nt y x ε< −  (outside and 

below the tube). 

 
Fig. 4.   Illustration for liquid variables ξn 

The condition for a destination point in the pipe is: 

n n ny t yε ε− ≤ ≤ +  with yn=y(xn). Using liquid variables, 

we allow destination points outside the tube (corresponding to 
liquid variables > 0) and thus the condition will now be: 

ˆ

n n n

n n n

t y

t y

ε ξ

ε ξ

≤ + +

≥ − −
 

Thus, we have an error function for SVR: 

N
2

n n

n 1

1ˆC ( w )
2=

ξ + ξ +∑ . Our goal is to minimize this 

error function with constraints: 

ˆ0, 0

ˆ

n n

n n n

n n n

t y

t y

ξ ξ

ε ξ

ε ξ

≥ ≥

≤ + +

≥ − −

  
Using the Lagrange function and the Karush-Kuhn-Tucker 
condition, we have the equivalent optimization problem: 

1 1

1 1

1
ˆ ˆ( )( ) ( , )

2

ˆ ˆ( ) ( )

N N

n n m m n m

n m

N N

n n n n n

n n

a a a a k x x

a a a a tε

= =

= =

− − −

− − + −

∑∑

∑ ∑
   

(4) 


Engineering, Technology & Applied Science Research Vol. 9, No. 4, 2019, 4548-4553 4551  
  

www.etasr.com Dung & Phuong: Short-Term Electric Load Forecasting Using Standardized Load Profile (SLP) and … 

 
where k is the kernel function: )'()()',( xxxxk T ΦΦ= . 

maximizing with constraints: 

1

0

ˆ0

ˆ( ) 0

n

n

N

n n

n

a C

a C

a a

=

≤ ≤

≤ ≤

− =∑

     (5) 

From here, we have the regression function of SVR: 

1

ˆ( ) ( ) ( , )

N

n n n

n

y x a a k x x b

=

= − +∑     (6) 

Thus, for SVRs using the ε-insensitive error function and 
the Gaussian kernel function we obtain three parameters: the 
normalization coefficient C, the parameter γ of the Gaussian 
kernel function, and the width of the pipe ε [7]. These 
parameters affect the forecast accuracy of the model and need 
to be selected carefully. 

• If C is too large, it will give a priority to the training error. 
It leads to a complex model and it is easy to be over fitting. 
If C is too small, it will give a priority to the complexity of 
the model. It leads to a too simple model and reduces 
forecast accuracy.  

• The meaning of ε is the same. If it is too large, there will be 
less support vectors, making the model too simple. On the 
other hand, if ε is too small, there are many support vectors, 
leading to complex models, which are more likely to be 
over fitting.  

• The γ parameter reflects the correlation between the support 
vectors and also affects the forecast accuracy of the model. 

C. Research Models 

The flowchart of the SLP-SVR forecasting algorithm is 
given in Figure 5. 

 
Fig. 5.  Flowchart of forecasting algorithm by SLP – SVR 

Processed historical data (power consumption, capacity and 
temperature recorded in 24 cycles - 60 minutes each) with the 
SLP will be included in modules to build regression functions 
under SVR, Neural Network (NN) algorithms to build 

regression functions. Then we use the above dataset to check 
and evaluate the error of regression functions. After that, we 
choose the regression function with the smallest error to be 
used as regression function for the next forecast phase. The 
SLP dataset in 24 cycles of the expected period (including 
holidays, etc.) and the forecasted temperature in 24 cycles of 
the corresponding period will be the input for the regression 
function that is selected to export forecast results in 24 cycles 
for a period of 7-30 days. 

III. RESULTS AND DISCUSSION 

A. Input Data: 

The article uses data from January 1
st
, 2015, to November 

17
th
, 2018 of EVNHCMC to run test models. After 

pretreatment, the dataset is divided into 2 parts: training set and 
testing set, in which the testing set is the last 30 days of the 
dataset. Or the dataset is divided into phases to test the forecast 
results in different time periods. Input data for training 
algorithms include: capacity (Pmax/Pmin) in 60-minute cycles, 
temperature (max/min) in 60-minute cycles, standardized load 
profiles of 24 hours of day and a list of holidays and Lunar 
New Year in the forecast year. A useful measurement 
parameter is the mean absolute percentage error (MAPE) 
which is used to evaluate the error of models. 

1
MAPE 100

f

t t

t

Y Y

n Y

−
= ∑    (7) 

The algorithms are programmed in Matlab and the results 
are exported to Excel files for data exploitation. 

B. SVR Models 

It is necessary to correctly select the input parameters to run 
SVR models such as: normalization coefficient C, width of 
pipe ε and Gaussian kernel function. The algorithm uses the 
same input dataset of models. Some typical proposed SVR 
model parameters are shown in Table I. 

TABLE I.  SVR MODEL PARAMETERS 

Model C ε Kernel Function 

SVR 1 93.42 32.5 Polynomial 

SVR 2 500.32 0.01 Gaussian 

SVR 3 1 50.03 Linear 

SVR 4 100 0.01 Linear 

 
C. RFR Models 

A set of regression trees is used with each set of different 
rules to perform a non-linear regression. The algorithm builds a 
total of 20 trees, with a minimum leaf size of 20. The number 
of leaves is smaller or equal to the size of the tree to control 
overfitting and bring about high performance [13-14]. The 
algorithm uses the same input dataset of models. 

D. Neural Network Models 

We used Feedforward Neural Network models with the 
mentioned above input variables and training dataset. A-
hidden-layer network architecture with a class size of 10 and 
Sigmoid activation function was used. At the same time, the 
usual Neural network with 3-hidden-layer network architecture, 


Engineering, Technology & Applied Science Research Vol. 9, No. 4, 2019, 4548-4553 4552  
  

www.etasr.com Dung & Phuong: Short-Term Electric Load Forecasting Using Standardized Load Profile (SLP) and … 

 
in which: the first hidden layer has a size of 10 nodes, the 
second hidden layer has 8 and the third hidden layer has 5 
nodes. 

E. Results and Analysis 

1) Regression Models Test 

We run the forecast results for February of 2018 (the month 
of the Lunar New Year) to assess the degree of error of the 
models. The model included as inputs the data of the previous 
day, previous hour, previous week and the previous week 
average. Processed historical data (power consumption, 
capacity, temperature recorded at 24 cycles of 1 hour) with the 
SLP were included in modules to build regression functions 
under SVR, Neural Network and Random Forest algorithms to 
build regression functions. 

 
Fig. 6.  Regression models test 

TABLE II.  CHECKING ERRORS OF REGRESSION MODELS RESULTS 

Date Ytr Yts1 Yts2 Yts3 Yts4 YtNN Ytfeed YtRF 

1/23/18 9.71 4.05 5.02 6.35 4.19 6.09 4.55 2.91 

1/24/18 8.30 3.65 2.61 7.00 4.25 0.65 4.76 4.19 

1/25/18 7.17 4.35 3.57 7.42 4.21 4.58 5.84 4.63 

1/26/18 7.10 6.20 6.77 7.48 6.39 6.58 5.82 6.44 

1/27/18 9.22 1.37 0.44 3.27 1.33 0.56 1.91 1.06 

1/28/18 9.68 2.16 3.28 7.12 0.32 25.51 5.89 3.93 

1/29/18 9.15 5.30 6.17 6.92 4.91 5.71 5.96 5.67 

 
We chose the regression function with the smallest error to 
be used for the next forecast phase. The Yts4 model was 
selected as a forecasting model. 

2) Forecast Results for February of 2018 

Considering the model forecast results for February, we see 
a big difference between forecast and reality (Figure 7). The 
reason is that we used the historical data of January of 2019 (7-
14-30 days before the forecasting date) as the input for the 
training model. 

3) Results of Testing SVR Models 

We see the results in Figure 8 and Table III. 

4) Results of Testing Machine Learning Models 

We see the results in Figure 9 and Table IV. 

 
Fig. 7.  Forecast results for the next 30 days 

 
Fig. 8.  SVR models test 

TABLE III.  RESULTS OF CHECKING ERRORS OF SVR MODELS 

Date Yts1 Yts2 Yts3 Yts4 

1/23/18 1.15 0.64 2.22 3.87 

1/24/18 1.70 2.12 2.95 6.19 

1/25/18 3.03 3.30 3.38 6.68 

1/26/18 1.35 1.04 1.76 2.76 

1/27/18 6.77 4.56 6.42 1.56 

1/28/18 4.18 5.09 1.81 0.76 

1/29/18 0.24 0.12 2.69 2.14 

MAPE 2.63 2.41 3.03 3.42 

 
Fig. 9.  Machine learning models test 


Engineering, Technology & Applied Science Research Vol. 9, No. 4, 2019, 4548-4553 4553  
  

www.etasr.com Dung & Phuong: Short-Term Electric Load Forecasting Using Standardized Load Profile (SLP) and … 

 
TABLE IV.  CHECKING ERRORS OF MACHINE LEARNING MODELS 
RESULT 

Date YtNN YtFeed YtRF 

1/23/18 1.25 1.61 1.70 

1/24/18 2.14 2.90 3.36 

1/25/18 0.99 5.55 3.89 

1/26/18 3.16 1.84 2.26 

1/27/18 4.81 1.56 1.92 

1/28/18 7.51 5.85 4.68 

1/29/18 4.41 2.05 0.43 

MAPE 3.47 3.05 2.60 

 
5) Results of Testing Regression Models: 

We see the results in Figure 10 and Table V. 

 
Fig. 10.  Regression test models 

TABLE V.  RESULTS OF TEST MODELS CHECKING ERRORS 

Date Ytr Yts1 Yts2 Yts3 Yts4 YtNN Ytfeed YtRF 

1/23/18 9.71 1.15 0.64 2.22 3.87 1.25 1.61 1.70 

1/24/18 8.30 1.70 2.12 2.95 6.19 2.14 2.90 3.36 

1/25/18 7.17 3.03 3.30 3.38 6.68 0.99 5.55 3.89 

1/26/18 7.10 1.35 1.04 1.76 2.76 3.16 1.84 2.26 

1/27/18 9.22 6.77 4.56 6.42 1.56 4.81 1.56 1.92 

1/28/18 9.68 4.18 5.09 1.81 0.76 7.51 5.85 4.68 

1/29/18 9.15 0.24 0.12 2.69 2.14 4.41 2.05 0.43 

MAPE 8.62 2.63 2.41 3.03 3.42 3.47 3.05 2.60 

 
We choose the regression function with the smallest error to 
be used as the regression function for the next forecast phase. 
The model Yts2 is selected to be the forecasting model. 

6) Forecast Results for February of 2018 

We see the results in Figure 11, where a definite 
improvement is observed. 

IV. CONCLUSION 

We observed the experimental results in the forms of 
testing datasets (load datasets of the previous day, previous 
week, previous month and the dataset of SLP), we saw that the 
results of the SLP-SVR models are closely to the actual value 
of February of 2018, while the results of the old model are in 
quite a large deviation. Thus, we see that the use of SLP as the 
input dataset for the modules of forecasting regression function 
is effective and gives forecasting results with low error. It 

solves the problem of deviation between the solar and the lunar 
dates, especially in the months of Lunar New Year. Also it 
resolves the difference between the solar and lunar cycles. 

 
Fig. 11.  Forecast results for the next 30 days 

REFERENCES 

[1] M. H. M. R. Shyamali Dilhani, C. Jeenanunt, “Daily electric load 

forecasting: Case of Thailand”. 7th International Conference on 
Information Communication Technology for Embedded Systems, 

Bangkok, Thailand, March 20-22, 2016 

[2] J. Huo, T. Shi, J. Chang, “Comparison of Random Forest and SVM for 
Electrical Short-term Load Forecast with Different Data Sources”, 7th 

IEEE International Conference on Software Engineering and Service 
Science, Beijing, China, March 23, 2017 

[3] L. C. P. Velasco, C. R. Villezas, P. N. C. Phalang, J. A. A. Dagaang, 
“Next Day Electric Load Forecasting  Using Artificial Neural 

Networks”, Cebu City, Philippines, December 9-12, 2015 

[4] D. Willingham, “Electricity Load Forecasting for the Australian Market 
Case Study”, available at https://ww2.mathworks.cn/matlabcentral/ 

fileexchange/31877-electricity-load-forecasting-for-the-australian-
market-case-study?s_tid=FX_rc1_behav, 2016 

[5] N. T. Dung, T. T. Ha, N. T. Phuong, “Comparative Study of Short-term 

Electric Load Forecasting: Case Study EVNHCMC”, 4th International 
Conference on Green Technology and Sustainable Development, Ho Chi 

Minh City, Vietnam, November 23-24, 2018 

[6] E. Ceperic, V. Ceperic, A. Baric, “A strategy for short-term load 
forecasting by support vector regression machines”, IEEE Transactions 

on Power Systems, Vol. 28, No. 4, pp. 4356-4364, 2013 

[7] V. Vapnik, The Nature of Statistical Learning Theory, Springer, 1995 

[8] S. Gunn, Support Vector Machines for Classification and Regression, 
Technical Report, University of Southampton, 1995 

[9] V. Cherkassky, Y. Ma, “Selection of Meta-parameters for Support 
Vector Regression”, International Conference on Artificial Neural 

Networks, Madrid, Spain, August 28-30, 2002 

[10] D. Basak, S. Pal, D. C. Patranabis, “Support Vector Regression”, Neural 
Information Processing – Letters and Reviews, Vol. 11, No. 10, pp. 203–

224, 2007 

[11] A. J. Smola, B. Scholkopf, “A Tutorial on Support Vector Regression, 
Statistics and Computing”, Vol. 14, No. 3, pp. 199–222, 2004 

[12] Understanding Support Vector Machine Regression  and Support Vector 

Machine Regression, available at: https://www.mathworks.com/help/ 
stats/understanding-support-vector-machine-regression.html 

[13] L. Breiman, “Random Forests”, Machine Learning, Vol. 45, No. 1, pp. 

5-32, 2001  

[14] L. Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone, Classification and 
Regression Trees. Chapman & Hall 1984