TX_1~AT/TX_2~AT


International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 202166

International Journal of Energy Economics and 
Policy

ISSN: 2146-4553

available at http: www.econjournals.com

International Journal of Energy Economics and Policy, 2021, 11(5), 66-77.

Electricity Price Fundamentals in Hydrothermal Power 
Generation Markets Using Machine Learning and Quantile 
Regression Analysis

Andrés Oviedo-Gómez1*, Sandra Milena Londoño-Hernández1, Diego Fernando Manotas-Duque2

1School of Electrical and Electronic Engineering, Universidad del Valle, Cali, Colombia, 2School of Industrial Engineering, 
Universidad del Valle, Cali, Colombia. *Email: oviedo.andres@correounivalle.edu.co

Received: 01 March 2021 Accepted: 10 June 2021 DOI: https://doi.org/10.32479/ijeep.11346

ABSTRACT

A hydrothermal power generation market is characterized by a strong dependence on water reservoir capacity and fossil fuel sources, which causes 
differences in generation marginal costs and high variability of the electricity spot price. Therefore, this study proposes an empirical approach to 
identify the price determinants and their effects on price dynamics. This paper presents two methodologies: a machine learning approach and a quantile 
regression analysis. The first method is used to validate the price determinants through a prediction process, and the second, the quantile regression, 
to identify the non-linear effects. The most important factors observed are total market demand, water reservoirs capacity for generation, and fossil 
fuel consumption. The results offer a new perspective about the market structure and spot price volatility.

Keywords: Electricity Prices, Hydrothermal Power Generation Markets, Machine Learning, Quantile Regression, Gaussian Process Regression 
JEL Classifications: C22, Q41, Q43, Q47

I. INTRODUCTION

The different reforms in electricity markets defined electricity as 
a commodity, which can be sold, bought, and traded in a market 
(Berrie and Hoyle, 1985). However, its storage limitations 
make the market price shows characteristics such as seasonal 
patterns, high volatility, mean reversion, price spikes, and others 
(Girish and Vijayalakshmi, 2013; Huisman and Mahieu, 2003). 
Besides, modeling the price dynamic requires understanding its 
asymmetric distribution, high dispersion, and serial correlation 
(Ciarreta et al., 2011). Therefore, analyzing and predicting the spot 
prices is a challenge for academics and market agents.

On the other hand, the market structure and generation technologies 
are fundamentals factors in the price formation. Based on a 
particular case of a hydrothermal power generation market which 
presents: (i) significant differences in the marginal costs of the 

generation sector; (ii) a small renewable generation capacity; (iii) a 
strong dependence on exogenous variables as fossil fuel prices and 
climatology factors; and, where (iv) the risk and uncertainty are 
higher for market agents, it has been observed that these features 
cause further increased in price variability (Mosquera-López et 
al., 2017a; Fernández-Blanco et al., 2017; Cotia et al., 2019). 
Hence, it is relevant to recognize the determinants that explain 
the electricity price behavior in this market structure.

For this reason, the objective of this study was to identify the 
economic and technological fundamentals in the hydrothermal 
power generation market. Also, it was sought to evaluate 
fundamentals effects on spot price dynamic. For the empirical 
analysis, the Colombian electricity market was selected. Moreover, 
the methodology applied in this analysis was divided into two: 
a machine learning approach and a quantile regression analysis. 
First, a gaussian process regression (GPR) model was trained to 

This Journal is licensed under a Creative Commons Attribution 4.0 International License


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 2021 67

validate the determinants and compute the spot price prediction 
for the next 6 months of the dataset. This method identifies 
complex patterns in a large volume of data and reviews the data 
to predict future behavior (Castelli et al., 2020; Díaz et al., 2019; 
Gonzalez-Briones et al., 2019; Imani et al., 2020; Ribeiro et al., 
2020). Second, a quantile regression model was fitted because it 
allows modeling electricity prices seasonality and quantifying 
the non-linear effects of determinants (Ma and Koenker, 2006; 
Maciejowska, 2020; Mosquera-López et al., 2017b; Uribe and 
Guillen, 2020).

According to Aggarwal et al. (2009) and Girish and Vijayalakshmi 
(2013), the spot price determinants were grouped into five 
categories: (i) market characteristics, (ii) fundamental factors, (iii) 
operation factor, (iv) strategic factors, and (v) historical factors. In 
the first group, it was identified variables such as energy supply and 
demand, electricity exports/imports, market-clearing quantity, and 
energy policy (Deng and Oren, 2006; Mandal et al., 2007; Mosquera-
López and Nursimulu, 2019; Zhang et al., 2008). In the second 
group, the fundamental factors considered were price volatility, fuel 
price, weather factors, and hydrological conditions. By contrast, 
operational factors describe fundamentals as a system load rate, 
electricity production (deficit/surplus), energy sources (nuclear, 
hydric, or thermal), line status and limits, and power transmission 
costs (He et al., 2010; Rodriguez and Anders, 2004; Zhang et al., 
2008). Meanwhile, strategic factors correspond to energy purchasing 
agreements, bilateral contracts, bidding strategy, and market design 
(Crespo-Cuaresma et al., 2004; Kian and Keyhani, 2001; Rodriguez 
and Anders, 2004). Finally, in the fifth group, it has been identified 
that past observations of variables as demand and supply, hydric 
reserves, and electricity price affect the present spot price dynamic 
(Ciarreta et al., 2011; Mandal et al., 2007).

However, and based on the power generation structure selected, 
the results of the empirical application described that total market 
demand, water reservoirs capacity for generation, and fossil fuel 
consumption, are the most relevant determinants of the spot 
price. Also, this paper provides a new contribution in terms of 
market structure analysis and a new perspective of the spot price 
distribution.

The paper is structured, after section 1, as follows: section 2, 
it is described the structure of the Colombia electricity market. 
Section 3 presents the empirical methodologies, and, in section 
4, the dataset is described. In section 5, the results are reported, 
and section 6 presents the conclusions.

2. COLOMBIAN ELECTRICITY MARKET

Since 1990, the Colombian energy sector has presented relevant 
reforms. García et al. (2011) described that the liberalization 
process allowed an improvement in the electricity market by 
introducing competition in different sectors, and hence, abolish the 
limitations of the vertical structure. Besides, the wholesale energy 
market (WEM) was created under a regulatory framework, and its 
operation is through a trade spot structure. However, the electricity 
sector presents limitations such as a low generation capacity and 
high demand, which do not allow structuring a competitive market, 

and electricity prices cannot capture the relationships between the 
supply and demand (Barrientos et al., 2012).

On the other hand, Colombia is part of a region with a lot of 
hydric sources. According to International Energy Agency (IEA) 
statistics, in 2018, approximately 86% of power generation in 
Central and South America was through hydric and thermal 
generation. Therefore, Colombia is part of these hydrothermal 
generation systems, where hydroelectric power generation 
represents 68% and thermal power generation (gas, coal, and 
liquid) 31% (Figure 1). While, renewable sources do not have 
a representative value in the power generation matrix (0.21%).

Due to hydrothermal power generation dependence, the Colombian 
electricity sector presents a high vulnerability by two exogenous 
factors: El Niño–Southern Oscillation (ENSO) and energy fossil 
price fluctuations. Figure 2 shows the daily spot price dynamic 
for the period 2000-2019, and significant effects of ENSO were 
observed in four periods during 2003 and 2014; however, the most 
important shock was observed between 2015 and 2016, where 
the price reached a maximum peak, and the gas prices increased 
considerably. Besides, the thermal generation sector did not have 
an economic guaranty to cover the demand1; hence, the state 
intervened in the market to avoid rationing (Botero-Duque et al., 
2016; Montes, 2018).

1 Thermal generation is a backup source for hydropower generation 
in two specific moments: high demand or low water reservoir 
levels.

Figure 1: Power generation net capacity by technology for January 
2020

Source: XM information system.


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 202168

According to Castaño and Sierra (2012), Díaz-Contreras et al. (2014), 
Lira et al. (2009), and Quintero-Quintero and Isaza-Cuervo (2013), 
the spot price is related to weather changes, fossil fuels to the thermal 
power generation, and electricity demand and supply. Likewise, power 
transmission failures, energy policy, or agent strategies are significant. 
Finally, Mosquera-López et al. (2017b) described that differences in 
the marginal costs are forwarded into the spot price dynamics and, 
consequently, increases the risk to agent decision making.

3. METHODOLOGY

Two approaches were considered to analyze the fundamentals 
of the electricity spot price in a hydrothermal power generation 
market. First, a machine learning approach was used, through a 
GPR, to fit a multivariable model to predict daily electricity price 
and validate the importance of variables considered; second, a 
quantile regression model was fitted to evaluate the effects of 
these predictors on the electricity price dynamic.

3.1. Gaussian Process Regression Models
According to Rasmussen and Williams (2006), and The Mathworks 
(2020), the GPR models are nonparametric kernel-based models of 
supervised learning, used for regression analysis and probabilistic 
classification. These models capture uncertainty and allow 
predictions where the data have unknown distributions. Besides, 
the GPR is a powerful method to perform Bayesian inference, and 
it is much better when the availability of the data is a problem (Aye 
and Heyns, 2017; Gonzalez-Briones et al., 2019).

A training set is defined as {(xi,yi);i=1,2,…,n}, where xi∈R
d 

and yi∈R, and have an unknown distribution. Based on a linear 
regression model, a GPR model predicts the response variable 
by introducing latent variables, f(xi),i=1,2,…,n, from a gaussian 
process (GP), and explicit basis function, h.

A GP is defined by its mean function, m(x), and covariance 
function, k(x,x’). If {f(x),x∈Rd} is a GP, then E(f(x))=m(x) and 
cov[f(x),f(x’)}=E[{f(x)-m(x)}{f(x’)-m(x’)}]=k(x,x´). Therefore, it 
considers the following model:

h x f xT� � � � �� , (1)

Where f(x)~GP(0,k(x,x´)), i.e. f(x) is zero mean GP with covariance 
function k(x,x´). Besides, h(x) is a set of basis functions that project 
the input x into a new p-dimensional feature space vector (Rp) and 
β is a px1 dimension vector of basis function coefficients. This 
is a representation of GPR model and the response variable can 
be described as:

( )( ) ( ) ( )( )2| , ~ | ,  .Ti i i i i iP y f x x N y h x f xβ σ+  (2)
Therefore, a GPR model is a probabilistic model. Furthermore, the 
GPR model is nonparametric model because of the observation xi 
has a latent variable f(xi).

The joint distribution of latent variable f(x1),f(x2),f(x3),…,f(xn) in the 
GPR model is P(f|X)~N(f|0,K(X,X)), close to a linear regression model, 
where K(X,X) is the covariance function and can be parametrized by 
a set of kernel parameters, θ. Hence, k(x,x’) is written as k(x,x’ |θ) to 
explicitly indicate the dependence on kernel parameters.

3.1.1. Kernel function options
The kernel parameters are based on the signal standard deviation σf 
and the characteristic length scale σl. The characteristic length scales 
define the distance between the input values xi and response values to 
become uncorrelated. The standard deviation and the characteristic 
length scale must be greater than zero, given θ1=logσl and θ2=logσf.

The following four built-in kernel function with the same length 
scale were considered:
• Rational quadratic Kernel

k x x
r

i j f
l

, ,|� �
��

�

� � � �
�

�
��

�

�
��

�
2

2

2
1
2

(3)

where σl is the characteristic length scale, α is the positive-valued 

scale-mixture parameter, and r x x x xi j
T

i j� �� � �� ����
�
�
�  is the

Euclidean distance between xi and xj.

Figure 2: Electricity spot price dynamic for the period 2000-2019

Source: XM information system.


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 2021 69

•	 Squared exponential kernel

 k x x ei j f

x x x xi j
T

i j

l
, ,|� �

�
� � �

�
�� � �� ��

�

�
�
�

�

�

�
�
�2

1

2 2

 
(4)

where σl is the characteristic length scale and σf is the signal 
standard deviation.

•	 Matern 5/2

 k x x
r r

ei j f
l l

r
l, ,|� �

� �
�� � � � �

�

�
��

�

�
��

�
�

�
�
�

�

�

2

2

2

5

1
5 5

3
 (5)

•	 Exponential

 k x x ei j f

r
l, ,|� � �� � �

�
�

�
�

�

�
�

2  (6)

where σl is the characteristic length scale and r is the Euclidean 
distance between xi and xj.

3.1.2. Parameter estimation
To estimate the parameters β, θ, and σ2 of a GPR model, the 
likelihood P(y|X) must be maximized as a function of parameters:

 
2

2 2

, , 
, , argmax ( | , , , ).ˆ ˆ  ˆ logP y X

β θ σ
β θ σ β θ σ=

 
(7)

Because, P(y|X,β, θ, σ2)=N(y|Hβ,K(X,X|θ)+σ2 In), the marginal 
log-likelihood function is as follows:

 
( )

( ) ( )

( )

2

12

2

1
( | , , , )

2

, | 2
2

1
log , | ,

2

T

n

n

logP y X y H

n
K X X I y H log

K X X I

β θ σ β

θ σ β π

θ σ

−

= − −

 + + − − 

− +
 

(8)

where, H is the vector of explicit basis functions, and K(X,X│θ) 
is the covariance function. To estimate the parameters, first, 

2ˆ ( , )β θ σ  is determined and its estimation is used to compute the 
β-profiled likelihood. Second, the β-profiled log-likelihood is given 
by 2 2ˆ( | , ( , ) , , )logP y X β θ σ θ σ , where it maximizes the β-profiled 
log-likelihood over θ, σ2 to find their estimates.

Finally, during the estimation process, principal component 
analysis (PCA) was applied to avoid multicollinearity and 
dimensionality problems.

3.1.3. Response variable forecast
To predict a value of a response variable ynew, given a new 
input vector xnew, and the training data, it is defined the density 
P(ynew|y,X,xnew) by conditional probabilities:

 
( )

( )
( )

,  | ,| , , .
| ,

new new
new new

new

P y y X x
P y y X x

P y X x
=

 
(9)

To find the joint density in the numerator, it is necessary to 
introduce the latent variables fnew and f corresponding to ynew, and 
y, respectively. Thus, it is possible to use the joint distribution for 
ynew, y, fnew, and f to compute (9). The GP models assume that each 
response only depends on the corresponding latent variable fi and 
the feature vector xi.

After we found the density P(ynew|y,X,xnew), the expected value of 
prediction ynew at a new point xnew, given y, X, and parameters β, 
θ, σ2 is:

( ) ( )2| , , , ,  ,  ( ) , | ,T Tnew new new newE y y X x h x K x Xβ θ σ β θ α= +  
 (10)

where, � � � �� � ��� � ��K X X I y Hn, ( )| 2
1

.

3.1.4. Performance indicators
To check the GPR model performance, different calibration metrics 
were used such as root mean square error (RMSE), R-squared (R2), 
and mean absolute error (MAE). These metrics are described in 
the following:

•	 RMSE

 
( )2

1

1
 ,ˆ

n

i i
i

RMSE y y
n

=

 
 = −
 
 
∑

 
(11)

•	 R2

 
( )

( )

2
2 1

2

1

 
ˆ

1 ,

n
i ii

n
i ii

y y
R

y y
=

=

−
= −

−

∑
∑  

(12)

•	 MAE

 1
 ,ˆ

1 n
i i

i

MAE y y
n

=

= −∑
 

(13)

where n is the number of observations, yi is the i-th observed value, 
and ˆiy  is the i-th predicted value. For RMSE and MAE lower 
values are desired, and for R2, a closest value to one shows a better 
performance. Besides, the performance metrics of the estimated 
GPR model were compared with two supervised learning models: 
Support vector machines (SVM) and Tree-based methods. The 
performance metrics are described in the result and discussion 
section.

3.2. Quantile Regression Approach
The quantile regression is a semi-parametric approach, with 
high flexibility that captures the stochastic relationship between 
variables, allows consistent estimation in non-Gaussian 
environmental, and requires a minimal distributional assumption 
on the data generating process (Koenker, 2004; Ma and Koenker, 
2006; Uribe and Guillen, 2020). To describe the quantile regression 
model, a linear regression model was assumed, where the response 
variable Yi,t represents the electricity spot prices and is related to 
a set of explanatory variables or fundamentals in a matrix Xi,t. 
Following Koenker and Bassett (1978), the quantile regression 
model can be written as:


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 202170

 Q Y X Xq i t i t i t i
q

, , ,
'

,|� � � �  (17)
where, Yi,t is a (Tx1) vector, with T denoting the number of 
observations (t=1,2,3,…T). Besides, the matrix Xi t,

' of dimensions 
(Txd), has (d-1) predictors that also includes a constant, and βq is 
a (dx1) vector of unknown parameters for each quantile q, q∈(0,1). 
The regression coefficients ˆ qβ of the quantile q were estimated as 
a solution to the following minimization problem:

 
( )' ', , , ,

1

1
min [ ], 

q
i

T
q q

i t i t i i t i t i
t

q I Y X Y X
Tβ

β β
=

 − < −
 ∑

 
(18)

where,

 
( )

'
' , ,

, ,
 1 ,  

 ,
0,  

q
q i t i t i

i t i t i
Y X

I Y X
otherwise

β
β

 <
< = 

  
(19)

Yi,t is defined as in equation (17) and must be computed in 
separate regressions for each i, i=1,…N. According to Mosquera-
López et al. (2017b), and Uribe and Guillen (2020), the quantile 
regression is a special case of the least absolute deviation estimator 
(LAD), that allows robust estimations when the data present heavy 
tails as for electricity spot prices.

4. DATA

The fundamentals of spot price are determined by the generation 
technologies. For example, in Central and South America, the 
generation is based, principally, on hydroelectric and thermal 
power sources. In this cases, different studies have described the 
following determinants: demand, hydrology changes, fossil fuel 
price variation, investment decisions making, the structure of the 
transmission system, and agent strategies (Barria and Rudnick, 
2011; Barrientos-Marín and Toro-Martínez, 2017; Blazsek and 
Hernández, 2018; Samudio-Carter et al., 2019; Vaca et al., 2019; 
Xavier et al., 2016). Therefore, the first database contained 
variables such as (i) total demand: real, commercial, and National 
Interconnected System (NIS); (ii) reservoir levels: daily volume 
in percentage and generation capacity; (iii) climatology factors 
as quantity of water that fuel reservoirs; and (iv) fuel fossil 
consumption: gas, coal, fuel oil, and kerosene. On the other hand, 
variables as the bilateral bidding price, electricity imports/exports, 
or the price regulatory policies were not selected due to the spot 
price is contained in their structures or missing observations were 
identified.

According to the variables described, finally, the correlation 
analysis was used to select the spot price determinants. Besides, 
considering the capacity of generation (Figure 1), the volume 
of water available in the reservoirs and the consumption of 
fossil fuels from two of the most important sources, gas and 
coal, were selected. Also, NIS demand was chosen because this 
variable is calculated based on the net generation of the plants. 
These variables were chosen due to they allow the structure of a 
parsimonious model characterized by describing a classic supply 
and demand model. The dataset applied in this research represents 

the market structure and seeks to explain the spot price dynamic. 
Table 1 shows the variables, specifying data source and units.

In summary, the database is a balanced panel composed of daily 
data that starts in August 2009 and ends in December 2019. The 
period was determined because of the availability of data with 
no methodological changes, and the current supply scheme for 
the generation sector is included (Creg 051 of 2009, article 10). 
Likewise, 2020 data were not selected because regulated and non-
regulated demand decreased by 4.2% and 12.9%, respectively, 
during the first quarterly by the SARS-CoV-2 (COVID-19) 
pandemic (Vidal et al., 2020).2

Table 2 reports summary statistics and unit root test (augmented 
Dickey-Fuller - ADF) of the variables and Figure 3 describes 
their dynamics during the sample period. The spot price presents 
a high variability and dispersion, especially in the last quartile 
due to ENSO effects during 2015 and 2016, where the price 
increased to 1943 COP$/kWh. Then again, the demand has a 
dynamic growth and shows a correlation of 0.26 with the price, 
which is positive and weak, despite the demand is a significant 
price determinant. Regarding water volume, it was observed a 
high variability by seasonal patterns and a negative correlation 
with price. Likewise, gas and coal are sources used to supply the 
demand when the hydropower system presents any limitation. 
Hence, these variables have a high dispersion in the last quartiles 

2. COP is the representative sign of the Colombian peso.

Table 1: Data description
Variable Description Units Source
Spot price Daily electricity 

spot price
COP2$ / 
kWh 

XM information 
system

Demand Total demand with 
energy losses

MW/h XM information 
system

Water volume Reservoirs capacity 
for hydropower 
generation

Percentage 
or GW/h

XM information 
system

Gas Gas quantity 
consumption

MBTU XM information 
system

Coal Coal quantity 
consumption

MBTU XM information 
system

Source: au Thor’s construction

Table 2: Summary and ADF test for selected variables
Statistical 
parameters

Spot price Demand Water 
volume 
(GW/h)

Gas Coal

Mean 184.47 173571 10527 231712 121932
Std. Dev. 166.96 18334.57 2122.958 94968.62 71163.86
Minimum 35.36 115438 5777 63336 0
25th percentile 97.93 160645 9022 155753 61913
50th percentile 146.88 173729 10712 211287 117501
75th percentile 194.68 188247 12303 289651 174210
Maximum 1942.69 217021 14502 543258 356137
Spot price 
correlation

- 0.23 −0.26 0.45 0.60

t-ADF −4.63*** −8.53*** -3.70** −4.10*** −6.06***
** and *** indicates that null hypothesis of a unit root is rejected at 5% and 1% level, 
respectively.

Source: Authors’ analysis


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 2021 71

and a significant and positive correlation with the price. Finally, 
ADF test was computed, and the result shows evidence against 
the presence of unit root in the variables for a 1% and 5% level 
of confidence. Therefore, the variables do not require stationary 
transformation before the estimation.

4.1. Determining, Training and Testing Set for 
Machine Learning Approach
Figure 4 summarizes the machine learning methodology through 
the variable set described. First, the dataset was imported from XM 
Information System, explored, and processed to find their descriptive 
statistics and identify their characteristics. In general, the variables 
did not transform, except for the spot prices due to outliers observed 
during the 2015-2016 period. Spot price outliers were filled through 
the Piecewise Cubic Hermite Interpolating Polynomial (PCHIP) to 
avoid their effects in the prediction process and a possible overfitting.

Second, a training set is used to train the model, while a validation 
set is used to evaluate the model performs with the dataset by 
the performance indicators, and a final test is used to confirm 
the model specification and identify overfitting. Therefore, hold-
out method was used to divide the dataset into three parts: train 
(65%), validation (15%), and test (20%) sets3. In this process stage, 

3 For train, validation, and test sets, the period August 2009-July 
2019 was used.

the response of the variables and their predictors were defined. 
According to the GPR model described in equation (2), we can 
write it in vector form:

 
( ) ( )2| , ~ | ,  ,P y f X N y H f Iβ σ+  (20)

where, the response variable y is the spot prices and the vector X 
has the fundamentals: demand, water volume, and gas and coal 
consumption.

Third, the best models were identified through performance 
indicators and the prediction of the daily spot price for the period 
August 2019-December 2019 was implemented.

4.2. Determining Quantile Regression Model
Based on equation (17), the linear quantile regression model 
can be written as a function of the response variable and their 
predictors:

 Q P D W Cq i t i
q

i
q

t i
q

t i
q

t, , , , , ,� � � � � �� � � �1 2 3 4  (21)
where, Pt is the response variable, spot price, while Dt is the demand, 
Wt is water volume, and Ct is the total gas and coal consumption. 
For estimating the quantile regression model, the period August 

Figure 4: Summary for machine learning methodology

Source: Authors’ analysis.

Figure 3: Evolution of fundamental variables for August 2009-December 2019. (a) The National Interconnected System (NIS) demand in MW/h; 
(b) water volume or reservoir capacity in GW/h; (c) Gas consumption for generating in MBTU; (d) Coal consumption for generating in MBTU.

Source: Author’s construction.

a b

dc


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 202172

2009-December 2019 was used and the natural logarithms were 
computed to interpret the coefficients as elasticities.

5. EMPIRICAL RESULTS AND DISCUSSION

The main findings are presented below for the variables and 
timespan selected. First, the machine learning training results 
and performance metrics are described. Then, the daily spot price 
prediction is shown. Second, in this section, the results of the 
quantile regression analysis are described to identify the effects 
of the main determinants on the spot price.

5.1. Machine Learning Results
The performance of the GPR model fitting is assessed using 
RMSE, R2, and MAE metrics4. Besides, the GPR model was 
compared with the support vector machine, which is categorized as 
a supervised learning method for the application of regression and 
classification. This method is based on determining hyperplanes 
that maximize the margin between classes (Gao et al., 2008). The 
following SVM kernels were used:
•	 Quadratic
•	 Cubic
•	 Gaussian: fine, medium, and coarse.

On the other hand, tree-based methods were considered due to 
their fast for fitting and prediction, low memory usage, and ease 
of interpretation. Therefore, the models used were: fine, medium, 
and coarse, for tree regression.

Besides, the training process is computed through PCA. Therefore, 
it was observed that models were estimated through the first two 
principal components due to these factors explained 98% of the 
variation of the selected determinants.

Tables 3-5 describe the metric performance for different fitting 
models and the kernels selected. Based on all performance metrics, 
the results show that the GPR Exponential performs better. In 
general, good performance was observed for the GPR models 
because the metrics for the three sets used were similar, in contrast 
to the SVM models that present a significant difference in the RMSE 
between the train and the other two datasets. Therefore, this leads 
us to conclude the possibility of overfitting in the SVM models. 
However, the SVM models presented a similar performance in 
validation and test sets in the MAE metric, this could suggest 
that the models still have a good predictive process. Then again, 
some differences were observed in tree regresion metrics; but the 
Medium and Coarse models presented a similar RMSE and MAE 
during the train, validation, and test sets. Finally, the R2 shows the 
percentage of the dependent variable variation that explain by the 
model, but some of these models are not linear, so the use of this 
indicator may be subject to criticism (Díaz et al., 2019).

According to Barrientos-Marín and Toro-Martínez (2017), another 
performance indicator is the mean absolute percentage error 
(MAPE). This metric describes the relative absolute deviation in 

4  RMSE and MAE metrics value in COP$/MWh.

per unit value. For each of the GPR models, the MAPE is 21%, 
for SVM models, the lowest MAPE is 21% for fine and medium 
Gaussian kernels through the test dataset. Likewise, the Coarse 
for tree regression has a MAPE equal to 22%.

Table 4: Metrics performance for SVM models
Model’s stages RMSE R2 MAE
Kernel: Quadratic

Train 58.19 0.67 39.55
Validation 54.66 0.71 37.13
Test 54.26 0.67 36.54

Kernel: Cubic
Train 53.72 0.72 36.33
Validation 48.82 0.76 33.50
Test 50.32 0.72 34.29

Kernel: Gaussian fine
Train 49.12 0.76 30.38
Validation 45.07 0.80 29.76
Test 44.94 0.78 28.82

Kernel: Gaussian medium
Train 51.09 0.74 33.67
Validation 45.87 0.79 31.14
Test 46.71 0.76 30.95

Kernel: Gaussian Coarse
Train 61.33 0.63 39.78
Validation 59.12 0.65 38.26
Test 56.96 0.64 36.39

Source: Authors’ analysis

Table 3: Metrics performance for GPR models
Model’s stages RMSE R2 MAE
Kernel: Rational quadratic

Train 47.57 0.78 32.21
Validation 43.42 0.81 30.11
Test 43.40 0.79 29.48

Kernel: Squared Exponential
Train 48.20 0.77 32.21
Validation 43.52 0.81 30.16
Test 43.51 0.79 29.55

Kernel: Matern 5/2
Train 47.82 0.78 32.40
Validation 43.43 0.81 30.10
Test 43.22 0.79 29.39

Kernel: Exponential
Train 44.45 0.81 30.16
Validation 43.17 0.82 29.79
Test 42.55 0.80 28.94

Source: Authors’ analysis

Table 5: Metrics performance for tree regressions
Model’s stages RMSE R2 MAE
Fine model

Train 36.32 0.87 22.96
Validation 51.97 0.73 34.76
Test 50.37 0.72 33.40

Medium model
Train 43.24 0.82 28.66
Validation 43.69 0.79 31.17
Test 44.76 0.78 30.82

Coarse model
Train 46.62 0.79 31.19
Validation 43.69 0.81 29.65
Test 45.99 0.77 30.76

Source: Authors’ analysis


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 2021 73

In summary, it was observed better performance metrics for 
the GPR models, especially the GPR exponential model. These 
models provide predictions for a given spectrum and a predictive 
distribution that allows computing the first and second moments: 
the mean and the standard deviation. Likewise, the kernels that 
provide rankings of the input variables or variance estimation of 
the data noise. Hence, the GPR models offer an alternative for 
analyzing a variable that presents mean reversion, spikes, and 
seasonal patterns.

Therefore, the daily prediction was computed through this model 
for the period from August 2019 to December 2019 (Figure 5). 
The dynamics of the spot price generated by the selected predictors 
were observed and it was conclude that the model allows a good 
approximation for lower prices, i.e., under 250 COP$/kWh. 
However, the prediction has not reached the true value for high 
prices, especially during the period from September to November. 
Barrientos-Marín and Toro-Martínez (2017) described for the 
Spanish market, an asymmetry response between the high and 
low prices. When the price is high, the model does not believe that 
prices will be higher. Likewise, when the price is low, the model 
is not confident that prices will be lower. Therefore, the authors 
explained that their model could capture the agent behavior, who 
submit bids with low prices to compete. Nevertheless, Weron 
(2014) and Ziel (2016) described there is not a standard structure 
for the electricity markets. Hence, it is not possible to make a 
comparison between markets and performance metrics for machine 
learning approach.

By contrast, the average spot price during July 2019 was 123.57 
COP$/kWh, and during October 2019, the price reached an average 
of 390.4 COP$/kWh. A reduction in hydric sources during August 
and September could explain the high price increase; however, the 
water reservoirs had a percentage of 74% and 67% in August and 
September, respectively. Besides, the water reservoir percentage 
in October and November was approximately 69%. Therefore, the 
generation concentration index or an oligopolistic indicator must 
be considered because the hydropower generation tries to make 

speculations when the water sources decrease and, thus, increase 
the price in the following months (Aggarwal et al., 2009; Zhang 
and Luh, 2005).

5.2. Quantile Regression Results
For estimating the quantile regression model, the complete sample 
was used: August 2009-December 2019. Likewise, the response 
variable was not transformed by outliers due to quantile regression 
models are robust to these data and according to Uribe and Guillen 
(2020) the financial time series presents crises and booms with 
high or low observations.

Figure 6 describes the spot prices’ quantile against the 
corresponding fraction of data. A low spot price was observed for 
the lower quantiles, approximately equal to 35 COP$/kWh, and 
around 147 COP$/kWh for the median price. From the lower to 
higher quantiles, a smooth increase was identified; however, after 
85% quantile, the price presents a sharp peak related to exogenous 
effects during 2015 and 2016.

The linear model described in (21) was estimated for different 
percentiles of the distribution of electricity prices, i.e. from 
the 10th to the 90th percentile. Furthermore, the gas and coal 
consumption were added to analyze the proportion of fossil 
fuel consumption due to these two variables are around 22% of 
total generation capacity. The main results are summarized in 
Figure 7 and the quantile regression coefficients are presented in 
the Appendix A.

5.2.1. Effects of the determinants variables of the electricity 
spot price for different percentiles
5.2.1.1. Demand effects
The sensitivity to changes in demand is positive and significant 
statistically, but its effects vary over the different spot price 
quantiles. In the 10th percentile, where the price is low, the demand 
presents a high impact, e.g. for a demand variation of 1%, the 
price variation is approximately 2%. However, around the 20th to 
the 50th percentile, the demand impact decreases significantly. For 

Figure 5: Electricity spot price daily prediction for August 2019-December 2019. The continues blue line is the real spot price for the sample. The 
dotted red line is the spot price prediction, and the dotted black lines are the prediction intervals

Source: authors’ analysis.


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 202174

Figure 6: Quantiles for electricity spot price for August 2009 – December 2019.

Source: authors’ analysis.

prices in the 60th and the 80th percentile, the demand tries to stabilize, 
but for prices over the 80th percentile, the effect associated with 
variation in demand is lower. Therefore, with a demand variation 
of 1%, the price variation is equal to 1.4%. Given the inverse 
relationship between prices and demand, its impact is less on high 
quantiles.

According to Barrientos et al. (2012), Barrientos-Marín and 
Toro-Martínez (2017), and García et al. (2011), demand is one 
of the most relevant spot price determinants. It is concluded that 
the price has a positive trend in the future by a positive demand 
shock. However, the effect is higher in the short-term. Besides, 
the price captures the complex effects of supply and demand 

Source: Authors’ analysis.

Figure 7: Fundamental variables effects on the electricity spot price for different percentiles. the vertical axis in each subplot corresponds to the 
spot price response by effects of predictors, while The horizontal axis corresponds to quantiles, from the 10th to the 90th percentile. The dotted black 
lines represent the quantile regression coefficients and the gray area is the 95% confidence interval. The continuous red line is the linear regression 

coefficient estimated by OLS, and the discontinuous red lines are the 95% confidence intervals. The variables are defined as follows: (a) effects 
of the intercept; (b) effects of the demand; (c) effects of the water volume or reservoir capacity; (d) effects of the gas and coal consumption for 

generating

ba

c d


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 2021 75

generation plants must turn on, affecting the price. By contrast, 
the elasticity of the water volume or reservoir capacity is negative, 
with increased impact on lower and higher quantiles. That is, 
seasonal patterns of reservoirs cause a strong price fluctuation, 
e.g., each rainy season, the spot price decrease significantly. An 
important aspect is the generation sector’s influence on the price 
by future speculation of water volume; for this reason, it must be 
added a fundamental that captures the oligopoly structure.

Positive elasticities were found for fossil fuel consumption. It was 
revealed how gas and coal increased the price significantly on last 
quantiles. Exogenous effects such as dry seasons or the demand 
changes, increase the spot price through generation costs.

Therefore, it has described how the magnitude changes in 
fundamental variables in a hydrothermal power, explain the 
electricity spot price. The effect of reservoir changes represents the 
main risk factor for generators. Besides, the generation sector faces 
risk by fossil fuel price fluctuation; hence, they cannot recover the 
costs through the electricity price increases. Likewise, this study 
allowed identifying the importance of renewable energy because 
they can become a smoother of the volatility prices and prevent 
their extreme changes caused by exogenous effects.

Finally, to improve the model prediction it will be required the 
inclusion of generation concentration index or agent strategies. 
However, the model can serve as a point of reference, given the 
hydrothermal generation sector characteristic and exogenous 
factors that explain the electricity price dynamics.

REFERENCES

Aggarwal, S.K., Saini, L.M., Kumar, A. (2009), Electricity price 
forecasting in deregulated markets: A review and evaluation. 
International Journal of Electrical Power and Energy Systems, 
31(1), 13-22.

Aye, S.A., Heyns, P.S. (2017), An integrated Gaussian process regression 
for prediction of remaining useful life of slow speed bearings based 
on acoustic emission. Mechanical Systems and Signal Processing, 
84, 485-498.

Barria, C., Rudnick, H. (2011), Investment under uncertainty in power 
generation: Integrated electricity prices modeling and real options 
approach. IEEE Latin America Transactions, 9(5), 785-792.

Barrientos, J., Rodas, E., Velilla, E. (2012), Modelo para el pronóstico del 
precio de la energía eléctrica en Colombia. Lecturas de Economía, 
77, 91-127.

Barrientos-Marín, J., Toro-Martínez, M. (2017), Análisis de los 
fundamentales del precio de la energía eléctrica: Evidencia empírica 
para Colombia. Revista de Economía del Caribe, 19, 34-63.

Berrie, T.W., Hoyle, M. (1985), Treating energy as a commodity. Energy 
Policy, 13(6), 506-510.

Blazsek, S., Hernández, H. (2018), Analysis of electricity prices for 
Central American countries using dynamic conditional score models. 
Empirical Economics, 55(4), 1807-1848.

Botero-Duque, J.P., García, J.J., Velásquez, H. (2016), Efectos del cargo 
por confiabilidad sobre el precio spot de la energía eléctrica en 
Colombia. Cuadernos de Economía, 35(68), 491-519.

Castaño, E., Sierra, J. (2012), Sobre la existencia de una raíz unitaria en 
la serie de tiempo mensual del precio de la electricidad en Colombia. 
Lecturas de Economía, 76, 259-291.

activity through the influence of the operational determinants: 
technological and organizational (Díaz-Contreras et al., 2014; 
Girish and Vijayalakshmi, 2013).

5.2.1.2. Water volume effects
The elasticity of the water volume is negative and significant 
statistically, independent of the quantiles. Therefore, an increased 
impact of water volume sensitivity was observed on lower and higher 
quantiles, i.e. when the water reservoir capacity is high, it always 
leads to a reduction in the electricity prices. In the first quantile, the 
price is low by a high water volume. It was observed that a water 
volume variation of 1% causes a price variation equal to −0.41%. In 
the last quantiles, the impact is higher because water volume becomes 
the most important source and an alternative to reduce the spot price 
when the thermal plants are on. In the 20th-70th percentiles the effects 
measured by quantile regression are similar to the median effects.

Hydraulic technology presents lower generation costs than thermal 
technology. However, hydric sources are high uncertainty to the 
energy and market reliability. Given the seasonal patterns in hydric 
sources, the electricity spot prices are lower in the rainy season 
and higher in the dry season (García et al., 2011). According to 
Barrientos-Marín and Toro-Martínez (2017) a positive effect on the 
available hydric capacity causes a negative real price. Likewise, 
hydropower generation depends on the future situation (or not 
observable); hence, this sector tries to influence on the spot prices.

5.2.1.3. Fossil fuel consumption effects
Positive and significant elasticities were observed for fossil fuel 
consumption. Around the 10th percentile, the effects are minor, but for 
prices over the 40th percentile, the effects are becoming higher. This 
means that the thermal plants must turn on by a decrease in water 
volume or an increase in demand and, as a result, the generation costs 
and spot prices increase. In the 90th percentile, the price variation is 
approximate 1.25% when the fossil fuel consumption is 1%.

According to Mosquera-López et al. (2017a), when the thermal 
generation plants are on, they present marginal costs of up to 300%, 
higher than hydropower plants. Therefore, the marginal generation 
costs show a relevant difference between the two most important 
generation technologies, which explains the price fluctuations.

6. CONCLUSIONS

Considering the Colombian power generation market structure, 
where hydropower generation is the most relevant source, followed 
by thermal power technology, a set of market fundamentals was 
validated through a price prediction using a machine learning 
trained model. Besides, by using quantile regression, the non-linear 
effects of these variables on the spot price were measured. In the 
sensitivity analyses for the different variables across the price 
distribution, it was observed how the demand, the water reservoir 
capacity, and the fossil fuel consumption influence the price.

Therefore, positive changes were observed in the spot price 
through demand variations. When the electricity consumption 
increases, all generation technologies must produce to meet 
demand. However, if the demand is not cover, the thermal power 


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 202176

Castelli, M., Groznik, A., Popovič, A. (2020), Forecasting electricity 
prices: A machine learning approach. Algorithms, 13(5), 119.

Ciarreta, A., Lagullón, M., Zarraga, A. (2011), Modelación de los 
precios en el mercado eléctrico español. Cuadernos de Economía, 
30, 227-250.

Cotia, B.P., Borges, C.L.T., Diniz, A.L. (2019), Optimization of wind 
power generation to minimize operation costs in the daily scheduling 
of hydrothermal systems. International Journal of Electrical Power 
and Energy Systems, 113, 539-548.

Crespo-Cuaresma, J., Hlouskova, J., Kossmeier, S., and Obersteiner, M. 
(2004), Forecasting electricity spot-prices using linear univariate 
time-series models. Applied Energy, 77(1), 87-106.

Deng, S., Oren, S. (2006), Electricity derivatives and risk management. 
Energy, 31(6-7), 940-953.

Díaz, G., Coto, J., Gómez-Aleixandre, J. (2019), Prediction and explanation 
of the formation of the Spanish day-ahead electricity price through 
machine learning regression. Applied Energy, 239, 610-625.

Díaz-Contreras, J.A., Macías-Villalba, G.I., Luna-González, E. (2014), 
Estrategia de cobertura con productos derivados para el mercado 
energético colombiano. Estudios Gerenciales, 30(130), 55-64.

Fernández-Blanco, R., Kavvadias, K., Hidalgo González, I. (2017), 
Quantifying the water-power linkage on hydrothermal power 
systems: A Greek case study. Applied Energy, 203, 240-253.

Gao, C., Bompard, E., Napoli, R., Wan, Q., Zhou, J. (2008), Bidding 
strategy with forecast technology based on support vector machine 
in the electricity market. Physica A: Statistical Mechanics and Its 
Applications, 387(15), 3874-3881.

García, J., Gaviria, A., Salazar, L. (2011), Determinantes del precio de la 
energía eléctrica en el mercado no regulado en Colombia. Ciencias 
Estratégicas, 19, 225-246.

Girish, G.P., Vijayalakshmi, S. (2013), Determinants of electricity price 
in competitive power market. International Journal of Business and 
Management, 8(21), 70-75.

Gonzalez-Briones, A., Hernandez, G., Corchado, J.M., Omatu, S., 
Mohamad, M.S. (2019), Machine Learning models for electricity 
consumption forecasting: A review. In: 2019 2nd International 
Conference on Computer Applications and Information Security 
(ICCAIS). p1-6.

He, Y.X., Zhang, S.L., Yang, L.Y., Wang, Y.J., Wang, J. (2010), Economic 
analysis of coal price-electricity price adjustment in China based on 
the CGE model. Energy Policy, 38(11), 6629-6637.

Huisman, R., Mahieu, R. (2003), Regime jumps in electricity prices. 
Energy Economics, 10, 425-434.

Imani, M.H., Bompard, E., Colella, P., Huang, T. (2020), Predictive 
methods of electricity price: An application to the Italian electricity 
market. In: 2020 IEEE International Conference on Environment and 
Electrical Engineering and 2020 IEEE Industrial and Commercial 
Power Systems Europe (EEEIC/I and CPS Europe). p1-6.

Kian, A., Keyhani, A. (2001), Stochastic price modeling of electricity 
in deregulated energy markets. In: Proceedings of the 34th Annual 
Hawaii International Conference on System Sciences. p7.

Koenker, R. (2004), Quantile regression for longitudinal data. Journal of 
Multivariate Analysis, 91(1), 74-89.

Koenker, R., Bassett, G. (1978), Regression quantiles. Econometrica, 
46(1), 33.

Lira, F., Muñoz, C., Núñez, F., Cipriano, A. (2009), Short-term forecasting 
of electricity prices in the Colombian electricity market. IET 
Generation, Transmission and Distribution, 3(11), 980-986.

Ma, L., Koenker, R. (2006), Quantile regression methods for recursive 
structural equation models. Journal of Econometrics, 134(2), 471-506.

Maciejowska, K. (2020), Assessing the impact of renewable energy 
sources on the electricity price level and variability-a quantile 
regression approach. Energy Economics, 85, 104532.

Mandal, P., Senjyu, T., Urasaki, N., Funabashi, T., Srivastava, A.K. (2007), 
A novel approach to forecast electricity price for PJM using neural 
network and similar days method. IEEE Transactions on Power 
Systems, 22(4), 2058-2065.

Montes, C. (2018), La incertidumbre climática y el dilema energético 
colombiano. Revista de la Academia Colombiana de Ciencias 
Exactas, Físicas y Naturales, 42(165), 392-401.

Mosquera-López, S., Manotas-Duque, D.F., Uribe, J.M. (2017a), Risk 
asymmetries in hydrothermal power generation markets. Electric 
Power Systems Research, 147, 154-164.

Mosquera-López, S., Nursimulu, A. (2019), Drivers of electricity price 
dynamics: Comparative analysis of spot and futures markets. Energy 
Policy, 126, 76-87.

Mosquera-López, S., Uribe, J.M., Manotas-Duque, D.F. (2017b), 
Nonlinear empirical pricing in electricity markets using fundamental 
weather factors. Energy, 139, 594-605.

Quintero-Quintero, M.C., Isaza-Cuervo, F. (2013), Dependencia 
hidrológica y regulatoria en la formación de precio de la energía 
en un sistema hidrodominado: Caso sistema eléctrico colombiano. 
Revista Ingenierías Universidad de Medellín, 12(22), 85-95.

Rasmussen, C.E., Williams, C.K.I. (2006), Gaussian Processes for 
Machine Learning. United States: MIT Press.

Ribeiro, M., Stefenon, S., de Lima, J., Nied, A., Mariani, V., Coelho, 
L. (2020), Electricity price forecasting based on self-adaptive 
decomposition and heterogeneous ensemble learning. Energies, 
13(19), 5190.

Rodriguez, C.P., Anders, G.J. (2004), Energy price forecasting in the 
ontario competitive power system market. IEEE Transactions on 
Power Systems, 19(1), 366-374.

Samudio-Carter, C., Vargas, A., Albarracín-Sánchez, R., Lin, J. (2019), 
Mitigation of price spike in unit commitment: A probabilistic 
approach. Energy Economics, 80, 1041-1049.

The Mathworks, Inc. (2020), Statistics and Machine Learning Toolbox 
User’s Guide. United States: The Mathworks, Inc. Available from: 
https://www.la.mathworks.com/help/pdf_doc/stats/stats.pdf.

Uribe, J.M., Guillen, M. (2020), Quantile Regression for Cross-Sectional 
and Time Series Data: Applications in Energy Markets Using R. 
Berlin: Springer International Publishing.

Vaca, J., Núñez, G., Kido, A. (2019), Análisis multisectorial del 
incremento de precios de la electricidad en la economía de México. 
Problemas del Desarrollo. Revista Latinoamericana de Economía, 
50(196), 167-189.

Vidal, P., Sierra, L., Cerón, J. (2020), Demanda Nacional de Energía 
y Crecimiento Económico en Tiempos de Cuarentena. Colombia: 
Pontificia Univerdiad Javeriana.

Weron, R. (2014), Electricity price forecasting: A review of the state-
of-the-art with a look into the future. International Journal of 
Forecasting, 30(4), 1030-1081.

Xavier, E.M., Pereira, G.M., Friedrich, L.R., Schneider, L.C., 
Danesi,  L.C., Borchardt, M. (2016), Requirements to leverage the 
electricity distributors’ sales and revenues in the brazilian free market. 
IEEE Latin America Transactions, 14(10), 4293-4303.

Zhang, L., Luh, P.B. (2005), Neural network-based market clearing price 
prediction and confidence interval estimation with an improved 
extended kalman filter method. IEEE Transactions on Power 
Systems, 20(1), 59-66.

Zhang, Y., Zhou, Q., Sun, C., Lei, S., Liu, Y., Song, Y. (2008), RBF neural 
network and ANFIS-based short-term load forecasting approach in 
real-time price environment. IEEE Transactions on Power Systems, 
23(3), 853-858.

Ziel, F. (2016), Forecasting electricity spot prices using lasso: On 
capturing the autoregressive intraday structure. IEEE Transactions 
on Power Systems, 31, 4977-4987.


Oviedo-Gómez, et al.: Electricity Price Fundamentals in Hydrothermal Power Generation Markets Using Machine Learning and Quantile Regression Analysis

International Journal of Energy Economics and Policy | Vol 11 • Issue 5 • 2021 77

Table A.I: Quantile regression coefficients for different quantiles
Predictors β0.1 β0.2 β0.3 β0.4 β0.5 β0.6 β0.7 β0.8 β0.9
Intercept −26.857 −27.877 −25.152 −24.142 −22.338 −22.600 −22.521 −21.739 −23.425
Demand 1.988 1.941 1.719 1.657 1.491 1.478 1.479 1.439 1.355
Water volume −0.412 −0.289 −0.254 −0.275 −0.275 −0.273 −0.319 −0.419 −0.358
Fossil fuel consumption 0.899 0.935 0.912 0.912 0.933 0.971 1.005 1.062 1.247
Source: authors’ analysis

APPENDIX A

Table A.I shows the quantile regression coefficients from 10th to 90th percentiles. All coefficients are significant statistically for a 1% 
level of confidence.