277 

 
PREDICTING THE BITCOIN PRICE USING LINEAR REGRESSION 
OPTIMIZED WITH EXPONENTIAL SMOOTHING 

 
Indah Suryani 1*), Hani Harafani 2 

 
Informatika  

Universitas Nusa Mandiri 
www.nusamandiri.ac.id 

indah.ihy@nusamandiri.ac.id 1*), hani.hhf@nusamandiri.ac.id 2 

 
(*) Corresponding Author 

 
Abstrak 

Bitcoin merupakan salah satu mata uang kripto yang paling popular saat ini. Di dalam kondisi pandemic yang 
melanda dunia saat ini akibat Covid-19, maka bitcoin diharapkan dapat dijadikan sebagai sebuah investasi 
ketika tingkat ketidakpastian ekonomi sedang tinggi. Pada penelitian ini, data yang digunakan adalah data 
harga bitcoin yang termasuk ke dalam data deret waktu. Salah satu metode yang umum digunakan untuk 
prediksi dalam deret waktu adalah metode regresi linear. Untuk dapat mengembangkan hasil prediksi 
tersebut, digunakan teknik transformasi data menggunakan metode yang popular yaitu exponential 
smoothing. Pada metode exponential smoothing, dilakukan optimasi parameter alpha untuk dapat 
mendongkrak hasil prediksi dari regresi linear. Dan dari hasil ekperimen yang dilakukan, terbukti bahwa 
optimasi parameter alpha pada exponential smoothing mampu meningkatkan performa prediksi regresi 
linear dengan hasil perbandingan RMSE dengan uji t telah menghasilkan hasil perbedaan yang signifikan.  
 
Kata kunci: Bitcoin; linear regresi; exponential smoothing 
 
 
Abstract 
Bitcoin is one of the most popular cryptocurrencies today. In the current pandemic conditions that hit the 
world due to Covid-19, bitcoin is expected to be used as an investment when the level of economic 
uncertainty is high. In this study, the data used is bitcoin price data which is included in time series data. 
One of the commonly used methods for prediction in time series is the linear regression method. To be able 
to develop the prediction results, a data transformation technique is used using the popular method, namely 
exponential smoothing. In the exponential smoothing method, optimization of the alpha parameter is 
carried out to be able to boost the prediction results from linear regression. And from the experimental 
results, it is evident that the optimization of the alpha parameter in exponential smoothing can improve the 
prediction performance of linear regression with the results of the comparison of RMSE with the t-test 
which has resulted in significant differences. 
 
Keywords: bitcoin; linear regression; exponential smoothing 
 
 
INTRODUCTION 
 

Bitcoin having received increased levels of 
attention from the media and investors alike in 
recent years (Kalyvas et al., 2020). So making 
bitcoin one of the most popular among all 
cryptocurrencies.  In line with (Jareño et al., 2020), 
they stated that In recent years, cryptocurrency 
markets have become much more popular, so 
cryptocurrencies may have moved to the category 
of investment assets. Much research on bitcoin has 
focused on the price discovery process and market 

efficiency in the bitcoin market (Tsang & Yang, 
2020). 

There are several bitcoin exchanges and 
the price difference between them is large and 
changes over time (Tsang & Yang, 2020). The global 
COVID-19 pandemic has disrupted normal business 
and affected sustainable economic development in 
many countries. However, it seems that the 
economic uncertainty following the COVID-19 
containment measures is supporting the 
cryptocurrency market signal (Sarkodie et al., 
2021). In line with (Kalyvas et al., 2020), their 
findings indicate that bitcoin may possess hedging 

http://creativecommons.org/licenses/by-nc/4.0/


278 

 
properties against economic uncertainty; therefore, 
it may be beneficial for investors to consider this 
cryptocurrency as an investment when economic 
uncertainty is high. 

The linear regression model is 
representative of the most well-known family of 
regression models, this model consists of a linear 
function that underlies the class of hypotheses 
(Vercellis, 2009). Linear regression is a statistical 
technique that describes a linear relationship 
between two variables, namely the dependent 
variable and the independent variable (Aslanyan, 
2021; Mondal & Rehena, 2020). Linear regression 
(LR) can be useful not only for discovering patterns 
in experimental data but also as a baseline for 
benchmarking and validating new analysis 
techniques (Zakeri et al., 2020), especially novel or 
unfamiliar ones. Linear regression is also one of the 
prediction methods in machine learning that is 
quite popular for researchers to develop, as done by 
(Huang & Hsieh, 2020), (Matiz & Barner, 2020), and 
(Patel & Kiran, 2019).  

A relevant problem that often faced by 
practitioners regarding the dynamic nature of time 
series is the selection of a particular exponential 
smoothing model. For example, the choice between 
adopting a local linear trend and simple exponential 
smoothing is usually driven by the detection (or 
absence) of a trend in the data (Sbrana & Silvestrini, 
2014). However, during the course of a business 
cycle, the trend dynamics of a series are sometimes 
not constant over time and may vary (Sbrana & 
Silvestrini, 2013). Data transformation can be in the 
form of smoothing, aggregation, generalization, 
normalization, and attribute construction or feature 
construction. One of the functions of the smoothing 
technique is to remove noise from the data. And 
exponential smoothing is one of those smoothing 
techniques (Han & Kamber, 2006). Another 
advantage of exponential smoothing is that it can 
consider trends and seasonal effects of the data so 
that it can produce estimates with simple formulas 
(Tratar, 2015). In addition, exponential smoothing 
also can beat many other advanced methods 
(Beaumont, 2014). Therefore, exponential 
smoothing is also widely used to develop time series 
prediction models, as was done by previous 
research in (Yager, 2013),(Koehler et al., 2012), and 
(Suryani, 2015).  

Based on the literature, it is interesting in 
this study to be able to predict the price of bitcoin 
by developing a linear regression method which 
developed by transforming the data using 
exponential smoothing. Which in previous studies, 
efforts to improve performance with exponential 
smoothing used for the gold price dataset and were 

directly carried out on the Neural Network method 
without first comparing with other machine 
learning methods. While in this study, optimization 
with exponential smoothing carried out after 
comparing the RMSE values between the three 
machine learning methods and used to predict 
bitcoin prices. 

 
RESEARCH METHODS 

 
Types of research 

This type of research is currently being 
carried out in the form of quantitative research in 
the form of experimental research. 

 
Time and Place of Research 
 This research used secondary data from 
https://www.investing.com/crypto/bitcoin/histor
ical-data. These data records collected from 01 
march 2017 until 05 march 2021 

 
Procedure 

In this study, the dataset in the form of 
bitcoin closing prices was processed first with data 
pre-processing techniques such as set roles, 
normalize and windowing. The role set is used to 
define labels and id. Normalize is used to normalize 
the data and windowing is used to break the closing 
price attribute into 5 parts, namely 5 input data and 
1 output data. 

Modeling in this research is the 
optimization of the alpha parameter in exponential 
smoothing to improve performance on prediction 
results using linear regression as shown in Figure 1. 
The first thing to do is to process a dataset in the 
form of bitcoin closing prices with pre-processing 
techniques such as set roles, normalize and 
windowing. The role set is used to define labels and 
id. Normalize is used to normalize data using binary 
sigmoid activation function and windowing is used 
to break the closing price attribute into 5 parts, 
namely 5 input data and 1 output data. 

Furthermore, exponential smoothing will 
used to optimize the performance of linear 
regression by optimizing its alpha parameter. 

After that, the new data will be produced 
and then processed by linear regression method 
using 4 future selection options in the form of t-test, 
m5prime, Greedy, and iterative-test. The processing 
is carried out using the 10 fold cross-validation 
technique, namely by dividing the training and 
testing data. Then the RMSE value will obtained 
from each experiment carried out and then a 
comparison is made. 

 
http://creativecommons.org/licenses/by-nc/4.0/


279 

 
Figure 1. Proposed Method 
 

Data, Instruments, and Data Collection 
Techniques 

The data collected is in the form of 
historical data on bitcoin prices which includes the 
attributes of date, opening price, highest price, low 
price, closing price, volume_BTC, volume_currency, 
and weighted prices. The attributes used to be 
processed are only the attributes of the date and 
closing price. Which is contains 1.170 records. 

 
Data analysis technique 

The data used in this research is time-series 
data in the form of historical data from bitcoin 
prices. Wherefrom the bitcoin price data, only one 
price data attribute is used in the form of the closing 
price data only. As shown in Table 1. below. 

 
Table 1. Samples of Bitcoin Prices Data  

Date Closing Price 
05/03/2021 56.826 
05/02/2021 57.016 
05/01/2021 57.700 
4/30/2021 56.803 
4/29/2021 53.006 
4/28/2021 54.456 
4/27/2021 55.067 
4/26/2021 53.297 
4/25/2021 48.075 
4/24/2021 50.955 

 
  Based on this data, there are two 
attributes, namely the date and closing price. Then 
made arrangements to determine the Id and Label 
attributes. We specify the date attribute as the Id 
attribute and the closing price attribute as the Label. 
Furthermore, the data normalization was carried 
out using the activation function of the binary 
sigmoid. Then the windowing technique is carried 
out because the data used is in the form of 
univariate data. After that, the data is ready to be 
processed using machine learning. 

From the experiments conducted, the 
performance of several methods in machine 
learning was tested, namely using k-nearest 
neighbor, neural network, and linear regression. 

Based on the RMSE results generated from 
the three methods, the method that produces the 
highest average RMSE is chosen and then optimized 
with exponential smoothing. 

 
RESULTS AND DISCUSSION 

 
Evaluation 

The data that is ready to use after pre-
processing is then predicted by experimenting with 
three methods. Then an evaluation will be made of 
the average RMSE value generated from each of 
these methods. 

First, the experiment was carried out using 
the KNN method. In this experiment, the k 
parameter optimization was carried out on the KNN 
with 4 experimental samples. The results can be 
seen in Table 2 below. The average RMSE value 
generated from the KNN method is 0.608. 

 
Table 2. Experiments Result Using KNN 

No. K RMSE 
1 0.7 0.5 
2 0.5 0.48 
3 0.3 0.478 
4 0.1 0.974 

Average 0.608 
 
The next experiment is to use the neural 

network method. In this experiment, optimization 
was carried out on the learning rate and momentum 
parameters with 4 experiments. The result obtained 
is to get an average RMSE value of 0.497 as shown 
in Table 3 below. 

 
Table 3. Experiments Result Using Neural Network 

No. LR Mom RMSE 
1 0.01 0.9 0.507 
2 0.001 0.9 0.435 
3 0.01 0.5 0.454 
4 0.001 0.5 0.59 

http://creativecommons.org/licenses/by-nc/4.0/


280 

 
Average 0.497 

The third method that was tested is linear 
regression. This experiment was carried out with 4 
different feature selections as listed in Table 4. The 
resulting average RMSE value was 0.451. And it 
turns out that this method produces the lowest 
RMSE value, which means that this method 
produces better predictive results. 

 
Table 4. Experiments Result Using Linear 

Regression 
No. Feature selection RMSE 

1 t-test 0.451 
2 m5prime 0.452 
3 Greedy 0.449 
4 iterative-t test 0451 

Average 0.451 
 
From the experiments that produced the 

best average RMSE value, then it was made to 
improve performance by using the exponential 
smoothing method. Then an experiment was 
carried out by optimizing the results of linear 
regression with exponential smoothing. The 
experiment was carried out by optimizing the alpha 
value in exponential smoothing with 4 feature 
selections in linear regression. And the results of 
these experiments can be seen in Table 4 below. 

 
Table 5. Experiments Result Using Linear 

Regression + Exponential Smoothing 
No. Alpha Feature selection RMSE 

1 0.5 t-test 0.229 
2 0.3 t-test 0.157 
3 0.1 t-test 0.175 
4 0.5 m5prime 0.229 
5 0.3 m5prime 0.157 
6 0.1 m5prime 0.178 
7 0.5 Greedy 0.229 
8 0.3 Greedy 0.157 
9 0.1 Greedy 0.178 

10 0.5 iterative-t test 0.229 
11 0.3 iterative-t test 0.157 
12 0.1 iterative-t test 0.178 

Average 0.188 
 

By performing optimization using 
exponential smoothing in the linear regression 
method, it turns out that it can produce a lower 
RMSE value with an average RMSE value of 0.188. 
This value is generated over 12 experiments. And it 
can be seen that the choice of features does not 
affect increasing the RMSE value, while the 
optimization of the alpha value on exponential 
smoothing has a sufficiently good impact on the 
increase in the RMSE value. 

 
Validation 

To prove whether there is a difference, and 
how significant the difference is between the usual 
linear regression method and the proposed method, 
in the form of optimizing linear regression using 
exponential smoothing, so it validates with a T-test. 

 
Table 6. RMSE Comparison Between LR and LR+ES 

Using T-test 
 Variable 1 Variable 2 

Mean 0.45075 0.18475 

Variance 1.58333E-06 
0.0009562

5 
Observations 4 4 

Pearson Correlation 
-

0.070674182 
 

Hypothesized Mean 
Difference 

0  

df 3  
t Stat 17.14049415  
P(T<=t) one-tail 0.000216309  
t Critical one-tail 2.353363435  
P(T<=t) two-tail 0.000432618  
t Critical two-tail 3.182446305  

  
 From the results of the T-test in Table 6, it 
produces a t-table value of 17.14049415, and t-
count value of 3.182446305, then the t-table value 
is greater than the t-count value. This means that 
there is a difference in alias H1 is accepted and H0 
is rejected. This difference also shows a significant 
value. This can be seen at the p-value which is less 
than 0.05 yaitu semester 0.000432618. 
 
 
CONCLUSIONS AND SUGGESTIONS 
 
Conclusion 

Based on the experiments conducted, this 
study uses three machine learning methods, namely 
k-Nearest Neighbor, Neural Network and Linear 
Regression. From the three methods, it is known 
that the linear regression method shows the highest 
average RMSE value of 0.451. After that, efforts 
were made to improve linear regression 
performance with exponential smoothing. It is 
known that the optimization of the alpha parameter 
in exponential smoothing can provide an average 
RMSE value of 0.188 and can provide a significant 
difference in the classical linear regression method. 
Where in previous studies, efforts to increase 
performance with exponential smoothing were 
directly carried out on the Neural Network method 
without first comparing what machine learning 
method has a better RMSE. That study also used 
another dataset, which was the gold price dataset. 
Meanwhile, in this study, optimization with 
exponential smoothing was carried out after 

http://creativecommons.org/licenses/by-nc/4.0/


281 

 
knowing the method that produced the most 
superior RMSE for predict the bitcoin prices. And it 
can be concluded that exponential smoothing can 
improve the performance of linear regression to be 
able to predict bitcoin prices. 

 
Suggestion 

From the results of the research conducted, 
it turns out that exponential smoothing can provide 
increased performance in predictions using the 
linear regression method. So in future research, it is 
hoped that the use of exponential smoothing will be 
developed as a method in pre-processing data to 
improve the performance of other machine learning 
methods. Other experiments are also expected to be 
carried out with different datasets. 
 
 
REFERENCES 
 

Aslanyan, T. K. (2021). Fundamentals Of Statistics 
For Data Scientists and Analysts. Towards Data 
Science. 
https://towardsdatascience.com/fundament
als-of-statistics-for-data-scientists-and-data-
analysts-69d93a05aae7 

Beaumont, A. N. (2014). Data transforms with 
exponential smoothing methods of 
forecasting. International Journal of 
Forecasting, 30(4), 918–927. 
https://doi.org/10.1016/j.ijforecast.2014.03.
013 

Han, J., & Kamber, M. (2006). Mining Stream, Time-
Series and Sequence Data. In Data Mining: 
Concepts and Techniques (Vol. 54, pp. 468–
473). 

Huang, C. H., & Hsieh, S. H. (2020). Predicting BIM 
labor cost with random forest and simple 
linear regression. Automation in Construction, 
118(May), 103280. 
https://doi.org/10.1016/j.autcon.2020.1032
80 

Jareño, F., González, M. de la O., Tolentino, M., & 
Sierra, K. (2020). Bitcoin and gold price 
returns: A quantile regression and NARDL 
analysis. Resources Policy, 67(February). 
https://doi.org/10.1016/j.resourpol.2020.10
1666 

Kalyvas, A., Papakyriakou, P., Sakkas, A., & Urquhart, 
A. (2020). What drives Bitcoin’s price crash 
risk? Economics Letters, 191(September 
2011), 1–5. 
https://doi.org/10.1016/j.econlet.2019.1087
77 

Koehler, A. B., Snyder, R. D., Ord, J. K., & Beaumont, 
A. (2012). A study of outliers in the 

exponential smoothing approach to 
forecasting. International Journal of 
Forecasting, 28(2), 477–484. 
https://doi.org/10.1016/j.ijforecast.2011.05.
001 

Matiz, S., & Barner, K. E. (2020). Conformal 
prediction based active learning by linear 
regression optimization. Neurocomputing, 
388, 157–169. 
https://doi.org/10.1016/j.neucom.2020.01.0
18 

Mondal, M. A., & Rehena, Z. (2020). Road Traffic 
Outlier Detection Technique based on Linear 
Regression. Procedia Computer Science, 
171(2019), 2547–2555. 
https://doi.org/10.1016/j.procs.2020.04.276 

Patel, D. R., & Kiran, M. B. (2019). A non-contact 
approach for surface roughness prediction in 
CNC turning using a linear regression model. 
Materials Today: Proceedings, 26(xxxx), 350–
355. 
https://doi.org/10.1016/j.matpr.2019.12.02
9 

Sarkodie, S. A., Ahmed, M. Y., & Owusu, P. A. (2021). 
COVID-19 pandemic improves market signals 
of cryptocurrencies–evidence from Bitcoin, 
Bitcoin Cash, Ethereum, and Litecoin. Finance 
Research Letters, January, 102049. 
https://doi.org/10.1016/j.frl.2021.102049 

Sbrana, G., & Silvestrini, A. (2013). Forecasting 
aggregate demand: Analytical comparison of 
top-down and bottom-up approaches in a 
multivariate exponential smoothing 
framework. International Journal of 
Production Economics, 146(1), 185–198. 
https://doi.org/10.1016/j.ijpe.2013.06.022 

Sbrana, G., & Silvestrini, A. (2014). Random 
switching exponential smoothing and inventory 
forecasting. 
https://www.bancaditalia.it/pubblicazioni/t
emi-discussione/2014/2014-
0971/en_tema_971.pdf 

Suryani, I. (2015). Penerapan Exponential 
Smoothing untuk Transformasi Data dalam 
Meningkatkan Akurasi Neural Network pada 
Prediksi Harga Emas. Journal of Intelligent 
Systems, 1(2), 67–75. 

Tratar, L. F. (2015). Int . J . Production Economics. 
Intern. Journal of Production Economics, 161, 
64–73. 
https://doi.org/10.1016/j.ijpe.2014.11.019 

Tsang, K. P., & Yang, Z. (2020). Price dispersion in 
bitcoin exchanges. Economics Letters, 194, 
109379. 
https://doi.org/10.1016/j.econlet.2020.1093
79 

http://creativecommons.org/licenses/by-nc/4.0/


282 

 
Vercellis, C. (2009). Business Intelligence: Data 
Mining and Optimization for Decision Making. 
In Business Intelligence: Data Mining and 
Optimization for Decision Making. 
https://doi.org/10.1002/9780470753866 

Yager, R. R. (2013). Exponential smoothing with 
credibility weighted observations. 
Information Sciences, 252, 96–105. 
https://doi.org/10.1016/j.ins.2013.07.008 

Zakeri, Z., Mansfield, N., Sunderland, C., & Omurtag, 
A. (2020). Cross-validating models of 

continuous data from simulation and 
experiment by using linear regression and 
artificial neural networks. Informatics in 
Medicine Unlocked, 21(July), 1–6. 
https://doi.org/10.1016/j.imu.2020.100457 

 
http://creativecommons.org/licenses/by-nc/4.0/