Knowledge Engineering and Data Science (KEDS)  pISSN 2597-4602 

Vol 6, No 2, October 2023, pp. 215–230  eISSN 2597-4637 

 
https://doi.org/10.17977/um018v6i22023p215-230 

©2023 Knowledge Engineering and Data Science | W : http://journal2.um.ac.id/index.php/keds | E : keds.journal@um.ac.id 

This is an open access article under the CC BY-SA license (https://creativecommons.org/licenses/by-sa/4.0/) 

Stacked LSTM-GRU Long-Term Forecasting Model for 
Indonesian Islamic Banks  

Yayat Sujatna a,1, Adhitio Satyo Bayangkari Karno b,2, Widi Hastomo c,3,*, Nia Yuningsih b,4, Dody 
Arif d,5, Sri Setya Handayani d,6, Aqwam Rosadi Kardian e,7, Ire Puspa Wardhani e,8, L.M Rasdi 

Rere e,9 

a Department of Accounting, Ahmad Dahlan Institute of Technology and Business 

Jl. Ir H. Juanda No.77, Tangerang Selatan 15419, Indonesia 
b Department of Information System, Faculty of Engineering, Gunadarma University 

Jl. Margonda Raya No. 100, Depok 16424, Indonesia 
c Department of Information Technology, Ahmad Dahlan Institute of Technology and Business 

Jl. Ir H. Juanda No.77, Tangerang Selatan 15419, Indonesia 
d Department of Management, Faculty of Economics, Gunadarma University 

Jl. Margonda Raya No. 100, Depok 16424, Indonesia 
e Department of Computer Systems, STMIK Jakarta STI&K 

Jl. Bri Radio Dalam No.17, Jakarta Selatan 12140, Indonesia 
1 yayatsujatna@gmail.com; 2 adh1t10.2@gmail.com; 3 widie.has@gmail.com*; 4 nia_yuningsih@staff.gunadarma.ac.id;  
5 dodiarif8@gmail.com; 6 srisetyahandayani@yahoo.com; 7 aqwam@staff.jak-stik.ac.id; 8 irepuspa@staff.jak-stik.ac.id; 

9 rasdirere267@gmail.com 
* corresponding author 

 
I. Introduction  

As the country with the world's largest Muslim-majority population, Indonesia has enormous 

potential for the expansion of the Islamic banking financial system in the future, as evidenced by a 

robust network of Islamic banks [1]. These banks follow Islamic law (Sharia) principles and follow 

ethical and moral criteria [2]. The Indonesian government is aggressively promoting the growth of 

Islamic banking in response to the growing demand for Islamic financial products and services. 

Various regulatory frameworks have been put in place to support the formation and expansion of 

Islamic banks. The Financial Services Authority (OJK) is responsible for managing and regulating 

the operations of Islamic banks in order to maintain Sharia compliance [3]. In addition to becoming 
full-service Islamic banks, conventional banks have built Islamic banking branches to 

accommodate the rising demand for Shariah-compliant services. These institutions provide a wide 

ARTICLE INFO A B S T R A C T   

Article history: 

Received 04 September 2023 

Revised 26 September 2023 

Accepted 20 October 2023 

Published online 06 November 2023 

 
The development of the Islamic banking industry in Indonesia has become a 
significant concern in recent years, with rapid growth in the number of banks 
operating based on Sharia principles. To face emerging challenges and opportunities, 
a deep understanding of the long-term financial behavior of Islamic banks is 
becoming increasingly important. This study aims to predict the share price of PT 
Bank Syariah Indonesia Tbk, over 28 days using the LSTM-GRU stack. The 
observation stage includes importing the dataset, data separation, model variations, 
the training process, output, and evaluation. Observations were conducted using 10 
model variations from 4 stacks of LSTM and GRU. Each model performs the 
training process in four epochs (200, 500, 750, and 1000). The results of 
observations in this study show that long-term predictions (28 days ahead) using 
four stacks of LSTM-GRU and daily training accumulation techniques produce 
better accuracy than the general method (using multiple outputs). From the 
observations we have made for predictions for the next 28 days, the model with the 
LGLG stack arrangement (LSTM-GRU-LSTM-GRU) produces the best accuracy at 
epoch 750 with an MSE LSTM-GRU 63.43762863. This study will undoubtedly 
continue in order to achieve even better precision, either by utilizing a new design or 
by further improving the technology we are now employing. 

This is an open access article under the CC BY-SA license 

(https://creativecommons.org/licenses/by-sa/4.0/).  

Keywords: 

Sharia Principles 

Indonesian Banks                                      

Long-term Forecasts 

GRU 

LSTM         

http://u.lipi.go.id/1502081730
http://u.lipi.go.id/1502081046
http://journal2.um.ac.id/index.php/keds
mailto:keds.journal@um.ac.id
https://creativecommons.org/licenses/by-sa/4.0/
https://creativecommons.org/licenses/by-sa/4.0/


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 216 

 
range of Shariah-compliant goods and services, including savings accounts, financing, investment 

instruments, takaful (Islamic insurance), and zakat payments [4][5][6]. 

Islamic banking in Indonesia has experienced rapid growth in the last few decades [7]. This 

growth not only reflects global trends in sharia finance but is also reflected in the economic and 

social development of Indonesia, which has a sizeable Muslim population. Sharia banking provides 

financial access to people previously not served by conventional banking [8]. The system has 

helped drive financial inclusion in Indonesia by providing access to banking products and services 

to groups previously considered "unbankable". The existence of Sharia banking also makes a 

positive contribution to the stability of the Indonesian economy as a whole [9]. Diversifying 

Islamic banking and financing based on Islamic ethics helps reduce systemic risk [10]. Thus, the 

growth of Sharia banking in Indonesia not only reflects high market demand but also creates a 

positive impact by encouraging financial inclusion, sustainable economic development, and the 

development of financial products and services that are in line with Islamic values [11], is an 

essential aspect of Indonesia's diverse and dynamic economic and financial development. 

Despite substantial progress in Islamic banking in Indonesia, there are still issues, difficulties, 

and possibilities to be addressed. Evaluating and analyzing the performance, efficiency, and 

competitiveness of Islamic banks in comparison to conventional banks, as well as comprehending 

the dynamics and factors influencing the growth and long-term sustainability of Islamic banking in 

Indonesia, is critical for policymakers, regulators, and market players [12][13][14]. Long-term 

stock forecasting is required for investors and financial institutions to make good long-term 

investment decisions and strategies in the Indonesian market [15][16]. For investors looking to 

improve their investment portfolios, accurate long-term stock prediction estimates from Islamic 

banks are invaluable. While previous research still uses traditional financial models [17] or basic 

machine learning algorithms [18], with low accuracy results [19] and many biases, it is still far 

from what was expected [20]. 

In recent years, financial markets have seen a considerable surge in the application of Artificial 

Intelligence (AI) and Machine Learning (ML) techniques for stock market prediction 

[21][22][23][24]. These strategies have demonstrated promising results in identifying complicated 

patterns and trends in financial data, supporting investors in making educated decisions. Recurrent 

Neural Networks (RNNs) have attracted much interest among other ML techniques due to their 

ability to handle sequential and temporal connections in data. The Long Short-Term Memory 

(LSTM) network is one form of RNN that has proven efficacy in time series analysis [25]. LSTM 

networks can capture long-term relationships and reduce the missing gradient issue in standard 

RNNs [26]. In addition, Gated Recurrent Units (GRUs) have emerged as an alternative RNN 

architecture that offers computational efficiency and performance comparable to LSTM [27]. 

Individually, the LSTM and GRU networks have been regularly used to estimate stock prices in 

the context of stock market prediction [28][29][30][31][32][33][34]. However, improved models 

that integrate the capabilities of the two architectures are still required to increase forecast 

accuracy. Despite the growing interest in Islamic banking and the importance of Islamic bank 

shares in Indonesia, there is a significant vacuum in the existing literature on long-term forecasts 

utilizing deep learning techniques. Most of the study focuses on traditional bank financial 

performance and short-term predictions, with minimal discussion of long-term stock projections in 

the Indonesian setting. This study aims to evaluate the performance of PT Bank Syariah Indonesia 

Tbk's long-term stock prediction model. Two novel approaches are proposed. The first is 

optimizing the model with a separate training process using ten variations of the 4 LSTM-GRU 

stacks. The second approach is the input and target data segmentation technique, adjusted to the 

predictions for the next 1 to 28 days. 

By stacking many models, deep learning models become better and more useful for forecasting 

time series data [35][36][37], particularly for predicting stock values [38][39][40][41]. Several 

experiments on merging several machine learning approaches to predict time series data have been 

conducted [42]. Predicting water prices with an LSTM-GRU model is more accurate than using the 

GRU and piles with an LSTM-LSTM arrangement [43]. When predicting complicated stock 


217 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
market data, the hybrid Akima-EMD-LSTM model outperforms the hybrid EMD-LSTM, EEMD-

LSTM, and SEMD-LSTM models [44]. Stock price prediction employs a time-series analysis of 

LSTM and sentiment analysis of the Valence Aware Dictionary and Sentiment Reasoner 

(VADER). Compared to earlier research, this method yields more accuracy [45]. The CNN, RNN, 

LSTM, CNN-RNN, and CNN-LSTM algorithms are used to predict the Shanghai Composite Index 

shares. The CNN-RNN approach outperforms other methods (CNN, RNN, and LSTM) [46]. For 

music data, classic tanh, LSTM, and GRU are used, with LSTM and GRU having benefits over 

standard tanh units [47]. A stacked LSTM model is used to detect abnormalities in four separate 

datasets. 

II. Methods 

This study was carried out in stages, beginning with data collection, then the separation of 

training and test data, the separation of goal data for long-term predictions of the following 28 

days, model creation, and assessment. The research flowchart shows in Figure 1 describes the steps 

of this investigation in general. The following is a detailed explanation of the experimental process 

flow for predicting Sharia stock prices using the LSTM and GRU stack models, starting from 

importing the dataset to output: 

• Import Dataset: From 01-07-2020 to 01-07-2023, the stock time series dataset from PT Bank 
Syariah Indonesia Tbk (BRIS) was taken from https://finance.yahoo.com. The data set has 728 

rows (days) and six columns (Open, High, Low, Close, AdjClose, and Volume), with data from 

the "Close" column being used in this study.  

• Data separation is done by taking the last 28 days of the dataset to be used as prediction data 
for the next 28 days. Then, the remaining 700 days of data are divided into training data (600 

days) and test data (100 days). 

• Modeling is building 10 model variations from 4 LSTM and GRU stack arrangements, namely: 
GGGG, GGGL, GGLL, GLGL, GLLG, LGGL, LGLG, LLGG, LLLG, LLLL. G is for GRU, 

and L is for LSTM. This model will be trained on training data using machine learning 

algorithms, includes initializing the model, determining the loss function, selecting the 

optimizer (e.g., Adam), and determining the evaluation metric, the Mean Square Error (MSE). 

• Evaluation: Once training is complete, the model should be evaluated to measure how well it 
predicts stock prices. This evaluation is usually carried out on previously separated test data. 

This experiment uses evaluation metrics such as MSE to assess the quality of model 

predictions. Additionally, visualizations such as graphs comparing predictions with actual data 

can also provide valuable insights. 

• The output is depicted in the form of a graph that shows historical visuals between actual data 
and predicted data.  

To be able to determine the level of accuracy of the results of the training that has been carried 

out. So, measurements are made between the predicted results and actual data using the MSE 

measurement method. 

 
Fig. 1. Research flowchart 

A. LSTM-GRU 

RNN employing backpropagation is the first deep learning model that can recall prior data and 

predict data one step ahead [48][49][50][51]. Adding layers can enhance accuracy, but doing so 

with the RNN might result in a diminishing gradient. As a result, the RNN can only overcome 

short-term reliance [52][53]. Because of this issue, LSTM [54] and [55] cells were created, which 

have several gates and may overcome long-term dependence. GRU, a cell with a simpler gate that 


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 218 

 
can also overcome long-term dependencies, is a further advancement [46][56]. Figure 2 depicts 

architectural advancements beginning with RNN, then LSTM, and finally GRU. 

 
Fig. 2. RNN, LSTM and GRU architecture development 

Initialize the initial hidden state and cell state values for each LSTM layer. 𝐻0
𝐿𝑆𝑇𝑀𝑖 =

 0, 𝐶0
𝐿𝑆𝑇𝑀𝑖 = 0 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑖  and GRU layer    𝐻

0

𝐺𝑅𝑈𝑗
= 0 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑗. Iterate through each time step t 

(usually from t =1 to T, where T is the length of the input sequence). For each LSTM layer i, 

calculate the hidden state 𝐻𝑡
𝐿𝑆𝑇𝑀𝑖  and cell state 𝐶𝑡

𝐿𝑆𝑇𝑀𝑖  , as in (1) to (6) and for each to-j GRU 

layer, calculate the hidden state 𝐻
𝑡

𝐺𝑅𝑈𝑗
 as in (6) to (10). 

𝑓𝑡
𝐿𝑆𝑇𝑀𝑖 =  (𝑊

𝑓
𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1

𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑓
𝐿𝑆𝑇𝑀𝑖 )              (1) 

𝑖𝑡
𝐿𝑆𝑇𝑀𝑖 =  (𝑊

𝑓
𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1

𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑖
𝐿𝑆𝑇𝑀𝑖 )              (2) 

Ĉ𝑡
𝐿𝑆𝑇𝑀𝑖 = 𝑡𝑎𝑛ℎ (𝑊𝑐

𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1
𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑐

𝐿𝑆𝑇𝑀𝑖 )             (3) 

𝐶𝑡
𝐿𝑆𝑇𝑀𝑖 = 𝑓𝑡

𝐿𝑆𝑇𝑀𝑖 . 𝐶𝑡−1
𝐿𝑆𝑇𝑀𝑖 + 𝑖𝑡

𝐿𝑆𝑇𝑀𝑖 . Ĉ𝑡
𝐿𝑆𝑇𝑀𝑖              (4) 

𝑜𝑡
𝐿𝑆𝑇𝑀𝑖 =  (𝑊𝑜

𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1
𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑜

𝐿𝑆𝑇𝑀𝑖 )              (5) 

𝐻𝑡
𝐿𝑆𝑇𝑀𝑖 = 𝑜𝑡

𝐿𝑆𝑇𝑀𝑖 . tanh (𝐶𝑡
𝐿𝑆𝑇𝑀𝑖 )               (6) 

𝑍
𝑡

𝐺𝑅𝑈𝑗
=  (𝑊𝑧

𝐺𝑅𝑈𝑗
. [𝐻

𝑡−1

𝐺𝑅𝑈𝑗
, 𝑋𝑡 ] + 𝑏𝑧

𝐺𝑅𝑈𝑗
)              (7) 

𝑇
𝑡

𝐺𝑅𝑈𝑗
=  (𝑊𝑟

𝐺𝑅𝑈𝑗
. [𝐻

𝑡−1

𝐺𝑅𝑈𝑗
, 𝑋𝑡 ] + 𝑏𝑟

𝐺𝑅𝑈𝑗
)              (8) 

Ĥ
𝑡

𝐺𝑅𝑈𝑗
= 𝑡𝑎𝑛ℎ (𝑊

ℎ

𝐺𝑅𝑈𝑗
. [𝑟

𝑡

𝐺𝑅𝑈𝑗
. 𝐻

𝑡−1

𝐺𝑅𝑈𝑗
, 𝑋𝑡  ] + 𝑏ℎ

𝐺𝑅𝑈𝑗
)             (9) 

𝐻
𝑡

𝐺𝑅𝑈𝑗
= (1 − 𝑧

𝑡

𝐺𝑅𝑈𝑗
) . Ĥ

𝑡

𝐺𝑅𝑈𝑗
+ 𝑧

𝑡

𝐺𝑅𝑈𝑗
. 𝐻

𝑡

𝐺𝑅𝑈𝑗
              (10) 

The output result of the last layer of LSTM and GRU at the last time step T is the final result of 

the model as in (11). 𝑋 is the input at each time step, 𝐻𝑡
𝐿𝑆𝑇𝑀𝑖  is the state (hidden state) of the-i 

LSTM layer at time step t,  𝐶𝑡
𝐿𝑆𝑇𝑀𝑖   is the cell state of the-i LSTM layer at time step t, 𝐻𝑡

𝐺𝑅𝑈𝑖   is the 

state (hidden state) of the-i GRU layer at time step t. 

 
219 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
𝑂𝑢𝑡𝑝𝑢𝑡 = [𝐻𝑇
𝐿𝑆𝑇𝑀𝑙𝑎𝑠𝑡 , 𝐻𝑇

𝐺𝑅𝑈𝑙𝑎𝑠𝑡 ]                (11) 

 
Pseudocode 1 is a pseudocode representation of stacking LSTM and GRU layers in a recurrent 

neural network (RNN). 

PSEUDOCODE 1. LSTM and GRU Stack 

1 input_data = Placeholder(shape=(batch_size, sequence_length, input_size)) 

2 hidden_states_lstm = [] 

3 hidden_states_gru = [] 

4 for i in range(num_layers_lstm): 

5   if i == 0: 

6      lstm_input = input_data 

7   else: 

8      lstm_input = hidden_states_lstm[-1] 

9   lstm_layer = LSTM(hidden_size_lstm, return_sequences=True)(lstm_input) 

10   hidden_states_lstm.append(lstm_layer) 

11 for j in range(num_layers_gru): 

12   if j == 0: 

13      gru_input = input_data  

14   else: 

15      gru_input = hidden_states_gru[-1] 

16      gru_layer = GRU(hidden_size_gru, return_sequences=True)(gru_input) 

17 hidden_states_gru.append(gru_layer) 

18 final_lstm_hidden_state = hidden_states_lstm[-1] 

19 final_gru_hidden_state = hidden_states_gru[-1] 

20 combined_hidden_state = Concatenate(axis=-1)([final_lstm_hidden_state, 

final_gru_hidden_state]) 

21 output_layer = Dense(output_size, activation='softmax')(combined_hidden_ 

state) 

22 model = Model(inputs=input_data, outputs=output_layer) 

23 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = 

['accuracy']) 

24 model.fit(input_data, target_data, epochs=num_epochs,batch_size=batch_size) 

 
Pseudocode for LSTM-GRU stacks represents a high-level algorithmic outline for constructing 

a deep neural network architecture that combines LSTM and GRU layers. This pseudocode 

specifies the critical steps for building a stacked RNN, starting with the definition of 

hyperparameters and input data placeholders, followed by creating multiple LSTM and GRU layers 

with their respective hidden states. The final hidden states of these layers can be concatenated or 

combined as needed for downstream tasks. By stacking LSTM and GRU units, the model aims to 

capture complex sequential patterns, making it particularly useful for tasks involving sequential 

data analysis.  

The traditional LSTM and GRU models have several limitations compared to model stacks that 

combine LSTM and GRU. Following are some of the main limitations of traditional LSTM and 

GRU models. Lack of ability to handle long-term information [57]. Although LSTM and GRU are 

designed to overcome the vanishing gradient problem in RNN models, they still have limitations in 

handling long-term information. These models can remember information from several previous 

time steps, but over very long periods, they may still have difficulty. More expensive computing, 

LSTM, and GRU models are relatively computationally complex [58], mainly when used in deep 

or layered networks, which can result in longer training times and require more excellent 

computing resources [59]. 

Susceptible to Overfitting: LSTM and GRU models are more susceptible to overfitting when 

used on relatively small datasets [60]. Because the number of parameters in these models is 

significant, they can “memorize” existing training data rather than understanding general patterns. 

Not Optimal for Specific Tasks: While LSTM and GRU are reasonable solutions for many tasks in 

time series modeling, there are some specialized tasks, such as text processing (NLP), that require 

more specialized architectures, such as transformers [61]. 


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 220 

 
To overcome these limitations, a stack of LSTM and GRU models can provide several 

advantages, including. Richer Representation Capabilities: with a stack of LSTM and GRU models, 

we can use multiple LSTM and GRU layers sequentially [61], allowing the model to represent the 

data better and describe more complex relationships in the time series. In hierarchical learning, the 

model stack can learn a hierarchy of information. The first layer can understand more basic 

patterns, while subsequent layers can understand increasingly abstract and complex patterns [61].  

Reduces the risk of overfitting with the addition of layers and techniques such as dropout 

between layers, and model stacks can help reduce the risk of overfitting, mainly if managed wisely 

[62]. Flexible Architectural Combinations: combining LSTM and GRU in various configurations in 

a model stack allows flexibility in designing the most appropriate architecture for a particular task 

[62]. However, it should be noted that stacked LSTM and GRU models also require careful tuning 

and attention to overfitting. The selection of appropriate architecture and parameters will 

significantly influence the quality of model predictions. 

B. Data Separation  

The dataset is divided into training data (700 days), test data (100 days), and prediction data (28 

days). Figure 3 shows the division of training data and test data as a history graph. The training 

procedure is conducted to create a model. Predictions were performed using training and test data 

to evaluate the performance of the resultant model as shown in Figure 4. 

 
Fig. 3. Separation of training data (green), test data (blue), and predictive data (yellow) 

 
Fig. 4. Predicted results of training data (magenta), and predicted results of test data (cyan) 

 
Prediction data (28 days) has been disguised and is only used to evaluate prediction outcomes; it 

is not included in the training process. We employ recurrent training approaches that are carried out 

individually for predictions from 1 day to 28 days ahead to anticipate the following 28 days without 

training data. The input data spans 7 days, whereas the desired data spans 1 day. The forecast for 


221 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
the first day is based on one day of target data, which is one day following the training data input. 

The forecast for the second day uses one day of target data that were collected two days after the 

input training data, and so on until the prediction for the 28th day utilizes one day of target data that 

was collected 28 days after the input training data. Each training procedure is repeated ten times 

with a distinct 4-layer LSTM-GRU arrangement model [63] to get the best outcomes. Figure 5 

depicts the separation of input and target data for forecasts from one to 28 days. 

  
Fig. 5. Illustration of training and target data separation for predictions ranging from 1 to 28 days 

 
C. Modeling  

Each training procedure is carried out in 10 variations of four distinct layers of the LSTM-GRU 

arrangement to get the most excellent model performance: Var-01: GGGG, Var-02: GGGL, Var-

03: GGLL, Var-04: GLGL, Var-05: GLLG, Var-06: LGGL, Var-07: LGLG, Var-08: LLGG, Var-

09: LLLG, Var-10: LLLL. The letter L represents LSTM, and the letter G represents GRU. Each 

training procedure uses four epochs (200, 500, 750, and 1000). Choosing the number of epochs 

(iterations through the entire training dataset) in training a neural network model is an important 

decision based on sound judgment, especially in using four epochs (200, 500, 750, and 1000). 

Below, we will provide scientific arguments for choosing this number of epochs: 

• Convergence Requirements: The number of epochs used in model training depends mainly on 
the complexity of the model, the volume of data, and the desired level of convergence. The 

more complex the model, the longer it takes to reach convergence. The number of epochs 

spanning four points (200, 500, 750, and 1000) reflects an attempt to examine how the model 

behaves at various points in training, from early to more advanced stages. 

• Performance Monitoring: During training, it is essential to monitor model performance on 
validation or test datasets to prevent overfitting. By using several different epoch points, we can 

examine how the model behaves over time. Also seeing whether the model's performance 

continues to increase, reaches a peak, or even decreases at a certain point will help decide when 

to stop training or take other actions, such as reducing the learning rate or adjusting the model 

architecture. 

Training data input Target data Prediction 28-day

--- --- --- --- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- --- --- --- --- --- --- ---

--- --- --- --- --- --- --- --- --- --- ---

R
e
su

lt

with target without target

TRAIN AND TEST DATA Pred. 28-day

T
R

A
IN

IN
G

 P
R

O
C

E
S

S
  

 2
8
-d

a
y

D
-1

D
-2

D
-3

D
-2

8


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 222 

 
• Probability Map Exploration: By trying several different epoch points, this process can also 
explore the likelihood map of the model's behavior. For example, at the initial epoch (200), the 

model has not converged enough and is biased towards the training data. At midpoints (500 and 

750), the model can approach convergence and begin to fit the validation data. At the endpoint 

(1000), one can see whether the model continues improving in performance or has reached a 

saturation point. 

• Stability Evaluation: The stability of the model can also be assessed through these four epoch 
points. When a model has highly fluctuating behavior at early points in training, this may 

indicate that the high learning rate and complexity of the model may need to be adjusted. 

Conversely, if the model shows good stability at specific points, this may indicate that the 

process has found an exemplary training configuration. 

• Testing and Generalisation: Once training is complete at the endpoint (1000), the process can 
then test the model on never-before-seen data to measure generalization capabilities. If the 

model can produce good results on the test data, this will indicate that the training has been 

successful. 

The selection of these four epoch points provides a rich perspective on how the model develops 

its performance over time. However, keep in mind that in practice, the choice of the number of 

epochs must also be considered along with other factors such as learning rate, batch size, model 

complexity, and the characteristics of the data used. 

The Adam optimization function is used to construct the model, with a learning rate of 1,001, 

nodes for each layer of 50, and a batch size of 64. Figure 6 depicts the process from input to deep 

learning models with 10 variations, predictions, and MSE values produced for each model variant. 

Adam combines the concepts of momentum (to help handle local minima) and RMSprop (to set the 

learning rate) in one algorithm. It uses moving estimates of the first gradient (momentum) and the 

second gradient (RMS momentum) to calculate weight updates. The learning rate can fluctuate for 

each parameter based on previous gradient history. These estimates are adjusted to consider the 

weighted average exponential factor (with higher learning rates). 

 
Fig. 6. Input, model, prediction results, and performance evaluation using MSE 

 
Learning Rate 1.001: The learning rate is the factor that controls the extent to which the model 

will adjust based on the gradient of the training data. A value of 1.001 is relatively high, and 

INPUT MSE

G G G G

G G G L

G G L L

G L G L

G L L G

L G G L

L G L G

L L G G

L L L G

L L L L
Var-10

Var-09

Var-01

Var-02

Var-03

Var-04

Var-05

Var-06

Var-07

Var-08

D-2 D-3 …. D-28D-1

D-1 D-2 D-3 …. D-28

D-2 D-3 …. D-28

D-2 D-3 …. D-28

D-1

D-1

D-28

D-1 D-2 D-3 …. D-28

D-1 D-2 D-3 …. D-28

D-3

REAL

D-1 D-2 …. D-28

D-3

D-3

D-3 D-28

D-1 D-2 D-3 …. D-28

D-1 D-2 D-3 …. D-28

D-1 D-2 D-3 ….

….
Var-08

D-1 D-2 … D-28

….
Var-09

D-28
Var-10

D-1 D-2 …

D-1 D-2 … D-28

….
Var-07

D-1 D-2 … D-28

D-3

D-3

….
Var-06

D-1 D-2 … D-28

….
Var-05

D-1 D-2 … D-28

….
Var-04

D-1 D-2 …D-3

D-3

D-28

DEEP LEARNING PREDICTION

D-1
Var-01

D-1 D-2 … D-28

D-3
Var-03

D-1 D-2 … D-28

D-2
Var-02

D-1 D-2 …

D-3

D-3

D-3

D-28


223 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
usually, smaller learning rate values (e.g., 0.001) are used to ensure stable convergence. Hidden 

Layer 50: This refers to the number of nodes (neurons) in each hidden layer in a neural network. 

This value shows the complexity of the model that has been created. The more nodes, the greater 

the model's ability to capture complex patterns in the data, but it can also increase the risk of 

overfitting if the training data is limited. Batch Size 64: This is the number of data samples used in 

each weight update iteration (mini-batch learning iteration). Larger batches can speed up training 

due to more efficient optimization, but they also require more memory. Too small a batch can 

cause unstable convergence. Batch size 64 is a commonly used value in most cases. 

D. Evaluation Criteria 

To assess model effectiveness, we employ a statistical technique known as Mean Square Error 

(MSE). MSE is calculated as the sum of the squares of the error distance between the anticipated 

outcomes and 28 previously hidden observation data points (actual data), then divided by the 

sample size. A lower MSE value suggests improved performance [64]. The formulation for MSE is 
shown in Equation 1, where the variables 𝑝 are predicted data, variables 𝑟 are actual data 

(observations) that are concealed, and n indicates the number of sample data. 

𝑀𝑆𝐸 =
1

𝑛
∑(𝑝 − 𝑟)2                                                                                    (12) 

A lower MSE value indicates that the experimental model can better predict stock prices 

accurately, which means that the difference between model predictions and actual stock prices 

tends to be smaller. Conversely, a high MSE value indicates the model has a significant mismatch 

in predicting stock prices. MSE is a simple and easy-to-understand metric. The smaller the MSE 

value, the better the model predicts stock prices. MSE can give high weight to significant errors in 

predictions, which is helpful in cases where outliers (significant differences between predicted and 

actual values) must be considered. The use of MSE in evaluating forecasting models for the next 28 

days will help to measure the quality of model predictions and to compare different models or 

update the model if necessary. 

III. Results and Discussions 

The training procedure used 10 model versions and 4 epochs (200, 500, 750, and 1000), 

resulting in 40 prediction graphs with 120 MSE measures. We only provide one graph of the 

projected outcomes (out of 40 graphs) for the training data phase, test data, and 28 days of 

prediction data (Figure 7) because of page limits. To make the 28-day forecast chart more visible, 

we expanded a smaller section (Figure 8). Figure 8 indicates that the 28-day forecast, particularly, 

has acceptable fluctuations until day 28 and continues to follow the original data pattern, starkly 

contrasting with long-term prediction approaches in general, which tend towards a specific value 

(convergent) with a more substantial bias for more extended data forecasts. 

 
Fig. 7. Training data, test data, and 28-day predicted data prediction results in full size 


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 224 

 
Fig. 8. Expanded sizes for test predictions and 28-day predictions 

 
Figure 7 and Figure 8 shows all MSE values for training data predictions, testing, and 28-day 

forecasts numerically, while Tables 1 and 2 show the MSE values graphically. Tables 1-2 and 7 

show that the best model for predicting training and test data is the Var-10 with the LSTM-LSTM-

LSTM-LSTM (LLLL) stack architecture, with MSE values of 1795.1927 and 1485.7672, 

respectively. Meanwhile, Var-7 with the LSTM-GRU-LSTM-GRU (LGLG) stack architecture is 

the best model for 28-day predictive data, with an MSE of 63.4376. 

Table 1 summarizes the MSE evaluation with all training procedures in the 200-500 epoch 

range. This model is a stack of four sequential layers with two different types of memory cells, 

namely the GRU and LSTM. Epoch 200 prediction of 28 days: this MSE value of 90.8903961 

shows how much this model performs in predicting data and indicates that the model has a 

relatively large error rate, which means that the difference between the stock price predicted by the 

model and the actual stock price at each time point in the dataset is relatively significant. MSE of 

90.8903 indicates that the GRU, LSTM, LSTM, and GRU stack model needs to be refined to 

improve the quality of stock price predictions. Careful evaluation and model adjustment are 

essential to overcome these limitations and achieve more accurate predictions. 

Table 1. The MSE of the whole training procedure in numerical form for epochs 200–500 

Var 

MSE 

Epoch-200 Epoch-500 

Train Test Pred-28 Train Test Pred-28 

Var-01 G G G G 1857.0976 1541.8996 111.25726 1856.6352 1529.7655 91.91923 

Var-02 G G G L 1858.9776 1533.7580 113.43025 1846.4071 1525.7432 103.7180 

Var-03 G G L L 1924.8036 1591.7732 130.17904 1867.3014 1534.2615 78.1313 

Var-04 G L G L 1839.2664 1519.7423 103.48911 1873.7208 1560.0412 82.2884 

Var-05 G L L G 1809.4829 1495.1547 90.8903 1853.8268 1529.8587 95.7873 

Var-06 L G G L 1811.1107 1495.3051 115.3705 1854.5001 1528.7287 77.9630 

Var-07 L G L G 1854.3511 1534.2149 113.8544 1856.7671 1534.1344 74.1705 

Var-08 L L G G 1850.9032 1527.7330 113.4182 1890.1816 1569.1906 70.1206 

Var-09 L L L G 1854.5735 1535.9987 116.4838 1865.5825 1539.1219 78.8466 

Var-10 L L L L 1795.1926 1485.7672 139.1183 1841.9540 1530.2617 82.5902 

 
Table 2 summarizes optimization prediction results for the next 28 days in the epoch 750 

training process with an MSE value of 63.4376. These results use variant seven with a stack of 

LSTM, GRU, LSTM, and GRU. The MSE value is a metric that measures the average of the 

squared differences between model predictions and actual values. In this context, an MSE value of 

63.4376 means that the squared average difference between the predicted value and the actual stock 

value for the next 28 days is approximately 63.44 (in units that correspond to the stock data, for 

example, in dollars). Interpretation: a lower MSE value indicates that this model can predict better 

because the difference between the prediction and the actual value is smaller on average. Therefore, 


225 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
in general, the MSE value of 63.44 indicates that the model has fairly good prediction quality. 

Epoch 750 is an iteration through the entire training dataset used to train the model. By the 750th 

epoch, the model has undergone many iterations through the data and has made repeated 

adjustments to the weights and parameters used to make predictions. 

Table 2. The MSE of the whole training procedure in numerical form for epochs 750–1000 

Var 

MSE 

Epoch-750 Epoch-1000 

Train Test Pred-28 Train Test Pred-28 

Var-01 G G G G 1889.6429 1558.6723 215.5718 1844.8476 1521.5133 237.05712 

Var-02 G G G L 1831.7365 1503.1840 156.7340 1838.4467 1524.5493 246.4624 

Var-03 G G L L 1845.0175 1523.0437 152.2221 1845.1898 1529.3063 202.8599 

Var-04 G L G L 1844.9218 1528.9018 122.2034 1843.0441 1522.9162 129.7584 

Var-05 G L L G 1841.6457 1524.5890 63.7433 1843.1699 1531.5649 107.1425 

Var-06 L G G L 1863.3424 1538.3165 120.0539 1855.0585 1531.2086 146.0880 

Var-07 L G L G 1881.2632 1560.3401 63.4376 1845.0715 1529.2144 161.1198 

Var-08 L L G G 1847.7620 1529.9814 87.0923 1845.5463 1521.8910 77.7621 

Var-09 L L L G 1832.0935 1514.4305 65.8479 1868.4365 1541.8586 85.3185 

Var-10 L L L L 1813.4162 1499.3618 81.2523 1835.4334 1518.0367 94.5993 

 
The combination of LSTM, GRU, LSTM, GRU stack can give the model the ability to capture 

complex patterns in time series data. LSTM has the ability to remember information in the long 

term, while GRU is more efficient at handling information in the short term. This combination 

allows the model to combine the advantages of both. The prediction results for the next 28 days 

show that the seven variants model with the LSTM, GRU, LSTM, GRU stack has the potential to 

provide fairly good stock price predictions. However, the use of these predicted results must be 

integrated into a careful investment strategy and pay attention to risk factors that may influence 

stock prices. Figure 9 to Figure 11 show the MSE values of the training process for training data, 

test data, and 28-day data predictions, respectively. 

 
Fig. 9. The MSE values of the whole training process for training data 

 
 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 226 

 
Fig. 10. Expanded sizes for test predictions and 28-day predictions 

 
Fig. 11. Expanded sizes for test predictions and 28-day predictions 

 
Table 3 present the performance study of present models. In previous studies conducted by [31], 

in this paper, a new model for optimizing stock forecasting is proposed that incorporates a range of 

technical indicators, including investor sentiment indicators and financial data, and performs 

dimension reduction on the many influencing factors of the retrieved stock price using depth 

learning LASSO and PCA approaches. The paper's insight is to propose a new model for 

optimizing stock forecasting by incorporating technical indicators and performing dimension 

reduction using LSTM and GRU models. LSTM and GRU models can effectively predict stock 

prices; the LASSO dimension reduction method performs better than PCA. In previous studies by 

[65] to forecast the stock price, the LSTM, bi-LSTM, GRU, and ordinary neural network (NN) 

modules are each designed sequentially. The performance of each separate model is then compared 

in this work with that of the suggested hybrid model. The NIFTY-50 stock market data implements 

the proposed stock price prediction model. The model predicts values along with the actual values 

of stock opening prices for (a) 100 days, (b) 300 days, (c) 500 days, and (d) 1000 days. 

In the results of studies by [66], the authors proposed using deep learning in making stock 

predictions. This paper compared the performance of six deep-learning algorithms to predict stock 

closing prices on the Indonesian Stock Exchange. Insights The paper proposes using a CNN-

LSTM-GRU hybrid algorithm for stock price prediction, which outperforms other methods in 

terms of accuracy. Based on the research that has been carried out by [67], this paper proposes a 
trading strategy designed for the Moroccan stock market based on two deep learning models: 

LSTM and GRU to predict, respectively, the close price for the short- and mid-term horizons. The 


227 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
proposed strategy outperforms benchmark indices in the Moroccan market; future work includes 

focusing on medium- and long-term predictions. The paper proposes a trading strategy for the 

Moroccan market using LSTM and GRU models for short- and medium-term price prediction. 

Table 3. Performance study of present models 

Reference Methods Results 

[31] - Depth learning LASSO and PCA approaches 

- LSTM and GRU models 

MSE 733.8773 

[65] Bi-LSTM and GRU models MSE 0.0018 

[66] CNN-LSTM-GRU hybrid algorithm RMSE decreased by 14%, MAE 
reduced by 13.4%, R2 3.9% 

[67] LSTM and GRU models MSE 0.57 

[68] LSTM and GRU MAPE 97.37% 

[69] - Two-layer stacked LSTM (TLS-LSTM) 
- Correlation analysis between different currency pairs 

MSE 0.0015129 

[70] Stacked-Bi-LSTM RMSE 0.025 

Proposed models LSTM-GRU-LSTM-GRU stack MSE 63,44 

 
The results of studies carried out by [68] methods use LSTM and GRU. In this paper, the 

authors propose eight new architectural models for stock price forecasting by identifying joint 

movement patterns in the stock market, which combine the LSTM and GRU models with four 

neural network block architectures. Eight new architectural models have been proposed for stock 

price forecasting. Evaluation of the proposed models using three accuracy measures The paper 

proposes eight new architectural models that combine LSTM and GRU algorithms with neural 

network block architectures to predict stock prices using grouped time-series data accurately. In the 

research conducted by [69] in this article, a TLS-LSTM neural network was used to forecast the 

trend of the Australian Dollar and United States Dollar (AUD/USD) and conduct a correlation 

analysis. TLS-LSTM outperforms other models in Forex trend prediction; AUD/USD movement 

affects EUR/AUD and AUD/JPY. The study proposes using a TLS-LSTM neural network for forex 

market forecasting and conducting correlation analysis between different currency pairs. Research 

conducted by [70] The Stacked Bi-LSTM (SBiLSTM) architecture, a modification of the 

conventional Deep Long-Short Term Memory (TDLM), is offered in this study. Two-time series 

from oilfield production are used to test the method. Comparative comparisons are made regarding 

the proposed SBiLSTM model's performance with those of multi-layer RNNs, Deep GRU, and 

Deep LSTM. 

IV. Conclusions 

Machine learning can deliver improved long-term predicted performance for PT Bank Syariah 

Indonesia Tbk (BRIS) shares, which is critical for investors when making stock market decisions. 

This data may also assist analysts in developing long-term financial strategy indicators. In this 

paper, we propose a distinct training approach for 1-day to 28-day forecasts utilizing 10 versions of 

deep learning models from 4 LSTM-GRU stacks and tailored input-target data segmentation 

algorithms. The LSTM-LSTM-LSTM-LSTM (LLLL) stack is used to obtain the best model for the 

prediction phase of training and test data utilizing BRIS stock history data from 01-07-2020 to 01-

07-2023 (728 days). Furthermore, the LSTM-GRU-LSTM-GRU (LGLG) stack model gives the 

most accurate long-term forecast for the next 28 days. 

The graph results from the altered input-target data segmentation approach exhibit variations 

and a perfect correlation with the observed data. Long-term forecasts do not exhibit significant 

volatility when utilizing the deep learning approach (without input-target data segmentation) solely 

but tend towards a constant (convergent) value. Long-term predictive research with even better 

accuracy is still possible, either by applying different methodologies or extending the techniques 

and procedures we have developed. 

The LSTM-GRU-LSTM-GRU stack model is a complex model that can be very good at 

handling complex time-series data. However, managing and maintaining such models requires 

considerable computing resources and a deep understanding of time series modeling. Overall, the 

LSTM-GRU-LSTM-GRU stack model can be a handy tool for forecasting long-term stock prices. 


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 228 

 
However, it should be used as one aspect of broader analysis and decision-making in investing in 

the stock market. 

 
Declarations  

Author contribution  

All authors contributed equally as the main contributor of this paper. All authors read and approved the final paper. 

Funding statement  

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit 
sectors.  

Conflict of interest  

The authors declare no known conflict of financial interest or personal relationships that could have appeared to 
influence the work reported in this paper.  

Additional information  

Reprints and permission information are available at http://journal2.um.ac.id/index.php/keds. 

Publisher’s Note: Department of Electrical Engineering and Informatics - Universitas Negeri Malang remains neutral with 

regard to jurisdictional claims and institutional affiliations. 

 
References 

[1] World Bank, Leveraging Islamic Fintech to Improve Financial Inclusion. World Bank, 2020. 
[2] M. A. Khattak and N. A. Khan, “Islamic Finance, Growth, and Volatility: a Fresh Evidence From 82 Countries,” J. 

Islam. Monet. Econ. Financ., vol. 9, no. 1, pp. 39–56, 2023. 
[3] E. Santi, B. Budiharto, and H. Saptono, "Pengawasan Otoritas Jasa Keuangan Terhadap Financial Technology 

(Peraturan Otoritas Jasa Keuangan NomoR 77/POJK.01/2016)," Diponegoro Law Journal, vol. 6, no. 3, pp. 1 -20, 

Jul. 2017. 

[4] S. Syarifuddin, R. Muin, and A. Akramunnas, “The Potential of Sharia Fintech in Increasing Micro Small and 
Medium Enterprises (MSMEs) in The Digital Era in Indonesia,” J. Huk. Ekon. Syariah, vol. 4, no. 1, p. 23, 2021. 

[5] R. A. Kasri and M. W. Sosianti, “Determinants of the Intention To Pay Zakat Online: the Case of Indonesia,” J. 
Islam. Monet. Econ. Financ., vol. 9, no. 2, pp. 275–294, 2023. 

[6] H. Hiyanti, L. Nugroho, C. Sukamadilaga, and T. Fitrijanti, “Sharia Fintech (Financial Technology) Opportunities 
and Challenges in Indonesia,” J. Ilm. Ekon. Islam, vol. 5, no. 03, pp. 326–333, 2019. 

[7] M. A. Kurniawan, M. Anwar, and S. R. Nidar, “Developing a Strategy for Islamic Money Market Model to Enhance 
Quality of Islamic Banking Performance during the Pandemic in Indonesia 2021,” Qual. - Access to Success, vol. 

23, no. 190, pp. 261–268, 2022. 
[8] N. Nurdin and K. Yusuf, “Knowledge management lifecycle in Islamic bank: the case of syariah banks in 

Indonesia,” Int. J. Knowl. Manag. Stud., vol. 11, no. 1, pp. 59–80, Jan. 2020. 

[9] S. M. Anwar, J. Junaidi, S. Salju, R. Wicaksono, and M. Mispiyanti, “Islamic bank contribution to Indonesian 
economic growth,” Int. J. Islam. Middle East. Financ. Manag., vol. 13, no. 3, pp. 519–532, Jan. 2020. 

[10] M. H. Ali, M. A. Uddin, M. A. R. Khan, and B. Goud, “Faith-based versus value-based finance: Is there any 
portfolio diversification benefit between responsible and Islamic finance?,” Int. J. Financ. Econ., vol. 26, no. 4, pp. 

5570–5583, Oct. 2021. 

[11] S. Alhammadi, “Expanding financial inclusion in Indonesia through Takaful: opportunities, challenges and 
sustainability,” J. Financ. Report. Account., vol. ahead-of-print, no. ahead-of-print, Jan. 2023. 

[12] A. D. Songer, J. Diekmann, W. Hendrickson, and D. Flushing, “Situational Reengineering: Case Study Analysis,” J. 
Constr. Eng. Manag., vol. 126, no. 3, pp. 185–190, May 2000. 

[13] M. Mursyid, H. Kusuma, A. Tohirin, and J. Sriyana, “Performance Analysis of Islamic Banks in Indonesia: The 
Maqashid Shariah Approach,” J. Asian Financ. Econ. Bus., vol. 8, no. 3, pp. 307–318, 2021. 

[14] A. Ding, X., Haron, R., & Hasan, “The Influence Of Basel III On Islamic Bank Risk,” J. Islam. Monet. Econ. 
Financ., vol. 9, no. 1, pp. 167–198, 2023. 

[15] E. B. Boukherouaa et al., Powering the Digital Economy: Opportunities and Risks of Artificial Intelligence in 
Finance. International Monetary Fund, 2021. 

[16] M. Asutay, P. F. Aziz, B. S. Indrastomo, and Y. Karbhari, “Religiosity and Charitable Giving on Investors’ Trading 
Behaviour in the Indonesian Islamic Stock Market: Islamic vs Market Logic,” J. Bus. Ethics, 2023. 

[17] D. Defrizal, K. Romli, A. Purnomo, and H. A. Subing, “A Sectoral Stock Investment Strategy Model in Indonesia 
Stock Exchange,” J. Asian Financ. Econ. Bus., vol. 8, no. 1, pp. 015–022, 2021. 

[18] A. Thakkar and K. Chaudhari, “A Comprehensive Survey on Portfolio Optimization, Stock Price and Trend 
Prediction Using Particle Swarm Optimization,” Arch. Comput. Methods Eng., vol. 28, no. 4, pp. 2133–2164, 2021. 

http://journal2.um.ac.id/index.php/keds
https://elibrary.worldbank.org/doi/abs/10.1596/34520
https://doi.org/10.21098/jimf.v9I1.1625
https://doi.org/10.21098/jimf.v9I1.1625
https://ejournal3.undip.ac.id/index.php/dlr/article/view/19683
https://ejournal3.undip.ac.id/index.php/dlr/article/view/19683
https://ejournal3.undip.ac.id/index.php/dlr/article/view/19683
https://doi.org/10.30595/jhes.v4i1.9768
https://doi.org/10.30595/jhes.v4i1.9768
https://doi.org/10.21098/jimf.v9i2.1664
https://doi.org/10.21098/jimf.v9i2.1664
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Sharia+Fintech+%28Financial+Technology%29+Opportunities+and+Challenges+in+Indonesia&btnG=
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Sharia+Fintech+%28Financial+Technology%29+Opportunities+and+Challenges+in+Indonesia&btnG=
https://doi.org/10.47750/QAS/23.190.28
https://doi.org/10.47750/QAS/23.190.28
https://doi.org/10.47750/QAS/23.190.28
https://doi.org/10.1504/IJKMS.2020.105073
https://doi.org/10.1504/IJKMS.2020.105073
https://doi.org/10.1108/IMEFM-02-2018-0071
https://doi.org/10.1108/IMEFM-02-2018-0071
https://doi.org/10.1002/ijfe.2081
https://doi.org/10.1002/ijfe.2081
https://doi.org/10.1002/ijfe.2081
https://doi.org/10.1108/JFRA-05-2023-0256
https://doi.org/10.1108/JFRA-05-2023-0256
https://doi.org/10.1061/(ASCE)0733-9364(2000)126:3(185)
https://doi.org/10.1061/(ASCE)0733-9364(2000)126:3(185)
https://doi.org/10.13106/jafeb.2021.vol8.no3.0307
https://doi.org/10.13106/jafeb.2021.vol8.no3.0307
http://www.jimf-bi.org/index.php/JIMF/article/view/1590
http://www.jimf-bi.org/index.php/JIMF/article/view/1590
https://books.google.com/books?hl=en&lr=&id=NvlXEAAAQBAJ&oi=fnd&pg=PA1&dq=Powering+the+Digital+Economy:+Opportunities+and+Risks+of+Artificial+Intelligence+in+Finance.+International+Monetary+Fund&ots=1ZU6OdDswC&sig=NxQvhVLgKQzf8q6Urz5W1YvTA_w
https://books.google.com/books?hl=en&lr=&id=NvlXEAAAQBAJ&oi=fnd&pg=PA1&dq=Powering+the+Digital+Economy:+Opportunities+and+Risks+of+Artificial+Intelligence+in+Finance.+International+Monetary+Fund&ots=1ZU6OdDswC&sig=NxQvhVLgKQzf8q6Urz5W1YvTA_w
https://doi.org/10.1007/s10551-023-05324-0
https://doi.org/10.1007/s10551-023-05324-0
https://doi.org/10.13106/jafeb.2021.vol8.no1.015
https://doi.org/10.13106/jafeb.2021.vol8.no1.015
https://doi.org/10.1007/s11831-020-09448-8
https://doi.org/10.1007/s11831-020-09448-8


229 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 

 
[19] E. I. Ardyanta and H. Sari, “A Prediction of Stock Price Movements Using Support Vector Machines in Indonesia,” 
J. Asian Financ., vol. 8, no. 8, pp. 399–0407, 2021. 

[20] W. Budiharto, “Data science approach to stock prices forecasting in Indonesia during Covid -19 using Long Short-
Term Memory (LSTM),” J. Big Data, vol. 8, no. 1, p. 47, 2021. 

[21] M. Kunwar, “Artificial Intelligence In Finance Understanding how automation and machine learning is transforming 
the financial industry,” no. August, 2019. 

[22] A. Saranya and R. Anandan, “Stock market prediction using machine learning algorithms,” Int. J. Recent Technol. 
Eng., vol. 8, no. 2 Special Issue 4, pp. 280–283, 2019. 

[23] S. Ahmed, M. M. Alshater, A. El Ammari, and H. Hammami, “Artificial intelligence and machine learning in 
finance: A bibliometric review,” Res. Int. Bus. Financ., vol. 61, p. 101646, 2022. 

[24] C. Milana and A. Ashta, “Artificial intelligence techniques in finance and financial markets: A survey of the 
literature,” Strateg. Chang., vol. 30, no. 3, pp. 189–209, May 2021. 

[25] W. Hastomo, A. S. B. Karno, N. Kalbuana, E. Nisfiani, and L. ETP, “Optimasi Deep Learning untuk Prediksi 
Saham di Masa Pandemi Covid-19,” J. Edukasi dan Penelit. Inform., vol. 7, no. 2, p. 133, Aug. 2021. 

[26] N. Navarin, B. Vincenzi, M. Polato, and A. Sperduti, “LSTM networks for data-aware remaining time prediction of 
business process instances,” in 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 2017, pp. 1–7. 

[27] M. O. Rahman, M. S. Hossain, T.-S. Junaid, M. S. A. Forhad, and M. K. Hossen, “Predicting prices of stock market 
using gated recurrent units (GRUs) neural networks,” Int. J. Comput. Sci. Netw. Secur, vol. 19, no. 1, pp. 213–222, 

2019. 
[28] K. A. Althelaya, E.-S. M. El-Alfy, and S. Mohammed, “Stock Market Forecast Using Multivariate Analysis with 

Bidirectional and Stacked (LSTM, GRU),” in 2018 21st Saudi Computer Society National Computer Conference 

(NCC), 2018, pp. 1–7. 

[29] M. A. I. Sunny, M. M. S. Maswood, and A. G. Alharbi, “Deep Learning-Based Stock Price Prediction Using LSTM 
and Bi-Directional LSTM Model,” in 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference 

(NILES), 2020, pp. 87–92. 

[30] Y. Liu, Z. Wang, and B. Zheng, “Application of Regularized GRU-LSTM Model in Stock Price Prediction,” in 2019 
IEEE 5th International Conference on Computer and Communications (ICCC), 2019, pp. 1886–1890. 

[31] Y. Gao, R. Wang, and E. Zhou, “Stock Prediction Based on Optimized LSTM and GRU Models,” Sci. Program., 
vol. 2021, p. 4055281, 2021. 

[32] M. E. Karim, M. Foysal, and S. Das, “Stock price prediction using Bi-LSTM and GRU-based hybrid deep learning 
approach,” in Proceedings of Third Doctoral Symposium on Computational Intelligence: DoSCI 2022, 2022, pp. 
701–711. 

[33] A. Sethia and P. Raut, “Application of LSTM, GRU and ICA for stock price prediction,” in Information and 
Communication Technology for Intelligent Systems: Proceedings of ICTIS 2018, Volume 2, 2019, pp. 479–487. 

[34] J. Zhao, D. Zeng, S. Liang, H. Kang, and Q. Liu, “Prediction model for stock price trend based on recurrent neural 
network,” J. Ambient Intell. Humaniz. Comput., vol. 12, no. 1, pp. 745–753, 2021. 

[35] K. Wang, X. Qi, and H. Liu, “Photovoltaic power forecasting based LSTM-Convolutional Network,” Energy, vol. 
189, p. 116225, Dec. 2019. 

[36] Z. Karevan and J. A. K. Suykens, “Transductive LSTM for time-series prediction: An application to weather 
forecasting,” Neural Networks, vol. 125, pp. 1–9, May 2020. 

[37] G. Ding and L. Qin, “Study on the prediction of stock price based on the associated network model of LSTM,” Int. 
J. Mach. Learn. Cybern., vol. 11, no. 6, pp. 1307–1317, Jun. 2020. 

[38] S. Chen and L. Ge, “Exploring the attention mechanism in LSTM-based Hong Kong stock price movement 
prediction,” Quant. Financ., vol. 19, no. 9, pp. 1507–1515, Sep. 2019. 

[39] Y. Baek and H. Y. Kim, “ModAugNet: A new forecasting framework for stock market index value with an 
overfitting prevention LSTM module and a prediction LSTM module,” Expert Syst. Appl., vol. 113, pp. 457 –480, 

2018. 
[40] X. Liang, Z. Ge, L. Sun, M. He, and H. Chen, “LSTM with Wavelet Transform Based Data Preprocessing for Stock 

Price Prediction,” Math. Probl. Eng., vol. 2019, p. 1340174, 2019. 

[41] P. Xu et al., “Automatic evaluation of facial nerve paralysis by dual-path LSTM with deep differentiated network,” 
Neurocomputing, vol. 388, pp. 70–77, 2020. 

[42] A. U. Muhammad, A. S. Yahaya, S. M. Kamal, J. M. Adam, W. I. Muhammad, and A. Elsafi, “A Hybrid Deep 
Stacked LSTM and GRU for Water Price Prediction,” in 2020 2nd International Conference on Computer and 

Information Sciences (ICCIS), 2020, pp. 1–6. 

[43] M. Ali, D. M. Khan, H. M. Alshanbari, and A. A.-A. H. El-Bagoury, “Prediction of Complex Stock Market Data 
Using an Improved Hybrid EMD-LSTM Model,” Appl. Sci., vol. 13, no. 3, 2023. 

[44] A. Dutta, G. Pooja, N. Jain, R. R. Panda, and N. K. Nagwani, “A Hybrid Deep Learning Approach for Stock Price 
Prediction,” in Machine Learning for Predictive Analysis, 2021, pp. 1–10. 

[45] S. Zaheer et al., “A Multi Parameter Forecasting for Stock Time Series Data Using LSTM and Deep Learning 
Model,” Mathematics, vol. 11, no. 3, 2023. 

[46] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on 
sequence modeling,” arXiv Prepr. arXiv1412.3555, 2014. 

[47] P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, “Long Short Term Memory networks for anomaly detection in time 
series,” 23rd Eur. Symp. Artif. Neural Networks, Comput. Intell. Mach. Learn. ESANN 2015 - Proc., no. April, pp. 

89–94, 2015. 

[48] J. L. Elman, “Finding structure in time,” Cogn. Sci., vol. 14, no. 2, pp. 179–211, 1990. 
[49] L. Medsker and L. C. Jain, Recurrent neural networks: design and applications. CRC press, 1999. 

https://doi.org/10.13106/jafeb.2021.vol8.no8.0399
https://doi.org/10.13106/jafeb.2021.vol8.no8.0399
https://doi.org/10.1186/s40537-021-00430-0
https://doi.org/10.1186/s40537-021-00430-0
https://www.theseus.fi/handle/10024/227560
https://www.theseus.fi/handle/10024/227560
https://doi.org/10.35940/ijrte.B1052.0782S419
https://doi.org/10.35940/ijrte.B1052.0782S419
https://doi.org/10.1016/j.ribaf.2022.101646
https://doi.org/10.1016/j.ribaf.2022.101646
https://doi.org/10.1002/jsc.2403
https://doi.org/10.1002/jsc.2403
https://dx.doi.org/10.26418/jp.v7i2.47411
https://dx.doi.org/10.26418/jp.v7i2.47411
https://doi.org/10.1109/SSCI.2017.8285184
https://doi.org/10.1109/SSCI.2017.8285184
https://www.researchgate.net/profile/Md-Sabir-Hossain/publication/331385031_Predicting_Prices_of_Stock_Market_using_Gated_Recurrent_Units_GRUs_Neural_Networks/links/5c93b36492851cf0ae8e96fb/Predicting-Prices-of-Stock-Market-using-Gated-Recurrent-Units-GRUs-Neural-Networks.pdf
https://www.researchgate.net/profile/Md-Sabir-Hossain/publication/331385031_Predicting_Prices_of_Stock_Market_using_Gated_Recurrent_Units_GRUs_Neural_Networks/links/5c93b36492851cf0ae8e96fb/Predicting-Prices-of-Stock-Market-using-Gated-Recurrent-Units-GRUs-Neural-Networks.pdf
https://www.researchgate.net/profile/Md-Sabir-Hossain/publication/331385031_Predicting_Prices_of_Stock_Market_using_Gated_Recurrent_Units_GRUs_Neural_Networks/links/5c93b36492851cf0ae8e96fb/Predicting-Prices-of-Stock-Market-using-Gated-Recurrent-Units-GRUs-Neural-Networks.pdf
https://doi.org/10.1109/NCG.2018.8593076
https://doi.org/10.1109/NCG.2018.8593076
https://doi.org/10.1109/NCG.2018.8593076
https://doi.org/10.1109/NILES50944.2020.9257950
https://doi.org/10.1109/NILES50944.2020.9257950
https://doi.org/10.1109/NILES50944.2020.9257950
https://doi.org/10.1109/ICCC47050.2019.9064035
https://doi.org/10.1109/ICCC47050.2019.9064035
https://doi.org/10.1155/2021/4055281
https://doi.org/10.1155/2021/4055281
https://link.springer.com/chapter/10.1007/978-981-19-3148-2_60
https://link.springer.com/chapter/10.1007/978-981-19-3148-2_60
https://link.springer.com/chapter/10.1007/978-981-19-3148-2_60
https://link.springer.com/chapter/10.1007/978-981-13-1747-7_46
https://link.springer.com/chapter/10.1007/978-981-13-1747-7_46
https://doi.org/10.1007/s12652-020-02057-0
https://doi.org/10.1007/s12652-020-02057-0
https://doi.org/10.1016/j.energy.2019.116225
https://doi.org/10.1016/j.energy.2019.116225
https://doi.org/10.1016/j.neunet.2019.12.030
https://doi.org/10.1016/j.neunet.2019.12.030
https://doi.org/10.1007/s13042-019-01041-1
https://doi.org/10.1007/s13042-019-01041-1
https://doi.org/10.1080/14697688.2019.1622287
https://doi.org/10.1080/14697688.2019.1622287
https://doi.org/10.1016/j.eswa.2018.07.019
https://doi.org/10.1016/j.eswa.2018.07.019
https://doi.org/10.1016/j.eswa.2018.07.019
https://doi.org/10.1155/2019/1340174
https://doi.org/10.1155/2019/1340174
https://doi.org/10.1016/j.neucom.2020.01.014
https://doi.org/10.1016/j.neucom.2020.01.014
https://doi.org/10.1109/ICCIS49240.2020.9257651
https://doi.org/10.1109/ICCIS49240.2020.9257651
https://doi.org/10.1109/ICCIS49240.2020.9257651
https://doi.org/10.3390/app13031429
https://doi.org/10.3390/app13031429
https://doi.org/10.1007/978-981-15-7106-0_1
https://doi.org/10.1007/978-981-15-7106-0_1
https://doi.org/10.3390/math11030590
https://doi.org/10.3390/math11030590
https://arxiv.org/abs/1412.3555
https://arxiv.org/abs/1412.3555
https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series
https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series
https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series
https://doi.org/10.1016/0364-0213(90)90002-E
https://books.google.com/books?hl=en&lr=&id=ME1SAkN0PyMC&oi=fnd&pg=PA1&dq=Recurrent+Neural+Networks+Desihn+and+Aplications&ots=7cBzcO2OVm&sig=TM414y-mIfEI4UNmXS7wWjImbnY


 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 230 

 
[50] P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proc. IEEE, vol. 78, no. 10, pp. 1550–
1560, 1990. 

[51] J. L. Elman and D. Zipser, “Learning the hidden structure of speech,” J. Acoust. Soc. Am., vol. 83, no. 4, pp. 1615–
1626, Apr. 1988.. 

[52] J. T. Connor, R. D. Martin, and L. E. Atlas, “Recurrent neural networks and robust time series prediction,” IEEE 
Trans. Neural Networks, vol. 5, no. 2, pp. 240–254, 1994. 

[53] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE 
Trans. Neural Networks, vol. 5, no. 2, pp. 157–166, 1994. 

[54] J. Brownlee, “How to develop LSTM models for time series forecasting (2018).” 2019. 
[55] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 

1997. 
[56] K. Cho et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” 

arXiv Prepr., 2014. 

[57] S. M. Al-Selwi, M. F. Hassan, S. J. Abdulkadir, and A. Muneer, “LSTM Inefficiency in Long-Term Dependencies 
Regression Problems,” J. Adv. Res. Appl. Sci. Eng. Technol., vol. 30, no. 3, pp. 16–31, 2023. 

[58] C. Hu, S. Martin, and R. Dingreville, “Accelerating phase-field predictions via recurrent neural networks learning 
the microstructure evolution in latent space,” Comput. Methods Appl. Mech. Eng., vol. 397, p. 115128, Jul. 2022. 

[59] M. R. Raza, W. Hussain, and J. M. Merigó, “Cloud Sentiment Accuracy Comparison using RNN, LSTM and GRU,” 
in 2021 Innovations in Intelligent Systems and Applications Conference (ASYU), 2021, pp. 1–5. 

[60] T. Limouni, R. Yaagoubi, K. Bouziane, K. Guissi, and E. H. Baali, “Accurate one step and multistep forecasting of 
very short-term PV power using LSTM-TCN model,” Renew. Energy, vol. 205, pp. 1010–1024, 2023. 

[61] N. Klyuchnikov et al., “NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language 
Processing,” IEEE Access, vol. 10, pp. 45736–45747, 2022. 

[62] S. Wang and H. Chen, “A novel deep learning method for the classification of power quality disturbances using 
deep convolutional neural network,” Appl. Energy, vol. 235, pp. 1126–1140, 2019. 

[63] W. Hastomo, N. Aini, A. S. B. Karno, and L. M. R. Rere, “Metode Pembelajaran Mesin untuk Memprediksi Emisi 
Manure Management,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 11, no. 2, pp. 131–139, 2022. 

[64] W. Hastomo, A. S. Bayangkari Karno, N. Kalbuana, A. Meiriki, and Sutarno, “Characteristic Parameters of Epoch 
Deep Learning to Predict Covid-19 Data in Indonesia,” J. Phys. Conf. Ser., vol. 1933, no. 1, 2021. 

[65] M. E. Karim, M. Foysal, and S. Das, “Stock Price Prediction Using Bi-LSTM and GRU-Based Hybrid Deep 
Learning Approach,” 2023, pp. 701–711. 

[66] B. Sulistio, H. L. H. S. Warnars, F. L. Gaol, and B. Soewito, “Energy Sector Stock Price Prediction Using The 
CNN, GRU & LSTM Hybrid Algorithm,” in 2023 International Conference on Computer Science, Information 

Technology and Engineering (ICCoSITE), 2023, pp. 178–182. 

[67] Y. Touzani and K. Douzi, “An LSTM and GRU based trading strategy adapted to the Moroccan market,” J. Big 
Data, vol. 8, no. 1, p. 126, 2021. 

[68] A. Lawi, H. Mesra, and S. Amir, “Implementation of Long Short-Term Memory and Gated Recurrent Units on 
grouped time-series data to predict stock prices accurately,” J. Big Data, vol. 9, no. 1, p. 89, 2022. 

[69] M. Ayitey Junior, P. Appiahene, and O. Appiah, “Forex market forecasting with two-layer stacked Long Short-Term 
Memory neural network (LSTM) and correlation analysis,” J. Electr. Syst. Inf. Technol., vol. 9, no. 1, p. 14, 2022. 

[70] B. Sirisha, K. K. C. Goud, and B. T. V. S. Rohit, “A Deep Stacked Bidirectional LSTM (SBiLSTM) Model for 
Petroleum Production Forecasting,” Procedia Comput. Sci., vol. 218, pp. 2767–2775, 2023. 

 
https://doi.org/https:/doi.org/10.1121/1.395916
https://doi.org/https:/doi.org/10.1121/1.395916
https://doi.org/10.1121/1.395916
https://doi.org/10.1121/1.395916
https://doi.org/10.1109/72.279188
https://doi.org/10.1109/72.279188
https://doi.org/10.1109/72.279181
https://doi.org/10.1109/72.279181
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=How+to+develop+LSTM+models+for+time+series+forecasting+%282018%29&btnG=
https://doi.org/10.1162/neco.1997.9.8.1735
https://doi.org/10.1162/neco.1997.9.8.1735
https://doi.org/10.48550/arXiv.1406.1078
https://doi.org/10.48550/arXiv.1406.1078
https://doi.org/10.37934/araset.30.3.1631
https://doi.org/10.37934/araset.30.3.1631
https://doi.org/10.1016/j.cma.2022.115128
https://doi.org/10.1016/j.cma.2022.115128
https://doi.org/10.1109/ASYU52992.2021.9599044
https://doi.org/10.1109/ASYU52992.2021.9599044
https://doi.org/10.1016/j.renene.2023.01.118
https://doi.org/10.1016/j.renene.2023.01.118
https://doi.org/10.1109/ACCESS.2022.3169897
https://doi.org/10.1109/ACCESS.2022.3169897
https://doi.org/10.1016/j.apenergy.2018.09.160
https://doi.org/10.1016/j.apenergy.2018.09.160
https://journal.ugm.ac.id/v3/JNTETI/article/view/2586
https://journal.ugm.ac.id/v3/JNTETI/article/view/2586
https://doi.org/10.1088/1742-6596/1933/1/012050
https://doi.org/10.1088/1742-6596/1933/1/012050
https://doi.org/10.1007/978-981-19-3148-2_60
https://doi.org/10.1007/978-981-19-3148-2_60
https://doi.org/10.1109/ICCoSITE57641.2023.10127847
https://doi.org/10.1109/ICCoSITE57641.2023.10127847
https://doi.org/10.1109/ICCoSITE57641.2023.10127847
https://doi.org/10.1186/s40537-021-00512-z
https://doi.org/10.1186/s40537-021-00512-z
https://doi.org/10.1186/s40537-022-00597-0
https://doi.org/10.1186/s40537-022-00597-0
https://doi.org/10.1186/s43067-022-00054-1
https://doi.org/10.1186/s43067-022-00054-1
https://doi.org/10.1016/j.procs.2023.01.248
https://doi.org/10.1016/j.procs.2023.01.248