Knowledge Engineering and Data Science (KEDS) pISSN 2597-4602 Vol 6, No 2, October 2023, pp. 215–230 eISSN 2597-4637 https://doi.org/10.17977/um018v6i22023p215-230 ©2023 Knowledge Engineering and Data Science | W : http://journal2.um.ac.id/index.php/keds | E : keds.journal@um.ac.id This is an open access article under the CC BY-SA license (https://creativecommons.org/licenses/by-sa/4.0/) Stacked LSTM-GRU Long-Term Forecasting Model for Indonesian Islamic Banks Yayat Sujatna a,1, Adhitio Satyo Bayangkari Karno b,2, Widi Hastomo c,3,*, Nia Yuningsih b,4, Dody Arif d,5, Sri Setya Handayani d,6, Aqwam Rosadi Kardian e,7, Ire Puspa Wardhani e,8, L.M Rasdi Rere e,9 a Department of Accounting, Ahmad Dahlan Institute of Technology and Business Jl. Ir H. Juanda No.77, Tangerang Selatan 15419, Indonesia b Department of Information System, Faculty of Engineering, Gunadarma University Jl. Margonda Raya No. 100, Depok 16424, Indonesia c Department of Information Technology, Ahmad Dahlan Institute of Technology and Business Jl. Ir H. Juanda No.77, Tangerang Selatan 15419, Indonesia d Department of Management, Faculty of Economics, Gunadarma University Jl. Margonda Raya No. 100, Depok 16424, Indonesia e Department of Computer Systems, STMIK Jakarta STI&K Jl. Bri Radio Dalam No.17, Jakarta Selatan 12140, Indonesia 1 yayatsujatna@gmail.com; 2 adh1t10.2@gmail.com; 3 widie.has@gmail.com*; 4 nia_yuningsih@staff.gunadarma.ac.id; 5 dodiarif8@gmail.com; 6 srisetyahandayani@yahoo.com; 7 aqwam@staff.jak-stik.ac.id; 8 irepuspa@staff.jak-stik.ac.id; 9 rasdirere267@gmail.com * corresponding author I. Introduction As the country with the world's largest Muslim-majority population, Indonesia has enormous potential for the expansion of the Islamic banking financial system in the future, as evidenced by a robust network of Islamic banks [1]. These banks follow Islamic law (Sharia) principles and follow ethical and moral criteria [2]. The Indonesian government is aggressively promoting the growth of Islamic banking in response to the growing demand for Islamic financial products and services. Various regulatory frameworks have been put in place to support the formation and expansion of Islamic banks. The Financial Services Authority (OJK) is responsible for managing and regulating the operations of Islamic banks in order to maintain Sharia compliance [3]. In addition to becoming full-service Islamic banks, conventional banks have built Islamic banking branches to accommodate the rising demand for Shariah-compliant services. These institutions provide a wide ARTICLE INFO A B S T R A C T Article history: Received 04 September 2023 Revised 26 September 2023 Accepted 20 October 2023 Published online 06 November 2023 The development of the Islamic banking industry in Indonesia has become a significant concern in recent years, with rapid growth in the number of banks operating based on Sharia principles. To face emerging challenges and opportunities, a deep understanding of the long-term financial behavior of Islamic banks is becoming increasingly important. This study aims to predict the share price of PT Bank Syariah Indonesia Tbk, over 28 days using the LSTM-GRU stack. The observation stage includes importing the dataset, data separation, model variations, the training process, output, and evaluation. Observations were conducted using 10 model variations from 4 stacks of LSTM and GRU. Each model performs the training process in four epochs (200, 500, 750, and 1000). The results of observations in this study show that long-term predictions (28 days ahead) using four stacks of LSTM-GRU and daily training accumulation techniques produce better accuracy than the general method (using multiple outputs). From the observations we have made for predictions for the next 28 days, the model with the LGLG stack arrangement (LSTM-GRU-LSTM-GRU) produces the best accuracy at epoch 750 with an MSE LSTM-GRU 63.43762863. This study will undoubtedly continue in order to achieve even better precision, either by utilizing a new design or by further improving the technology we are now employing. This is an open access article under the CC BY-SA license (https://creativecommons.org/licenses/by-sa/4.0/). Keywords: Sharia Principles Indonesian Banks Long-term Forecasts GRU LSTM http://u.lipi.go.id/1502081730 http://u.lipi.go.id/1502081046 http://journal2.um.ac.id/index.php/keds mailto:keds.journal@um.ac.id https://creativecommons.org/licenses/by-sa/4.0/ https://creativecommons.org/licenses/by-sa/4.0/ Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 216 range of Shariah-compliant goods and services, including savings accounts, financing, investment instruments, takaful (Islamic insurance), and zakat payments [4][5][6]. Islamic banking in Indonesia has experienced rapid growth in the last few decades [7]. This growth not only reflects global trends in sharia finance but is also reflected in the economic and social development of Indonesia, which has a sizeable Muslim population. Sharia banking provides financial access to people previously not served by conventional banking [8]. The system has helped drive financial inclusion in Indonesia by providing access to banking products and services to groups previously considered "unbankable". The existence of Sharia banking also makes a positive contribution to the stability of the Indonesian economy as a whole [9]. Diversifying Islamic banking and financing based on Islamic ethics helps reduce systemic risk [10]. Thus, the growth of Sharia banking in Indonesia not only reflects high market demand but also creates a positive impact by encouraging financial inclusion, sustainable economic development, and the development of financial products and services that are in line with Islamic values [11], is an essential aspect of Indonesia's diverse and dynamic economic and financial development. Despite substantial progress in Islamic banking in Indonesia, there are still issues, difficulties, and possibilities to be addressed. Evaluating and analyzing the performance, efficiency, and competitiveness of Islamic banks in comparison to conventional banks, as well as comprehending the dynamics and factors influencing the growth and long-term sustainability of Islamic banking in Indonesia, is critical for policymakers, regulators, and market players [12][13][14]. Long-term stock forecasting is required for investors and financial institutions to make good long-term investment decisions and strategies in the Indonesian market [15][16]. For investors looking to improve their investment portfolios, accurate long-term stock prediction estimates from Islamic banks are invaluable. While previous research still uses traditional financial models [17] or basic machine learning algorithms [18], with low accuracy results [19] and many biases, it is still far from what was expected [20]. In recent years, financial markets have seen a considerable surge in the application of Artificial Intelligence (AI) and Machine Learning (ML) techniques for stock market prediction [21][22][23][24]. These strategies have demonstrated promising results in identifying complicated patterns and trends in financial data, supporting investors in making educated decisions. Recurrent Neural Networks (RNNs) have attracted much interest among other ML techniques due to their ability to handle sequential and temporal connections in data. The Long Short-Term Memory (LSTM) network is one form of RNN that has proven efficacy in time series analysis [25]. LSTM networks can capture long-term relationships and reduce the missing gradient issue in standard RNNs [26]. In addition, Gated Recurrent Units (GRUs) have emerged as an alternative RNN architecture that offers computational efficiency and performance comparable to LSTM [27]. Individually, the LSTM and GRU networks have been regularly used to estimate stock prices in the context of stock market prediction [28][29][30][31][32][33][34]. However, improved models that integrate the capabilities of the two architectures are still required to increase forecast accuracy. Despite the growing interest in Islamic banking and the importance of Islamic bank shares in Indonesia, there is a significant vacuum in the existing literature on long-term forecasts utilizing deep learning techniques. Most of the study focuses on traditional bank financial performance and short-term predictions, with minimal discussion of long-term stock projections in the Indonesian setting. This study aims to evaluate the performance of PT Bank Syariah Indonesia Tbk's long-term stock prediction model. Two novel approaches are proposed. The first is optimizing the model with a separate training process using ten variations of the 4 LSTM-GRU stacks. The second approach is the input and target data segmentation technique, adjusted to the predictions for the next 1 to 28 days. By stacking many models, deep learning models become better and more useful for forecasting time series data [35][36][37], particularly for predicting stock values [38][39][40][41]. Several experiments on merging several machine learning approaches to predict time series data have been conducted [42]. Predicting water prices with an LSTM-GRU model is more accurate than using the GRU and piles with an LSTM-LSTM arrangement [43]. When predicting complicated stock 217 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 market data, the hybrid Akima-EMD-LSTM model outperforms the hybrid EMD-LSTM, EEMD- LSTM, and SEMD-LSTM models [44]. Stock price prediction employs a time-series analysis of LSTM and sentiment analysis of the Valence Aware Dictionary and Sentiment Reasoner (VADER). Compared to earlier research, this method yields more accuracy [45]. The CNN, RNN, LSTM, CNN-RNN, and CNN-LSTM algorithms are used to predict the Shanghai Composite Index shares. The CNN-RNN approach outperforms other methods (CNN, RNN, and LSTM) [46]. For music data, classic tanh, LSTM, and GRU are used, with LSTM and GRU having benefits over standard tanh units [47]. A stacked LSTM model is used to detect abnormalities in four separate datasets. II. Methods This study was carried out in stages, beginning with data collection, then the separation of training and test data, the separation of goal data for long-term predictions of the following 28 days, model creation, and assessment. The research flowchart shows in Figure 1 describes the steps of this investigation in general. The following is a detailed explanation of the experimental process flow for predicting Sharia stock prices using the LSTM and GRU stack models, starting from importing the dataset to output: • Import Dataset: From 01-07-2020 to 01-07-2023, the stock time series dataset from PT Bank Syariah Indonesia Tbk (BRIS) was taken from https://finance.yahoo.com. The data set has 728 rows (days) and six columns (Open, High, Low, Close, AdjClose, and Volume), with data from the "Close" column being used in this study. • Data separation is done by taking the last 28 days of the dataset to be used as prediction data for the next 28 days. Then, the remaining 700 days of data are divided into training data (600 days) and test data (100 days). • Modeling is building 10 model variations from 4 LSTM and GRU stack arrangements, namely: GGGG, GGGL, GGLL, GLGL, GLLG, LGGL, LGLG, LLGG, LLLG, LLLL. G is for GRU, and L is for LSTM. This model will be trained on training data using machine learning algorithms, includes initializing the model, determining the loss function, selecting the optimizer (e.g., Adam), and determining the evaluation metric, the Mean Square Error (MSE). • Evaluation: Once training is complete, the model should be evaluated to measure how well it predicts stock prices. This evaluation is usually carried out on previously separated test data. This experiment uses evaluation metrics such as MSE to assess the quality of model predictions. Additionally, visualizations such as graphs comparing predictions with actual data can also provide valuable insights. • The output is depicted in the form of a graph that shows historical visuals between actual data and predicted data. To be able to determine the level of accuracy of the results of the training that has been carried out. So, measurements are made between the predicted results and actual data using the MSE measurement method. Fig. 1. Research flowchart A. LSTM-GRU RNN employing backpropagation is the first deep learning model that can recall prior data and predict data one step ahead [48][49][50][51]. Adding layers can enhance accuracy, but doing so with the RNN might result in a diminishing gradient. As a result, the RNN can only overcome short-term reliance [52][53]. Because of this issue, LSTM [54] and [55] cells were created, which have several gates and may overcome long-term dependence. GRU, a cell with a simpler gate that Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 218 can also overcome long-term dependencies, is a further advancement [46][56]. Figure 2 depicts architectural advancements beginning with RNN, then LSTM, and finally GRU. Fig. 2. RNN, LSTM and GRU architecture development Initialize the initial hidden state and cell state values for each LSTM layer. 𝐻0 𝐿𝑆𝑇𝑀𝑖 = 0, 𝐶0 𝐿𝑆𝑇𝑀𝑖 = 0 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑖 and GRU layer 𝐻 0 𝐺𝑅𝑈𝑗 = 0 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑗. Iterate through each time step t (usually from t =1 to T, where T is the length of the input sequence). For each LSTM layer i, calculate the hidden state 𝐻𝑡 𝐿𝑆𝑇𝑀𝑖 and cell state 𝐶𝑡 𝐿𝑆𝑇𝑀𝑖 , as in (1) to (6) and for each to-j GRU layer, calculate the hidden state 𝐻 𝑡 𝐺𝑅𝑈𝑗 as in (6) to (10). 𝑓𝑡 𝐿𝑆𝑇𝑀𝑖 =  (𝑊 𝑓 𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1 𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑓 𝐿𝑆𝑇𝑀𝑖 ) (1) 𝑖𝑡 𝐿𝑆𝑇𝑀𝑖 =  (𝑊 𝑓 𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1 𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑖 𝐿𝑆𝑇𝑀𝑖 ) (2) Ĉ𝑡 𝐿𝑆𝑇𝑀𝑖 = 𝑡𝑎𝑛ℎ (𝑊𝑐 𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1 𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑐 𝐿𝑆𝑇𝑀𝑖 ) (3) 𝐶𝑡 𝐿𝑆𝑇𝑀𝑖 = 𝑓𝑡 𝐿𝑆𝑇𝑀𝑖 . 𝐶𝑡−1 𝐿𝑆𝑇𝑀𝑖 + 𝑖𝑡 𝐿𝑆𝑇𝑀𝑖 . Ĉ𝑡 𝐿𝑆𝑇𝑀𝑖 (4) 𝑜𝑡 𝐿𝑆𝑇𝑀𝑖 =  (𝑊𝑜 𝐿𝑆𝑇𝑀𝑖 . [𝐻𝑡−1 𝐿𝑆𝑇𝑀𝑖 , 𝑋𝑡 ] + 𝑏𝑜 𝐿𝑆𝑇𝑀𝑖 ) (5) 𝐻𝑡 𝐿𝑆𝑇𝑀𝑖 = 𝑜𝑡 𝐿𝑆𝑇𝑀𝑖 . tanh (𝐶𝑡 𝐿𝑆𝑇𝑀𝑖 ) (6) 𝑍 𝑡 𝐺𝑅𝑈𝑗 =  (𝑊𝑧 𝐺𝑅𝑈𝑗 . [𝐻 𝑡−1 𝐺𝑅𝑈𝑗 , 𝑋𝑡 ] + 𝑏𝑧 𝐺𝑅𝑈𝑗 ) (7) 𝑇 𝑡 𝐺𝑅𝑈𝑗 =  (𝑊𝑟 𝐺𝑅𝑈𝑗 . [𝐻 𝑡−1 𝐺𝑅𝑈𝑗 , 𝑋𝑡 ] + 𝑏𝑟 𝐺𝑅𝑈𝑗 ) (8) Ĥ 𝑡 𝐺𝑅𝑈𝑗 = 𝑡𝑎𝑛ℎ (𝑊 ℎ 𝐺𝑅𝑈𝑗 . [𝑟 𝑡 𝐺𝑅𝑈𝑗 . 𝐻 𝑡−1 𝐺𝑅𝑈𝑗 , 𝑋𝑡 ] + 𝑏ℎ 𝐺𝑅𝑈𝑗 ) (9) 𝐻 𝑡 𝐺𝑅𝑈𝑗 = (1 − 𝑧 𝑡 𝐺𝑅𝑈𝑗 ) . Ĥ 𝑡 𝐺𝑅𝑈𝑗 + 𝑧 𝑡 𝐺𝑅𝑈𝑗 . 𝐻 𝑡 𝐺𝑅𝑈𝑗 (10) The output result of the last layer of LSTM and GRU at the last time step T is the final result of the model as in (11). 𝑋 is the input at each time step, 𝐻𝑡 𝐿𝑆𝑇𝑀𝑖 is the state (hidden state) of the-i LSTM layer at time step t, 𝐶𝑡 𝐿𝑆𝑇𝑀𝑖 is the cell state of the-i LSTM layer at time step t, 𝐻𝑡 𝐺𝑅𝑈𝑖 is the state (hidden state) of the-i GRU layer at time step t. 219 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 𝑂𝑢𝑡𝑝𝑢𝑡 = [𝐻𝑇 𝐿𝑆𝑇𝑀𝑙𝑎𝑠𝑡 , 𝐻𝑇 𝐺𝑅𝑈𝑙𝑎𝑠𝑡 ] (11) Pseudocode 1 is a pseudocode representation of stacking LSTM and GRU layers in a recurrent neural network (RNN). PSEUDOCODE 1. LSTM and GRU Stack 1 input_data = Placeholder(shape=(batch_size, sequence_length, input_size)) 2 hidden_states_lstm = [] 3 hidden_states_gru = [] 4 for i in range(num_layers_lstm): 5 if i == 0: 6 lstm_input = input_data 7 else: 8 lstm_input = hidden_states_lstm[-1] 9 lstm_layer = LSTM(hidden_size_lstm, return_sequences=True)(lstm_input) 10 hidden_states_lstm.append(lstm_layer) 11 for j in range(num_layers_gru): 12 if j == 0: 13 gru_input = input_data 14 else: 15 gru_input = hidden_states_gru[-1] 16 gru_layer = GRU(hidden_size_gru, return_sequences=True)(gru_input) 17 hidden_states_gru.append(gru_layer) 18 final_lstm_hidden_state = hidden_states_lstm[-1] 19 final_gru_hidden_state = hidden_states_gru[-1] 20 combined_hidden_state = Concatenate(axis=-1)([final_lstm_hidden_state, final_gru_hidden_state]) 21 output_layer = Dense(output_size, activation='softmax')(combined_hidden_ state) 22 model = Model(inputs=input_data, outputs=output_layer) 23 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = ['accuracy']) 24 model.fit(input_data, target_data, epochs=num_epochs,batch_size=batch_size) Pseudocode for LSTM-GRU stacks represents a high-level algorithmic outline for constructing a deep neural network architecture that combines LSTM and GRU layers. This pseudocode specifies the critical steps for building a stacked RNN, starting with the definition of hyperparameters and input data placeholders, followed by creating multiple LSTM and GRU layers with their respective hidden states. The final hidden states of these layers can be concatenated or combined as needed for downstream tasks. By stacking LSTM and GRU units, the model aims to capture complex sequential patterns, making it particularly useful for tasks involving sequential data analysis. The traditional LSTM and GRU models have several limitations compared to model stacks that combine LSTM and GRU. Following are some of the main limitations of traditional LSTM and GRU models. Lack of ability to handle long-term information [57]. Although LSTM and GRU are designed to overcome the vanishing gradient problem in RNN models, they still have limitations in handling long-term information. These models can remember information from several previous time steps, but over very long periods, they may still have difficulty. More expensive computing, LSTM, and GRU models are relatively computationally complex [58], mainly when used in deep or layered networks, which can result in longer training times and require more excellent computing resources [59]. Susceptible to Overfitting: LSTM and GRU models are more susceptible to overfitting when used on relatively small datasets [60]. Because the number of parameters in these models is significant, they can “memorize” existing training data rather than understanding general patterns. Not Optimal for Specific Tasks: While LSTM and GRU are reasonable solutions for many tasks in time series modeling, there are some specialized tasks, such as text processing (NLP), that require more specialized architectures, such as transformers [61]. Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 220 To overcome these limitations, a stack of LSTM and GRU models can provide several advantages, including. Richer Representation Capabilities: with a stack of LSTM and GRU models, we can use multiple LSTM and GRU layers sequentially [61], allowing the model to represent the data better and describe more complex relationships in the time series. In hierarchical learning, the model stack can learn a hierarchy of information. The first layer can understand more basic patterns, while subsequent layers can understand increasingly abstract and complex patterns [61]. Reduces the risk of overfitting with the addition of layers and techniques such as dropout between layers, and model stacks can help reduce the risk of overfitting, mainly if managed wisely [62]. Flexible Architectural Combinations: combining LSTM and GRU in various configurations in a model stack allows flexibility in designing the most appropriate architecture for a particular task [62]. However, it should be noted that stacked LSTM and GRU models also require careful tuning and attention to overfitting. The selection of appropriate architecture and parameters will significantly influence the quality of model predictions. B. Data Separation The dataset is divided into training data (700 days), test data (100 days), and prediction data (28 days). Figure 3 shows the division of training data and test data as a history graph. The training procedure is conducted to create a model. Predictions were performed using training and test data to evaluate the performance of the resultant model as shown in Figure 4. Fig. 3. Separation of training data (green), test data (blue), and predictive data (yellow) Fig. 4. Predicted results of training data (magenta), and predicted results of test data (cyan) Prediction data (28 days) has been disguised and is only used to evaluate prediction outcomes; it is not included in the training process. We employ recurrent training approaches that are carried out individually for predictions from 1 day to 28 days ahead to anticipate the following 28 days without training data. The input data spans 7 days, whereas the desired data spans 1 day. The forecast for 221 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 the first day is based on one day of target data, which is one day following the training data input. The forecast for the second day uses one day of target data that were collected two days after the input training data, and so on until the prediction for the 28th day utilizes one day of target data that was collected 28 days after the input training data. Each training procedure is repeated ten times with a distinct 4-layer LSTM-GRU arrangement model [63] to get the best outcomes. Figure 5 depicts the separation of input and target data for forecasts from one to 28 days. Fig. 5. Illustration of training and target data separation for predictions ranging from 1 to 28 days C. Modeling Each training procedure is carried out in 10 variations of four distinct layers of the LSTM-GRU arrangement to get the most excellent model performance: Var-01: GGGG, Var-02: GGGL, Var- 03: GGLL, Var-04: GLGL, Var-05: GLLG, Var-06: LGGL, Var-07: LGLG, Var-08: LLGG, Var- 09: LLLG, Var-10: LLLL. The letter L represents LSTM, and the letter G represents GRU. Each training procedure uses four epochs (200, 500, 750, and 1000). Choosing the number of epochs (iterations through the entire training dataset) in training a neural network model is an important decision based on sound judgment, especially in using four epochs (200, 500, 750, and 1000). Below, we will provide scientific arguments for choosing this number of epochs: • Convergence Requirements: The number of epochs used in model training depends mainly on the complexity of the model, the volume of data, and the desired level of convergence. The more complex the model, the longer it takes to reach convergence. The number of epochs spanning four points (200, 500, 750, and 1000) reflects an attempt to examine how the model behaves at various points in training, from early to more advanced stages. • Performance Monitoring: During training, it is essential to monitor model performance on validation or test datasets to prevent overfitting. By using several different epoch points, we can examine how the model behaves over time. Also seeing whether the model's performance continues to increase, reaches a peak, or even decreases at a certain point will help decide when to stop training or take other actions, such as reducing the learning rate or adjusting the model architecture. Training data input Target data Prediction 28-day --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- R e su lt with target without target TRAIN AND TEST DATA Pred. 28-day T R A IN IN G P R O C E S S 2 8 -d a y D -1 D -2 D -3 D -2 8 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 222 • Probability Map Exploration: By trying several different epoch points, this process can also explore the likelihood map of the model's behavior. For example, at the initial epoch (200), the model has not converged enough and is biased towards the training data. At midpoints (500 and 750), the model can approach convergence and begin to fit the validation data. At the endpoint (1000), one can see whether the model continues improving in performance or has reached a saturation point. • Stability Evaluation: The stability of the model can also be assessed through these four epoch points. When a model has highly fluctuating behavior at early points in training, this may indicate that the high learning rate and complexity of the model may need to be adjusted. Conversely, if the model shows good stability at specific points, this may indicate that the process has found an exemplary training configuration. • Testing and Generalisation: Once training is complete at the endpoint (1000), the process can then test the model on never-before-seen data to measure generalization capabilities. If the model can produce good results on the test data, this will indicate that the training has been successful. The selection of these four epoch points provides a rich perspective on how the model develops its performance over time. However, keep in mind that in practice, the choice of the number of epochs must also be considered along with other factors such as learning rate, batch size, model complexity, and the characteristics of the data used. The Adam optimization function is used to construct the model, with a learning rate of 1,001, nodes for each layer of 50, and a batch size of 64. Figure 6 depicts the process from input to deep learning models with 10 variations, predictions, and MSE values produced for each model variant. Adam combines the concepts of momentum (to help handle local minima) and RMSprop (to set the learning rate) in one algorithm. It uses moving estimates of the first gradient (momentum) and the second gradient (RMS momentum) to calculate weight updates. The learning rate can fluctuate for each parameter based on previous gradient history. These estimates are adjusted to consider the weighted average exponential factor (with higher learning rates). Fig. 6. Input, model, prediction results, and performance evaluation using MSE Learning Rate 1.001: The learning rate is the factor that controls the extent to which the model will adjust based on the gradient of the training data. A value of 1.001 is relatively high, and INPUT MSE G G G G G G G L G G L L G L G L G L L G L G G L L G L G L L G G L L L G L L L L Var-10 Var-09 Var-01 Var-02 Var-03 Var-04 Var-05 Var-06 Var-07 Var-08 D-2 D-3 …. D-28D-1 D-1 D-2 D-3 …. D-28 D-2 D-3 …. D-28 D-2 D-3 …. D-28 D-1 D-1 D-28 D-1 D-2 D-3 …. D-28 D-1 D-2 D-3 …. D-28 D-3 REAL D-1 D-2 …. D-28 D-3 D-3 D-3 D-28 D-1 D-2 D-3 …. D-28 D-1 D-2 D-3 …. D-28 D-1 D-2 D-3 …. …. Var-08 D-1 D-2 … D-28 …. Var-09 D-28 Var-10 D-1 D-2 … D-1 D-2 … D-28 …. Var-07 D-1 D-2 … D-28 D-3 D-3 …. Var-06 D-1 D-2 … D-28 …. Var-05 D-1 D-2 … D-28 …. Var-04 D-1 D-2 …D-3 D-3 D-28 DEEP LEARNING PREDICTION D-1 Var-01 D-1 D-2 … D-28 D-3 Var-03 D-1 D-2 … D-28 D-2 Var-02 D-1 D-2 … D-3 D-3 D-3 D-28 223 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 usually, smaller learning rate values (e.g., 0.001) are used to ensure stable convergence. Hidden Layer 50: This refers to the number of nodes (neurons) in each hidden layer in a neural network. This value shows the complexity of the model that has been created. The more nodes, the greater the model's ability to capture complex patterns in the data, but it can also increase the risk of overfitting if the training data is limited. Batch Size 64: This is the number of data samples used in each weight update iteration (mini-batch learning iteration). Larger batches can speed up training due to more efficient optimization, but they also require more memory. Too small a batch can cause unstable convergence. Batch size 64 is a commonly used value in most cases. D. Evaluation Criteria To assess model effectiveness, we employ a statistical technique known as Mean Square Error (MSE). MSE is calculated as the sum of the squares of the error distance between the anticipated outcomes and 28 previously hidden observation data points (actual data), then divided by the sample size. A lower MSE value suggests improved performance [64]. The formulation for MSE is shown in Equation 1, where the variables 𝑝 are predicted data, variables 𝑟 are actual data (observations) that are concealed, and n indicates the number of sample data. 𝑀𝑆𝐸 = 1 𝑛 ∑(𝑝 − 𝑟)2 (12) A lower MSE value indicates that the experimental model can better predict stock prices accurately, which means that the difference between model predictions and actual stock prices tends to be smaller. Conversely, a high MSE value indicates the model has a significant mismatch in predicting stock prices. MSE is a simple and easy-to-understand metric. The smaller the MSE value, the better the model predicts stock prices. MSE can give high weight to significant errors in predictions, which is helpful in cases where outliers (significant differences between predicted and actual values) must be considered. The use of MSE in evaluating forecasting models for the next 28 days will help to measure the quality of model predictions and to compare different models or update the model if necessary. III. Results and Discussions The training procedure used 10 model versions and 4 epochs (200, 500, 750, and 1000), resulting in 40 prediction graphs with 120 MSE measures. We only provide one graph of the projected outcomes (out of 40 graphs) for the training data phase, test data, and 28 days of prediction data (Figure 7) because of page limits. To make the 28-day forecast chart more visible, we expanded a smaller section (Figure 8). Figure 8 indicates that the 28-day forecast, particularly, has acceptable fluctuations until day 28 and continues to follow the original data pattern, starkly contrasting with long-term prediction approaches in general, which tend towards a specific value (convergent) with a more substantial bias for more extended data forecasts. Fig. 7. Training data, test data, and 28-day predicted data prediction results in full size Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 224 Fig. 8. Expanded sizes for test predictions and 28-day predictions Figure 7 and Figure 8 shows all MSE values for training data predictions, testing, and 28-day forecasts numerically, while Tables 1 and 2 show the MSE values graphically. Tables 1-2 and 7 show that the best model for predicting training and test data is the Var-10 with the LSTM-LSTM- LSTM-LSTM (LLLL) stack architecture, with MSE values of 1795.1927 and 1485.7672, respectively. Meanwhile, Var-7 with the LSTM-GRU-LSTM-GRU (LGLG) stack architecture is the best model for 28-day predictive data, with an MSE of 63.4376. Table 1 summarizes the MSE evaluation with all training procedures in the 200-500 epoch range. This model is a stack of four sequential layers with two different types of memory cells, namely the GRU and LSTM. Epoch 200 prediction of 28 days: this MSE value of 90.8903961 shows how much this model performs in predicting data and indicates that the model has a relatively large error rate, which means that the difference between the stock price predicted by the model and the actual stock price at each time point in the dataset is relatively significant. MSE of 90.8903 indicates that the GRU, LSTM, LSTM, and GRU stack model needs to be refined to improve the quality of stock price predictions. Careful evaluation and model adjustment are essential to overcome these limitations and achieve more accurate predictions. Table 1. The MSE of the whole training procedure in numerical form for epochs 200–500 Var MSE Epoch-200 Epoch-500 Train Test Pred-28 Train Test Pred-28 Var-01 G G G G 1857.0976 1541.8996 111.25726 1856.6352 1529.7655 91.91923 Var-02 G G G L 1858.9776 1533.7580 113.43025 1846.4071 1525.7432 103.7180 Var-03 G G L L 1924.8036 1591.7732 130.17904 1867.3014 1534.2615 78.1313 Var-04 G L G L 1839.2664 1519.7423 103.48911 1873.7208 1560.0412 82.2884 Var-05 G L L G 1809.4829 1495.1547 90.8903 1853.8268 1529.8587 95.7873 Var-06 L G G L 1811.1107 1495.3051 115.3705 1854.5001 1528.7287 77.9630 Var-07 L G L G 1854.3511 1534.2149 113.8544 1856.7671 1534.1344 74.1705 Var-08 L L G G 1850.9032 1527.7330 113.4182 1890.1816 1569.1906 70.1206 Var-09 L L L G 1854.5735 1535.9987 116.4838 1865.5825 1539.1219 78.8466 Var-10 L L L L 1795.1926 1485.7672 139.1183 1841.9540 1530.2617 82.5902 Table 2 summarizes optimization prediction results for the next 28 days in the epoch 750 training process with an MSE value of 63.4376. These results use variant seven with a stack of LSTM, GRU, LSTM, and GRU. The MSE value is a metric that measures the average of the squared differences between model predictions and actual values. In this context, an MSE value of 63.4376 means that the squared average difference between the predicted value and the actual stock value for the next 28 days is approximately 63.44 (in units that correspond to the stock data, for example, in dollars). Interpretation: a lower MSE value indicates that this model can predict better because the difference between the prediction and the actual value is smaller on average. Therefore, 225 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 in general, the MSE value of 63.44 indicates that the model has fairly good prediction quality. Epoch 750 is an iteration through the entire training dataset used to train the model. By the 750th epoch, the model has undergone many iterations through the data and has made repeated adjustments to the weights and parameters used to make predictions. Table 2. The MSE of the whole training procedure in numerical form for epochs 750–1000 Var MSE Epoch-750 Epoch-1000 Train Test Pred-28 Train Test Pred-28 Var-01 G G G G 1889.6429 1558.6723 215.5718 1844.8476 1521.5133 237.05712 Var-02 G G G L 1831.7365 1503.1840 156.7340 1838.4467 1524.5493 246.4624 Var-03 G G L L 1845.0175 1523.0437 152.2221 1845.1898 1529.3063 202.8599 Var-04 G L G L 1844.9218 1528.9018 122.2034 1843.0441 1522.9162 129.7584 Var-05 G L L G 1841.6457 1524.5890 63.7433 1843.1699 1531.5649 107.1425 Var-06 L G G L 1863.3424 1538.3165 120.0539 1855.0585 1531.2086 146.0880 Var-07 L G L G 1881.2632 1560.3401 63.4376 1845.0715 1529.2144 161.1198 Var-08 L L G G 1847.7620 1529.9814 87.0923 1845.5463 1521.8910 77.7621 Var-09 L L L G 1832.0935 1514.4305 65.8479 1868.4365 1541.8586 85.3185 Var-10 L L L L 1813.4162 1499.3618 81.2523 1835.4334 1518.0367 94.5993 The combination of LSTM, GRU, LSTM, GRU stack can give the model the ability to capture complex patterns in time series data. LSTM has the ability to remember information in the long term, while GRU is more efficient at handling information in the short term. This combination allows the model to combine the advantages of both. The prediction results for the next 28 days show that the seven variants model with the LSTM, GRU, LSTM, GRU stack has the potential to provide fairly good stock price predictions. However, the use of these predicted results must be integrated into a careful investment strategy and pay attention to risk factors that may influence stock prices. Figure 9 to Figure 11 show the MSE values of the training process for training data, test data, and 28-day data predictions, respectively. Fig. 9. The MSE values of the whole training process for training data Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 226 Fig. 10. Expanded sizes for test predictions and 28-day predictions Fig. 11. Expanded sizes for test predictions and 28-day predictions Table 3 present the performance study of present models. In previous studies conducted by [31], in this paper, a new model for optimizing stock forecasting is proposed that incorporates a range of technical indicators, including investor sentiment indicators and financial data, and performs dimension reduction on the many influencing factors of the retrieved stock price using depth learning LASSO and PCA approaches. The paper's insight is to propose a new model for optimizing stock forecasting by incorporating technical indicators and performing dimension reduction using LSTM and GRU models. LSTM and GRU models can effectively predict stock prices; the LASSO dimension reduction method performs better than PCA. In previous studies by [65] to forecast the stock price, the LSTM, bi-LSTM, GRU, and ordinary neural network (NN) modules are each designed sequentially. The performance of each separate model is then compared in this work with that of the suggested hybrid model. The NIFTY-50 stock market data implements the proposed stock price prediction model. The model predicts values along with the actual values of stock opening prices for (a) 100 days, (b) 300 days, (c) 500 days, and (d) 1000 days. In the results of studies by [66], the authors proposed using deep learning in making stock predictions. This paper compared the performance of six deep-learning algorithms to predict stock closing prices on the Indonesian Stock Exchange. Insights The paper proposes using a CNN- LSTM-GRU hybrid algorithm for stock price prediction, which outperforms other methods in terms of accuracy. Based on the research that has been carried out by [67], this paper proposes a trading strategy designed for the Moroccan stock market based on two deep learning models: LSTM and GRU to predict, respectively, the close price for the short- and mid-term horizons. The 227 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 proposed strategy outperforms benchmark indices in the Moroccan market; future work includes focusing on medium- and long-term predictions. The paper proposes a trading strategy for the Moroccan market using LSTM and GRU models for short- and medium-term price prediction. Table 3. Performance study of present models Reference Methods Results [31] - Depth learning LASSO and PCA approaches - LSTM and GRU models MSE 733.8773 [65] Bi-LSTM and GRU models MSE 0.0018 [66] CNN-LSTM-GRU hybrid algorithm RMSE decreased by 14%, MAE reduced by 13.4%, R2 3.9% [67] LSTM and GRU models MSE 0.57 [68] LSTM and GRU MAPE 97.37% [69] - Two-layer stacked LSTM (TLS-LSTM) - Correlation analysis between different currency pairs MSE 0.0015129 [70] Stacked-Bi-LSTM RMSE 0.025 Proposed models LSTM-GRU-LSTM-GRU stack MSE 63,44 The results of studies carried out by [68] methods use LSTM and GRU. In this paper, the authors propose eight new architectural models for stock price forecasting by identifying joint movement patterns in the stock market, which combine the LSTM and GRU models with four neural network block architectures. Eight new architectural models have been proposed for stock price forecasting. Evaluation of the proposed models using three accuracy measures The paper proposes eight new architectural models that combine LSTM and GRU algorithms with neural network block architectures to predict stock prices using grouped time-series data accurately. In the research conducted by [69] in this article, a TLS-LSTM neural network was used to forecast the trend of the Australian Dollar and United States Dollar (AUD/USD) and conduct a correlation analysis. TLS-LSTM outperforms other models in Forex trend prediction; AUD/USD movement affects EUR/AUD and AUD/JPY. The study proposes using a TLS-LSTM neural network for forex market forecasting and conducting correlation analysis between different currency pairs. Research conducted by [70] The Stacked Bi-LSTM (SBiLSTM) architecture, a modification of the conventional Deep Long-Short Term Memory (TDLM), is offered in this study. Two-time series from oilfield production are used to test the method. Comparative comparisons are made regarding the proposed SBiLSTM model's performance with those of multi-layer RNNs, Deep GRU, and Deep LSTM. IV. Conclusions Machine learning can deliver improved long-term predicted performance for PT Bank Syariah Indonesia Tbk (BRIS) shares, which is critical for investors when making stock market decisions. This data may also assist analysts in developing long-term financial strategy indicators. In this paper, we propose a distinct training approach for 1-day to 28-day forecasts utilizing 10 versions of deep learning models from 4 LSTM-GRU stacks and tailored input-target data segmentation algorithms. The LSTM-LSTM-LSTM-LSTM (LLLL) stack is used to obtain the best model for the prediction phase of training and test data utilizing BRIS stock history data from 01-07-2020 to 01- 07-2023 (728 days). Furthermore, the LSTM-GRU-LSTM-GRU (LGLG) stack model gives the most accurate long-term forecast for the next 28 days. The graph results from the altered input-target data segmentation approach exhibit variations and a perfect correlation with the observed data. Long-term forecasts do not exhibit significant volatility when utilizing the deep learning approach (without input-target data segmentation) solely but tend towards a constant (convergent) value. Long-term predictive research with even better accuracy is still possible, either by applying different methodologies or extending the techniques and procedures we have developed. The LSTM-GRU-LSTM-GRU stack model is a complex model that can be very good at handling complex time-series data. However, managing and maintaining such models requires considerable computing resources and a deep understanding of time series modeling. Overall, the LSTM-GRU-LSTM-GRU stack model can be a handy tool for forecasting long-term stock prices. Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 228 However, it should be used as one aspect of broader analysis and decision-making in investing in the stock market. Declarations Author contribution All authors contributed equally as the main contributor of this paper. All authors read and approved the final paper. Funding statement This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Conflict of interest The authors declare no known conflict of financial interest or personal relationships that could have appeared to influence the work reported in this paper. Additional information Reprints and permission information are available at http://journal2.um.ac.id/index.php/keds. Publisher’s Note: Department of Electrical Engineering and Informatics - Universitas Negeri Malang remains neutral with regard to jurisdictional claims and institutional affiliations. References [1] World Bank, Leveraging Islamic Fintech to Improve Financial Inclusion. World Bank, 2020. [2] M. A. Khattak and N. A. Khan, “Islamic Finance, Growth, and Volatility: a Fresh Evidence From 82 Countries,” J. Islam. Monet. Econ. Financ., vol. 9, no. 1, pp. 39–56, 2023. [3] E. Santi, B. Budiharto, and H. Saptono, "Pengawasan Otoritas Jasa Keuangan Terhadap Financial Technology (Peraturan Otoritas Jasa Keuangan NomoR 77/POJK.01/2016)," Diponegoro Law Journal, vol. 6, no. 3, pp. 1 -20, Jul. 2017. [4] S. Syarifuddin, R. Muin, and A. Akramunnas, “The Potential of Sharia Fintech in Increasing Micro Small and Medium Enterprises (MSMEs) in The Digital Era in Indonesia,” J. Huk. Ekon. Syariah, vol. 4, no. 1, p. 23, 2021. [5] R. A. Kasri and M. W. Sosianti, “Determinants of the Intention To Pay Zakat Online: the Case of Indonesia,” J. Islam. Monet. Econ. Financ., vol. 9, no. 2, pp. 275–294, 2023. [6] H. Hiyanti, L. Nugroho, C. Sukamadilaga, and T. Fitrijanti, “Sharia Fintech (Financial Technology) Opportunities and Challenges in Indonesia,” J. Ilm. Ekon. Islam, vol. 5, no. 03, pp. 326–333, 2019. [7] M. A. Kurniawan, M. Anwar, and S. R. Nidar, “Developing a Strategy for Islamic Money Market Model to Enhance Quality of Islamic Banking Performance during the Pandemic in Indonesia 2021,” Qual. - Access to Success, vol. 23, no. 190, pp. 261–268, 2022. [8] N. Nurdin and K. Yusuf, “Knowledge management lifecycle in Islamic bank: the case of syariah banks in Indonesia,” Int. J. Knowl. Manag. Stud., vol. 11, no. 1, pp. 59–80, Jan. 2020. [9] S. M. Anwar, J. Junaidi, S. Salju, R. Wicaksono, and M. Mispiyanti, “Islamic bank contribution to Indonesian economic growth,” Int. J. Islam. Middle East. Financ. Manag., vol. 13, no. 3, pp. 519–532, Jan. 2020. [10] M. H. Ali, M. A. Uddin, M. A. R. Khan, and B. Goud, “Faith-based versus value-based finance: Is there any portfolio diversification benefit between responsible and Islamic finance?,” Int. J. Financ. Econ., vol. 26, no. 4, pp. 5570–5583, Oct. 2021. [11] S. Alhammadi, “Expanding financial inclusion in Indonesia through Takaful: opportunities, challenges and sustainability,” J. Financ. Report. Account., vol. ahead-of-print, no. ahead-of-print, Jan. 2023. [12] A. D. Songer, J. Diekmann, W. Hendrickson, and D. Flushing, “Situational Reengineering: Case Study Analysis,” J. Constr. Eng. Manag., vol. 126, no. 3, pp. 185–190, May 2000. [13] M. Mursyid, H. Kusuma, A. Tohirin, and J. Sriyana, “Performance Analysis of Islamic Banks in Indonesia: The Maqashid Shariah Approach,” J. Asian Financ. Econ. Bus., vol. 8, no. 3, pp. 307–318, 2021. [14] A. Ding, X., Haron, R., & Hasan, “The Influence Of Basel III On Islamic Bank Risk,” J. Islam. Monet. Econ. Financ., vol. 9, no. 1, pp. 167–198, 2023. [15] E. B. Boukherouaa et al., Powering the Digital Economy: Opportunities and Risks of Artificial Intelligence in Finance. International Monetary Fund, 2021. [16] M. Asutay, P. F. Aziz, B. S. Indrastomo, and Y. Karbhari, “Religiosity and Charitable Giving on Investors’ Trading Behaviour in the Indonesian Islamic Stock Market: Islamic vs Market Logic,” J. Bus. Ethics, 2023. [17] D. Defrizal, K. Romli, A. Purnomo, and H. A. Subing, “A Sectoral Stock Investment Strategy Model in Indonesia Stock Exchange,” J. Asian Financ. Econ. Bus., vol. 8, no. 1, pp. 015–022, 2021. [18] A. Thakkar and K. Chaudhari, “A Comprehensive Survey on Portfolio Optimization, Stock Price and Trend Prediction Using Particle Swarm Optimization,” Arch. Comput. Methods Eng., vol. 28, no. 4, pp. 2133–2164, 2021. http://journal2.um.ac.id/index.php/keds https://elibrary.worldbank.org/doi/abs/10.1596/34520 https://doi.org/10.21098/jimf.v9I1.1625 https://doi.org/10.21098/jimf.v9I1.1625 https://ejournal3.undip.ac.id/index.php/dlr/article/view/19683 https://ejournal3.undip.ac.id/index.php/dlr/article/view/19683 https://ejournal3.undip.ac.id/index.php/dlr/article/view/19683 https://doi.org/10.30595/jhes.v4i1.9768 https://doi.org/10.30595/jhes.v4i1.9768 https://doi.org/10.21098/jimf.v9i2.1664 https://doi.org/10.21098/jimf.v9i2.1664 https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Sharia+Fintech+%28Financial+Technology%29+Opportunities+and+Challenges+in+Indonesia&btnG= https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Sharia+Fintech+%28Financial+Technology%29+Opportunities+and+Challenges+in+Indonesia&btnG= https://doi.org/10.47750/QAS/23.190.28 https://doi.org/10.47750/QAS/23.190.28 https://doi.org/10.47750/QAS/23.190.28 https://doi.org/10.1504/IJKMS.2020.105073 https://doi.org/10.1504/IJKMS.2020.105073 https://doi.org/10.1108/IMEFM-02-2018-0071 https://doi.org/10.1108/IMEFM-02-2018-0071 https://doi.org/10.1002/ijfe.2081 https://doi.org/10.1002/ijfe.2081 https://doi.org/10.1002/ijfe.2081 https://doi.org/10.1108/JFRA-05-2023-0256 https://doi.org/10.1108/JFRA-05-2023-0256 https://doi.org/10.1061/(ASCE)0733-9364(2000)126:3(185) https://doi.org/10.1061/(ASCE)0733-9364(2000)126:3(185) https://doi.org/10.13106/jafeb.2021.vol8.no3.0307 https://doi.org/10.13106/jafeb.2021.vol8.no3.0307 http://www.jimf-bi.org/index.php/JIMF/article/view/1590 http://www.jimf-bi.org/index.php/JIMF/article/view/1590 https://books.google.com/books?hl=en&lr=&id=NvlXEAAAQBAJ&oi=fnd&pg=PA1&dq=Powering+the+Digital+Economy:+Opportunities+and+Risks+of+Artificial+Intelligence+in+Finance.+International+Monetary+Fund&ots=1ZU6OdDswC&sig=NxQvhVLgKQzf8q6Urz5W1YvTA_w https://books.google.com/books?hl=en&lr=&id=NvlXEAAAQBAJ&oi=fnd&pg=PA1&dq=Powering+the+Digital+Economy:+Opportunities+and+Risks+of+Artificial+Intelligence+in+Finance.+International+Monetary+Fund&ots=1ZU6OdDswC&sig=NxQvhVLgKQzf8q6Urz5W1YvTA_w https://doi.org/10.1007/s10551-023-05324-0 https://doi.org/10.1007/s10551-023-05324-0 https://doi.org/10.13106/jafeb.2021.vol8.no1.015 https://doi.org/10.13106/jafeb.2021.vol8.no1.015 https://doi.org/10.1007/s11831-020-09448-8 https://doi.org/10.1007/s11831-020-09448-8 229 Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 [19] E. I. Ardyanta and H. Sari, “A Prediction of Stock Price Movements Using Support Vector Machines in Indonesia,” J. Asian Financ., vol. 8, no. 8, pp. 399–0407, 2021. [20] W. Budiharto, “Data science approach to stock prices forecasting in Indonesia during Covid -19 using Long Short- Term Memory (LSTM),” J. Big Data, vol. 8, no. 1, p. 47, 2021. [21] M. Kunwar, “Artificial Intelligence In Finance Understanding how automation and machine learning is transforming the financial industry,” no. August, 2019. [22] A. Saranya and R. Anandan, “Stock market prediction using machine learning algorithms,” Int. J. Recent Technol. Eng., vol. 8, no. 2 Special Issue 4, pp. 280–283, 2019. [23] S. Ahmed, M. M. Alshater, A. El Ammari, and H. Hammami, “Artificial intelligence and machine learning in finance: A bibliometric review,” Res. Int. Bus. Financ., vol. 61, p. 101646, 2022. [24] C. Milana and A. Ashta, “Artificial intelligence techniques in finance and financial markets: A survey of the literature,” Strateg. Chang., vol. 30, no. 3, pp. 189–209, May 2021. [25] W. Hastomo, A. S. B. Karno, N. Kalbuana, E. Nisfiani, and L. ETP, “Optimasi Deep Learning untuk Prediksi Saham di Masa Pandemi Covid-19,” J. Edukasi dan Penelit. Inform., vol. 7, no. 2, p. 133, Aug. 2021. [26] N. Navarin, B. Vincenzi, M. Polato, and A. Sperduti, “LSTM networks for data-aware remaining time prediction of business process instances,” in 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 2017, pp. 1–7. [27] M. O. Rahman, M. S. Hossain, T.-S. Junaid, M. S. A. Forhad, and M. K. Hossen, “Predicting prices of stock market using gated recurrent units (GRUs) neural networks,” Int. J. Comput. Sci. Netw. Secur, vol. 19, no. 1, pp. 213–222, 2019. [28] K. A. Althelaya, E.-S. M. El-Alfy, and S. Mohammed, “Stock Market Forecast Using Multivariate Analysis with Bidirectional and Stacked (LSTM, GRU),” in 2018 21st Saudi Computer Society National Computer Conference (NCC), 2018, pp. 1–7. [29] M. A. I. Sunny, M. M. S. Maswood, and A. G. Alharbi, “Deep Learning-Based Stock Price Prediction Using LSTM and Bi-Directional LSTM Model,” in 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES), 2020, pp. 87–92. [30] Y. Liu, Z. Wang, and B. Zheng, “Application of Regularized GRU-LSTM Model in Stock Price Prediction,” in 2019 IEEE 5th International Conference on Computer and Communications (ICCC), 2019, pp. 1886–1890. [31] Y. Gao, R. Wang, and E. Zhou, “Stock Prediction Based on Optimized LSTM and GRU Models,” Sci. Program., vol. 2021, p. 4055281, 2021. [32] M. E. Karim, M. Foysal, and S. Das, “Stock price prediction using Bi-LSTM and GRU-based hybrid deep learning approach,” in Proceedings of Third Doctoral Symposium on Computational Intelligence: DoSCI 2022, 2022, pp. 701–711. [33] A. Sethia and P. Raut, “Application of LSTM, GRU and ICA for stock price prediction,” in Information and Communication Technology for Intelligent Systems: Proceedings of ICTIS 2018, Volume 2, 2019, pp. 479–487. [34] J. Zhao, D. Zeng, S. Liang, H. Kang, and Q. Liu, “Prediction model for stock price trend based on recurrent neural network,” J. Ambient Intell. Humaniz. Comput., vol. 12, no. 1, pp. 745–753, 2021. [35] K. Wang, X. Qi, and H. Liu, “Photovoltaic power forecasting based LSTM-Convolutional Network,” Energy, vol. 189, p. 116225, Dec. 2019. [36] Z. Karevan and J. A. K. Suykens, “Transductive LSTM for time-series prediction: An application to weather forecasting,” Neural Networks, vol. 125, pp. 1–9, May 2020. [37] G. Ding and L. Qin, “Study on the prediction of stock price based on the associated network model of LSTM,” Int. J. Mach. Learn. Cybern., vol. 11, no. 6, pp. 1307–1317, Jun. 2020. [38] S. Chen and L. Ge, “Exploring the attention mechanism in LSTM-based Hong Kong stock price movement prediction,” Quant. Financ., vol. 19, no. 9, pp. 1507–1515, Sep. 2019. [39] Y. Baek and H. Y. Kim, “ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module,” Expert Syst. Appl., vol. 113, pp. 457 –480, 2018. [40] X. Liang, Z. Ge, L. Sun, M. He, and H. Chen, “LSTM with Wavelet Transform Based Data Preprocessing for Stock Price Prediction,” Math. Probl. Eng., vol. 2019, p. 1340174, 2019. [41] P. Xu et al., “Automatic evaluation of facial nerve paralysis by dual-path LSTM with deep differentiated network,” Neurocomputing, vol. 388, pp. 70–77, 2020. [42] A. U. Muhammad, A. S. Yahaya, S. M. Kamal, J. M. Adam, W. I. Muhammad, and A. Elsafi, “A Hybrid Deep Stacked LSTM and GRU for Water Price Prediction,” in 2020 2nd International Conference on Computer and Information Sciences (ICCIS), 2020, pp. 1–6. [43] M. Ali, D. M. Khan, H. M. Alshanbari, and A. A.-A. H. El-Bagoury, “Prediction of Complex Stock Market Data Using an Improved Hybrid EMD-LSTM Model,” Appl. Sci., vol. 13, no. 3, 2023. [44] A. Dutta, G. Pooja, N. Jain, R. R. Panda, and N. K. Nagwani, “A Hybrid Deep Learning Approach for Stock Price Prediction,” in Machine Learning for Predictive Analysis, 2021, pp. 1–10. [45] S. Zaheer et al., “A Multi Parameter Forecasting for Stock Time Series Data Using LSTM and Deep Learning Model,” Mathematics, vol. 11, no. 3, 2023. [46] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv Prepr. arXiv1412.3555, 2014. [47] P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, “Long Short Term Memory networks for anomaly detection in time series,” 23rd Eur. Symp. Artif. Neural Networks, Comput. Intell. Mach. Learn. ESANN 2015 - Proc., no. April, pp. 89–94, 2015. [48] J. L. Elman, “Finding structure in time,” Cogn. Sci., vol. 14, no. 2, pp. 179–211, 1990. [49] L. Medsker and L. C. Jain, Recurrent neural networks: design and applications. CRC press, 1999. https://doi.org/10.13106/jafeb.2021.vol8.no8.0399 https://doi.org/10.13106/jafeb.2021.vol8.no8.0399 https://doi.org/10.1186/s40537-021-00430-0 https://doi.org/10.1186/s40537-021-00430-0 https://www.theseus.fi/handle/10024/227560 https://www.theseus.fi/handle/10024/227560 https://doi.org/10.35940/ijrte.B1052.0782S419 https://doi.org/10.35940/ijrte.B1052.0782S419 https://doi.org/10.1016/j.ribaf.2022.101646 https://doi.org/10.1016/j.ribaf.2022.101646 https://doi.org/10.1002/jsc.2403 https://doi.org/10.1002/jsc.2403 https://dx.doi.org/10.26418/jp.v7i2.47411 https://dx.doi.org/10.26418/jp.v7i2.47411 https://doi.org/10.1109/SSCI.2017.8285184 https://doi.org/10.1109/SSCI.2017.8285184 https://www.researchgate.net/profile/Md-Sabir-Hossain/publication/331385031_Predicting_Prices_of_Stock_Market_using_Gated_Recurrent_Units_GRUs_Neural_Networks/links/5c93b36492851cf0ae8e96fb/Predicting-Prices-of-Stock-Market-using-Gated-Recurrent-Units-GRUs-Neural-Networks.pdf https://www.researchgate.net/profile/Md-Sabir-Hossain/publication/331385031_Predicting_Prices_of_Stock_Market_using_Gated_Recurrent_Units_GRUs_Neural_Networks/links/5c93b36492851cf0ae8e96fb/Predicting-Prices-of-Stock-Market-using-Gated-Recurrent-Units-GRUs-Neural-Networks.pdf https://www.researchgate.net/profile/Md-Sabir-Hossain/publication/331385031_Predicting_Prices_of_Stock_Market_using_Gated_Recurrent_Units_GRUs_Neural_Networks/links/5c93b36492851cf0ae8e96fb/Predicting-Prices-of-Stock-Market-using-Gated-Recurrent-Units-GRUs-Neural-Networks.pdf https://doi.org/10.1109/NCG.2018.8593076 https://doi.org/10.1109/NCG.2018.8593076 https://doi.org/10.1109/NCG.2018.8593076 https://doi.org/10.1109/NILES50944.2020.9257950 https://doi.org/10.1109/NILES50944.2020.9257950 https://doi.org/10.1109/NILES50944.2020.9257950 https://doi.org/10.1109/ICCC47050.2019.9064035 https://doi.org/10.1109/ICCC47050.2019.9064035 https://doi.org/10.1155/2021/4055281 https://doi.org/10.1155/2021/4055281 https://link.springer.com/chapter/10.1007/978-981-19-3148-2_60 https://link.springer.com/chapter/10.1007/978-981-19-3148-2_60 https://link.springer.com/chapter/10.1007/978-981-19-3148-2_60 https://link.springer.com/chapter/10.1007/978-981-13-1747-7_46 https://link.springer.com/chapter/10.1007/978-981-13-1747-7_46 https://doi.org/10.1007/s12652-020-02057-0 https://doi.org/10.1007/s12652-020-02057-0 https://doi.org/10.1016/j.energy.2019.116225 https://doi.org/10.1016/j.energy.2019.116225 https://doi.org/10.1016/j.neunet.2019.12.030 https://doi.org/10.1016/j.neunet.2019.12.030 https://doi.org/10.1007/s13042-019-01041-1 https://doi.org/10.1007/s13042-019-01041-1 https://doi.org/10.1080/14697688.2019.1622287 https://doi.org/10.1080/14697688.2019.1622287 https://doi.org/10.1016/j.eswa.2018.07.019 https://doi.org/10.1016/j.eswa.2018.07.019 https://doi.org/10.1016/j.eswa.2018.07.019 https://doi.org/10.1155/2019/1340174 https://doi.org/10.1155/2019/1340174 https://doi.org/10.1016/j.neucom.2020.01.014 https://doi.org/10.1016/j.neucom.2020.01.014 https://doi.org/10.1109/ICCIS49240.2020.9257651 https://doi.org/10.1109/ICCIS49240.2020.9257651 https://doi.org/10.1109/ICCIS49240.2020.9257651 https://doi.org/10.3390/app13031429 https://doi.org/10.3390/app13031429 https://doi.org/10.1007/978-981-15-7106-0_1 https://doi.org/10.1007/978-981-15-7106-0_1 https://doi.org/10.3390/math11030590 https://doi.org/10.3390/math11030590 https://arxiv.org/abs/1412.3555 https://arxiv.org/abs/1412.3555 https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series https://www.researchgate.net/publication/304782562_Long_Short_Term_Memory_Networks_for_Anomaly_Detection_in_Time_Series https://doi.org/10.1016/0364-0213(90)90002-E https://books.google.com/books?hl=en&lr=&id=ME1SAkN0PyMC&oi=fnd&pg=PA1&dq=Recurrent+Neural+Networks+Desihn+and+Aplications&ots=7cBzcO2OVm&sig=TM414y-mIfEI4UNmXS7wWjImbnY Y. Suyatna et al. / Knowledge Engineering and Data Science 2023, 6 (2): 215–230 230 [50] P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proc. IEEE, vol. 78, no. 10, pp. 1550– 1560, 1990. [51] J. L. Elman and D. Zipser, “Learning the hidden structure of speech,” J. Acoust. Soc. Am., vol. 83, no. 4, pp. 1615– 1626, Apr. 1988.. [52] J. T. Connor, R. D. Martin, and L. E. Atlas, “Recurrent neural networks and robust time series prediction,” IEEE Trans. Neural Networks, vol. 5, no. 2, pp. 240–254, 1994. [53] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. Neural Networks, vol. 5, no. 2, pp. 157–166, 1994. [54] J. Brownlee, “How to develop LSTM models for time series forecasting (2018).” 2019. [55] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [56] K. Cho et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv Prepr., 2014. [57] S. M. Al-Selwi, M. F. Hassan, S. J. Abdulkadir, and A. Muneer, “LSTM Inefficiency in Long-Term Dependencies Regression Problems,” J. Adv. Res. Appl. Sci. Eng. Technol., vol. 30, no. 3, pp. 16–31, 2023. [58] C. Hu, S. Martin, and R. Dingreville, “Accelerating phase-field predictions via recurrent neural networks learning the microstructure evolution in latent space,” Comput. Methods Appl. Mech. Eng., vol. 397, p. 115128, Jul. 2022. [59] M. R. Raza, W. Hussain, and J. M. Merigó, “Cloud Sentiment Accuracy Comparison using RNN, LSTM and GRU,” in 2021 Innovations in Intelligent Systems and Applications Conference (ASYU), 2021, pp. 1–5. [60] T. Limouni, R. Yaagoubi, K. Bouziane, K. Guissi, and E. H. Baali, “Accurate one step and multistep forecasting of very short-term PV power using LSTM-TCN model,” Renew. Energy, vol. 205, pp. 1010–1024, 2023. [61] N. Klyuchnikov et al., “NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing,” IEEE Access, vol. 10, pp. 45736–45747, 2022. [62] S. Wang and H. Chen, “A novel deep learning method for the classification of power quality disturbances using deep convolutional neural network,” Appl. Energy, vol. 235, pp. 1126–1140, 2019. [63] W. Hastomo, N. Aini, A. S. B. Karno, and L. M. R. Rere, “Metode Pembelajaran Mesin untuk Memprediksi Emisi Manure Management,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 11, no. 2, pp. 131–139, 2022. [64] W. Hastomo, A. S. Bayangkari Karno, N. Kalbuana, A. Meiriki, and Sutarno, “Characteristic Parameters of Epoch Deep Learning to Predict Covid-19 Data in Indonesia,” J. Phys. Conf. Ser., vol. 1933, no. 1, 2021. [65] M. E. Karim, M. Foysal, and S. Das, “Stock Price Prediction Using Bi-LSTM and GRU-Based Hybrid Deep Learning Approach,” 2023, pp. 701–711. [66] B. Sulistio, H. L. H. S. Warnars, F. L. Gaol, and B. Soewito, “Energy Sector Stock Price Prediction Using The CNN, GRU & LSTM Hybrid Algorithm,” in 2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE), 2023, pp. 178–182. [67] Y. Touzani and K. Douzi, “An LSTM and GRU based trading strategy adapted to the Moroccan market,” J. Big Data, vol. 8, no. 1, p. 126, 2021. [68] A. Lawi, H. Mesra, and S. Amir, “Implementation of Long Short-Term Memory and Gated Recurrent Units on grouped time-series data to predict stock prices accurately,” J. Big Data, vol. 9, no. 1, p. 89, 2022. [69] M. Ayitey Junior, P. Appiahene, and O. Appiah, “Forex market forecasting with two-layer stacked Long Short-Term Memory neural network (LSTM) and correlation analysis,” J. Electr. Syst. Inf. Technol., vol. 9, no. 1, p. 14, 2022. [70] B. Sirisha, K. K. C. Goud, and B. T. V. S. Rohit, “A Deep Stacked Bidirectional LSTM (SBiLSTM) Model for Petroleum Production Forecasting,” Procedia Comput. Sci., vol. 218, pp. 2767–2775, 2023. https://doi.org/https:/doi.org/10.1121/1.395916 https://doi.org/https:/doi.org/10.1121/1.395916 https://doi.org/10.1121/1.395916 https://doi.org/10.1121/1.395916 https://doi.org/10.1109/72.279188 https://doi.org/10.1109/72.279188 https://doi.org/10.1109/72.279181 https://doi.org/10.1109/72.279181 https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=How+to+develop+LSTM+models+for+time+series+forecasting+%282018%29&btnG= https://doi.org/10.1162/neco.1997.9.8.1735 https://doi.org/10.1162/neco.1997.9.8.1735 https://doi.org/10.48550/arXiv.1406.1078 https://doi.org/10.48550/arXiv.1406.1078 https://doi.org/10.37934/araset.30.3.1631 https://doi.org/10.37934/araset.30.3.1631 https://doi.org/10.1016/j.cma.2022.115128 https://doi.org/10.1016/j.cma.2022.115128 https://doi.org/10.1109/ASYU52992.2021.9599044 https://doi.org/10.1109/ASYU52992.2021.9599044 https://doi.org/10.1016/j.renene.2023.01.118 https://doi.org/10.1016/j.renene.2023.01.118 https://doi.org/10.1109/ACCESS.2022.3169897 https://doi.org/10.1109/ACCESS.2022.3169897 https://doi.org/10.1016/j.apenergy.2018.09.160 https://doi.org/10.1016/j.apenergy.2018.09.160 https://journal.ugm.ac.id/v3/JNTETI/article/view/2586 https://journal.ugm.ac.id/v3/JNTETI/article/view/2586 https://doi.org/10.1088/1742-6596/1933/1/012050 https://doi.org/10.1088/1742-6596/1933/1/012050 https://doi.org/10.1007/978-981-19-3148-2_60 https://doi.org/10.1007/978-981-19-3148-2_60 https://doi.org/10.1109/ICCoSITE57641.2023.10127847 https://doi.org/10.1109/ICCoSITE57641.2023.10127847 https://doi.org/10.1109/ICCoSITE57641.2023.10127847 https://doi.org/10.1186/s40537-021-00512-z https://doi.org/10.1186/s40537-021-00512-z https://doi.org/10.1186/s40537-022-00597-0 https://doi.org/10.1186/s40537-022-00597-0 https://doi.org/10.1186/s43067-022-00054-1 https://doi.org/10.1186/s43067-022-00054-1 https://doi.org/10.1016/j.procs.2023.01.248 https://doi.org/10.1016/j.procs.2023.01.248