PARADIGMA BARU PENDIDIKAN MATEMATIKA DAN APLIKASI ONLINE INTERNET PEMBELAJARAN How to cite: M. Fajar, “An Application of Hybrid Forecasting Singular Spectrum Analysis– Extreme Learning Machine Method in Foreign Tourists Forecasting”, mantik, vol. 5, no. 2, pp. 60-68, October 2019. An Application of Hybrid Forecasting Singular Spectrum Analysis – Extreme Learning Machine Method in Foreign Tourists Forecasting Muhammad Fajar BPS-Statistics Indonesia, mfajar@bps.go.id doi: https://doi.org/10.15642/mantik.2019.5.2.60-68 Abstrak: Wisman adalah salah satu indikator untuk melihat perkembangan pariwisata. Perkembangan pariwisata mempunyai andil penting bagi perekonomian karena pariwisata merupakan booster peningkatan devisa, menciptakan peluang usaha, dan membuka kesempatan kerja. Sebagai bahan input untuk strategi dan program pariwisata adalah prediksi terhadap jumlah wisman di masa depan yang diperoleh dari peramalan. Dalam paper ini, penulis menggunakan metode Hybrid singular spectrum analysis – extreme learning machine untuk meramalkan jumlah wisman. Data yang digunakan dalam penelitian adalah jumlah wisman yang bersumber dari Badan Pusat Statistik. Hasil penelitian ini bahwa kemampuan metode Hybrid SSA-ELM sangat baik dalam meramalkan jumlah wisman. Hal tersebut ditunjukkan oleh nilai MAPE sebesar 4.91 persen, dengan out sample sebanyak delapan observasi. Kata kunci: wisata mancanegara, singular spectrum analysis, extreme learning machine Abstract: International tourism is one indicator of measuring tourism development. Tourism development is important for the national economy since tourism could boost foreign exchange, create business opportunities, and provide employment opportunities. The prediction of foreign tourist numbers in the future obtained from forecasting is used as an input parameter for strategy and tourism programs planning. In this paper, the Hybrid Singular Spectrum Analysis – Extreme Learning Machine (SSA-ELM) is used to forecast the number of foreign tourists. Data used is the number of foreign tourists January 1980 - December 2017 taken from Badan Pusat Statistik (Statistics Indonesia). The result of this research concludes that Hybrid SSA-ELM performance is very good at forecasting the number of foreign tourists. It is shown by the MAPE value of 4.91 percent with eight observations out a sample. Keywords: foreign tourist, singular spectrum analysis, extreme learning machine Jurnal Matematika MANTIK Volume 5, Nomor 2, October 2019, pp. 60-68 ISSN: 2527-3159 (print) 2527-3167 (online) http://u.lipi.go.id/1458103791 http://u.lipi.go.id/1457054096 M. Fajar An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning Machine Method in Foreign Tourists Forecasting 61 1. Introduction Indonesia is a country that has a lot of mesmerizing landscapes, natural resources, and diverse cultures. These are tourist attractions that could be optimally utilized to advance the national economy. Tourism development is important for the national economy since tourism could boost foreign exchange, create business opportunities, and provide employment opportunities. One indicator of measuring tourism development is the number of foreign tourists to visit. The number of foreign tourists visiting Indonesia from January 1980 to December 2017 is visually presented in Fig. 1.1. It could be seen that in this period, the number of foreign tourists visiting Indonesia shows an increasing trend each year. The implication for Indonesia as the host is that the strategy of infrastructural development is needed to avoid the decreasing number of foreign tourists and to lower the negative impacts to the environment caused by the increasing number of foreign tourists visit. Such as environment based transportation vehicles, hotels, recreation facilities, etc. Figure 1.1. The Number of Foreign Tourists January 1980 – December 2017 Therefore, one input for the strategy is the prediction number of foreign tourists obtained from forecasting, specifically, time series forecasting. Time series forecasting is a quantitative method used to analyze a series of data collected in time order using the right technique. The result could be used as a reference to forecast the value of the series in the future [9]. The development of forecasting methods is increasingly rapid and complex as advances in the development of computing technology. The interesting thing from the time series method development is the reconstruction of hybrid time series forecasting method, a time series constructed from two different types of forecasting method [1] – [5]. In this paper, singular spectrum analysis and extreme learning machine techniques are combined to forecast the number of foreign tourists visiting Indonesia. Extreme learning machine is an exclusive example of feed-forward neural network in the form of FFNN with only one hidden layer, where the singular spectrum analysis – FFNN method has been developed before [5]. Time w is m an t 1980 1990 2000 2010 0 20 00 00 60 00 00 10 00 00 0 14 00 00 0 Jurnal Matematika MANTIK Vol. 5, No. 2, October 2019, pp. 60-68 62 2. Data Data used in this research is the number of foreign tourists visiting Indonesia from January 1980 until August 2018. Train data (in the sample) covers the number of foreign tourists in January 1980 until August 2018. Meanwhile, test data (out sample) covers the number of foreign tourists from January 2018 until August 2018. 3. Method 3.1 Singular Spectrum Analysis (SSA) SSA is a non-parametric time series method based on multivariate statistics principle. SSA decomposes time series additively into several independent components. These components could be identified as trend, periodic, quasi- periodic, and noise component. SSA procedure consists of four steps, they are [6]- [10]: Step 1. Embedding Given an 𝑥1,𝑥2,…, 𝑥𝑇 time series, choose an even number 𝐿, where 𝐿 parameter is the window length defined as 2 < 𝐿 < 𝑇 2⁄ , and 𝐾 = 𝑇 − 𝐿 + 1. The cross matrix is: 𝑿 = (𝑋1,…,𝑋𝑇) = ( 𝑥1 𝑥2 𝑥2 𝑥3 ⋯ 𝑥𝐾 ⋯ 𝑥𝐾+1 ⋮ ⋮ 𝑥𝐿 𝑥𝐿+1 ⋱ ⋮ ⋯ 𝑥𝑇 ) The cross matrix proves to be a Hankel matrix, which means every element in the main anti diagonal has the same value. Thus, 𝑿 could be assumed as multivariate data with L characteristic and K observations so that the covariance matrix is 𝑺 = 𝑿𝑿′ with the dimension of 𝐿 × 𝐿. Step 2. Singular Value Decomposition (SVD) Suppose that 𝑺 has eigenvalue and eigenvector 𝜆𝑖 and 𝑈𝑖, respectively. Where 𝜆1 ≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝐿 and 𝑈1,…,𝑈𝐿. Thus, obtained SVD from 𝑿 as follows: 𝑿 = 𝐸1 + 𝐸2 + ⋯+ 𝐸𝑑 (1) where 𝐸𝑖 = √𝜆𝑖𝑈𝑖𝑉𝑖 ′ , 𝑖 = 1,2,…,𝑑, 𝐸𝑖 is the main component, 𝑑 is the number of eigenvalue 𝜆𝑖, and 𝑉𝑖 = 𝑿 ′ 𝑈𝑖 √𝜆𝑖⁄ . Step 3. Grouping In this step, 𝑿 is additively grouped into subgroups based on patterns that form a time series. They are trend, periodic, quasi-periodic, and noise component. Partition the index set {1,2,…,𝑑} into several groups 𝐼1, 𝐼2,…,𝐼𝑛, then correspond matrix 𝑿𝐼 into group 𝐼 = {𝑖1, 𝑖2,…, 𝑖𝑏} which is defined as: M. Fajar An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning Machine Method in Foreign Tourists Forecasting 63 𝑿𝐼 = 𝐸𝑖1 + 𝐸𝑖2 + ⋯+ 𝐸𝑖𝑏 (2) Thus, the decomposition represents as: 𝑿 = 𝑿𝐼1 + 𝑿𝐼2 + ⋯+ 𝑿𝐼𝑛 (3) with 𝑿𝐼𝑗(𝑗 = 1,2,…,𝑛) is reconstructed component (RC). 𝑿𝐼 component contribution measured with corresponding eigenvalue contribution: ∑ 𝜆𝑖𝑖∈𝐼 ∑ 𝜆𝑖 𝑑 𝑖=1⁄ . Using the close frequency range from the main components is based on the study of the grouping process using auto grouping [11]. Main components with relatively close frequency ranges are grouped into one reconstructed component. Soon, until several reconstructed components are formed. Step 4. Reconstruction In this last step, 𝑿𝐼𝑗 is transformed into a new time series with T observations obtained from diagonal averaging or Hankelization. Suppose that 𝒀 is a matrix with 𝐿 × 𝐾 dimensions and has 𝑦𝑖𝑗,1 ≤ 𝑖 ≤ 𝐿,1 ≤ 𝑗 ≤ 𝐾 elements. Then, 𝐿 ∗ = min(𝐿,𝐾),𝐾∗ = max(𝐿,𝐾),and 𝑇 = 𝐿 + 𝐾 − 1. Then, 𝑦𝑖𝑗 ∗ = 𝑦𝑖𝑗 if 𝐿 < 𝐾 and 𝑦𝑖𝑗 ∗ = 𝑦𝑗𝑖 if 𝐿 > 𝐾. Matrix 𝒀 transferred into 𝑦1,𝑦2,…,𝑦𝑇 series with using the following formula: 𝑦𝑘 = { 1 𝑘 ∑ 𝑦𝑚,𝑘−𝑚+1 ∗ 𝑘 𝑚=1 ,1 ≤ 𝑘 ≤ 𝐿∗ 1 𝐿∗ ∑ 𝑦𝑚,𝑘−𝑚+1 ∗ 𝐿 𝑚=1 ,𝐿∗ ≤ 𝑘 ≤ 𝐾∗ 1 𝑇 − 𝑘 + 1 ∑ 𝑦𝑚,𝑘−𝑚+1 ∗ 𝑇−𝐾∗+1 𝑚=𝑘−𝐾∗+1 ,𝐾∗ ≤ 𝑘 ≤ 𝑇 (4) Diagonal averaging on equation (4) is applied to every matrix component 𝑿𝐼𝑗 on equation (3) resulting a �̃�(𝑘) = (�̌�1 (𝑘) , �̌�2 (𝑘) ,…, �̌�𝑇 (𝑘) ) series. Thus, 𝑥1,𝑥2,…,𝑥𝑇 series is decomposed into an addition of reconstructed m series: 𝑥𝑡 = ∑�̌�𝑡 (𝑘) 𝑚 𝑘=1 , 𝑡 = 1,2,…,𝑇 (5) SSA forecasting used in this research is SSA recurrent, with estimating min-norm LRR (Linear Recurrence Relationship) coefficient. The LRR coefficient is calculated with the following algorithm: 1. Input: Matrix 𝐏 = [𝑃1:…:𝑃𝑟], 𝐏 is a matrix composed of 𝑃𝑖 eigenvector from SVD step, where Select a group of r (1 ≤ r ≤ L) eigenvectors 𝑃1,…,𝑃𝑟. Suppose that 𝐏 is a 𝐏 that the last row is removed, and 𝐏 is a 𝐏 that the first row is removed. Jurnal Matematika MANTIK Vol. 5, No. 2, October 2019, pp. 60-68 64 2. For every 𝑃𝑖 vector column from 𝐏, calculate 𝜋𝑖, where 𝜋𝑖 is the last component from 𝑃𝑖 , and 𝑃𝒊 is a 𝑃𝑖 that the last component is removed. 3. Calculate: 𝑣2 = ∑ 𝜋𝑖 2𝑟 𝑖=1 . If 𝑣 2 = 1, then STOP with a warning message “Verticality coefficient equals 1.” 4. Calculate the min-norm LRR coefficient (ℛ): ℛ = 1 1 − 𝑣2 ∑𝜋𝑖𝑃𝒊 𝑟 𝑖=1 5. From point (4) obtained: ℛ = (𝛼𝐿−1 …𝛼1) ′. 6. Then, calculate the forecasting value with: �̂�𝑛 = ∑𝛼𝑖�̃�𝑛−1 𝐿−1 𝑖=1 , 𝑛 = 𝑇 + 1,…. ,𝑇 + ℎ 3.2 Extreme Learning Machine (ELM) Extreme learning machine is a learning scheme of feedforward neural network for single-hidden layer feedforward neural networks (SLFN). ELM could adaptively set the number of nodes and randomly chooses the input weight W and bias 𝒃𝑖 on a hidden layer. Hidden layer weight is obtained by using the least square method [12]. Suppose that a SLFN training process with 𝐾 hidden nodes and an activation vector function 𝒈(�̌�) = (𝑔1(�̌�),𝑔2(�̌�),…,𝑔𝐾(�̌�)) for 𝑁 samples (�̌�𝑖,𝒑𝑖) learning process, with �̌�𝑖 = [�̌�𝑖1, �̌�𝑖2,…, �̌�𝑖𝑛] ′ and 𝒑𝑖 = [𝑝𝑖1,𝑝𝑖2,…,𝑝𝑖𝑛] ′ is performed. If SLFN could approx 𝑁 samples without any error (zero error), thus: ∑‖𝒚𝑗 − 𝒑𝑗‖ 𝑁 𝑗=1 = 0, (6) with 𝒚𝑗 is actual output value of SLFN. There are also parameter 𝜷𝑖 = [𝛽𝑖1,…,𝛽𝑖𝕞] ′,𝒘𝑖 = [𝑤𝑖1,…,𝑤𝑖𝕞]′ and 𝑏𝑖 which are interconnected in: ∑𝛽𝑖𝑔𝑖(𝒘𝑖�̌�𝑗 + 𝑏𝑖) 𝐾 𝑖=1 = 𝒑𝑗, 𝑗 = 1,…,𝑁,𝑖 = 1,…,𝐾 (7) 𝒘𝑖 is a weight vector connecting the 𝑖-th hidden node and the input node, 𝜷𝑖 is a weight vector connecting the 𝑖-th hidden node and the output node, and 𝑏𝑖 is the threshold of the 𝑖-th hidden node. Equation (7) could be expressed as: 𝐇𝛃 = 𝐓 (8) With 𝐇 = {ℎ𝑖𝑗} is the output matrix of the hidden layer, ℎ𝑖𝑗 = 𝑔(𝑤𝑗�̌�𝑖 + 𝑏𝑗) represents the output of 𝑗-th hidden neuron corresponding with �̌�𝑖, 𝛃 = M. Fajar An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning Machine Method in Foreign Tourists Forecasting 65 [𝜷1,…,𝜷𝐾] is the weight output matrix, and 𝐓 = [𝒑1,…,𝒑𝑁]′ is the target matrix. The output weight (weight connecting the hidden layer and the output) is obtained from finding the solution of least square from the linear system given. The solution of the linear system (8) is: �̂� = 𝐇+𝐓 (9) With 𝐇+ is Moore-Penrose generalized inverse matrix of 𝐇. The solution of equation (8) is unique and has the shortest distance compared to other solutions. As mentioned in the reference [12], ELM tends to give a generally good performance along with the increasing of the learning speed using the Moore- Penrose Generalized Inverse method. ELM algorithm could be summarized into four steps [12], they are: 1. Define the number of hidden nodes (𝐾), then randomly choose the initial value of 𝛽𝑖 and 𝑏𝑖. 2. Calculate matrix 𝐇. 3. Based on equation (9), calculate the weight output �̂�. 4. Then, the forecasting result �̂� is calculated with: �̂� = 𝐇 �̂� (10) 3.3 Hybrid Singular Spectrum Analysis – Extreme Learning Machine In this section, this paper proposes the hybrid SSA – ELM forecasting method as follows: 1. Time series is decomposed into main components by using SSA method. 2. The main components obtained from (1) could be defined as trend, periodic, quasi-periodic, and noise component. 3. The reconstructed component is formed by adding up several main components based on the frequency range closeness. 4. ELM is applied to every reconstructed component thus, ELM architecture is different for every reconstructed component. 5. The final result of hybrid SSA-ELM forecasting is an addition of forecasting results from several ELM architectures using equation (10). Steps above are visually presented in Picture 3.1. 3.4 Forecasting Accuracy Forecasting accuracy of the test data (out sample) in this research uses MAPE (Mean Absolute Percentage Error) formulated as follows: MAPE = 1 𝑣 − 𝑇 ∑ | 𝐹𝑡 − 𝐴𝑡 𝐴𝑡 × 100%| 𝑣 𝑡=𝑇+1 (11) With 𝐹𝑡 is the 𝑡-th forecasting result value, and 𝐴𝑡 is the 𝑡-th actual value. MAPE characteristics: (1) If the MAPE < 10%, then the Hybrid SSA – ELM method performance is very good, (2) if the MAPE value is in the range of 10% - 20%, then the Hybrid SSA – ELM method forecasting performance is good, (3) if the Jurnal Matematika MANTIK Vol. 5, No. 2, October 2019, pp. 60-68 66 MAPE in the range of 20% - 50%, then the Hybrid SSA – ELM method forecasting performance is adequate, and (4) if the MAPE > 50%, then the Hybrid SSA – ELM method forecasting performance is bad. SSA Decomposition 1st Reconstruction Component 2nd Reconstruction Component m-th Reconstruction Component 1st Architecture of ELM Original Time Series 2nd Architecture of ELM m-th Architecture of ELM . . . . . . Forecasted of 1st ELM Architecture . . . Start Summation result from Forecasted of ELM Architecture Result of Hybrid Forecasting SSA-ELM Finish Forecasted of 2nd ELM Architecture Forecasted of m-th ELM Architecture Figure 3.1. Hybrid SSA – ELM Method Flowchart 4. Results and Discussion In the process of Hybrid SSA – ELM forecasting, the first step is decomposing foreign tourists data with SSA. In SSA, defining the value of L in this paper is the number of train data (528 observations) divided by two. Thus L is 264. Based on SVD (Singular Value Decomposition), 264 components are obtained from SSA process of train data, but only the first ten components are picked because with only these first ten components could explain the variation of foreign tourists 99.38 percent to be analyzed further. Figure 4.1. (a). Variation of the first ten components, (b). Data scree plot Grouping is performed by looking at the similarities of 10 components plot patterns that indirectly indicate the similarities of the components. Ten components are grouped into ten groups. Group 1 consists of the 1st components, M. Fajar An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning Machine Method in Foreign Tourists Forecasting 67 group 2 consists of the 2nd components, group 3 consists of the 3rd components, and so on. Table 4.1 presents the forecasting result of SSA-ELM, SSA dan ELM methods in test data forecasting. SSA-ELM has the lowest MAPE (4.91 percent) compared to SSA (28 percent) and ELM (9.07 percent). Based on MAPE characteristics, SSA-ELM and ELM methods’ performance is very good. Meanwhile, SSA performance is adequate. Table 4.1. The Forecasting Result of Hybrid SSA-ELM, SSA, and ELM The year 2018 The actual number of Foreign Tourists Forecast Result MAPE Month: SSA-ELM SSA ELM SSA- ELM SSA ELM January 1100677 1215770.91 925266.00 1141727.56 10.46 15.94 3.73 February 1201001 1218719.07 929846.08 1057204.11 1.48 22.58 11.97 March 1363339 1277141.99 935022.03 1093649.66 6.32 31.42 19.78 April 1300277 1322031.55 941383.50 1205315.22 1.67 27.60 7.30 May 1242588 1321390.12 944461.27 1182573.77 6.34 23.99 4.83 June 1318094 1347513.63 948271.13 1178043.33 2.23 28.06 10.63 July 1540549 1424376.79 954890.47 1404689.88 7.54 38.02 8.82 August 1510764 1461883.81 960299.96 1427398.44 3.24 36.44 5.52 Average 4.91 28.00 9.07 Source: author 5. Conclusions Based on the previous discussion, it could be concluded that Hybrid SSA- ELM performance is very good in forecasting the number of foreign tourists. It is shown by the MAPE value of 4.91 percent with eight observations out the sample. References [1] C.H. Aladag, E. Egrioglu, and C. Kadilar, “Improvement in forecasting accuracy using the hybrid model of arfima and feed forward neural network american,” Journal of Intelligent Systems, vol.2, no.2, pp. 12-17, 2012. [2] D. Rahmani, “A Forecasting algorithm for singular spectrum analysis based on bootstrap linear recurrent formula coefficients,” International Journal of Energy and Statistics, vol.2, no.4, pp. 287-299, 2014. [3] M. Fajar, “Perbandingan kinerja peramalan pertumbuhan ekonomi Indonesia antara ARMA, FFNN dan hybrid ARMA-FFNN,” 2016. DOI:10.13140/RG.2.2.34924.36483. [4] M. Fajar, “Meningkatkan akurasi peramalan dengan menggunakan metode hybrid singular spectrum analysis-multilayer perceptron neural networks,” 2018. DOI: 10.13140/RG.2.2.32839.60320. [5] M. Fajar, “Perbandingan kinerja peramalan antara metode hybrid singular spectrum analysis-multilayer perceptrons neural network dan hybrid singular spectrum analysis-feed forward neural network pada data inflasi,” Jurnal Matematika MANTIK Vol. 5, No. 2, October 2019, pp. 60-68 68 2018. DOI: 10.13140/RG.2.2.10312.98561. [6] N. Golyandina, V. Nektrutkin, and A. Zhiglovsky, Analysis of Time Series: SSA and Related Techniques. Chapman and Hall/CRC, 2001. [7] R. Siregar, D. Prariesa, and G. Darmawan, “Aplikasi Metode Singular Spectral Analysis (SSA) dalam Peramalan Pertumbuhan Ekonomi Indonesia Tahun 2017”, mantik, vol. 3, no. 1, pp. 5-12, Oct. 2017 [8] Y. Jatmiko, R. Rahayu, and G. Darmawan, “Perbandingan Keakuratan Hasil Peramalan Produksi Bawang Merah Metode Holt-Winters dengan Singular Spectrum Analysis (SSA)”, mantik, vol. 3, no. 1, pp. 13-22, Oct. 2017. [9] D. Lubis, M. Johra, and G. Darmawan, “Peramalan Indeks Harga Konsumen dengan Metode Singular Spectral Analysis (SSA) dan Seasonal Autoregressive Integrated Moving Average (SARIMA)”, mantik, vol. 3, no. 2, pp. 74-82, Oct. 2017. [10] Th. Alexandrv, and N. Golyandina, “automatic extraction and forecast of time series cyclic components within the framework of SSA,” . In Proceedings of the 5th St.Petersburg, 2005. [11] W. Makridakis, and MacGee, Metode dan Aplikasi Peramalan. Binarupa Aksara, 1999. [12] S. Ding, H. Zhao, Y. Zhang, X. Xu, and R. Nie, “Extreme learning machine: algorithm, theory and applications,” Artificial Intelligence Review, vol.44, no.1 pp. 103-115, 2013.