PARADIGMA BARU PENDIDIKAN MATEMATIKA DAN APLIKASI ONLINE INTERNET PEMBELAJARAN


How to cite: M. Fajar, “An Application of Hybrid Forecasting Singular Spectrum Analysis–

Extreme Learning  Machine Method in Foreign Tourists Forecasting”, mantik, vol. 5, no. 2, pp. 

60-68, October 2019. 

An Application of Hybrid Forecasting Singular Spectrum 

Analysis – Extreme Learning  Machine Method in Foreign 

Tourists Forecasting 
 

Muhammad Fajar 

BPS-Statistics Indonesia, mfajar@bps.go.id 
 

doi: https://doi.org/10.15642/mantik.2019.5.2.60-68 

 
Abstrak: Wisman adalah salah satu indikator untuk melihat perkembangan pariwisata. 

Perkembangan pariwisata mempunyai andil penting bagi perekonomian karena pariwisata 

merupakan booster peningkatan devisa, menciptakan peluang usaha, dan membuka kesempatan 

kerja.  Sebagai bahan input untuk strategi dan program pariwisata adalah prediksi terhadap jumlah 

wisman di masa depan yang diperoleh dari peramalan.  Dalam paper ini, penulis menggunakan 

metode Hybrid singular spectrum analysis – extreme learning machine untuk meramalkan jumlah 

wisman. Data yang digunakan dalam penelitian adalah jumlah wisman yang bersumber dari Badan 

Pusat Statistik. Hasil penelitian ini bahwa kemampuan metode Hybrid SSA-ELM sangat baik dalam 

meramalkan jumlah wisman. Hal tersebut ditunjukkan oleh nilai MAPE sebesar 4.91 persen, dengan 

out sample sebanyak delapan observasi. 
 

Kata kunci: wisata mancanegara, singular spectrum analysis, extreme learning machine 
 
 
Abstract: International tourism is one indicator of measuring tourism development. Tourism 
development is important for the national economy since tourism could boost foreign exchange, 

create business opportunities, and provide employment opportunities. The prediction of foreign 

tourist numbers in the future obtained from forecasting is used as an input parameter for strategy and 

tourism programs planning. In this paper, the Hybrid Singular Spectrum Analysis – Extreme 

Learning Machine (SSA-ELM) is used to forecast the number of foreign tourists.  Data used is the 

number of foreign tourists January 1980 - December 2017 taken from Badan Pusat Statistik 

(Statistics Indonesia). The result of this research concludes that Hybrid SSA-ELM performance is 

very good at forecasting the number of foreign tourists. It is shown by the MAPE value of 4.91 

percent with eight observations out a sample. 
 

Keywords: foreign tourist, singular spectrum analysis, extreme learning machine 
 

Jurnal Matematika MANTIK 

Volume 5, Nomor 2, October 2019, pp. 60-68 

 
ISSN: 2527-3159 (print) 2527-3167 (online) 

http://u.lipi.go.id/1458103791
http://u.lipi.go.id/1457054096


M. Fajar 

An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning  Machine 

Method in Foreign Tourists Forecasting 

61 

1. Introduction 

Indonesia is a country that has a lot of mesmerizing landscapes, natural 

resources, and diverse cultures. These are tourist attractions that could be 

optimally utilized to advance the national economy. Tourism development is 

important for the national economy since tourism could boost foreign exchange, 

create business opportunities, and provide employment opportunities. 

One indicator of measuring tourism development is the number of foreign 

tourists to visit. The number of foreign tourists visiting Indonesia from January 

1980 to December 2017 is visually presented in Fig. 1.1. It could be seen that in 

this period, the number of foreign tourists visiting Indonesia shows an increasing 

trend each year. The implication for Indonesia as the host is that the strategy of 

infrastructural development is needed to avoid the decreasing number of foreign 

tourists and to lower the negative impacts to the environment caused by the 

increasing number of foreign tourists visit. Such as environment based 

transportation vehicles, hotels, recreation facilities, etc.  

 
Figure 1.1. The Number of Foreign Tourists January 1980 – December 2017 

 
Therefore, one input for the strategy is the prediction number of foreign 

tourists obtained from forecasting, specifically, time series forecasting. Time 

series forecasting is a quantitative method used to analyze a series of data 

collected in time order using the right technique. The result could be used as a 

reference to forecast the value of the series in the future [9]. The development of 

forecasting methods is increasingly rapid and complex as advances in the 

development of computing technology. The interesting thing from the time series 

method development is the reconstruction of hybrid time series forecasting 

method, a time series constructed from two different types of forecasting method 

[1] – [5]. 

In this paper, singular spectrum analysis and extreme learning machine 

techniques are combined to forecast the number of foreign tourists visiting 

Indonesia. Extreme learning machine is an exclusive example of feed-forward 

neural network in the form of FFNN with only one hidden layer, where the 

singular spectrum analysis – FFNN method has been developed before [5]. 

  
Time

w
is

m
an

t

1980 1990 2000 2010

0
20

00
00

60
00

00
10

00
00

0
14

00
00

0


Jurnal Matematika MANTIK 

Vol. 5, No. 2, October 2019, pp. 60-68 

62 

2. Data 

Data used in this research is the number of foreign tourists visiting Indonesia 

from January 1980 until August 2018. Train data (in the sample) covers the 

number of foreign tourists in January 1980 until August 2018. Meanwhile, test 

data (out sample) covers the number of foreign tourists from January 2018 until 

August 2018. 

 
3. Method 

3.1 Singular Spectrum Analysis (SSA) 

SSA is a non-parametric time series method based on multivariate statistics 

principle. SSA decomposes time series additively into several independent 

components. These components could be identified as trend, periodic, quasi-

periodic, and noise component. SSA procedure consists of four steps, they are [6]-

[10]: 

 
Step 1. Embedding 

Given an 𝑥1,𝑥2,…,  𝑥𝑇 time series, choose an even number 𝐿, where 𝐿 parameter 

is the window length defined as 2 < 𝐿 < 𝑇 2⁄ , and 𝐾 = 𝑇 − 𝐿 + 1. The cross 

matrix is: 

 
𝑿 = (𝑋1,…,𝑋𝑇) = (

𝑥1 𝑥2
𝑥2 𝑥3

⋯ 𝑥𝐾
⋯ 𝑥𝐾+1

⋮ ⋮
𝑥𝐿 𝑥𝐿+1

⋱ ⋮
⋯ 𝑥𝑇

) 

 
The cross matrix proves to be a Hankel matrix, which means every element in the 

main anti diagonal has the same value. Thus, 𝑿 could be assumed as multivariate 
data with L characteristic and K observations so that the covariance matrix is  𝑺 =
𝑿𝑿′  with the dimension of 𝐿 × 𝐿. 
 

Step 2. Singular Value Decomposition (SVD) 

Suppose that 𝑺 has eigenvalue and eigenvector  𝜆𝑖 and 𝑈𝑖, respectively. Where 
𝜆1 ≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝐿 and 𝑈1,…,𝑈𝐿. Thus, obtained SVD from 𝑿 as follows: 
 

𝑿 = 𝐸1 + 𝐸2 + ⋯+ 𝐸𝑑                                               (1) 
 

where 𝐸𝑖 = √𝜆𝑖𝑈𝑖𝑉𝑖
′ , 𝑖 = 1,2,…,𝑑, 𝐸𝑖 is the main component, 𝑑 is the number of 

eigenvalue 𝜆𝑖, and 𝑉𝑖 = 𝑿
′ 𝑈𝑖 √𝜆𝑖⁄ . 

 
Step 3. Grouping 

In this step, 𝑿 is additively grouped into subgroups based on patterns that form a 
time series. They are trend, periodic, quasi-periodic, and noise component. 

Partition the index set {1,2,…,𝑑} into several groups 𝐼1, 𝐼2,…,𝐼𝑛, then correspond 
matrix 𝑿𝐼 into group 𝐼 = {𝑖1, 𝑖2,…, 𝑖𝑏} which is defined as: 
 

M. Fajar 

An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning  Machine 

Method in Foreign Tourists Forecasting 

63 

𝑿𝐼  = 𝐸𝑖1 + 𝐸𝑖2 + ⋯+ 𝐸𝑖𝑏                                               (2) 

 
Thus, the decomposition represents as: 

 
 𝑿 = 𝑿𝐼1 + 𝑿𝐼2 + ⋯+ 𝑿𝐼𝑛                                         (3) 

with 𝑿𝐼𝑗(𝑗 = 1,2,…,𝑛) is reconstructed component (RC). 𝑿𝐼 component 

contribution measured with corresponding eigenvalue contribution: 

∑ 𝜆𝑖𝑖∈𝐼 ∑ 𝜆𝑖
𝑑
𝑖=1⁄ . Using the close frequency range from the main components is 

based on the study of the grouping process using auto grouping [11]. Main 

components with relatively close frequency ranges are grouped into one 

reconstructed component. Soon, until several reconstructed components are 

formed. 

 
Step 4. Reconstruction 

In this last step, 𝑿𝐼𝑗 is transformed into a new time series with T observations 

obtained from diagonal averaging or Hankelization. Suppose that 𝒀 is a matrix 
with 𝐿 × 𝐾 dimensions and has 𝑦𝑖𝑗,1 ≤ 𝑖 ≤ 𝐿,1 ≤ 𝑗 ≤ 𝐾 elements. Then, 𝐿

∗ =

min(𝐿,𝐾),𝐾∗ = max(𝐿,𝐾),and 𝑇 = 𝐿 + 𝐾 − 1. Then, 𝑦𝑖𝑗
∗ = 𝑦𝑖𝑗 if 𝐿 < 𝐾 and 

𝑦𝑖𝑗
∗ = 𝑦𝑗𝑖 if 𝐿 > 𝐾. Matrix 𝒀 transferred into 𝑦1,𝑦2,…,𝑦𝑇 series with using the 

following formula: 

 
𝑦𝑘 =

{
 
 
 1

𝑘
∑ 𝑦𝑚,𝑘−𝑚+1

∗

𝑘

𝑚=1

,1 ≤ 𝑘 ≤ 𝐿∗

1

𝐿∗
∑ 𝑦𝑚,𝑘−𝑚+1

∗

𝐿

𝑚=1

,𝐿∗ ≤ 𝑘 ≤ 𝐾∗

1

𝑇 − 𝑘 + 1
∑ 𝑦𝑚,𝑘−𝑚+1

∗

𝑇−𝐾∗+1

𝑚=𝑘−𝐾∗+1

,𝐾∗ ≤ 𝑘 ≤ 𝑇

                  (4) 

 
Diagonal averaging on equation (4) is applied to every matrix component 𝑿𝐼𝑗 on 

equation (3) resulting a �̃�(𝑘) = (�̌�1
(𝑘)
, �̌�2
(𝑘)
,…, �̌�𝑇

(𝑘)
) series. Thus, 𝑥1,𝑥2,…,𝑥𝑇 

series is decomposed into an addition of reconstructed m series: 

 
𝑥𝑡 = ∑�̌�𝑡
(𝑘)

𝑚

𝑘=1

, 𝑡 = 1,2,…,𝑇                                                           (5) 

 
SSA forecasting used in this research is SSA recurrent, with estimating min-norm 

LRR (Linear Recurrence Relationship) coefficient. The LRR coefficient is 

calculated with the following algorithm: 

1. Input: Matrix 𝐏 = [𝑃1:…:𝑃𝑟], 𝐏 is a matrix composed of 𝑃𝑖 eigenvector from 
SVD step, where Select a group of r (1 ≤ r ≤ L) eigenvectors 

𝑃1,…,𝑃𝑟. Suppose that 𝐏 is a 𝐏 that the last row is removed, and 𝐏 is a 𝐏 that 
the first row is removed. 


Jurnal Matematika MANTIK 

Vol. 5, No. 2, October 2019, pp. 60-68 

64 

2. For every 𝑃𝑖 vector column from 𝐏, calculate 𝜋𝑖, where 𝜋𝑖 is the last 
component from 𝑃𝑖 , and 𝑃𝒊 is a  𝑃𝑖 that the last component is removed. 

3. Calculate: 𝑣2 = ∑ 𝜋𝑖
2𝑟

𝑖=1 . If 𝑣
2 = 1, then STOP with a warning message 

“Verticality coefficient equals 1.” 

4. Calculate the min-norm LRR coefficient (ℛ): 
 

ℛ =
1

1 − 𝑣2
∑𝜋𝑖𝑃𝒊

𝑟

𝑖=1

 
5. From point (4) obtained: ℛ = (𝛼𝐿−1 …𝛼1)
′. 

6. Then, calculate the forecasting value with: 
 

�̂�𝑛 = ∑𝛼𝑖�̃�𝑛−1

𝐿−1

𝑖=1

, 𝑛 = 𝑇 + 1,…. ,𝑇 + ℎ 

 
3.2 Extreme Learning Machine (ELM) 

Extreme learning machine is a learning scheme of feedforward neural 

network for single-hidden layer feedforward neural networks (SLFN). ELM could 

adaptively set the number of nodes and randomly chooses the input weight W and 

bias 𝒃𝑖 on a hidden layer. Hidden layer weight is obtained by using the least 
square method [12]. 

Suppose that a SLFN training process with 𝐾 hidden nodes and an activation 

vector function 𝒈(�̌�) = (𝑔1(�̌�),𝑔2(�̌�),…,𝑔𝐾(�̌�)) for 𝑁 samples (�̌�𝑖,𝒑𝑖) learning 

process, with �̌�𝑖 = [�̌�𝑖1, �̌�𝑖2,…, �̌�𝑖𝑛]
′ and 𝒑𝑖 = [𝑝𝑖1,𝑝𝑖2,…,𝑝𝑖𝑛]

′ is performed. If 

SLFN could approx 𝑁 samples without any error (zero error), thus: 
 

∑‖𝒚𝑗 − 𝒑𝑗‖

𝑁

𝑗=1

= 0,                                                    (6) 

 
with 𝒚𝑗 is actual output value of SLFN. There are also parameter 𝜷𝑖 =
[𝛽𝑖1,…,𝛽𝑖𝕞]

′,𝒘𝑖 = [𝑤𝑖1,…,𝑤𝑖𝕞]′ and 𝑏𝑖 which are interconnected in: 
 

∑𝛽𝑖𝑔𝑖(𝒘𝑖�̌�𝑗 + 𝑏𝑖)

𝐾

𝑖=1

= 𝒑𝑗, 𝑗 = 1,…,𝑁,𝑖 = 1,…,𝐾                          (7) 

 
𝒘𝑖 is a weight vector connecting the 𝑖-th hidden node and the input node, 𝜷𝑖 is a 
weight vector connecting the 𝑖-th hidden node and the output node, and 𝑏𝑖 is the 
threshold of the 𝑖-th hidden node. Equation (7) could be expressed as: 
 

𝐇𝛃 = 𝐓                                                                 (8) 
 

With 𝐇 = {ℎ𝑖𝑗} is the output matrix of the hidden layer, ℎ𝑖𝑗 = 𝑔(𝑤𝑗�̌�𝑖 + 𝑏𝑗) 

represents the output of 𝑗-th hidden neuron corresponding with �̌�𝑖,  𝛃 =


M. Fajar 

An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning  Machine 

Method in Foreign Tourists Forecasting 

65 

[𝜷1,…,𝜷𝐾] is the weight output matrix, and 𝐓 = [𝒑1,…,𝒑𝑁]′ is the target matrix. 
The output weight (weight connecting the hidden layer and the output) is obtained 

from finding the solution of least square from the linear system given. The 

solution of the linear system (8) is: 

 
 �̂� = 𝐇+𝐓                                                       (9) 
 

With 𝐇+ is Moore-Penrose generalized inverse matrix of 𝐇. The solution of 
equation (8) is unique and has the shortest distance compared to other solutions. 

As mentioned in the reference [12], ELM tends to give a generally good 

performance along with the increasing of the learning speed using the Moore-

Penrose Generalized Inverse method. 

ELM algorithm could be summarized into four steps [12], they are: 

1. Define the number of hidden nodes (𝐾), then randomly choose the initial 
value of 𝛽𝑖 and 𝑏𝑖. 

2. Calculate matrix 𝐇. 

3. Based on equation (9), calculate the weight output �̂�. 
4. Then, the forecasting result �̂� is calculated with: 
 

  �̂� = 𝐇 �̂�                                                   (10) 
 

3.3 Hybrid Singular Spectrum Analysis – Extreme Learning Machine 

In this section, this paper proposes the hybrid SSA – ELM forecasting 

method as follows: 

1. Time series is decomposed into main components by using SSA method. 
2. The main components obtained from (1) could be defined as trend, periodic, 

quasi-periodic, and noise component. 

3. The reconstructed component is formed by adding up several main 
components based on the frequency range closeness. 

4. ELM is applied to every reconstructed component thus, ELM architecture is 
different for every reconstructed component. 

5. The final result of hybrid SSA-ELM forecasting is an addition of forecasting 
results from several ELM architectures using equation (10). Steps above are 

visually presented in Picture 3.1. 

 
3.4 Forecasting Accuracy 

Forecasting accuracy of the test data (out sample) in this research uses MAPE 

(Mean Absolute Percentage Error) formulated as follows: 

 
MAPE =
1

𝑣 − 𝑇
∑ |

𝐹𝑡 − 𝐴𝑡
𝐴𝑡

× 100%|

𝑣

𝑡=𝑇+1

                          (11) 

 
With 𝐹𝑡 is the 𝑡-th forecasting result value, and 𝐴𝑡 is the 𝑡-th actual value. MAPE 
characteristics: (1) If the MAPE < 10%, then the Hybrid SSA – ELM method 

performance is very good, (2) if the MAPE value is in the range of 10% - 20%, 

then the Hybrid SSA – ELM method forecasting performance is good, (3) if the 


Jurnal Matematika MANTIK 

Vol. 5, No. 2, October 2019, pp. 60-68 

66 

MAPE in the range of 20% - 50%, then the Hybrid SSA – ELM method 

forecasting performance is adequate, and (4) if the MAPE > 50%, then the Hybrid 

SSA – ELM method forecasting performance is bad. 

 
SSA 
Decomposition

1st 
Reconstruction 

Component

2nd 
Reconstruction 

Component

m-th 
Reconstruction 

Component

1st 
Architecture 

of ELM

Original 
Time Series

2nd 
Architecture 

of ELM

m-th 
Architecture 

of ELM

.

.

.

.

.

.

Forecasted of 1st 
ELM Architecture

.

.

.

Start

Summation result 
from Forecasted of 
ELM Architecture

Result of Hybrid 
Forecasting 

SSA-ELM 
Finish

Forecasted of 2nd 

ELM Architecture

Forecasted of 
m-th ELM 

Architecture

 
Figure 3.1. Hybrid SSA – ELM Method Flowchart 

 
4. Results and Discussion 

In the process of Hybrid SSA – ELM forecasting, the first step is 

decomposing foreign tourists data with SSA. In SSA, defining the value of L in 

this paper is the number of train data (528 observations) divided by two. Thus L is 

264. Based on SVD (Singular Value Decomposition), 264 components are 

obtained from SSA process of train data, but only the first ten components are 

picked because with only these first ten components could explain the variation of 

foreign tourists 99.38 percent to be analyzed further.   

 
Figure 4.1. (a). Variation of the first ten components,  (b). Data scree plot 

 
Grouping is performed by looking at the similarities of 10 components plot 

patterns that indirectly indicate the similarities of the components. Ten 

components are grouped into ten groups. Group 1 consists of the 1st components, 


M. Fajar 

An Application of Hybrid Forecasting Singular Spectrum Analysis–Extreme Learning  Machine 

Method in Foreign Tourists Forecasting 

67 

group 2 consists of the 2nd components, group 3 consists of the 3rd components, 

and so on. 

Table 4.1 presents the forecasting result of SSA-ELM, SSA dan ELM 

methods in test data forecasting. SSA-ELM has the lowest MAPE (4.91 percent) 

compared to SSA (28 percent) and ELM (9.07 percent). Based on MAPE 

characteristics, SSA-ELM and ELM methods’ performance is very good. 

Meanwhile, SSA performance is adequate. 

 
Table 4.1. The Forecasting Result of Hybrid SSA-ELM, SSA, and ELM 

The year 

2018 
The actual 

number of 

Foreign 

Tourists 

Forecast Result MAPE 

Month: SSA-ELM SSA ELM 
SSA-

ELM 
SSA ELM 

January 1100677 1215770.91 925266.00 1141727.56 10.46 15.94 3.73 

February 1201001 1218719.07 929846.08 1057204.11 1.48 22.58 11.97 

March 1363339 1277141.99 935022.03 1093649.66 6.32 31.42 19.78 

April 1300277 1322031.55 941383.50 1205315.22 1.67 27.60 7.30 

May 1242588 1321390.12 944461.27 1182573.77 6.34 23.99 4.83 

June 1318094 1347513.63 948271.13 1178043.33 2.23 28.06 10.63 

July 1540549 1424376.79 954890.47 1404689.88 7.54 38.02 8.82 

August 1510764 1461883.81 960299.96 1427398.44 3.24 36.44 5.52 

Average 4.91 28.00 9.07 

Source: author 

 
5. Conclusions 

Based on the previous discussion, it could be concluded that Hybrid SSA-

ELM performance is very good in forecasting the number of foreign tourists. It is 

shown by the MAPE value of 4.91 percent with eight observations out the sample. 

 
References 

[1] C.H. Aladag, E. Egrioglu, and C. Kadilar, “Improvement in forecasting 

accuracy using the  hybrid model of arfima and feed forward neural 

network american,” Journal of Intelligent Systems, vol.2, no.2, pp. 12-17, 

2012. 

[2] D. Rahmani, “A Forecasting algorithm for singular spectrum analysis based 

on bootstrap linear recurrent formula coefficients,” International Journal of 

Energy and Statistics, vol.2, no.4, pp. 287-299, 2014. 

[3] M. Fajar, “Perbandingan kinerja peramalan pertumbuhan ekonomi 

Indonesia antara ARMA, FFNN dan hybrid ARMA-FFNN,” 2016. 
DOI:10.13140/RG.2.2.34924.36483. 

[4] M. Fajar, “Meningkatkan akurasi peramalan dengan menggunakan metode 

hybrid singular spectrum analysis-multilayer perceptron neural networks,” 

2018. DOI: 10.13140/RG.2.2.32839.60320. 
[5] M. Fajar, “Perbandingan kinerja peramalan antara metode hybrid singular 

spectrum analysis-multilayer perceptrons neural network dan hybrid 

singular spectrum analysis-feed forward neural network pada data inflasi,” 


Jurnal Matematika MANTIK 

Vol. 5, No. 2, October 2019, pp. 60-68 

68 

2018. DOI: 10.13140/RG.2.2.10312.98561. 
[6] N. Golyandina, V. Nektrutkin, and A. Zhiglovsky, Analysis of Time Series: 

SSA and Related Techniques. Chapman and Hall/CRC, 2001. 
[7] R. Siregar, D. Prariesa, and G. Darmawan, “Aplikasi Metode Singular 

Spectral Analysis (SSA) dalam Peramalan Pertumbuhan Ekonomi 

Indonesia Tahun 2017”, mantik, vol. 3, no. 1, pp. 5-12, Oct. 2017 

[8] Y. Jatmiko, R. Rahayu, and G. Darmawan, “Perbandingan Keakuratan 

Hasil Peramalan Produksi Bawang Merah Metode Holt-Winters dengan 

Singular Spectrum Analysis (SSA)”, mantik, vol. 3, no. 1, pp. 13-22, Oct. 

2017. 

[9] D. Lubis, M. Johra, and G. Darmawan, “Peramalan Indeks Harga 

Konsumen dengan Metode Singular Spectral Analysis (SSA) dan Seasonal 

Autoregressive Integrated Moving Average (SARIMA)”, mantik, vol. 3, 

no. 2, pp. 74-82, Oct. 2017. 

[10] Th. Alexandrv, and N. Golyandina, “automatic extraction and forecast of 

time series cyclic components within the framework of SSA,” . In 

Proceedings of the 5th St.Petersburg, 2005. 

[11] W. Makridakis, and MacGee, Metode dan Aplikasi Peramalan. Binarupa 
Aksara, 1999. 

[12] S. Ding, H. Zhao, Y. Zhang, X. Xu, and R. Nie, “Extreme learning 

machine: algorithm, theory and applications,” Artificial Intelligence 

Review, vol.44, no.1 pp. 103-115, 2013.