Microsoft Word - 5-11100-Harits-LE3


Knowledge Engineering and Data Science (KEDS) pISSN 2597-4602 
Vol 2, No 2, December 2019, pp. 90–100 eISSN 2597-4637 
 
 
https://doi.org/10.17977/um018v2i22019p90-100  
©2019 Knowledge Engineering and Data Science | W : http://journal2.um.ac.id/index.php/keds | E : keds.journal@um.ac.id 
This is an open access article under the CC BY-SA license (https://creativecommons.org/licenses/by-sa/4.0/) 

Comparison of Indonesian Imports Forecasting  
by Limited Period using SARIMA Method 

Harits Ar Rosyid a, 1, Mutyara Whening Aniendya a, 2, Heru Wahyu Herwanto a, 3, * 

a Electrical Engineering Department, Universitas Negeri Malang  
Jl. Semarang No. 5, Malang 65145, Indonesia 

1 harits.ar.ft@um.ac.id; 2 mutyaraaniendya@gmail.com; 3 heru_wh@um.ac.id *  

* corresponding author 
 

I. Introduction 

Indonesia is a country with rapid economic growth. Good economic growth is one of the national 
benchmarks capable of giving welfare to its people. Economic growth in Indonesia, especially for 
international trade in exports and imports, is one of the largest. Import is an activity to enter goods 
from another country into the Indonesian customs area. Import has three types of materials that are 
often needed by Indonesian society such as, consumption goods, raw materials, and capital goods.  

Based on data of the import trade balance of the Ministry of Trade of the Republic of Indonesia 
starting from January 2002 until July 2019, it shows the results of imports experiencing an unstable 
increase and decrease. Indonesian imports in January 2019 – July 2019 amounting to 111.88 billion 
USD or decreased by 9.89 % when compared with the results of the import of January 2018 – July 
2018 amounting to 124,167 billion USD. The necessity for import that is still very high could decrease 
the Indonesian income because of the domestic payments abroad, while exports can add money 
because there is a purchase from domestic to overseas. If the valuation of import is higher than the 
export’s, it can threatens the Indonesian economy, especially the local businesses. For instance, recent 
imports of rice looked to be overly performed. This have caused the decision to exterminate tons of 
local rice products just to maintain the market price. When the balance rate of trade is unstoppable, 
inflation could act like a time bomb to the Indonesian economy. 

ARTICLE INFO A B S T R A C T   

Article history: 
Received 9 December 2019 
Revised 13 December 2019 
Accepted 13 December 2019 
Published online 23 December 2019 
 

The development of Indonesia's imports fluctuate over years. Inability to anticipate 
such rapid changes can cause economic slump due to inappropriate policy. For 
instance, recent years imports in rice led to the extermination of rice reserves. The 
reason is to maintain the market price of rice in Indonesia. To overcome these changes, 
forecasting the amount of imports should assist the Government in determining the 
optimum policy. This can be done by utilizing an algorithm to forecast time series 
data, in this case the amount of imports in the next few months with a high degree of 
accuracy. This study uses data obtained from the official website of the Indonesian 
Ministry of Trade. Then, Seasonal Autoregressive Integrated Moving Average 
(SARIMA) method is applied to forecast the imports. This method is suitable for the 
interconnected dependent variables, as well as in forecasting seasonal data patterns. 
The results of the experiment showed that 6-period forecast is the most accurate results 
compared to forecasting by 16 and 24 periods. The research resulted in the best model, 
that is ARIMA (0, 1, 3)(0, 1, 1)12 produces forecasting with a MAPE value of 7.210 % 
or an accuracy rate of 92.790 %. By applying this imports forecast model, the 
government can have a forward strategic plans such as selectively imports products 
and carefully decide the amount of the incoming products to Indonesia. Hence, it could 
maintain or improve the economic condition where local businesses can grow 
confidently. 

This is an open access article under the CC BY-SA license 
(https://creativecommons.org/licenses/by-sa/4.0/).

Keywords: 
Import dataset 
Forecasting model 
Limited period 
SARIMA 
MAPE  


 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 91 

 
These inconsistent import developments can be anticipated by forecasting imports in the future 
periods. By using the assistance of a forecasting method, the result of forecasting can then be used by 
the Government as a consideration material to take a new policy or step in reducing the outcome of 
imports so that the economy in Indonesia is better. The main thing to note in forecasting is the level 
of accuracy of the methods done. 

Several research to forecast import result has been that is, Forecasting Iron Ore Import and 
Consumption of China Using Grey Model Optimized by Particle Swarm Optimization Algorithm [1]. 
This research concluded that proposed hybrid-model performs better than the results obtained by a 
single method such as basic GM(1,1), PSO-GM(1,1), or rolling GM(1,1). The PSO-rolling GM(1,1) 
approach to modeling iron ore imports and consumption in China is both reliable and efficient. The 
prediction accuracies of the proposed model for imports and consumption have reached 3.2 % and 
2.3 % respectively. Then some research that has been done for forecasting using the ARIMA method 
has been carried out, among others Identifying an Appropriate Forecasting Model For Forecasting 
Total Import of Bangladesh [2]. The research produced the best model, which is 
ARIMA(0,1,1)(1,0,0)12 with an MSE value of 15747374 and MAPE value of 22.97802 %. Next 
Forecasting International Tourism Demand in Malaysia Using Box Jenkins SARIMA Application [3]. 
The research produced the best model, which is ARIMA(1,0,1) model with RMSE value is 0.2914, 
MAE value is 0.2075 and MAPE value of 1.4319 %. 

SARIMA is a development of ARIMA models that have seasonal patterns in their data. ARIMA 
is one of the forecasting models that fully ignores independent variables and uses dependent variables 
where data is interconnected. The advantages of the ARIMA method are being able to produce highly 
accurate forecasting in forecasting short-term, flexible and can represent a wide range of time series 
characters occurring in the short term, and can analyze random, trending, and seasonal data situations. 
Especially for data that has seasonal patterns such as Indonesian import data, the exact method is 
Seasonal Autoregressive Integrated Moving Average (SARIMA). 

Based on the problem in the import trade, this research will use SARIMA method to forecast 
Indonesian imports. The SARIMA method is chosen because it is capable of predicting time series 
data and generating high levels of accuracy for short-term forecasting. So by using SARIMA method 
is expected to produce good forecasting and become a step to the development of innovation and the 
establishment of a strategic plan in determining the ledge to reduce the outcome of imports. 

II. Materials and Methods 

The research is divided into 5 main phases (shown in Figure 1), namely data collection, 
preprocessing, model candidate determination, model assessment and evaluation, and best model 
determination. 

A. Data Collection 
The dataset used in this research is sourced from the official website kemendag.go.id. The Website 

of Ministry of Trade of the Republic of Indonesia (KEMENDAGRI) is a site that provides various 
information about trading in Indonesia, such as the development of exports and imports, trade balance, 
foreign exchange rate against rupiah, inflation, and other trading activities. It contains 211 Indonesia’s 
import data from 1996 until July 2019. The dataset has 5 attributes, which is Year, Total, Consumption 
Goods, Raw Material Support, and Capital Goods. 

 
Fig. 1. Research design  


92 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 

B. Preprocessing 
1) Attribute Removal 

Attribute Removal isa trivial process by eliminating or removing unused attributes for the 
forecasting process. The original data consists of time-series occasions of imports in Indonesia. 
However, forecasting import only requires the time attribute as the independent variable (x-axis) and 
import amount as the target output (y-axis), the remaining attributes were removed from the dataset. 
The removal process was done manually via spreadsheet application. In addition, the resulting dataset 
was converted to dd-mm-yyyy format. Then, the order of the dataset was reversed to an increasing 
time order. 

2) Stationary Test 

Stationarity of a data means that the statistical attributes of the time-series data has not change over 
time. One can illustrate that there is a constant progression of the graph. It is similar to a linear model, 
but not a constant one. As time progresses, the linear function constantly changes. It has a constant 
slope, a value representing the rate of change. So, time series with seasonal occasions or trends are 
not stationary. In contrast, a stationary time series contains no foreseen patterns in the long-term. In 
this case, forecasting becomes impossible because wherever any point one observed, there exist 
relatively the same values.  

A stationary test is performed to determine whether the data is stationary or not [4]. Stationary tests 
can be performed in two ways. The first way is by viewing the graph of dataset, if the graph fits to a 
straight line or the average of a chart is close to zero then the dataset is already stationary. The second 
way is to see the Auto-Correlation Function (ACF) and Partial Auto-Correlation Function (PACF) 
plots on the dataset. The ACF plot is used to measure the comparison between time series data and 
time lag. PACF plot is used to measure the amount between a variable and the time lag. If the plot of 
ACF and PACF display a change in the value between lag which is evident in the form of cut off and 
dies down then the dataset is stationary. 

3) Differencing 

Differencing is a technique to make the time-series data stationary as a requirement for the 
SARIMA model. So, differencing only applies for the non-stationary time-series data. Differencing 
removes the series dependencies on time, this includes structures like trends and seasonality. A non 
stationary time series would not be suitable to be forecasted. Therefore, differencing is done by 
calculating the change or difference between the subsequent observations. The value of difference 
obtained is checked back whether it is stationary, otherwise, it will repeat the process. Equation (1) 
shows a formulate for the differenciation process between Yt and Yt-1 [5].  

𝑌 = 𝑌 − 𝑌  (1) 

More sequence differences are calculated in the same way. For example, the second sequence 
difference (d = 2) is only expanded to include the second lag of the series, as follows [5]. 

𝑌 = 𝑌 − 𝑌  (2) 

The number of differencing processes that have been done will determine the order of the coefficient 
d which is then used to determine the candidate models such as Auto Regressive (AR) and Moving 
Average (MA). 

C. SARIMA 
Seasonal Autoregressive Integrated Moving Average (SARIMA) is a development of the ARIMA 

model that has seasonal patterns in their data. Seasonal patterns are patterns that experience a loop at 
each season, such as Weekly, Monthly, quarterly, yearly, and so on. ARIMA is a method developed 
by George Box and Gwilyn Jenkins in 1970 and commonly referred to as the Box-Jenkins Method 
[6][7]. ARIMA is one of the models used in time-series forecasting and its accuracy is recognisable 
for the short-term forecasting. ARIMA is a forecasting model that fully ignores independent variables 
and uses dependent variables where data is interconnected and has some assumptions that must be 
fulfilled such as autocorrelation, trend, or seasonal [8]. ARIMA uses its previous data values to 
produce accurate short-term forecasting. In the SARIMA model (p,d,q)(P,D,Q)S, parameters p and P 
indicates non-seasonal AR values and seasonal AR values, the parameters q and Q indicates non-


 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 93 

 
seasonal MA values and seasonal MA values, and parameter d indicates the differencing process in 
non-seasonal data for D in seasonal data [9]. The ARIMA method is divided into 4 groups, namely 
AR, MA, ARMA and ARIMA. 

Inserting a SARIMA model into data involves the following things four-step recurring cycle: (a) 
Identification of SARIMA structure (p, D, Q) (P, D, Q); (b) estimate the unknown parameters; (c) 
Perform tests on residual estimates; (d) Forecasting future results based on known data [10]. The 
SARIMA method is defined as data that has a repeating pattern within a fixed period of time. Since 
there are seasonal patterns, the models used by the mathematical ARIMA are ARIMA (p, D, Q) (P, 
D, Q)S with the formulae models (3) [11]. 

Φ(𝐵 )𝜙 (𝐵)(1 − 𝐵) (1 − 𝐵 ) 𝑋 = 𝜃 (𝐵)Θ (𝐵 )𝑒  (3) 

D. Model Candidate Determination 
The SARIMA model in the study has three orders namely p, d, and q for non-seasonal data while 

the three orders P, D, and Q are for seasonal data and the S order for the frequency of data used. The 
determination of SARIMA order candidates can be done by analyzing the plot of Autocorrelation 
function (ACF) and Partial Autocorrelation Function (PACF). The ACF Plot is used to measure the 
correlation between time series data and time-lag. The ACF Plot is used to indicate an Autoregressive 
(AR) or order (p, p) value. The PACF Plot is used to measure the amount of correlation between 
variables and time-lag after removing the linear dependency that is at the bottom lag. The PACF Plot 
is used to indicate the value of Moving Average (MA) or order (q, Q). The order value (d, D) is 
determined by the number of differencing processes performed in stationary data changes. As for the 
S order is determined by looking at the frequency of data used, weekly, monthly, yearly and others. 

The value of the order can be seen from the results of the plot ACF and the plot of PACF with the 
existence of dies down and cut off. The dies down pattern occur when the data decreases to close to a 
value of 0 slowly. While the cut off pattern occurs when the data is approaching a value of 0 at the 
initial lag or visible patterns of images have drastically decreased. Determining order value based on 
ACF and PACF plot conditions can be seen in Table 1. 

After the ACF and PACF plotting on each dataset, a white noise test is performed to determine if 
there is a residual between lag with the Ljung-Box model. If the resulting p-value is greater than α = 
0.05 then the value meets the criteria of the white noise test. The Ljung-Box formula as follows [12]. 

Q = n(n + 2)  (4) 

After the ACF and PACF plotting on each dataset, a white noise test is performed to determine if there 
is a residual between lag with the Ljung-Box model. If the resulting p-value is greater than α = 0.05 
then the value meets the criteria of the white noise test. The Ljung-Box formula as follows [13]. 

AIC =  −2 log 𝐿 + 2𝑉  (5) 

The smallest AIC value is a candidate for the selected SARIMA model for forecasting process. 

E. Testing Model for Prediction and Evaluation 
After obtaining SARIMA model candidates, the next step is to test each models. The testing 

process is divided into two stages: forecasting and evaluation. The models built in this experiment 
forecast imports in different periods: 6 (six) periods, 12 periods and 24 periods. Then, the evaluation 
calculates the error rate of each forecasting model using Mean Absolute Percentage Error (MAPE) 
[14]. 

Table 1. ACF and PACF plot criteria 

Model ACF Trend PACF Trend 

AR(p) Decreases exponentially Drastically decreased on certain lag 

MA(q) Drastically decreased on certain lag Drastically decreased on certain lag 

ARMA(p,q) Decreases exponentially Decreases exponentially 
 

94 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 

MAPE is an alternative method used to measure the accuracy level of a forecasting model in a 
percentage unit (fraction). MAPE is an average of the overall percentage of error results from actual 
data and forecasting data. A low MAPE value indicates the resulting value is approaching its actual 
value. The MAPE formulae is shown in (6) [15]. 

MAPE =  
%

 (6) 

The test will be conducted using trial and error method, import dataset amounting to 211 data will be 
divided into training dataset and testing dataset (shown in Table 2). 

F. Best Model Determination 
MAPE scores of all forecasting model candidates act as the selection criteria of the forecasting 

model. The best model is the one with low MAPE scores (error rate) or high accuracy. 

III. Results and Discussions 

A. Preprocessing 
1) Attribute Removal 

At this stage, only two out of five attributes are used: the year and the total. While the Consumption 
Goods, Raw Material Support, Capital Goods attributes are dismissed in forecasting process. Such 
removal process is trivial but the choice of attributes for selection was based on the forecasting target: 
the amount of yearly imports. 

Regarding the year attribute, reordering of the year was done to ensure it is consistent with the 
time-series x-axis. In addition, the year attribute needs reformatting from mm-yyyy to dd-mm-yyyy. A 
peek of final import dataset can be seen in Table 3. 

2) Stationary Test 

A stationary test can be done in two ways, first by looking at the original data graph plot or viewing 
the graphic plot of the ACF data. FIgure 2 indicates that the data is not stationary because the graph 
do not fit to a straight line. From the image, it appears that the ACF shows a value that exceeds the 
line at the initial lags and decreases very slowly. From both tests, it can be ensured that the data has 
not been stationary. 

Table 3. Import dataset after attribute removal process 

Date Value 

1/1/2002 2,087.90 

1/2/2002 2,182.30 

1/3/2002 2,362.71 

1/4/2002 2,382.90 

1/5/2002 2,498.09 

1/6/2002 2,438.90 

1/7/2002 2,646.30 

1/8/2002 2,823.70 

1/9/2002 2,860.20 

1/10/2002 3,104.80 

1/11/2002 2,955.90 

1/12/2002 2,945.20 
 

Table 2. Data distribution scenario 

Period Training Data Testing Data 

6 January 2002 to January 2019 (205 Data) February 2019 to July 2019 (6 Data) 

12 January 2002 to July 2018 (199 Data) August 2018 to July 2019 (12 Data) 

24 January 2002 to July 2017 (187 Data) August 2017 to July 2019 (24 Data) 
 

 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 95 

 
3) Differencing 

The next step is to differencing the data by using the diff () function of timeseries package in R. 
Differencing process is done once and Figure 3 shows the resulting ACF plot. From the graph in 
Figure 3, the import dataset is now in a stationary form. The ACF plot indicates the presence of 
significant changes showed by the boxed values but there is a repetition of seasonal patterns or patterns 
occurring. On the ACF plot, the seasonal pattern occurs nearly by the increment of 12, so the S value 
used is S = 12. Then, by re-differencing the seasonal lag to determine the candidate order model on 
the seasonal pattern. From the data graph and the ACF/PACF plot in Figure 4, the data has been 
changed to stationary. The data graph shows a straight chart at a value of 0 in the middle. It shows 
that the data is stationary and the ACF plot is also subjected to significant changes and does not exceed 
the line limit. 

B. Model Candidate Determination 
The next stage is the determination of the candidate order model p, q, P and Q via observation to 

the ACF and PACF plots. Based on Figure 4, the dataset graph shows a straight chart at a value of 0 
in the middle. It shows the data is stationary and the ACF plot has also undergone significant changes. 
Determining candidate order models that do not have a seasonal pattern is done by looking at the 
initial lag (lag 1, 2, 3, and so on) while determining the candidate model on the data that has a seasonal 
pattern seen at lag 12, 24 and 36. Meanwhile, both seasonal and non-seasonal data only needs one 
differencing process, thus, the D value is 1. 

Based on the plot results of the ACF/PACF for the non-seasonal pattern, the cuts were taken off at 
Lag 1, 2 and 10. Hence, the PACF plot shows the dies down. Meanwhile, the ACF plot results for 
seasonal patterns shows no lag that exceeds the line, and on the PACF plot there is a line exceeding 
the 12th lag. From these results, candidate models for the order p and Q in the non-seasonal patterns 
are 1, 2, and 3, while for the candidate of the order P and Q in the non-seasonal pattern is 1 on the 
SARIMA model. Order D and D are 1, due to the differencing process was done once. 

 
Fig. 2. Graph and ACF plot on import dataset 

 
Fig. 3. ACF plot after differencing on import dataset 

 
96 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 

From several candidates of the order of p, d, q, P, D, Q and S, the combination produce forecasting 
model candidates in a form of (1,1,0)(1,1,0)12, (2,1,0)(1,1,0)12, (3,1,0)(1,1,0)12, (1,1,0)(0,1,1)12, 
(2,1,0)(0,1,1)12, (3,1,0)(0,1,1)12, (0,1,1)(1,1,0)12, (0,1,2)(1,1,0)12, (0,1,3)(1,1,0)12, (0,1,1)(0,1,1)12, 
(0,1,2)(0,1,1)12, and (0,1,3)(0,1,1)12.  

1) White Noise Test 

Assuming white noise is met when it meets the criteria, i.e. when p-value resulting from the Ljung-
Box process is greater than α = 0.05 then the value meets the criteria [16]. The import dataset has a p-
value value that can be seen in Table 4. Both datasets have a p-value that exceeds α = 0.05, so white 
noise assumptions are fulfilled. To display the p-value value of Ljung-box using the box test() function 
in the Rstudio application. 

2) Akaike’s Information Criterion (AIC) 

The best models are models that have the smallest AIC value of all existing model candidates [17]. 
Comparison table of values on each candidate model can be seen in Table 5. From both stages of 
selection of the best models, it can be concluded that the best ARIMA models are ARIMA 
(0,1,3)(0,1,1)12, because each dataset has the smallest AIC value. 

Table 4. Import dataset of ARIMA model candidates’ Ljung-box value 

ARIMA Model Ljung-Box 

(1,1,0)(1,1,0)12 0.3167 

(2,1,0)(1,1,0)12 0.7164 

(3,1,0)(1,1,0)12 0.5898 

(1,1,0)(0,1,1)12 0.1658 

(2,1,0)(0,1,1)12 0.6858 

(3,1,0)(0,1,1)12 0.6807 

(0,1,1)(1,1,0)12 0.09715 

(0,1,2)(1,1,0)12 0.4701 

(0,1,3)(1,1,0)12 0.9819 

(0,1,1)(0,1,1)12 0.07206 

(0,1,2)(0,1,1)12 0.5218 

(0,1,3)(0,1,1)12 0.9662 
 

Fig. 4. Data graph and ACF/PACF plot on import dataset after differencing 


 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 97 

 
C. Testing Model for Forecasting 
Testing was conducted with the training and test data specified in Table 2. The testing process is 

divided into two, i.e. forecasting and calculates error rate forecasting results. Forecasting is conducted 
to obtain forecasting of import results in several periods. After the forecasting results, then the 
calculation of error rate of each forecasting using MAPE. As for the calculation of MAPE using (10) 
and assisted by the MAPE() function of package MLmetrics on RStudio applications. 

The first testing phase is testing the model for forecasting the import results. Sample testing done 
on the model ARIMA(1,1,0)(1,1,0)12. The function used to commit forecasting is ARIMA(x, order = 
c(p,d,q), seasonal = (P,D,Q)). The results can be seen in Figure 5. The Model generates two values of 
AR coefficient and 1 value of AR coefficient for seasonal. The value of the coefficient will then be 
used for the subsequent period forecasting using the (7). 

Forecasting results using model ARIMA(1,1,0)(1,1,0)12 for the 6 future periods can be seen in 
Figure 6. To display forecasting Results Use the forecast () function. The output of the function is 
forecasting value, lower limit and upper limit of forecasting. Then calculate the error rate of the 
prediction result with the actual value using MAPE. To display the MAPE calculations using the 
MAPE () function and the result can be seen in Figure 7. On the other model, candidates are done the 
same testing and evaluation methods 

Table 5. Import dataset of ARIMA model candidates’ AIC value 

ARIMA Model AIC 

(1,1,0)(1,1,0)12 3274.87 

(2,1,0)(1,1,0)12 3272.56 

(3,1,0)(1,1,0)12 3268.29 

(1,1,0)(0,1,1)12 3271.9 

(2,1,0)(0,1,1)12 3263.48 

(3,1,0)(0,1,1)12 3262.28 

(0,1,1)(1,1,0)12 3282.45 

(0,1,2)(1,1,0)12 3262.51 

(0,1,3)(1,1,0)12 3260.28 

(0,1,1)(0,1,1)12 3276.74 

(0,1,2)(0,1,1)12 3256.99 

(0,1,3)(0,1,1)12 3255.22 

 
Fig. 5. ARIMA(1,1,0)(2,1,0)12 coefficient value 
 

Fig. 6. ARIMA(1,1,0)(1,1,0)12 forecasting results 


98 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 

D. Evaluation of Forecasting Results 
After testing each model for prediction and getting the predicted result for the current period, the 

next step is to calculate the prediction error rate using MAPE. The final result obtained for each model 
can be seen in Table 6 for 6-period forecast model, Table 7 for 12-period forecast model, and Table 8 
for 24-period forecast model 

The result of forecasting on the import dataset with 6 periods resulted in ARIMA(0,1,3)(0,1,1)12 
as the best model with MAPE value of 7.516 % or an accuracy rate of 92.79 %. The result of 
forecasting on the import dataset with 12 periods resulted in a different model with the previous period 
of 6 periods. In this period resulted in the best two models because it produces the same MAPE model 
ARIMA(1,1,0)(0,1,1)12 and ARIMA(2,1,0)(0,1,1)12 with MAPE value of 16.029 % or an accuracy 
rate of 83.971 %. 

The forecasting of the import dataset with 24 periods resulted in different models with the previous 
period of 6 periods and 12 periods. This period resulted in the best model of ARIMA model 
(0,1,3)(1,1,0)12 with 9.526 % MAPE value or an accuracy rate of 90.474 %. From the test results, by 
adding the number of forecasting periods, it can be concluded that there is an increase in the MAPE 
value when compared with the short term shown in Table 9. It proves that the SARIMA method can 
do Short-term forecasting with a high degree of accuracy. 

Table 6. Import dataset of MAPE value for 6 periods forecasting 

Model ARIMA MAPE 

(1,1,0)(1,1,0)12 10.301 

(2,1,0)(1,1,0)12 11.688 

(3,1,0)(1,1,0)12  8.759 

(1,1,0)(0,1,1)12 11.973 

(2,1,0)(0,1,1)12 11.973 

(3,1,0)(0,1,1)12  9.279 

(0,1,1)(1,1,0)12 12.913 

(0,1,2)(1,1,0)12  8.847 

(0,1,3)(1,1,0)12  8.807 

(0,1,1)(0,1,1)12 14.038 

(0,1,2)(0,1,1)12  8.307 

(0,1,3)(0,1,1)12  7.516 

 
Table 7. Import dataset of MAPE value for 12 periods forecasting 

Model ARIMA MAPE 

(1,1,0)(1,1,0)12 23.530 

(2,1,0)(1,1,0)12 22.658 

(3,1,0)(1,1,0)12 22.085 

(1,1,0)(0,1,1)12 16.029 

(2,1,0)(0,1,1)12 16.029 

(3,1,0)(0,1,1)12 17.452 

(0,1,1)(1,1,0)12 24.445 

(0,1,2)(1,1,0)12 22.360 

(0,1,3)(1,1,0)12 23.374 

(0,1,1)(0,1,1)12 19.267 

(0,1,2)(0,1,1)12 18.445 

(0,1,3)(0,1,1)12 19.875 
 

Fig. 7. ARIMA(1,1,0)(2,1,0)12 MAPE value 


 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 99 

 
E. Discussions 
Test results of 12 model candidates to forecast Indonesia’s import by 6 periods, 12 periods, and 24 

periods produce interesting error rates. The 6-period forecast model be the best one with smallest 
MAPE of 7.210 %, that is ARIMA(0,1,3)(0,1,1)12 model. Interestingly, the 12-period forecast 
(ARIMA(1,1,0)(0,1,1)12 and ARIMA(2,1,0)(0,1,1)12 models) have MAPE values much larger than the 
shorter or longer period forecast models. This 12-period forecast model experienced a greater increase 
in MAPE value because the dataset has a high value in the last data. 

From these experiments, SARIMA is superior to short-term forecasting, this result is consistent 
with previous research [18] stating that the more periods produced from the dataset can lead to a higher 
accuracy of forecasting. Therefore, forecasting import in 6 periods (month) produced more set for 
forecasting, thus, a better accuracy 

IV. Conclusion 

This research produced a forecasting model for Indonesia’s import. In the experiments with the set 
of months as periodical forecast, the best result was obtained when forecasting the future 6 periods of 
imports. The best forecast for imported results is ARIMA(0,1,3)(0,1,1)12 because it produces the 
smallest MAPE value and AIC value. MAPE value for forecasting Indonesian imports is 7.210 % with 
an accuracy value of 92.79 % and AIC value of 3255.22. This research also proved that forecasting 
using SARIMA method is best used for short-term future trends for Indonesia’s imports. In this 
regards, the 6-period forecast should make the government to be more aware in their development 
planning and highly prepared with contingency planning in import policy. Therefore, the strategic 
plan to improve the local businesses can be accommodated by effective yet efficient imports as the 
supporting roles. In this research, the forecasting model development applied a hold-out validation 
method where the test set was the time series of the last period. Hence, it may not be the best (generic) 
method applied. Therefore, there is an open challenge to improve this research by applying the cross 
validation method. It is expected that by applying this method, a more generic forecasting model can 
be built. 

Table 8. Import dataset of MAPE value for 24 periods forecasting 

Model ARIMA MAPE 

(1,1,0)(1,1,0)12  10.035 

(2,1,0)(1,1,0)12  9.982 

(3,1,0)(1,1,0)12  9.930 

(1,1,0)(0,1,1)12  15.322 

(2,1,0)(0,1,1)12   15.322 

(3,1,0)(0,1,1)12  15.175 

(0,1,1)(1,1,0)12  10.376 

(0,1,2)(1,1,0)12  9.663 

(0,1,3)(1,1,0)12  9.526 

(0,1,1)(0,1,1)12  13.579 

(0,1,2)(0,1,1)12  15.383 

(0,1,3)(0,1,1)12  15.236 
 

Table 9. MAPE result comparison 

Period  MAPE 

6 period  7.210 % 

12 period  16.029 % 

24 period  9.526 % 

 
100 H.A. Rosyid et al. / Knowledge Engineering and Data Science 2019, 2 (2): 90–100 

Acknowledgement 

We thank everyone who contributed to the completion of this paper in one way or another. First of 
all, we thank God for the ability to do the job. We are also very grateful to my informants. Their 
identities cannot be published, but during our research we want to recognize and appreciate their 
support and accountability. We are also so grateful to my fellow students whose struggles and 
constructive critics are facing the search for new ideas. Lastly, we would like to thank PUI Disruptive 
Learning Innovation, Universitas Negeri Malang, for the intensive support and guidance for this 
research to run well. 

Declarations 

Author contribution  
All authors contributed equally as the main contributor of this paper. All authors read and approved the final paper. 

Funding statement 
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. 

Conflict of interest  
The authors declare no conflict of interest. 

Additional information 
No additional information is available for this paper. 

References 
[1] W. Ma, X. Zhu, and M. Wang, “Forecasting iron ore import and consumption of China using grey model optimized by 

particle swarm optimization algorithm,” Resour. Policy, vol. 38, no. 4, pp. 613–620, 2013. 
[2] T. Khan, “Identifying an Appropriate Forecasting Model for Forecasting Total Import of Bangladesh,” Int. J. Trade, 

Econ. Financ., vol. 2, no. 3, pp. 242–246, 2011. 
[3] Y. Ibrahim, Nanthakumar, and Loganathan, “Forecasting International Tourism Demand in Malaysia Using Box Jenkins 

Sarima Application,” South Asian J. Tour. Herit., vol. 3, no. 2, pp. 50–60, 2010. 
[4] T. S. Rao, and M. M. Gabr, "A test for linearity of stationary time series," Journal of time series analysis, vol. 1, no. 2, 

pp. 145-158, 1980. 
[5] Rob J Hyndman, “Forecasting: Forecasting: Principles & Practice,” no. September, p. 138, 2014. 
[6] E. B. Dagum, The X-II-ARIMA seasonal adjustment method. Ottawa: Statistic Canada, 1980. 
[7] G. Box, "Box and Jenkins: time series analysis, forecasting and control," In A Very British Affair, pp. 161-215. Palgrave 

Macmillan, London, 2013. 
[8] A. Qonita, A. G. Pertiwi, and T. Widiyaningtyas, “Prediction of rupiah against us dollar by using arima,” Int. Conf. 

Electr. Eng. Comput. Sci. Informatics, vol. 4, no. September, pp. 746–750, 2017. 
[9] K. K. Sumer, O. Goktas, and A. Hepsag, “The application of seasonal latent variable in forecasting electricity demand 

as an alternative method,” Energy Policy, vol. 37, no. 4, pp. 1317–1322, 2009. 
[10] K. Y. Chen and C. H. Wang, “A hybrid SARIMA and support vector machines in forecasting the production values of 

the machinery industry in Taiwan,” Expert Syst. Appl., vol. 32, no. 1, pp. 254–264, 2007. 
[11] F. M. Tseng and G. H. Tzeng, “A fuzzy seasonal ARIMA model for forecasting,” Fuzzy Sets Syst., vol. 126, no. 3, pp. 

367–376, 2002. 
[12] W. W. S. Wei, “Time Seried Analysis: Univariate and Multivariate Methods 2nd Edition.” Pearson Addison Wesley, 

New York, 2006. 
[13] E. J. Wagenmakers and S. Farrell, “AIC model selection using Akaike weights,” Psychon. Bull. Rev., vol. 11, no. 1, pp. 

192–196, 2004. 
[14] M. V. Shcherbakov, A. Brebels, N. L. Shcherbakova, A. P. Tyukov, T. A. Janovsky, & V. A. E. Kamaev, "A survey of 

forecast error measures," World Applied Sciences Journal, vol. 24, no. 24, pp. 171-176, 2013. 
[15] A. de Myttenaere and Dkk, “Mean Absolute Percentage Error for regression models,” Neurocomputing, vol. 192, pp. 

38–48, 2016.  
[16] R. Serra, and A. C. Rodríguez, "The Ljung-Box test as a performance indicator for VIRCs," International Symposium 

on Electromagnetic Compatibility-EMC EUROPE, IEEE, pp. 1-6, 2012. 
[17] T. W. Arnold, "Uninformative parameters and model selection using Akaike's Information Criterion." The Journal of 

Wildlife Management, vol. 74, no. 6, pp. 1175-1178, 2010. 
[18] T. Widiyaningtyas, Muladi, and A. Qonita, “Use of ARIMA Method to Predict the Number of Train Passenger in 

Malang City,” Proceeding - 2019 Int. Conf. Artif. Intell. Inf. Technol. ICAIIT 2019, pp. 359–364, 2019.