CAUCHY –Jurnal Matematika Murni dan Aplikasi Volume 5(1)(2017), Pages 29-35 p-ISSN: 2086-0382; e-ISSN: 2477-3344 Submitted: 11 July 2017 Reviewed: 12 July 2017 Accepted: 2 November 2017 DOI: http://dx.doi.org/10.18860/ca.v5i1.4288 Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama, Ni Wayan Surya Wardhani, Rahma Fitriani Department of Statistics, Brawijaya University, Malang Email: priska.arindya@gmail.com, wswardhani@yahoo.com, rahmafitriani@ub.ac.id ABSTRACT The aim of this research is to obtain the appropriate transfer function to model rainfall data in Batu City and to figure out how accurate the results of rainfall forecasting in Batu City using the transfer fu nction model based on air temperature, humidity, wind speed and cloud. Transfer function model is a multivariate time series model which consists of an output series (Yt) sequence expected to be affected by an input series (Xt) and other inputs in a group called a noise series (Nt). Air temperature, humidity, wind speed and cloud are used as an input series (Xt) and rainfall as an output series (Yt). Multi input transfer function model obtained is (𝑏1, 𝑠1, π‘Ÿ1) (𝑏2, 𝑠2, π‘Ÿ2) (𝑏3, 𝑠3, π‘Ÿ3) (𝑏4, 𝑠4, π‘Ÿ4) (𝑝𝑛 ,π‘žπ‘› ) = (0,0,0) (23,0,0) (1,2,0) (0,0,0) ([5,8],2) and shows that rainfall on a certain day is affected by air temperature and cloud on that day, air humidity in the previous 23 days and wind speed in the previous day. The results of rainfall forecasting in Batu City using multi input with single output transfer function model is accurate based on the result of model validation using 𝑑 test and MAPE statistic that less than 20%. Keywords: multi input transfer function model, rainfall, forecasting INTRODUCTION Time series analysis is a data analysis considering the effect of time. Time series or time series data is a sequence of observations taken sequentially based on time with the same interval, ie daily, weekly, monthly, yearly or other time periods [1]. The modeling in time series analysis is classified into two, ie univariate and multivariate. One univariate time series model is Autoregressive Integrated Moving Average (ARIMA), whereas one multivariate time series model is a transfer function model. The transfer function model consists of an output series (Yt) that is influenced by the input series (Xt) and other inputs combined in a group called the noise series (Nt) [2]. The transfer function model combines several characteristics of the univariate ARIMA model and multiple regression analysis or combines time series analysis with a causal approach. The purpose of modeling the transfer function is to establish a simple model that can connect the output series (Yt) with the input series (Xt) and the noise series (Nt). Some forecasting research using transfer function model is done by Tankersley, et. al. [3] about groundwater fluctuations in Florida showed that the transfer function model is more accurate than ARIMA. While Edlurd and Karlsson [4] who studied the unemployment rate in Sweden showed that the forecasting results of the transfer function model is better when compared with other classical forecasting models such as Vector Autoregressive (VAR) and mailto:priska.arindya@gmail.com mailto:wswardhani@yahoo.com mailto:rahmafitriani@ub.ac.id Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama 30 ARIMA. In addition Thomakos and Geurard [5] who examined unemployment forecasting in St. Louis United States showed that the transfer function model is better when compared to the naive model and ARIMA. Rainfall data information becomes an important thing that affects the agricultural output to determine the beginning of the growing season. One of the areas in East Java Province that has potential in the agricultural sector is Batu City, so it is important to link rainfall with the agricultural sector in Batu City. The agricultural sector is a leading sector which is expected to synergize with other sectors such as tourism, trade and industry [6] . Some factors that can affect rainfall are air temperature, humidity, wind speed and cloud [7]. Some previous research on rainfall forecasting is done by Faulina [8] about comparison of accuracy of ensemble ARIMA in Batu City and Tresnawati, et. al. [9] examined the Kalman Filter method with SST Nino 3.4 predictors in Purbalingga. Research of rainfall forecasting on preceding still has a weakness that only involves one variable. While research that has been done by Badan Meteorologi, Klimatologi dan Geofisika (BMKG) in rainfall forecasting uses ensemble method (combination of individual method using concept of merging some forecasting model) and BMA (Bayesian Model Averaging) modelling but the forecasting rarely gives the accurate results. Therefore, in this research will discuss rainfall forecasting involving more than one variable. Air temperature, humidity, wind speed and clouds can be used to model and forecast rainfall using the transfer function model. In this study, rainfall is a response variable (output series), while air temperature, air humidity, wind speed and cloud are predictor variables (input series). The transfer function model used in this research is called the multi-input transfer function model, because the predictor variable or input series is used more than one variable (input). The existence of factors that affect the rainfall as a predictor variable (input series) is expected to provide accurate forecasting results. FUNDAMENTAL THEORIES The steps of formation of the transfer function model consists of five steps, namely prewhitening process of input series and output series, identification of impulse response function and transfer function, identification of noise model, prediction and testing of parameter significance of transfer function model, and diagnostic test of transfer function model [10]. Prior to the prewhitening process, the first step is to prepare the input series and the output series. In this preparation step, the identification of time series plots, stationarity test and ARIMA modeling of input series and output series is conducted. Stationarity test consists of two parts, namely stationary to the variance and the mean. Stationarity test against variance can be seen from the Box-Cox plot and the value of Ξ». The data is said to be stationary to the range if the value of Ξ» is equal to or near 1. The non-stationary time series data can be overcome by doing the Box-Cox transformation. While the stationarity test against the mean can be seen from the autocorrelation plot (ACF). The stationary data on the mean on the ACF plot will not be significant after the second or third lag. Time series data that is not stationary to the mean can be overcome by differencing to become a stationary series. After obtaining a stationary series then the next step is to identify the ARIMA model in the input sequence. The identification of ARIMA model is performed on time series data which has been stationary to the variance and the mean by looking at ACF and PACF plots. The ARIMA model will be used in the prewhitening process of the input series and output series. The first step of identification of the impulse response function and transfer function is the calculation phase of cross correlation and the autocorrelation of the input series and the prewhitened output series. The cross correlation function (CCF) of the input series xt and the output series yt between the k-lags is expressed by 𝜌π‘₯𝑦 (π‘˜) = Ξ³xy(k) Οƒx Οƒy [10]. The second step is the direct prediction of the impulse response weight used to establish the order of the model of the defined transfer function vkΜ‚ = CΞ±Ξ²(k) SΞ±2 = ρ̂αβ(k) SΞ² SΞ± [2]. The last step is the assignment of values (b, r, s) of the transfer function model, b, r and s are three keys of parameter in the transfer function model. Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama 31 Determining the value of b, r, s can be done by identifying the cross-correlation values of the sample or by directly estimating the weight of the impulse response. The phase identification model of noise series consists of two steps, the first is the initial estimate of the noise series (nt) declared �̂�𝑑 = 𝑦𝑑 βˆ’ οΏ½Μ‚οΏ½(𝐡) οΏ½Μ‚οΏ½(𝐡) 𝐡𝑏 π‘₯𝑑 [10] and the second is the ARIMA modeling (pn,0,qn) for the noise series (nt). The estimation of transfer function model parameters used maximum likelihood method and significance test of transfer function model parameters using t test statistic. The last formation step of the transfer function model is the diagnostic test of the transfer function model which is done by testing the remaining autocorrelation and cross-correlation test (CCF) between the input and the batch input series using the Ljung-Box (Q) test statistic. Multi-input transfer function model with single output is formulated as follows: 𝑦𝑑 = βˆ‘ πœˆπ‘— (𝐡)π‘₯𝑗𝑑 π‘˜ 𝑗=1 + 𝑛𝑑 where 𝑦𝑑 is output series that has been stationary, π‘₯𝑗𝑑 is a j-inputs series that have been stationary, 𝑛𝑑 is a noise series and πœˆπ‘— (𝐡) is the transfer function for the j-input sequence (xjt). RESULTS AND DISCUSSION This research used daily secondary data of rainfall (π‘Œπ‘‘ ), air temperature (𝑋1𝑑 ), humidity (𝑋2𝑑 ), wind speed (𝑋3𝑑 ) and cloud (𝑋4𝑑 ) in Batu City obtained from the web www.worldweatheronline.com. Daily data for the period of January 2016-December 2016 was used as training data to form a multi-input transfer function model. While the daily data for January 2017 period was used as data testing for model validation. The time series plot of each input series and output series is shown in Figure 1. (a) (b) (c) (d) (e) 36032428825221618014410872361 50 40 30 20 10 0 Time R a in fa ll ( m m ) Time Series Plot of Rainfall 36032428825221618014410872361 32 31 30 29 28 Time A ir T e m p e ra tu re ( Β°C ) Time Series Plot of Air Temperature 36032428825221618014410872361 85 80 75 70 65 60 55 50 Time H u m id it y ( % ) Time Series Plot of Humidity 36032428825221618014410872361 14 12 10 8 6 4 2 Time W in d S p e e d ( m p h ) Time Series Plot of Wind Speed Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama 32 Figure 1. (a) Time series plot of rainfall (b) Time series plot of air temperature (c) Time series plot of humidity (d) Time series plot of wind speed (e) Time series plot of cloud Based on the time series plot each input series and the output series indicated that the rainfall, air temperature, humidity, wind speed and cloud daily throughout the year 2016 has a pattern up and down and does not show the trend pattern. The stationarity test results to the variance indicated that each input series and output series are not stationary, which is shown by the Box-Cox plot and the Ξ» values of each input and output series are not close to 1. Then, the Box- Cox transformation is done for each the input series and the output series. While the results of stationary test against the mean indicated by ACF and PACF plots also showed that each input series and output series has not been stationary to the mean. Therefore, differencing is done once for each of input and output series so it becomes stationary. Furthermore, the results of ARIMA model identification for the input series obtained ARIMA (1,1,2) for the air temperature input series (𝑋1𝑑 ), ARIMA ([3,5,7],1,2) for air humidity input series (𝑋2𝑑 ), ARIMA (1,1,1) for wind speed input series (𝑋3𝑑 ) and ARIMA (6,1,1) for cloud input series (𝑋4𝑑 ). Based on the ARIMA model obtained from each input series, it can be determined prewhitening model on each input series and output series, such as: 𝛼1𝑑 = π‘₯1𝑑 βˆ’ 0.26995π‘₯1π‘‘βˆ’1 + 0.63921𝛼1π‘‘βˆ’1 + 0.21176π‘Ž1π‘‘βˆ’2 (1) 𝛼2𝑑 = π‘₯2𝑑 + 0.39196π‘₯2π‘‘βˆ’1 βˆ’ 0.24251π‘₯2π‘‘βˆ’2 + 0.19483π‘₯2π‘‘βˆ’3 + 0.18366π‘₯2π‘‘βˆ’5 + 0.36474π‘₯2π‘‘βˆ’7 + 0.31646𝛼2π‘‘βˆ’1 + 0.72181𝛼2π‘‘βˆ’2 (2) 𝛼3𝑑 = π‘₯3𝑑 βˆ’ 0.51882π‘₯3π‘‘βˆ’1 + 0.94334𝛼3π‘‘βˆ’1 (3) 𝛼4𝑑 = π‘₯4𝑑 βˆ’ 0.58649π‘₯4π‘‘βˆ’1 + 0.14606π‘₯4π‘‘βˆ’2 + 0.42116π‘₯4π‘‘βˆ’3 + 0.42977π‘₯4π‘‘βˆ’4 + 0.16236π‘₯4π‘‘βˆ’5 + 0.34483π‘₯4π‘‘βˆ’6 + 0.90410π‘Ž4π‘‘βˆ’1 (4) Equations (1), (2), (3), and (4) are prewhitening models for the input air temperature series, humidity, wind speed and cloud. The prewhitening model for the output series is the same as the input series, only 𝛼𝑗𝑑 and π‘Žπ‘—π‘‘ are replaced by 𝛽𝑗𝑑 , π‘₯𝑗𝑑 is replaced by 𝑦𝑗𝑑 where j is the index of the input series. The results of the cross-correlation and the autocorrelation of the prewhitened input and output series are shown in the cross-correlation plot of each input air temperature, air humidity, wind speed, cloud series with prewhitened rainfall output are presented in Figure 2. 36032428825221618014410872361 70 60 50 40 30 20 10 0 Time C lo u d ( % ) Time Series Plot of Cloud Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama 33 Figure 2. Crosscorrelation Plot of Prewhitened Input Series and Output Series Based on the results of cross-correlation and the weight of the impulse response for each input series with prewhitened output series, it can be identified the values of b, s, r for each input series. The value of b, s, r for the air temperature input series and the rainfall output series are (b = 0, s = 0, r = 0), for the humidity input series and the rainfall output series are (b = 23, s = 0, r = 0), while for the wind speed input series and the rainfall output series are (b = 1, s = 2, r = 0) and for the cloud input series and the rainfall output series are (b = 0, s = 0, r = 0). The noise series model (𝑛𝑑 ) obtained based on the value of the impulse response weight is formulated as the following: 𝑛𝑑 = 𝑦𝑑 βˆ’ 𝑦�̂� = 𝑦𝑑 βˆ’ βˆ‘ 𝑣�̂� (𝐡)π‘₯𝑗𝑑 π‘˜ 𝑗=1 𝑛𝑑 = 𝑦𝑑 βˆ’ {(βˆ’0,1166)π‘₯1𝑑 + β‹― + (βˆ’0,8663)π‘₯1π‘‘βˆ’23} βˆ’ {(0,1391)π‘₯2𝑑 + β‹― + (0,3001)π‘₯2π‘‘βˆ’23} βˆ’ {(βˆ’0,2078)π‘₯3𝑑 + β‹― + (βˆ’0,4434)π‘₯3π‘‘βˆ’23} βˆ’ {(0,1681)π‘₯4𝑑 + β‹― + (0,0439)π‘₯4π‘‘βˆ’23} Based on the equation we can get the value 𝑛24, 𝑛25, … , 𝑛365 where the value 𝑛1 … 𝑛23 is set equal to zero. ARIMA model (𝑝𝑛 , 0, π‘žπ‘› ) for noise series (𝑛𝑑 ) is ARIMA ([5,8],0,2) and can be formulated 𝑛𝑑 = (1 + 0.0378𝐡 βˆ’ 0.9109𝐡2) (1 + 0.72949𝐡 βˆ’ 0.23092𝐡2 βˆ’ 0.06788𝐡3 βˆ’ 0.01765𝐡4 + 0.1179𝐡5 + 0.10773𝐡8) π‘Žπ‘‘ Identify the values of b, s, r for the multi-input transfer function model based on the incorporation of b, s, r values from the single input transfer function model in each input series to the output series. The possible multi-input transfer function model is (𝑏1, 𝑠1, π‘Ÿ1) (𝑏2, 𝑠2, π‘Ÿ2) (𝑏3, 𝑠3, π‘Ÿ3) (𝑏4, 𝑠4, π‘Ÿ4) (𝑝𝑛 ,π‘žπ‘›) = (0,0,0) (23,0,0) (1,2,0) (0,0,0) ([5,8],2). The prediction and significance test of the parameter of multi-input transfer function model is shown in Table 1. Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama 34 Table 1. Significance Test Results Parameter Multi Input Transfer Function Model Input Series Orde (B,S,R) Parameter Estimation T P-Value Conclusion π‘ΏπŸπ’• b=0,s=0,r=0 πœ”01 0.1042 3.84 0.0089 Significant π‘ΏπŸπ’• b=23,s=0,r=0 πœ”02 0.2459 2.03 0.0423 Significant π‘ΏπŸ‘π’• b=1,s=2,r=0 πœ”03 0.4318 3.53 0.0180 Significant π‘ΏπŸ’π’• b=0,s=0,r=0 πœ”04 0.1992 4.12 <.0001 Significant Table 1 shows that all parameters in the multi-input transfer function model are significant, this is based on the p-value of each parameter smaller than the value of Ξ±. The diagnostic results of the multi-input transfer function model performed by testing the residual autocorrelation using the Ljung-Box test are shown below: Table 2. Diagnostic Test Result of Multi Input Transfer Function Model Multi Input Transfer Function Model Lag Chi-Square P-Value Conclusion (π’ƒπŸ, π’”πŸ, π’“πŸ) (π’ƒπŸ, π’”πŸ, π’“πŸ) (π’ƒπŸ‘, π’”πŸ‘, π’“πŸ‘) (π’ƒπŸ’, π’”πŸ’, π’“πŸ’) (𝒑𝒏,𝒒𝒏) = (0,0,0) (23,0,0) (1,2,0) (0,0,0) ([5,8],2) 12 2.80 0.5921 Fit 18 7.12 0.7142 24 9.66 0.8840 Based on the Ljung-Box test, a model is said to be feasible if the whole lag of a model yields a p- value greater than Ξ± = 0.05. The result of model test using Ljung-Box test stated that the multi input transfer function model with single output is feasible to use. So the model of multi input transfer function with single output for rainfall forecasting in Batu City stated 𝑦𝑑 = 0.10418π‘₯1𝑑 + 0.24597π‘₯2π‘‘βˆ’23 + 0.43178π‘₯3π‘‘βˆ’1 + 0.19916π‘₯4𝑑 + 𝑛𝑑 The model shows that rainfall on a certain day is affected by air temperature and cloud on that day, air humidity in the previous 23 days and wind speed in the previous day. The precision of forecasting was measured by calculating the MAPE (Mean Absolute Percentage Error) statistic. MAPE statistic for testing data was 16.35%. The results of model validation using t paired test showed that the value of forecasting with actual value was not significantly different. CONCLUSION The results of analysis showed that based on the multi input with single output transfer function model, the rainfall in Batu City on certain days is affected by air temperature and cloud on that day, humidity in the previous 23 days, and wind speed in the previous day. The results of rainfall forecasting in Batu City using multi input with single output transfer function model is accurate based on the result of model validation using 𝑑 test and MAPE statistic that less than 20%. Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City Priska Arindya Purnama 35 REFERENCES [1] G. Box, G. Jenkins dan J. Reisel, Time Series Analysis Forecasting and Control, 4th penyunt., New Jersey: Wiley, 2015. [2] S. Makridakis dan S. W. a. V. McGee, Metode dan Aplikasi Peramalan, 2nd penyunt., Jakarta: Erlangga, 1988. [3] C. D. Tankersley, G. W. D dan H. K, β€œComparison of Univariate and Transfer Function Models of Groundwater Fluctuations,” Water Resources Research, no. 29, pp. 3517-3533, 1993. [4] P. Edlurd dan K. S, β€œForecasting The Swedish Unemployment Rate VAR vs Transfer Function Modelling,” International Journal of Forecasting, no. 9, pp. 61-76, 1995. [5] D. D. Thomakos dan G. J. B, β€œNaive, ARIMA, Nonparametric, Transfer Function and VAR Models: A Comparison of Forecasting Performance,” International Journal of Forecasting, no. 20, pp. 53-67, 2004. [6] BPS Kota Batu, Kota Batu dalam Angka, Batu: BPS Kota Batu, 2015. [7] E. Wilson, Hidrologi Teknik, 4th penyunt., Jakarta: Erlangga, 1993. [8] R. Faulina, β€œPerbandingan Akurasi Ensemble ARIMA dalam Peramalan Curah Hujan di Kota Batu, Malang, Jawa Timur,” Jurnal Matematika, Sains dan Teknologi, no. 15, pp. 75-83, 2014. [9] R. Tresnawati, T. A. Nuraini dan W. Hanggoro, β€œPrediksi Curah Hujan Bulanan Menggunakan Metode Kalman Filter dengan Prediktor SST Nino 3.4 Diprediksi,” Jurnal Meteorologi dan Geofisika, no. 11, pp. 108-119, 2010. [10] W. Wei, Time Series Analysis: Univariate and Multivariate Methods, 2nd penyunt., Addison-Wesley Publishing Co, 2006. ABSTRACT INTRODUCTION FUNDAMENTAL THEORIES RESULTS AND DISCUSSION CONCLUSION REFERENCES