Simulation Study of the Using of Bayesian Quantile Regression in Non-normal Error CAUCHY –Jurnal Matematika Murni dan Aplikasi Volume 5(3) (2018), Pages 121-126 p-ISSN: 2086-0382; e-ISSN: 2477-3344 Submitted: 04 October 2018 Reviewed: 08 October 2018 Accepted: 30 November 2018 DOI: http://dx.doi.org/10.18860/ca.v5i3.5633 Simulation Study of the Using of Bayesian Quantile Regression in Non- normal Error Catrin Muharisa1, Ferra Yanuar2*, Dodi Devianto3 1,2,3 Department of Mathematics, Faculty of Mathematics and Natural Science, Andalas University, Kampus Limau Manis, 25163, Padang - Indonesia Email: catrinmuharisa@gmail.com, ferrayanuar@sci.unand.ac.id, ddevianto@sci.unand.ac.id *Corresponding Author Email: ferrayanuar@sci.unand.ac.id ABSTRACT The purposes of this paper are to introduce the ability of the Bayesian quantile regression method in overcoming the problem of the non-normal errors. In this research we do simulation study to apply the proposed method. We generate data and assume error by asymmetric Laplace distribution. In this research, we solve the non-normal problem using quantile regression method and Bayesian quantile regression method and then we compare. The quantile regression approach we divide the data into any quantiles, then estimated the conditional quantile function and minimize absolute error that is asymmetrical. Bayesian regression method used the asymmetric Laplace distribution in likelihood function. Markov Chain Monte Carlo method using Gibbs sampling algorithm is applied then to estimate the parameter in Bayesian regression method. Convergence and confidence interval of parameter estimated are also checked. Bayesian quantile regression method results has more significance parameter and smaller confidence interval than quantile regression method. The best regression equation model is Bayesian quantile method used quantile value 0.50. This study proves that Bayesian quantile regression method can produce acceptable parameter estimate for non-normal error. Keywords: Quantile regression; Asymmetric Laplace distribution; Gibbs sampling; Markov Chain Monte Carlo; Bayesian quantile regression INTRODUCTION Regression analysis is a tool in experienced statistics development and used in many areas of life. The analysis has a purpose to estimate the relationship between dependent variable with independent variable [1]. There are several methods used to estimate parameters in the regression equation, one of the most commonly used is the Ordinary Least Square (OLS) method. The used of OLS method is based on several assumptions, one of which is the assumption of normality. Furthermore, developed Quantile regression method which is generally used in case of econometrics. In this quantile regression approach, we divide the data into any quantiles, then estimated the conditional quantile function and minimize absolute error that is asymmetrical. In the method of quantile regression usually requires large data size. Bayes introduced a method to estimate parameters by utilizing initial information called prior distribution. This method is known as the Bayesian method. Prior distribution can be derived from the prior research data or based on the researcher's intuition[2]. Prior information from the distribution of parameters is then combined with information from data obtained from sampling or so-called likelihood function so that posterior mailto:catrinmuharisa@gmail.com mailto:ferrayanuar@sci.unand.ac.id mailto:ddevianto@sci.unand.ac.id mailto:ferrayanuar@sci.unand.ac.id Simulation Study of the Using of Bayesian Quantile Regression in Non-normal Error Catrin Muharisa 122 distribution is obtained. The mean and variance of this posterior distribution or the posterior mean and posterior variance then can be estimated. Bayesian Method uses MCMC algorithm (Markov Chain Monte Carlo). MCMC can be easily used to obtain posterior distribution even in that situation complex [3]. In this research will be estimated model parameters with combining the quantile regression method and the Bayesian regression method called the Bayesian quantile regression method. METHODS In this section, we will introduce two models, quantile regression and Bayesian quantile regression. Here, we denote Y as the dependent variable, X is the independent variable. 1. Quantile Regression Quantile function denoted by π‘„πœƒ where πœƒ, 0 ≀ πœƒ ≀ 1. Given π‘Œ be a random variable with a cumulative distribution function πΉπ‘Œ = 𝑃(π‘Œ ≀ 𝑦). The Quantile regression πœƒπ‘‘β„Ž of π‘Œ can be simply written as follows [4]: π‘„πœƒ(π‘Œ) ≔ πΉπ‘Œ βˆ’1(πœƒ) = inf⁑{𝑦:πΉπ‘Œ(𝑦) β‰₯ πœƒ} (1) Given π‘Œ is dependent variable, and 𝑋 is independent variable have 𝑝 dimension. Let πΉπ‘Œ(𝑦|𝑋 = π‘₯) = 𝑃(π‘Œ ≀ 𝑦|𝑋 = π‘₯) conditional cumulative function notation of π‘Œ given 𝑋 = π‘₯. The conditional quantile πœƒπ‘‘β„Ž of π‘Œ defined as bellows [4]: π‘„πœƒ(π‘Œ|𝑋 = π‘₯) = inf⁑{𝑦:πΉπ‘Œ(𝑦|π‘₯) β‰₯ πœƒ}. (2) Based on the median concept of estimate for 𝛽 from the quantile regression πœƒπ‘‘β„Ž obtained by minimizing the absolute number of errors by weighting πœƒ for positive error and weighting 1 βˆ’ πœƒ for negative error [5] : π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘› π‘₯𝑖 𝑇𝛽0βˆˆπ‘… πœƒ βˆ‘ πœŒπœƒ 𝑛 𝑖=1 ⁑(𝑦𝑖 βˆ’ π‘„πœƒ(π‘Œ|𝑋))⁑ (3) where 𝜽 is quantile indeks ∈ (𝟎,𝟏) and π†πœ½ is asymmetrice loss function for π‘Έπœ½(𝒀|𝑿) = π‘Ώπ‘»πœ·. 2. Bayesian Quantile Regression The Bayesian method uses Markov Monte Carlo chain (MCMC) to estimate the posterior distribution. Given π’š = (𝑦1,𝑦2,…,𝑦𝑛), where the prior distribution of 𝛽 is 𝑝(𝛽). The prior distribution taken in this research is prior informative those originating from previous research [6]. Determination of prior distribution parameters are very subjective, depending on the researcher's intuition. A variable π‘Œ is said to follows Asymmetric Laplace Distribution with the density function of the probability as follows [2]: 𝑓𝑝(𝑦) = 𝑝(1 βˆ’ 𝑝)exp⁑{πœŒπœƒ(𝑦𝑖 βˆ’ πœ‡)} (4) and likelihood function as follows : 𝐿(𝑦|𝛽) = 𝑝𝑛(1 βˆ’ 𝑝)𝑛exp⁑{βˆ’βˆ‘ πœŒπœƒπ‘– (𝑦𝑖 βˆ’ π‘₯𝑖 𝑇𝛽)} . (5) Simulation Study of the Using of Bayesian Quantile Regression in Non-normal Error Catrin Muharisa 123 Then the posterior distribution of 𝛽, 𝑓(𝛽|𝑦) is given by 𝑓(𝛽|𝑦) ∝ 𝐿(𝑦|𝛽)⁑𝑝(𝛽) ∝ 𝑝𝑛(1 βˆ’ 𝑝)𝑛exp⁑{βˆ’βˆ‘ πœŒπœƒπ‘– (𝑦𝑖 βˆ’ π‘₯𝑖 𝑇𝛽)}⁑𝑝(𝛽) (6) RESULTS AND DISCUSSION In this study data generated by software R that consists of two independent variables (𝑋1,𝑋2) and one response variable (π‘Œ), each independent variable (𝑋1,𝑋2) spreads according to the normal distribution (𝑋1⁑~⁑𝑁(0,1)) and (𝑋2⁑~⁑𝑁(0,1)). While the response variable (π‘Œ), is set value 𝑦 = 0,7⁑π‘₯1 +⁑π‘₯2 + πœ€, where πœ€~𝐴𝐿𝐷(0,1,0.75). Each variable measured 150 sample data. Estimation results of each model parameter for each quantile using the quantile regression method can be seen in the following Table 1 Table 1. Estimated parameter model of quantile regression method Quantile (πœ½π’•π’‰) 𝜷𝟏 Se 𝜷𝟐 Se 0.05 0.25 0.50 0.75 0.95 0,97237 (0,30104) 0,7057 (0,11758) 1,00287 (0,00031**) 1,20465 (0,0000**) 1,12467 (0,1943) 0,9369 0,44829 0,27177 0,19633 0,86254 4,43896 (0,000055**) 0,92898 (0,06935) 1,25697 (0,00007**) 1,22291 (0,0000**) 1,27023 (0,1956) 1,06125 0,50779 0,30784 0,30784 0,97702 **Significant on level 𝛼=0.05 Se=Standart error Based on Table 1, the parameter of 𝛽1 significant in quantities to 0.50 and 0.75. While the parameter 𝛽2 is significant in quantiles to 0.05, 0.50 and 0.75. Hence, the coefficient parameter of each quantile satisfied when used quantile 0.50. Estimation results for each quantile using the Bayesian quantile regression method can be seen in the Table 2 Table 2. Estimated parameter model of Bayesian quantile regression method Quantile (πœ½π’•π’‰) 𝜷𝟏 𝜷𝟐 0.05 0,527 3,499 0.25 0,749 1,037 0.50 1,04 1,22 Simulation Study of the Using of Bayesian Quantile Regression in Non-normal Error Catrin Muharisa 124 0.75 1,077 1,195 0.95 1,08 1,48 Based on the coefficient parameter of each quantile satisfied when used quantile 0.25. Based on the point estimated value of the Bayesian quantile regression closer to the beta value than the quantile regression. Estimation of model parameters with quantile regression method and Bayesian quantile regression method as previously obtained. Next will be compared by using the confidence interval. The comparison results are shows in the following Table 3 Table 3. Confidence interval of quantile regression and Bayesian quantile regression Quantile (πœ½π’•π’‰) 𝜷𝟏 𝜷𝟐 QR BQR QR BQR 0.05 0.25 0.50 0.75 0.95 Lower Limit Upper Limit Difference Lower Limit Upper Limit Difference Lower Limit Upper Limit Difference Lower Limit Upper Limit Difference Lower Limit Upper Limit Difference -1,84973 2,54512 4,39485 -0,07067 1,51635 1,58702 0,7346 1,53139 0,79679 0,49631 1,46396 0,96765 0,6571 1,64607 0,98897 -0,819 1,17 1,989 0,189 1,33 1,141 0,706 1,41 0,704 0,601 1,5283 0,9273 0,369 1,72 1,351 -2,8179 4,63321 7,45111 0,34168 1,84061 1,49893 0,69714 1,54094 0.8438 0,87919 1,4644 0,58521 0,58451 2,69848 2,11397 1,663 4,77 3,107 0,462 1,63 1,168 0,801 1,61 0,809 0,795 1,5898 0,7948 0,337 2,7 2,363 QR=Quantile Regression BQR=Bayesian Quantile Regression Based on Table 3, it can be seen in the quantile 0.05 by using the quantile regression method on 𝛽1 parameter the result is not significant, where Bayesian quantile regression method is significant. On the parameter of 𝛽2 to the quantile 0.05 using the quantile regression method and Bayesian quantile regression method not significant. In quantiles 0.25, 0.50, 0.75, and 0.95 it can be seen that using the Bayesian quantile and quantile regression methods has been significant. From the analysis, the best regression equation model is obtained when using Bayesian quantile regression analysis with 0.50 quantile value. The next step in Bayesian quantile regression approach is convergence test of convergence of model parameters that have been estimated parameter. Test is using history trace plot and density plot [3]. Figure 1 and Figure 2 presents a trace plot and density plot for some selected parameters. Simulation Study of the Using of Bayesian Quantile Regression in Non-normal Error Catrin Muharisa 125 Figure 1. Trace Plot of parameter 𝛽1 for πœƒ = 0.50 Figure 2. Trace Plot of parameter 𝛽2 for πœƒ = 0.50 Figure 1 and Figure 2 it can be concluded that the assumption of convergence is related. Data distribution has been stable as it is between two parallel horizontal lines. Figure 3. Density plot of parameter 𝛽1 for πœƒ = 0.50 Figure 4. Density plot of parameter 𝛽2 for πœƒ = 0.50 In Figure 3 and Figure 4, the density plot for some parameters show the normal distributed curve. This result informs us that the selected parameters model is convergent. Based on the convergence examination of the trace plot and density plot, it can be concluded that the alleged model has satisfied the criterion of convergence. CONCLUSIONS In this research, we use the analysis quantile and Bayesian quantile regression to analyze simulation study in non-normal error. Obtained the best regression equation model is Bayesian quantile method used quantile value 0.50. Bayesian quantile regression method using Gibbs algorithm sampling better estimator than quantile regression method. Because the Bayesian method has more parameter significance and confidence interval values smaller. Simulation Study of the Using of Bayesian Quantile Regression in Non-normal Error Catrin Muharisa 126 REFERENCES [1] Walpole, R.E and Myers, R. H. 1995. Ilmu Peluang dan Statistika untuk Insinyur dan Ilmuwan Edisi ke-4. ITB : Bandung. [2] Yu, K. and Moyeed, R. 2001. Bayesian Quantile Regression. Statistics & Probability Letters, 54(4), 437-447. [3] Ntzoufras, I. 2009. Bayesian Modeling Using WINBugs. John Wiley Sons, Inc: New Jersey. [4] Davino, C., Furno, M. and Vistocco, D. 2014. Quantile Regression Theory and Applications. John Wiley and Sons, Ltd. [5] Koenkar,R and Basset,G.Jr. 1978. Regression Quantiles. Econometrica,46: 33-50 [6] Benoit, D.F and Van den Poel, D. 2017. BayesQR: A Bayesian Approach to Quantile Regression.Journal of Statistical Software, 76(7), 1-32.