DEM_2015_71to87 © 2015 Nicolaus Copernicus University Press. All rights reserved. http://www.dem.umk.pl/dem D Y N A M I C E C O N O M E T R I C M O D E L S DOI: http://dx.doi.org/10.12775/DEM.2015.004 Vol. 15 (2015) 71−87 Submitted October 30, 2015 ISSN (online) 2450-7067 Accepted December 15, 2015 ISSN (print) 1234-3862 Błażej Mazur* Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation∗∗ A b s t r a c t. The paper investigates gains in performance of density forecasts from models using disaggregate data when forecasting aggregate series. The problem is considered within a restricted VAR framework with alternative sets of exclusion restrictions. Empirical analysis of Polish CPI m-o-m inflation rate (using its 14 sub-categories for disaggregate modelling) is presented. Exclusion restrictions are shown to improve density forecasting performance (as evaluated using log-score and CRPS criteria) relatively to aggregate and also disaggregate unrestricted models. K e y w o r d s: prediction, model comparison, density forecasting, inflation, VAR models, shrinkage. J E L Classification: E31, E37, C53, C32. Introduction The paper focuses on the question whether point and in particular density forecasts of univariate series can be improved using disaggregate infor- mation (assuming that it is available) – or more generally, whether economic fluctuations are more accurately modeled at the aggregate or at the disaggre- gate level. The crucial assumption being relevant here is that a multivariate model for disaggregate data is used only as a tool for obtaining the implied univariate forecast of the aggregate series. The aggregating weights are as- * Correspondence to: Błażej Mazur, Cracow University of Economics, Chair of Econo- metrics and Operations Research, e-mail: eomazur@cyfronet.pl. ∗∗ This research was supported by the Polish National Science Center (NCN) based on de- cision number DEC-2013/09/B/HS4/01945. Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 72 sumed to be known and fixed. There is no direct and necessary relationship between forecasting power in the disaggregate (multivariate) context and the validity of the implied univariate (aggregate) forecast. Therefore the usual statistical procedures used for forecast evaluation or model comparison ap- plied at the disaggregate level have to be adjusted in order to take the context into account. The probabilistic approach to inference have been gaining popularity in the systematic manner over time. In particular events like the recent econom- ic crisis have made it evident that uncertainty quantification is an inherent part of the forecasting process. Some problems that seem to be uncomplicat- ed when point forecast perspective is taken become more complex when one has the density forecasting perspective in mind. The paper investigates con- sequences of the aggregation for uncertainty of the aggregate forecast. How- ever, an alternative perspective on the problems is the general question whether economic fluctuations are better modeled at the aggregate or the disaggregate level, given some specific model classes. A framework for formal investigation of the problems mentioned above builds on a Gaussian VAR specification with restrictions. Usage of such relatively simple models is dictated by complexity that increases rapidly with the dimension (i.e. disaggregation level). Moreover, as the problem considered here can be interpreted in the context of variable selection (result- ing in a very large number of possible exclusion restrictions), numerical complexity becomes quite considerable even within the relatively simple model class consisting of Gaussian VARs. The objective of the paper is to investigate empirical evidence support- ing the use of disaggregate data for aggregate forecasting. The application under consideration deals with forecasting Polish monthly CPI inflation rate (relative to the previous month) using 164 observations on growth rates of 14 price sub-indices. A pseudo real-time forecasting experiment is per- formed and out-of-sample predictive performance of competing specifica- tions is evaluated over a verification window consisting of the last 44 obser- vations. All the criteria (those relevant for point forecast and density fore- casts) show the evidence of substantial gains from considering a large menu of competing specifications at the disaggregate level. However, as the total number of alternative restricted specifications in enormous, a stochastic search algorithm was used to explore the model space. The results suggest that the problem of aggregate vs. disaggregate mod- eling for the sake of aggregate forecasting is not trivial and illustrates a po- tential for gains in predictive performance, but this requires some form of promoting parametric parsimony – as attributing non-zero probability to Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 73 models with zero restrictions might be interpreted as introducing some form of shrinkage within the overall (unrestricted) model. The rest of the paper is organized in the following way: firstly, the gen- eral problem of forecasting of the aggregate series is presented in a more detailed way. Secondly, a formal model framework used here is described, with the relevant forecasting methodology. Thirdly, methods for evaluation of density forecasts are outlined. Fourthly, the empirical illustration is pro- vided. A summary and some remarks about possible directions for further developments conclude. 1. Aggregate vs. Disaggregate Approach to Forecasting A natural approach to forecasting is to use a model formulated directly in terms of the quantity of interest itself. However, if one considers the context of e.g. inflation forecasting, an approach that is often taken by practitioners is to obtain the aggregate forecast using individual disaggregate inflation forecasts for a number of sub-categories. The aggregate approach seems to be conceptually simpler and more ele- gant. Moreover, due to the lack of the dimensionality problem, at the aggre- gate level it is possible to make use of more sophisticated models. Aggrega- tion could have a regularizing impact as well. For example, one could expect that in certain cases the assumption of say homoscedastic Gaussian errors is more likely to hold at the aggregate level. On the contrary, at the disaggregate level one has to deal with increased heterogeneity. Another issue is that of modeling dependencies across the variables in a multivariate process which might be challenging. However, there is more information available, and sometimes an expert knowledge can be utilized at the disaggregate level only. In the specific context considered here, the only objective of the analysis is the aggregate forecast. It is not obvious that the efforts made to deal with heterogeneity and dependence modeling that improve goodness of fit at the disaggregate level would lead to an improved forecasting performance of the aggregate series. This is even more true when one has in mind the density forecasting context. In general it is possible to obtain a point forecast of the aggregate series using disaggregate point forecasts obtained from individual, unrelated univariate models for the sub-aggregates. However, if one makes an attempt to generate a density forecast in the same way, the omitted dependence be- tween sub-aggregates might result in uncertainty misspecification for the aggregate forecast, resulting in poor density predictive performance. The Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 74 tradeoffs between various aspects of heterogeneity and dependence model- ling and possible gains in the aggregate density forecasting performance are among the points of interest in the paper. The essential empirical issue under consideration is whether there exist potential gains from working at the dis- aggregate level in terms of the aggregate density predictive performance. Moreover, one might be interested in verifying to what extent the aggregate predictive performance deteriorates as one neglects e.g. the stochastic de- pendence among the series in the model for sub-aggregates. The problems of disaggregate vs. aggregate modeling and forecasting (also with focus on inflation forecasting) has been considered by many au- thors. In particular Hubrich (2005) considers similar inflation forecasting problem, but does not deal with density forecasts. At a more general level, the question was addressed by Hendry and Hubrich (2011). Some theoretical developments are related to the work of Giacomini and Granger (2004); Lütkepohl (2009) considers aggregation within an interesting class of dis- aggregate DGPs. Contributions involving also empirical applications, espe- cially on inflation forecasting, include Aron and Muellbauer (2013), Castle and Hendry (2010) and Faust and Wright (2013) among others. The distinc- tive features of the approach taken here include the focus on density fore- casting and the role of exclusion restrictions at the disaggregate level. 2. Model Framework In order to establish the notation used below, some fairly standard results concerning estimation of multivariate linear models are recalled. Consider a Gaussian VAR(p) model for m variables: ,...11 tpptt s t εAyAyαy ++++= −− ,,...,1 Tt = (1) where: ty is m-dimensional row vector (corresponding to period t), s α is m-vector of seasonal intercept terms (consisting of mS unknown pa- rameters in total, where S is the number of different seasons) 1A ,…, pA are mm × matrices of parameters, with pA including at least one non-zero element, { }tε is a Gaussian vector white noise process with covariance matrix .Σ Assuming m = 1 leads to a Gaussian autoregressive process AR(p). However, if one assumes that 1A ,…, pA and Σ are diagonal, (1) describes a set of m unrelated Gaussian AR(p)-type processes. Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 75 Assuming Σ to be positive definite and otherwise unrestricted and 1A ,…, pA to be unrestricted (i.e. containing no zero restrictions) results in a standard VAR formulation, where OLS estimates of the parameters are also MLEs and thus are asymptotically effective. However, for unrestricted Σ if 1A ,…, pA include zero restrictions (i.e. the set of explanatory varia- bles is not the same in all the equations), OLS estimates are no longer equiv- alent to ML estimates (and are not asymptotically efficient anymore). Issues of non-stationarity (existence of unit or explosive roots in the characteristic polynomial of (1)) are not considered here, the autoregressive parameters are assumed to be unrestricted for the sake of simplicity. The initial conditions 0Y are assumed to be fixed and the values are taken from the data, as the further analysis is conditional on 0Y . In order to consider asymptotically efficient estimation of the model parameters in special cases (with exclusion restrictions), the following SUR-type form of (1) is considered: ,~ ~~ εβXy += (2) where: ( )' ... ~ 21 Tyyyy = is a Tm-dimensional column vector, β is a k-dimensional column vector consisting of unrestricted elements of 1A ,…, pA and 1 α ,…, ,Sα X ~ is kTm × matrix consisting of rows and columns arranged in a way that matches the convention assumed for ,~y ( )' ... ~ 21 Tεεεε = is Tm-dimensional column vector of Gaussian error terms with zero mean and covariance matrix of the form ΣI ⊗ . Asymptotically efficient estimation of β can be achieved by means of a Zellner-type estimator of the form: ( )[ ] ( ) ,~'~~'~ˆ 111 yΣIXXΣIXβ Σ −−− ⊗⊗= (3) with Σ replaced by its consistent estimate, β S , based on residuals (e.g. obtained in the previous iteration of the procedure, as the procedure could be iterated): ,'1 EES β −= T (4) Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 76 where E is mT × -dimensional matrix of residuals. In order to initialize the procedure, the first estimate of β can be obtained as ( ) .~'~~'~ˆ 1 yXXXβ −=OLS The procedure is assumed to stop once some estimate of Σ is considered the final one, here denoted by Σ̂ , with Σ β ˆ ˆ being the final estimate of β (being also an approximation of MLEs for β ). 3. Forecasting Methodology For the purpose of one-period-ahead prediction, it is assumed that the point forecast of ' 1+Ty (its estimated conditional expectation) is given by: ,ˆ~ˆ ˆ ' 1 Σ βxy fT =+ (5) where: fx~ is an km × matrix consisting of known constants and lagged values of the dependent variables (all these are readily available for one- step-ahead forecast), preserving the structure of X ~ . For the sake of density forecasting, the predictive distribution of ' 1+Ty considered here is m-dimensional Gaussian, with mean given by (5) and variance-covariance matrix given as: ( )[ ] .'~~ˆ'~~ˆˆ 111 ffTV xXΣIXxΣy −−+ ⊗+= (6) Consequently, a Gaussian distribution with mean and covariance matrix given by (5) and (6) respectively is perceived as an approximation to the Bayesian predictive distribution obtained with diffuse priors. The approxi- mation can be quite a satisfactory one, especially if the number of observa- tions is not very small and the prior information on parameters is not very strong. A motivation for the use of the approximation instead of the exact Bayesian results is twofold. Firstly, the recursive of out-of-sample prediction experiment with extensive specification search imposes a considerable nu- merical burden, which requires some simplifications for the feasibility rea- sons. Secondly, one of the reasons often stated in favor of the Bayesian ap- proach in VAR forecasting (e.g. with continuous Minnesota-type priors) is that the approach introduces some form of shrinkage, which is beneficial for the forecasting performance. In the paper the counterpart of shrinkage is the specification search described below. It could be perceived as analogous to Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 77 so-called Stochastic Search Variable Selection approach, which in turn im- poses very strong (“hard”) shrinkage by means of discrete-continuous prior distributions. The paper focuses on the results obtained using the exclusion restrictions (similar to “hard” shrinkage) without imposing any “soft” re- strictions (which usually imply reduction of the prior variance for certain parameters). The question whether imposing additional “soft” shrinkage could improve forecasting performance even more is left for further re- search. It is assumed here that the variable of interest (labeled tz ) is a linear combination of variables in ty with known, fixed weights tc : ,'tttz yc= (7) where tc is a m-dimensional row vector of weights. Predictive distribution of 1+Tz is therefore univariate Gaussian with mean given by: ,ˆ~ˆ ˆ11 Σβxc f TTz ++ = (8) and variance of the form: ( )[ ] '.'~~ˆ'~~'ˆrâv 1111111 +−−++++ ⊗+= TffTTTzT cxXΣIXxccΣc (9) The above formulas simplify in certain cases, in particular when Σ is as- sumed to be diagonal. The restriction leads to a considerable reduction of the numerical burden. Empirical verification of its predictive consequences is therefore of great practical importance. It is obvious that ex post (point) forecast error of 1+Tz is given by: ( ),ˆ~'ˆ~'ˆ ˆ11ˆ11111 ΣΣ βxycβxcyc fTTfTTTTT zz −=−=− +++++++ (10) so it is a linear combination of forecasts errors of 1+Ty . Minimization of ex ante forecast error does not necessarily require the forecast error of 1+Ty to be minimized (in ‘mean squared’ sense for in- stance). In particular, it is possible that the disaggregate forecast errors just cancel out in the aggregate. Moreover, within a given model class, the fore- cast error resulting from application of the model directly to tz could be higher compared to the one obtained for implicit forecast based on the anal- ogous model for the disaggregate data ty . Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 78 As density forecasts are considered, it is of course crucial to provide adequate description of uncertainty: formulas (6) and (9) take into account potential stochastic dependence between the model equations (represented by estimated Σ ) and correlations within the joint distribution of the estima- tor for whole β in all the model equations (or a joint posterior for all the structural parameters). Therefore in general no ‘limited information’ approx- imation is used, unless it follows directly from the model structure (e.g. as a special case in which a full information method is equivalent to a limited information one). Another advantage of taking the general perspective of multivariate modeling is that it provides a framework in which the “naïve” disaggregate forecast strategy, based on separate sub-models, can be evalu- ated against more complicated alternatives. The crucial issue is that of model choice (or comparison) at the dis- aggregate level. The approach used here is based on two basic premises. Firstly, importance of the exclusion restrictions is emphasized. This amounts to recognizing the fact that applying some reasonable variable selection or model selection procedures can lead to huge gains in predictive performance compared to a basic, unrestricted Gaussian VAR model. The related idea of Stochastic Search Variable Selection (see e.g. Frühwirth-Schnatter and Wagner, 2010) can be interpreted in terms of approximating the results of a Bayesian inference pooling experiment, dealing with the problem of find- ing the relevant model reducing restrictions. Introducing some form of par- simony is particularly important in the case of VAR models that are poten- tially heavily overparametrized. Secondly, having in mind the model pooling or model comparison con- text, it seems to be reasonable to consider the criteria that measure predictive performance with respect to tz . Consequently, though individual models are estimated on the disaggregate data, meaning that the estimation procedure itself aims at optimizing goodness of fit at the disaggregate level, the model comparison (or model selection) is undertaken based on predictive criteria at the aggregate level. Similar issues to the ones presented above are formally considered by Lütkepohl (2009) within a broader model class (though Lütkepohl does not attempt to introduce parsimony and does not consider density forecasts). In order to describe the above procedure in detail, an overview of some criteria for density forecast evaluation is provided below. Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 79 4. Density Forecast Evaluation The awareness of the necessity of more careful investigation of the fore- cast uncertainty has become widespread in the econometric literature (see e.g. Clark and Ravazzolo, 2015). The discussion on ex ante predictability of the recent economic crisis has also contributed to the recognition of im- portance of the probabilistic approach to forecasting. Tasks such as evalua- tion of probability of extreme events are no longer considered non-standard. However, the strand of research is deeply rooted in history of the statistical inference methods and some of the ideas can be traced back to classical texts. However, for the purpose of the paper just a brief review of the select- ed techniques is presented. Standard ways of reporting ex post forecasting performance for point forecasts include considering mean forecast error, mean absolute forecast error or root mean squared forecast error (RMSFE). However, the concept of absolute forecast error can be generalized into a density context, leading to so-called continuous ranked probability score (CRPS). The generalization is based on the fact that the CRPS formula simplifies to that of absolute error if the predictive distribution is represented by point mass. The definition of CRPS, together with a closed-form analytical formula for the cases where predictive distribution is Gaussian is given by Gneiting and Raftery (2007, p. 367). Here the negatively-oriented version of CRPS is used (labeled CRPS* by the authors), which takes positive values and directly generalizes absolute error. For two predictive distributions with location parameter cor- responding exactly to the actual outturn, CRPS (intuitively, ‘a density fore- cast error’) will be higher for the density that is more dispersed. For the sake of ex post predictive performance analysis, the averaged CRPS is reported. Another popular measure is so-called log score, being just a logarithm of the value of probability density function corresponding to the predictive distribution at the point being the actual outturn. Log-score has a Bayesian interpretation, because a sum of the one-period-ahead log-scores in a recur- sive predictive experiment on expanding sample can be perceived as related to so-called Predictive Bayes Factor, being a modification of the Bayes Fac- tor, which in turn is a basic method of formal Bayesian model comparison. Relative to the CRPS measure, log-score (and the Bayes Factor) is consid- ered to be sensitive to tail outcomes. Differences between (log of) values of Gaussian pdf say two, three or four standard deviations away from the mean do not increase proportionally, therefore in case of mis-predicted outturn log-score will discriminate models stronger compared to CRPS. Details and Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 80 more elaborate theoretical justifications of the above measures are summa- rized by Gneiting and Raftery (2007). As the analysis conducted here is intended to provide an approximation to the results of Bayesian inference, the log-score will be used as the main criterion used for the model choice. 5. Model Comparison Framework The model framework used here has the advantage of nesting a simple “practitioner’s approach” of the following form. Consider the problem of inflation forecasting. A simple practitioner’s solution would be to use dis- aggregate data on sub-indices and apply say AR(p) forecasting models to each of the individual series. The total CPI forecast would be then obtained making use of the fact that weights of the sub-aggregates in the total index are usually known. Such an approach accounts for heterogeneity to some extent, but neglects possible dependence. Individual predictive models are chosen based on say goodness of fit for the individual series. However, an advantage of the framework provided here is that it highlights the fact that the collection of such individual processes can be perceived as a single mod- el for total CPI forecasting. From such a perspective the models for sub- aggregates should be evaluated jointly rather than separately. Consequently, neither of the sub-models should be evaluated without taking actual combi- nation of the remaining ones into account. This of course results in a big increase in the number of specifications being considered. The approach makes some generalizations of the “simple practitioner’s approach” readily available. For instance it is easy to impose stochastic link- ages between equations by allowing the contemporaneous variance- covariance matrix to be non-diagonal. Another option would be to include lags of the other sub aggregates into the individual equations. The frame- work allows for considering of a broad menu of models that differ substan- tially. One end of the model spectrum would be the simplistic approach us- ing individual processes (which is likely to be too restrictive), whereas the other end would correspond to an unrestricted VAR model (which is very far from being parsimonious). A solution proposed here is to explore options that are somewhere in between, meaning that these are less restrictive than the simple approach, but introduce “hard” shrinkage (parsimony by exclu- sion restrictions) compared to full, unrestricted VARs. This would corre- spond to including only some lags of some variables in the equations at the disaggregate level. Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 81 Alternative models (or sets of restrictions) should be compared by the predictive accuracy of the aggregate, here measured with log-score. The number of possible combinations of the restrictions is way too large to allow for any systematic specification search. Instead, a random-search algorithm is applied. Even if it is not capable of achieving the global optimum, it is interesting to check whether it can find a combination of restrictions that results in a model with predictive performance clearly dominating the simple special cases already mentioned. If such a specification can be found within reasonable computational time, the strategy would be empirically attractive and the issue should be investigated in more detail. The objective of the paper is to present a strategy that is feasible for more than ten disaggregate variables. For each model (a set of exclusion restriction that actually corresponds to some particular form of X ~ ) a predic- tive experiment is conducted, with recursive model reestimation and one step ahead prediction (with the number of repetitions denoted by N , being also a number of one-step-ahead forecasts obtained from one specification). Re- sults of the repeated out-of-sample predictive exercise are summarized (here by a sum of log-score values) and the best specification is retained. A modi- fication of the algorithm that aims at inference pooling instead of model selection could be also considered, though the possibility is not explored here. As the stochastic specification search requires thousands of model speci- fications to be checked, the computational burden is substantial. The compu- tational limitations are the reason for which the attention here is restricted to simple Gaussian VAR models only (instead of say VARMA class members), for analogous applications see George, Sun and Ni (2008). One more thing should be kept in mind: if N is not large, the specifica- tion search might result in overfitting issues. This might show a spurious predictive gains that are not necessarily observed out-of-sample, resulting from using the statistical noise to obtain perfect prediction in the verification period. In order to consider possible empirical consequences in depth, it would be necessary to add another data window over which performance is not maximized but just analyzed. However, for Polish macroeconomic time series the number of available observations is not large enough to do so. In order to avoid overfitting problems one might want to promote specification parsimony (imposing e.g. a limit on the number of non-zero autoregressive parameters, or using some penalty function). Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 82 6. Empirical Analysis: Nowcasting Polish CPI Inflation Rate In order to illustrate practical applicability of the approach and possible gains in terms of forecast performance an empirical analysis of Polish infla- tion is provided1. The analysis makes use of month-over-month CPI inflation rate in Polish economy for the period 2002M01–2015M08, with T = 164, (see Figure 1.). The last N = 44 observations are treated as a verification window2. It is important to notice that the verification period is quite challenging, as it in- cludes a period of deflation unprecedented in Polish economy. Figure 1. CPI m-o-m inflation rate [%] in Poland, 2000–2015 As for the data at the disaggregate level, Polish Statistical Office (GUS) reports disaggregation into 10 main price groups. However, as two of the components include both energy and non-energy prices, an effort has been made to separate the price indices. Moreover, housing expenditures and al- coholic beverages with tobacco are also separated from food, resulting in a total of 14 categories. The unrestricted model would therefore correspond to a VAR for 14 variables, having extra 196 parameters for each additional lag in the unrestricted version. 1 Other studies examining similar problems include Clark (2006), Dees and Güntner (2014), Huwiler and Kaufmann (2013) and Ibarra (2012), see also Stock and Watson (2015). 2 All the calculations were conducted using own routines written in Ox (see Doornik and Ooms, 2003), details of the specification search procedure are available from author by re- quest. -1 -0,5 0 0,5 1 1,5 2 2000 2002 2004 2006 2008 2010 2012 2014 Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 83 The predictive exercise is pseudo-real time, which means that no revi- sions are taken into account (though for inflation the revisions are rather minor, occurring once a year only). It is assumed that there are no aggrega- tion errors (that is that the percentage m-o-m change of the total CPI is equal to weighted sum of changes of the sub-aggregate categories), which is in accordance with (7). Moreover, the weights are assumed to be known, which is not always the case in real time, as the weights for a calendar year are published in March. The disaggregate categories are listed in Table 1, to- gether with average weights in the sample period. Table 1. Disaggregate CPI sub-categories used in the empirical analysis No. Code Category label Average weight 1 1a Food and non-alcoholic beverages 0.258 2 1b Alcoholic beverages, tobacco 0.060 3 2 Clothing and footwear 0.052 4 3a Dwelling: housing, water (excluding energy) 0.116 5 3b Dwelling: electricity, gas and other fuels 0.089 6 3c Dwelling: Furnishings, household equipment and routine maintenance of the house 0.049 7 4 Health 0.050 8 5a Transport: fuels for personal transport equipment 0.045 9 5b Transport: excluding fuels 0.044 10 6 Communication 0.049 11 7 Recreation and culture 0.070 12 8 Education 0.013 13 9 Restaurants and hotels 0.052 14 10 Miscellaneous goods and services 0.053 Note: The average weight is computed as average over the years included in the sample (as the sample end does not correspond to December). The benchmark models considered in the comparison are stationary and include unrestricted VAR models with one and two lags, as well as AR(1) process applied at the aggregate level, and a result of the specification search applied to AR(p) process for the aggregate data (resulting in omission of certain own lags of inflation). For the stochastic specification search algo- rithm, the maximum number of autoregressive parameters is limited to 42 (which reflects the idea that on average there are 3 parameters per equation, though these need not be uniformly allocated) and the maximum lag order is set to be equal to 13. Additional parameters (not included in the parameter count mentioned here) are the seasonal dummy variables, though the sto- chastic algorithm also switches between inclusion and exclusion of the sea- sonal dummies in each equation. Moreover, as for the contemporaneous covariance matrix two options are separately considered. In the first version, Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 84 the Σ matrix is assumed to be unrestricted (being of course positive- definite), in the other version it is assumed to be diagonal. The results are reported in Table 2., including the number of parameters (with sub-total for intercept/seasonal terms). The criteria of forecasting per- formance include RMSFE and average CRPS (the lower the better) and summed up (decimal) log-score (the higher the better). All the criteria show consistent results (neither is contradicting the other ones). Table 2. Comparison of the results from alternative specifications: predictive per- formance for aggregate inflation based on aggregate/disaggregate data Model Σ st. dev. RMSFE CRPS Log- score No. of parameters total Intercepts AR(1) for total CPI – 0.248 0.250 0.142 –0.726 14 12 AR(2) for total CPI – 0.249 0.249 0.142 –0.657 15 12 AR with variable selection total CPI – 0.294 0.224 0.129 0.306 7 1 VAR(1) nd 0.268 0.280 0.156 –2.283 455 168 VAR(2) nd 0.280 0.348 0.180 –4.949 651 168 VAR with variable selection d 0.228 0.174 0.100 5.058 147 91 VAR with variable selection nd 0.247 0.187 0.107 3.668 224 102 Note: Full estimation results are available from the author by request. Verification period: 2012M01– 2015M08; the forecasts are one-period-ahead (which amounts to nowcasting due to the publication lag). CRPS (Continuous Ranked Probability Score) is averaged over the realized forecasts, for log-score the sum of decimal logs is taken. For AR(1), AR(2), VAR(1),VAR(2) seasonal dummies are included. The second column indicates assumptions regarding the contemporaneous variance-covariance matrix (rele- vant for disaggregate models only), with ‘d’ denoting a diagonal matrix, whereas ‘nd’ denotes non- diagonal one; ‘st. dev.’ denotes ex ante forecast error (standard deviation of the predictive distribution) averaged over all forecasts. The column labeled ‘intercepts’ takes into account parameters corresponding to seasonality modelling as well. Unrestricted VARs for disaggregate data show relatively poor perfor- mance, deteriorating with the number of lags. AR(1) and AR(2) models are similar, though AR with variable selection is clearly better. Poor perfor- mance of the AR(1), AR(2), VAR(1) and VAR(2) models results from re- strictive assumptions about the highest lag order allowed, though VAR(2) is much worse than AR(2), so the other aspects seem to matter as well. VARs with variable selection are clearly the best models, strongly domi- nating the other specifications by all the criteria. Moreover, the model as- suming diagonal contemporaneous covariance matrix is getting slightly bet- ter results. In both cases ex ante errors are large compared to the RMSFE, consequently the forecast densities seems to be overdispersed. One caveat should be mentioned here: as the models with non-diagonal covariance matrix of shocks and exclusion restrictions are computationally more demanding, the specification search algorithm might be less efficient Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 85 there. It is therefore possible that the algorithm used here gets trapped in a local optimum, which is not the global one, and the result could be some- what improved upon a more intensive specification search. However, the practical conclusion would be that it is easier to find a useful model assum- ing uncorrelated shocks. Such a simplification is quite important from com- putational point of view as well. The predictive gains obtained here are non-negligible, though the atten- tion is restricted to one-period-ahead forecasts. However, the disaggregate information is likely to be useful in rather short horizons only. Moreover, it seems that the dependence between variables induced by incorporation of lags of other variables into equations is more important than the dependence represented by contemporaneous correlations of errors (at least from the viewpoint of quality of the forecasts of the aggregate series). Conclusions The paper aims at demonstrating potential gains in terms of density fore- casting that can be obtained by allowing for the use of disaggregate data. The analysis of density forecasts reflects the importance of the issue of un- certainty quantification in current econometric literature. The explicit objective of forecasting of the aggregate series only is intro- duced here. Consequently, from a theoretical point of view it might be plau- sible to consider all the models for sub-aggregates jointly, as one model (even if the individual models are fully stochastically independent). The comparison of such models (or the corresponding sets of exclusion re- strictions) is based on the predictive performance of the aggregate series. The results obtained for Polish CPI inflation series confirm that consid- ering multivariate models for the disaggregate data might result in substan- tial improvements in terms of prediction of the aggregate inflation. The con- clusion remains true for various criteria for both point and density ex post forecast evaluation. However, the result stems from introducing parsimony- oriented exclusion restrictions. Some cross-variable dependence matters, as the best performing models include lags of other variables as well. On the other hand, there is no evidence that allowing for contemporaneous correla- tions of the shocks in disaggregate series is really important for the aggre- gate predictive performance in the case under consideration. There are many possibilities to extend the scope of the analysis presented here, in particular by considering more general model classes. Besides that, two aspects are worth pointing out. Błażej Mazur DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 86 Firstly, a fully consistent theoretical framework for statistical inference in such situations should be developed. It was pointed out to the author that such a development could be based on a reparametrization of the observation space (at the disaggregate level) in such a way that the aggregate variable is explicitly considered3. This opens possibility for considering specifications taking advantage of concepts like the Bayesian Cut. Secondly, the stochastic search algorithm used here to explore the model space (or, equivalently, the space containing various sets of exclusion re- strictions) was rather simple and heuristic, with no theoretical premises for its efficiency. Perhaps some solutions used in the Stochastic Search Variable Selection setup could be adopted to match the specification search problem considered here. The most important conclusion from the empirical analysis provided is that even within quite a simple class of multivariate models for disaggregate data a substantial improvement in predictive performance (over the standard unrestricted specifications) is possible. However, it requires a thorough spec- ification search in terms of variable selection, which might be non-trivial when the dimension is large, which corresponds to detailed disaggregation. References Aron, J., Muellbauer, J. (2013), New Methods for Forecasting Inflation, Applied to the US*, Oxford Bulletin of Economics and Statistics 75(5), 637–661, DOI: http://dx.doi.org/10.1111/j.1468-0084.2012.00728.x. Castle, J. L., Hendry, D. F. (2010), Nowcasting From Disaggregates in the Face of Location Shifts, Journal of Forecasting, 29(1–2), 200–214, DOI: http://dx.doi.org/10.1002/for.1140. Clark, T. (2006), Disaggregate Evidence on the Persistence of Consumer Price Inflation, Journal of Applied Econometrics, 21(5), 563–587, DOI: http://dx.doi.org/10.1002/jae.859. Clark, T., Ravazzolo, F. (2015), Macroeconomic Forecasting Performance under Alternative Specifications of Time-Varying Volatility, Journal of Applied Econometrics, 30(4), 551–575, DOI: http://dx.doi.org/10.1002/jae.2379. Dees, S., Guntner, J. (2014), Analysing and Forecasting Price Dynamics Across Euro Area Countries and Sectors: A Panel VAR Approach, Economics Working Papers 2014-10, Department of Economics, Johannes Kepler University Linz, Austria. Doornik, J. A., Ooms, M. (2003), Computational Aspects of Maximum Likelihood Estimation of Autoregressive Fractionally Integrated Moving Average Models, Computational Statistics & Data Analysis, 42(3), 333–348, DOI: http://dx.doi.org/10.1016/S0167-9473(02)00212-8. 3 The author would like to thank Jacek Osiewalski for the valuable suggestion. Density Forecasts Based on Disaggregate Data: Nowcasting Polish Inflation DYNAMIC ECONOMETRIC MODELS 15 (2015) 71–87 87 Faust, J., Wright, J. H. (2013), Forecasting Inflation, in Elliott G., Timmermann. A. (eds.), Handbook of Economic Forecasting, vol. 2A, Amsterdam, North Holland, DOI: http://dx.doi.org/10.1016/B978-0-444-53683-9.00001-3. Frühwirth-Schnatter S., Wagner. H. (2010), Stochastic model specification search for Gaussi- an and partial non-Gaussian state space models, Journal of Econometrics, 154, 85– –100, DOI: http://dx.doi.org/10.1016/j.jeconom.2009.07.003. George, E. I., Sun, D., Ni. S. (2008), Bayesian Stochastic Search for VAR Model Re- strictions, Journal of Econometrics, 142(1), 553–580, DOI: http://dx.doi.org/10.1016/j.jeconom.2007.08.017. Giacomini, R., Granger, C. (2004), Aggregation of Space-Time Processes, Journal of Econ- ometrics, 118(1-2), 7–26, DOI: http://dx.doi.org/10.1016/S0304-4076(03)00132-5. Gneiting, T., Raftery, A. (2007), Strictly Proper Scoring Rules, Prediction, and Estimation, Journal of the American Statistical Association, 102(477), 359–378, DOI: http://dx.doi.org/10.1198/016214506000001437. Hendry, D. F., Hubrich, K. (2011), Combining Disaggregate Forecasts or Combining Dis- aggregate Information to Forecast an Aggregate, Journal of Business & Economic Sta- tistics, 29(2), 216–227, DOI: http://dx.doi.org/10.1198/jbes.2009.07112. Hubrich, K.. (2005), Forecasting Euro Area Inflation: Does Aggregating Forecasts by HICP Component Improve Forecast Accuracy?, International Journal of Forecasting, 21(1), 119–136, DOI: http://dx.doi.org/10.1016/j.ijforecast.2004.04.005. Huwiler, M., Kaufmann, D. (2013), Combining Disaggregate Forecasts for Inflation: The SNB's ARIMA model, Economic Studies 2013-07, Swiss National Bank. Ibarra, R. (2012), Do Disaggregated CPI Data Improve the Accuracy of Inflation Forecasts?, Economic Modelling, 29(4), 1305–1313, DOI: http://dx.doi.org/10.1016/j.econmod.2012.04.017. Lütkepohl, H. (2009), Forecasting Aggregated Time Series Variables: A Survey, Economics Working Papers ECO2009/17, European University Institute. Stock, J. H., Watson, M. (2015), Core Inflation and Trend Inflation, NBER Working Paper No. 21282. Własności rozkładów predyktywnych z modeli dla danych zdezagregowanych: prognoza inflacji w Polsce Z a r y s t r e ś c i. W artykule podjęto kwestię weryfikacji występowania korzyści w zakresie poprawy jakości prognostycznej (oceniając także trafność ex post rozkładów prognoz) w przypadku prognozowania agregatu na podstawie modeli dla danych zdezagregowanych. Problem ten rozpatrywano w ramach modelu wektorowej autoregresji z restrykcjami, przy czym alternatywne specyfikacje odpowiadały różnym układom restrykcji zerowych. Zapre- zentowano empiryczną analizę stopy inflacji CPI m/m w Polsce (rozpatrując 14 podkategorii dla danych zdezagregowanych). Modele z restrykcjami dla danych zdezagregowanych pro- wadziły do lepszych prognoz agregatu w porównaniu do modeli dla danych zagregowanych i modeli dla danych zdezagregowanych bez restrykcji (do porównania prognoz wykorzystano kryteria dla rozkładów prognoz takie jak CRPS oraz logarytm gęstości predyktywnej). S ł o w a k l u c z o w e: predykcja, porównanie modeli, rozkład predyktywny, inflacja, mo- dele VAR Pusta strona