Layout 6 Adaptively smoothed seismicity earthquake forecasts for Italy Maximilian J. Werner1,*, Agnès Helmstetter2, David D. Jackson3, Yan Y. Kagan3, Stefan Wiemer1 1 ETH Zurich, Swiss Seismological Service, Zurich, Switzerland 2 Université Joseph Fourier and CNRS, Laboratoire de Géophysique Interne et Tectonophysique, Grenoble, France 3 University of California, Department of Earth and Space Sciences, Los Angeles, USA ANNALS OF GEOPHYSICS, 53, 3, 2010; doi: 10.4401/ag-4839 ABSTRACT We present a model for estimation of the probabilities of future earthquakes of magnitudes m ≥ 4.95 in Italy. This model is a modified version of that proposed for California, USA, by Helmstetter et al. [2007] and Werner et al. [2010a], and it approximates seismicity using a spatially heterogeneous, temporally homogeneous Poisson point process. The temporal, spatial and magnitude dimensions are entirely decoupled. Magnitudes are independently and identically distributed according to a tapered Gutenberg- Richter magnitude distribution. We have estimated the spatial distribution of future seismicity by smoothing the locations of past earthquakes listed in two Italian catalogs: a short instrumental catalog, and a longer instrumental and historic catalog. The bandwidth of the adaptive spatial kernel is estimated by optimizing the predictive power of the kernel estimate of the spatial earthquake density in retrospective forecasts. When available and reliable, we used small earthquakes of m ≥ 2.95 to reveal active fault structures and 29 probable future epicenters. By calibrating the model with these two catalogs of different durations to create two forecasts, we intend to quantify the loss (or gain) of predictability incurred when only a short, but recent, data record is available. Both forecasts were scaled to five and ten years, and have been submitted to the Italian prospective forecasting experiment of the global Collaboratory for the Study of Earthquake Predictability (CSEP). An earlier forecast from the model was submitted by Helmstetter et al. [2007] to the Regional Earthquake Likelihood Model (RELM) experiment in California, and with more than half of the five-year experimental period over, the forecast has performed better than the others. 1. Introduction Here, we document the calibration of a previously published, time-independent model of earthquake occurrences in the region of Italy. We extracted probabilities of future m ≥ 4.95 shocks for five-year and ten-year periods in a format suitable for prospective testing within the Italian earthquake predictability experiment [Schorlemmer et al. 2010a]. Previously, Helmstetter et al. [2007] calculated a probabilistic earthquake forecast for m ≥ 4.95 for the region of California, USA, over a five-year duration. The forecast is currently being tested within the Regional Earthquake Likelihood Model (RELM) experiment [Field 2007]. After more than half of the five years over, the forecast has not been rejected following a suite of tests, and has performed better than other forecasts [Schorlemmer et al. 2010b]. Werner et al. [2010a] made some modifications to the model by Helmstetter et al. [2007] and re-calibrated it on up-dated data to generate a new earthquake forecast for California. This forecast is under test within the California branch of the global Collaboratory for the Study of Earthquake Predictability (CSEP) [Jordan 2006, Zechar et al. 2009]. To calculate future earthquake potential in Italy, we used the same model, with some minor modifications. One modification relates to the estimation of the completeness threshold, which was difficult to estimate on the small spatial scales here, which were possible with the high-quality dataset available in California [Werner et al. 2010a, Helmstetter et al. 2007]. Instead, we set a single magnitude threshold for the entire region. Smoothed seismicity models, such as the present one, usually do not incorporate geological or tectonic observations. Rather, the models are calibrated on the seismicity data available from earthquake catalogs. One can justifiably question the hypothesized validity that the short (decadal) periods covered by high-quality instrumental catalogs are sufficient to forecast locations of future large earthquakes that have very low occurrence probabilities. Even if the spatial distribution of seismicity is reasonably stable up to geological timescales, estimating this distribution from a short time window of observations is difficult. A partial solution to this problem is to estimate a predictive spatial distribution, rather than the observed spatial distribution. That is, rather than estimating the density of past earthquakes, the available data is divided into separate learning and target sets, to estimate a predictive density from the learning catalog that can be evaluated and optimized on the target earthquakes. This cross-validation method was used here, and it generated smoother forecasts than a simple Article history Received March 23, 2010; accepted July 1, 2010. Subject classification: Earthquake interactions and probability, Seismic risk, Statistical Analysis, Mathematical Geophysics, Inverse methods. 107 kernel-density estimation method because the locations of future – rather than past – earthquakes are predicted, and these locations can be in regions of little previous seismicity. Nonetheless, this is only a partial solution: Kagan and Jackson [1994] conjectured that the optimal predictive horizon of a forecast based on smoothed seismicity will scale proportionally with the learning catalog because of spatio- temporal clustering. Thus, the forecasts by [Werner et al. 2010a] and Helmstetter et al. [2007], which were calculated from about 25 years of high-quality Californian data, should perform well for moderate earthquakes over similar time scales, but the longer periods that are relevant for seismic hazard estimates and building codes might require longer input datasets. On the other hand, according to Kagan and Jackson [1994], it remains an untested hypothesis that seismicity estimates based on geological observations, i.e. the earthquake history of several thousands of years, can provide more predictive and relevant information for engineering design than high-quality, low-threshold instrumental catalogs. Their argument was based on two points: first, the quality of geological observations for seismic hazard is often low compared to the quality of modern earthquake catalogs; secondly, seismicity shows long-term spatio-temporal variations that necessitate an appropriate weighting of the information contained in observations of recent earthquakes and those that occurred in the distant past. To begin to quantify the impact of the duration of the learning catalog on the predictive power of smoothed seismicity forecasts, we calculated two forecasts of the smoothed seismicity model by calibrating the model on two catalogs of different durations. One forecast was based on the most recent 30 years of instrumental data, while the other one was calculated from 100 years of combined historic and instrumental data. We have submitted the forecasts to the European CSEP Testing Center at ETH Zurich, to be tested and compared with other forecasts within the framework of the Italian earthquake predictability experiment, CSEP-Italy [see Schorlemmer et al. 2010a]. Comparing the performances of these two forecasts might shed light on the impact of the length of the learning catalog. The present study is structured as follows. In Section 2, we describe the Italian earthquake catalogs from which we estimated the future earthquake potential in Italy. Section 3 describes the model and its calibration on the two datasets, while in Section 4 we present the earthquake forecasts; we discuss our conclusions in Section 5. 2. Data 2.1. The CSI 1.1: 1981-2002 For more details about the catalogs discussed here and below, see [Schorlemmer et al. 2010a] and references therein. As the primary data source for the forecast based on smoothed instrumental seismicity, we used the catalog of Italian seismicity (Catalogo della Sismicità Italiana, CSI 1.1) [Castello et al. 2007, Chiarabba et al. 2005], which is available from http://www.cseptesting.org/regions/italy. The CSI spans the time period from 1981 until 2002, and it reports the local magnitudes, in agreement with the magnitude scale that will be used during the prospective evaluation of these forecasts. Schorlemmer et al. [2010a] reported a clear change in the earthquake numbers per year in 1984 due to the numerous network changes in the early 1980s, and they recommended using the CSI data from mid-1984 onwards. We therefore selected the earthquakes listed in the CSI from July 1, 1984, to December 31, 2002, within the CSEP-Italy collection region defined by Schorlemmer et al. [2010a]. To avoid possible contamination from quarry blasts and volcanic microseismicity, we set a uniform completeness magnitude threshold of mt = 2.95 for the entire collection region, which was higher than the threshold of mt = 2.5 calculated by Schorlemmer et al. [2010a] for onshore seismicity. 2.2. The BSI catalog: 2003-2009 For prospective tests of the forecasts submitted [see Schorlemmer et al. 2010a], we used the Italian seismic bulletin (Bollettino Sismico Italiano, BSI) earthquake catalog that is recorded by the Istituto Nazionale di Geofisica e Vulcanologia (INGV) [BSI Working Group 2002, Amato et al. 2006]. The BSI is available at http://bollettinosismico.rm.ingv.it, and since July 2007 at http://ISIDe.rm.ingv.it/. We used earthquakes listed in the BSI from January 1, 2003, until June 25, 2009. As the quality of the data for small earthquakes in the BSI catalog between 2003 and 2005 was questionable, we applied a relatively high magnitude threshold of mt = 2.95. For the forecast based on recent instrumental observations, we merged the BSI and CSI catalogs (hereinafter referred to as the «merged instrumental catalog», MIC). 2.3. The CPTI08, 1901-2006 For the forecast based on instrumental and historic seismicity, we used the parametric catalog of Italian earthquakes (Catalogo Parametrico dei Terremoti Italiani, version CPTI08) [Rovida and the CPTI Working Group 2008], which was available from http://www.cseptesting. org/regions/italy. The CPTI08 is a preliminary revision of the 2004 CPTI04 [CPTI Working Group 2004]. It covered the period from 1901 to 2006, and it was based on both instrumental and historic observations. The CPTI08 lists the moment magnitudes that were either estimated from macroseismic data or calculated using linear («ad hoc» [Rovida and the CPTI Working Group 2008]) regression relationships between surface waves, body waves or local magnitudes [MPS Working Group 2004]. Castellaro et al. [2006] showed that standard linear regression can lead to biased and uncertain SMOOTHED SEISMICITY EARTHQUAKE FORECASTS 108 109 magnitude conversions, because magnitude observations and the associated errors can violate the simplifying assumptions of standard linear regression. To remedy this, Castellaro et al. [2006] advocated the use of general orthogonal regression. Therefore, the CPTI08 can be expected to include bias and large uncertainties (like most historic catalogs) that might affect the quality of the earthquake forecasts. In their global forecasts, Kagan and Jackson [2010a] therefore used more homogeneous catalogs: the global Centroid Moment Tensor catalog [Ekström et al. 2005] and the global Preliminary Determination of Epicenters catalog by the U.S. Geological Survey [U.S. Geological Survey 2001]. Nonetheless, the CPTI08 offers a much longer training catalog (100 years) than either of these two global catalogs (30 to 40 years), which allowed our investigation of the effects of the length of the training catalog on the forecasts generated. Therefore, we accepted these potential shortcomings of the CPTI08 for this study. As the prospective experiment used local magnitudes, we converted the moment magnitudes to local magnitudes using the same regression equation that was used to convert the original local magnitudes to moment magnitudes for the creation of the CPTI [MPS Working Group 2004, Schorlemmer et al. 2010a]: (1) As for the BSI catalog and the CSI, we only selected shocks within the collection region and with depths of less than 30 km. Some of the earthquakes were not assigned depths, mostly during the early part of the CPTI. We included these earthquakes as observations within the testing region because it was very unlikely that these were deeper than 30 km [see also Schorlemmer et al. 2010a]. We selected the earthquakes with moment magnitudes mW ≥ 4.75 [Schorlemmer et al. 2010a], which corresponded to local magnitudes of mL ≥ 4.45. 3. Model calibration This model has previously been documented by Helmstetter et al. [2007] and Werner et al. [2010a], and so we provide only a brief overview of it here. First, the earthquake catalogs were declustered to remove the strong influence of triggered sequences (Section 3.1); without declustering, there would be the need to use a more complicated, time-dependent model that used the Omori-Utsu law to remove the influence of aftershocks [Omori 1894, Utsu et al. 1995]. Once declustered, the seismicity was smoothed with an adaptive power-law kernel (Section 3.2). The bandwidth of the kernel at each earthquake epicenter was adapted to the distance to the k-th nearest neighbor. To estimate the optimal number of neighbors to include in the smoothing, we divided the catalogs into two non-overlapping sets: a learning catalog and a testing catalog. In Section 3.3, we determine the optimal number of neighbors by calculating the spatial density of seismicity from the learning catalog and evaluating its predictive power on the testing catalog. The spatial density was scaled to the number of expected earthquakes by using the mean number of observed earthquakes (Section 3.6). Finally, to obtain a rate- space-magnitude forecast, we multiplied the scaled spatial density by a tapered Gutenberg-Richter magnitude-frequency distribution (Section 3.5). In contrast to the model presented by Helmstetter et al. [2007] and Werner et al. [2010a], we used a homogeneous threshold for the completeness magnitude as the previous method that estimated the threshold as a function of space did not perform well for Italy. 3.1. Declustering To estimate the spatial distribution of spontaneous earthquakes, we used the declustering algorithm proposed by Reasenberg [1985], as modified by Helmstetter et al. [2007] and Werner et al. [2010a]. As in these prior studies, we set the input parameters to rfact = 8, xk = 0.5, p1 = 0.95, xmin = 1 day and xmax = 5 days. We varied xmeff according to the different learning catalogs used. As the interaction distance, we used the scaling r = 0.01 × 100.5M km, as suggested by Wells and Coppersmith [1994], instead of r = 0.011 × 100.4M km and r < 30 km in the Reasenberg algorithm. Like Werner et al. [2010a], we set the localization errors to 1 km horizontal and 2 km vertical. We found that about 80% of the earthquakes in the MIC were spontaneous, while in the CPTI, about 92% of all of the shocks were independent, according to the Reasenberg classification. 3.2 Adaptive kernel smoothing of declustered seismicity We estimated the density of spontaneous seismicity in each 0.1˚× 0.1 ̊cell by smoothing the location of each earthquake i with an isotropic adaptive power-law kernel : (2) where di is the adaptive smoothing distance, and C (di) is a normalizing factor, so that the integral of over an infinite area was equal to 1.0. We measured the smoothing distance di associated with an earthquake i as the horizontal distance between event i and its k-th closest neighbor. The number of neighbors, k, was an adjustable parameter that was estimated by optimizing the forecasts (see Section 3.3). We imposed di > 0.5 km to account for location uncertainty. The kernel bandwith di thus decreased if the density of the seismicity was high at the location ri of the earthquake i, so that we had higher resolution (smaller di) where the density was higher. The density at any point was estimated by: (3) where Nl is the total number of earthquakes in the learning SMOOTHED SEISMICITY EARTHQUAKE FORECASTS 1.231 ( 1.145) .m mL W= - ( )K rdi ( )K rdi ( ) ( ) ( ) K r r d C d 2 2 1.5d i i i = +; ; ( )r ( )r K r r( ) 1 d i N ii l =n - = / SMOOTHED SEISMICITY EARTHQUAKE FORECASTS 110 Model t1 t2 mt Nl t1 t2 mmin Nt L G k 1 1984 2003 2.95 2,632 2004 2008 4.95 6 −45.4 2.12 5 14.2 2 1984 2003 3.45 755 2004 2008 4.95 6 −45.0 2.26 4 25.5 3 1984 2003 3.95 223 2004 2008 4.95 6 −44.5 2.44 2 32.5 4 1984 2003 4.45 63 2004 2008 4.95 6 −44.1 2.60 1 48.8 5 1984 2003 4.95 19 2004 2008 4.95 6 −45.5 2.07 1 95.6 6 1984 2003 5.45 5 2004 2008 4.95 6 −48.1 1.34 1 221.3 7* 1984 1998 2.95 1,900 1999 2003 4.95 8 −65.4* 0.94* 50* 60.1 8 1984 1998 3.45 550 1999 2003 4.95 8 −64.6 1.04 48 123.3 9 1984 2003 3.95 147 1999 2003 4.95 8 −63.0 1.27 27 206.3 10 1984 2003 4.45 27 1999 2003 4.95 8 −63.1 1.25 8 248.7 11 1984 2003 4.95 12 1999 2003 4.95 8 −63.3 1.22 4 290.9 12 1984 2003 5.45 4 1999 2003 4.95 8 −63.7 1.16 2 500.9 13 1984 1993 2.95 1,145 1994 1998 4.95 13 −86.3 3.14 1 7.8 14 1984 1993 3.45 328 1994 1998 4.95 13 −86.5 3.09 1 16.1 15 1984 1993 3.95 81 1994 1998 4.95 13 −99.4 1.15 5 107.5 16 1984 1993 4.45 15 1994 1998 4.95 13 −97.2 1.37 6 407.1 17 1984 1993 4.95 4 1994 1998 4.95 13 −97.3 1.35 1 388.4 18 1984 1993 5.45 0 1994 1998 4.95 13 _ _ _ 19 1984 2009 2.95 3,522 2004 2008 4.95 6 −37.8 7.48 1 5.0 20! 1984 2009 2.95 3,522 2004 2008 4.95 6 −41.6! 3.98! 6! 13.8 Imput Catalog (MIC) Target Catalog (MIC) Results Model t1 t2 mt Nl t1 t2 mmin Nt L G k 1 1901 2001 4.45 605 2002 2006 4.95 7 −55.5 1.39 35 95.9 2 1901 2001 4.95 166 2002 2006 4.95 7 −54.8 1.54 12 102.9 3 1901 2001 5.45 51 2002 2006 4.95 7 −55.2 1.46 6 127.8 4 1901 1996 4.45 576 1997 2001 4.95 15 −97.1 3.41 17 63.7 5 1901 1996 4.95 155 1997 2001 4.95 15 −101.8 2.49 8 84.6 6 1901 1996 5.45 46 1997 2001 4.95 15 −104.3 2.12 5 120.8 7 1901 1991 4.45 565 1992 1996 4.95 2 −15.9 4.41 2 17.8 8 1901 1991 4.95 153 1992 1996 4.95 2 −16.5 3.17 1 22.7 9 1901 1991 5.45 46 1992 1996 4.95 2 −18.0 1.53 8 199.3 10 1901 1986 4.45 551 1987 1991 4.95 6 −46.6 1.72 2 18.0 11 1901 1986 4.95 147 1987 1991 4.95 6 −46.2 1.84 1 22.6 12 1901 1986 5.45 44 1987 1991 4.95 6 −48.2 1.32 4 101.7 13 1901 2006 4.45 623 2002 2006 4.95 7 −39.4 13.78 1 11.3 14 1901 2006 4.45 623 2002 2006 4.95 7 −39.4! 1.35! 6! 34.1 Imput Catalog (CPTI) Target Catalog (CPTI) Results Table 1. Results of the optimization of the spatial density estimate using the MIC from July 1, 1984, to June 25, 2009. We varied the learning and target catalogs. The target catalog is the MIC in the testing region. The input catalog is the declustered MIC in the collection region. t1: first year of data taken from the learning or target catalog; t2: last year of data taken from the learning or target catalog; mt: magnitude threshold of the learning catalog; mmin: magnitude threshold of the target catalog; Nl and Nt: number of earthquakes in the learning and testing catalogs, respectively; L: log-likelihood score of a model; G: model probability gain per earthquake over a spatially uniform model; k: optimal number of neighbors to include in the bandwidth of the smoothing kernel; (di): mean adaptive bandwidth (km); ! : indication where k was not optimized, but constrained to k = 6; *: maximum k = 50 of the range [1, 50] was attained. Table 2.As for Table 1, but using the CPTI from 1901 to 2006 as the dataset. ! indicates that kwas not optimized, but constrained. See Table 1 for further details. di di 111 catalog. However, the forecasts were given as an expected number of events in each cell of 0.1˚. We therefore integrated Equation (2) over each cell and summed over all of the contributing earthquakes, to obtain the seismicity rate of each cell. 3.3. Optimizing the spatial smoothing We estimated the parameter k, the number of neighbors used to compute the smoothing distance di in Equation (3), by maximizing the likelihood of the model. We built the model n´(ix, iy) in each cell (ix, iy) from the data in the learning catalog, and evaluated the likelihood of target earthquakes in the testing catalog. As we assumed independence of the spatial density from the magnitude distribution and the total expected number of events, we optimized the normalized spatial density estimate in each cell (ix, iy) using: (4) where Nt is the number of observed target events. The expected number of events for the model n* was thus equal to the observed number Nt. The log-likelihood of the model is given by: (5) where n is the number of events that occurred in cell (ix, iy). To adhere to the rules of the CSEP-Italy predictability experiment, we assumed that the probability p of observing n events in cell (ix, iy) given a forecast of n*(ix, iy) in that cell was given by the Poisson distribution: (6) We built a large set of background models n* by varying: (i) the start times, end times and magnitude thresholds of the learning and testing catalogs; and (ii) the catalog (either the MIC or the CPTI). We evaluated the performance of each model by calculating its probability gain per target earthquake relative to a model with a uniform spatial density: (7) where L0 is the log-likelihood of the spatially homogeneous model. 3.4. Results of the spatial optimization Tables 1 and 2 show the results of the spatial optimization on the MIC and the CPTI, respectively. For each model, we determined the optimal smoothing parameter k in the range [1, 50] by choosing the value that maximized the likelihood of the target earthquakes in the target catalog, given the smoothed spatial density estimated from the learning catalog. We varied the magnitude threshold of the input catalog to test whether including small earthquakes results in greater predictability of future m ≥ 4.95 earthquakes. We also changed the target periods to test the robustness of the results. In Figure 1, we show the probability gains per earthquake against the magnitude threshold of the two learning catalogs. For comparisons, we also included the gains obtained for the five-year period of 2004-2008 (inclusive) in California by Werner et al. [2010a]. The gains obtained in Italy fluctuated strongly for different target periods, and it was difficult to detect a systematic trend in the gain as a function of the magnitude threshold of the learning catalog. In contrast to California, where around 25 earthquakes of m ≥ 4.95 tend to occur every five years, Italy has experienced far fewer shocks of such sizes; during the 1992-1996 period in the CPTI, the gains were calculated from only two target earthquakes. The small sample size of these target earthquakes might explain the observed fluctuations in the calculated gains (see also below, and Section 5). For the target period 1994-1998 of the MIC, the gains were especially small for low thresholds. For mt = 2.95, the smoothing parameter reached the maximum value of 50 (Table 1, model 7), providing a gain smaller than unity, i.e. the uniform forecast outperformed the smoothed seismicity forecast (further increasing the amount of smoothing eventually leads to a uniform density, such that the gain would be equal to unity). The long-range smoothing required by the target events can be traced back to the occurrence of three earthquakes in 2002: the September 6 Sicily earthquake north-east of Palermo, and the October 31 Molise earthquake and one of its aftershocks. The predicted densities in the relevant cells was increased by a factor of almost ten as the algorithm increased the smoothing from SMOOTHED SEISMICITY EARTHQUAKE FORECASTS Figure 1. Probability gain per earthquake versus magnitude threshold of the learning catalogs for various five-year target periods: blue, MIC; red, CPTI; Cal. 2004-2008, gains obtained for California by Werner et al. [2010a]; homogeneous, reference gain of a spatially homogeneous forecast. * ( , ) ( , ) ( , ) i i i i i i N x y x yii x y t yx =n n n l l // log * ( , ),L p i i nx y ii yx = n6 @// * ( , ), * ( , ) ! exp * ( , ) p i i n i i n i i x y x y n x y=n n n-6 6 6@ @ @ exp N L L G 0 t = -c m k = 1 to k = 50 (see also discussion in Section 5). Whenever target earthquakes occur in previously active regions, the optimal amount of smoothing is small (k = 1) and the gains tend to be higher (see, e.g., Table 1, model 13 for the 1994-1998 target period, which included the 1997 Colfiorito earthquake sequence). However, exceptions exist to the expected anti-correlation between k and G: the 15 targeted earthquakes during the 1997-2001 target period of the CPTI were forecast best with a smoothing parameter k = 15, although they realized a gain per earthquake of G = 3.41. To calculate the spatial densities for the final forecasts for the predictability experiment (Table 1, model 20!obtained from the MIC, and Table 2, model 14! obtained from the CPTI), we had to decide which magnitude threshold to apply to the learning catalog, which smoothing parameter to use, and whether to use all of the existing data up to the end of the two catalogs. We decided to use all of the available data in each catalog for the final density estimates so that the forecasts would benefit from as much data as possible. Moreover, despite the observed variability in gains against the magnitude threshold of the input catalog (discussed above), Figure 1 shows that, on average, there appears to be an advantage in including small earthquakes for estimating the predictive spatial density (see also discussion in Section 5). Therefore, to calculate the spatial densities for the final forecasts, we used the lowest reliable magnitude threshold for each catalog. Finally, we used an optimal smoothing parameter of k = 6 despite the large variability across the magnitude thresholds and target periods, because the resulting density was slightly smoother than that obtained from the median (k = 5) of the optimal values for the lowest magnitude thresholds, and because both Werner et al. [2010a] and Helmstetter et al. [2007] used the same value. The two final predictive spatial densities based on the MIC and the CPTI were model 20! in Table 1 and model 14! in Table 2, respectively. We discuss future improvements of the spatial optimization method in Section 5. 3.5. Magnitude distribution We assumed that the cumulative magnitude probability distribution followed a tapered Gutenberg-Richter magnitude frequency distribution [Gutenberg and Richter 1944] with a uniform b-value and corner magnitude mc [Helmstetter et al. 2007, Equation (10)]: (8) with a minimum target magnitude mmin= 4.95 (for the five- year and ten-year CSEP forecast groups). Bird and Kagan [2004, p. 2393] classified the tectonic setting of Italy as an orogen situated at a continental convergent boundary and estimated mc= . Kagan et al. [2010: fig. 1, table 1] assigned onshore Italy to the category of «active continent» with mc = and the southern off-shore region of Italy to «trench» with mc = . For simplicity, we set a uniform value of mc = 8.0, to reflect these studies. This is likely to be a conservative choice for the most seismically active region of onshore Italy. We further used a b-value of 1.0 (a maximum likelihood estimate based on magnitudes above mt = 2.95 in the MIC resulted in = 1.07). We integrated the magnitude distribution of Equation (8) in discrete bins of width of 0.1 to conform to the rules of the experiment. 3.6. Expected number of events The expected number of events per year in each space- magnitude bin (ix, iy, im) was calculated from: (9) where n* is the normalized spatial background density, P (im) is the integrated probability of an earthquake in magnitude bin (im) defined according to Equation (8), and m is the expected number of earthquakes over a five-year or ten-year period. To estimate the expected number of earthquakes, we counted the total number of observed m ≥ 4.95 earthquakes in each (non-declustered) catalog and divided by the duration to obtain the mean number of events of m ≥ 4.95 per year. For the MIC, we estimated mmic = 1.24 per year, while there was an average of mcpti = 1.72 earthquakes of m ≥ 4.95 per year in the CPTI. To obtain the five-year and ten-year forecasts, we simply multiplied m by the number of years. Thus, based on the shorter MIC, we expect 6.2 (12.4) earthquakes from January 1, 2010 to December 31, 2014 (to December 31, 2019), while based on the longer CPTI, we predict 8.6 (17.2) earthquakes over the same periods. 4. Five-year and ten-year m ≥ 4.95 forecasts The five-year forecasts based on the MIC and the CPTI are shown in Figures 2 and 3, respectively. To obtain the ten- year forecasts, we doubled the rates in each space magnitude bin because we assumed a temporally homogeneous Poisson process. The forecast based on the CPTI is smoother than the map based on the shorter MIC, because the 3,522 earthquakes of mostly small magnitude in the MIC (m ≥ 2.95) clustered more strongly than the 623 events of larger magnitude in the CPTI (m ≥ 4.45). The more evenly distributed epicenters of the CPTI were more uncertain than those from the MIC, resulting in less clustering. We can compare the forecasts in Figures 2 and 3 with those of Kagan and Jackson [2010a: fig. 4] and Zechar and Jordan [2010: fig. 2]. Kagan and Jackson [2010a] used a fixed- bandwidth power-law kernel with an optimized bandwidth rs = 5 km, to smooth the seismicity in Italy listed in the Preliminary Determination of Epicenters catalog above a threshold mt = 4.7. By visual inspection, their forecast was similar to the forecast based on the CPTI in Figure 3, SMOOTHED SEISMICITY EARTHQUAKE FORECASTS 112 ( ) 10 exp 10 10P m ( ) 1.5( ) 1.5( )b m m m m m mmin min c c= -- -- - 6 @ 8.46 0.39 0.21+ - .7 59 0. 0. 25 72+ - .8 57 0. ? 35 + - bt ( , , ) * ( , ) ( )E i i i i i P ix y x y mm = m n 113 although their fixed bandwidth of rs = 5 km was much smaller than the average of our optimal adaptive bandwidths (di) = 34.1 (Table 2, model 14 !). The optimal smoothing bandwidths are different because Kagan and Jackson [2010a] optimized their bandwidth for the 2004-2006 target period and smoothed earthquakes from a different data source. Given the observed dependence of the optimal smoothing distance on the chosen target period, we should expect to see differences in the optimal bandwidths. Zechar and Jordan [2010] smoothed the CSI, the CPTI and a merged («hybrid») catalog with an optimized fixed- bandwidth Gaussian kernel. Again, different choices for the magnitude threshold and the learning data make direct comparisons difficult, except for the forecasts based on the CPTI optimized for the 2002-2006 target period. Figure 1 of Zechar and Jordan [2010] shows that the optimal smoothing lengthscale is v = 75 km, while we obtained a mean bandwidth of ≈ 95.6 (Table 2, model 1), indicating broad agreement between the two methods. Neither Kagan and Jackson [2010a] nor Zechar and Jordan [2010] investigated whether the optimal smoothing length scale varied with target periods. 5. Discussion and conclusions Werner et al. [2010b] evaluated all of the time- independent five-year and ten-year forecasts of the CSEP-Italy experiment retrospectively on data from the CSI and CPTI. Using the forecasts from the first round of submissions from August 1, 2009, they found that several of the modelers had committed errors in the calibration of their models for the calculation of their forecasts, principally in the conversion of the moment magnitude scale of the CPTI to the local magnitude scale that is used for prospective testing. Our first submission of the forecast based on the CPTI also contained an error, because of a mistake in the conversion formula we used (see Equation 1). In this report, we only discuss the corrected forecasts that we submitted during the second round (January 1, 2010). In the future, we would like to make a number of improvements to the model. First, we used a relatively arbitrary declustering procedure based on Reasenberg's algorithm [Reasenberg 1985], although there are more objective methods that can be used [e.g. Zhuang et al. 2002, Console et al. 2010]. Secondly, contrary to the studies of Helmstetter et al. [2007] and Werner et al. [2010a], we could not estimate the completeness threshold as a function of space using their method and then attempt to correct for missing small earthquakes because the results were not stable. In the future, we intend to use a more robust method for estimating the completeness threshold. Thirdly, we found that the optimal smoothing parameter varied substantially for different target periods, much more so than observed by Werner et al. [2010a]. The small number of target earthquakes might have caused these fluctuations. In the future, the optimal smoothing parameter should be optimized jointly over many target periods. More generally, we intend to assess the influence of the choice of kernel function. For example, do anisotropic kernels [e.g., Kagan and Jackson 1994] improve spatial forecasts? Does the optimal kernel choice depend on the tectonic regime [e.g., Kagan and Jackson 2010a, Kagan and Jackson 2010b]? Should large earthquakes count more towards the density than small earthquakes [e.g., Kagan et al. 2007]? Smoothed seismicity models make the implicit SMOOTHED SEISMICITY EARTHQUAKE FORECASTS Figure 2.Earthquake forecast based on the MIC. Expected number of earthquakes of mL ≥ 4.95 over the five-year period from January 1, 2010, to December 31, 2014 per 0.1˚× 0.1˚ based on smoothing the locations of earthquakes of mL ≥ 2.95 in the instrumental catalog from July 1, 1984, to June 25, 2009. Figure 3. Earthquake forecast based on the CPTI. Expected number of earthquakesof mL ≥ 4.95 over the five-year period from January 1, 2010, to December 31, 2014, per 0.1˚× 0.1˚ based on smoothing the locations of earthquakes of mL ≥ 4.45 in the CPTI from 1901 to 2006. di assumption that the available earthquake catalogs are long enough to estimate predictive spatial densities. Kagan and Jackson [1994], however, conjectured that the optimal forecast horizon of an earthquake forecast based on smoothed seismicity is related to the duration of the learning catalog. To begin to address this question, we provided two earthquake forecasts: one based on a relatively short (30 years) dataset with a lower magnitude threshold, and the other based on a longer (100 years) catalog with a higher magnitude threshold. If the conjecture by Kagan and Jackson [1994] is correct, the forecast based on the MIC should perform better than the forecast based on the CPTI over shorter periods, while the CPTI-based forecast should show more predictive accuracy at longer time scales. Helmstetter et al. [2007] and Werner et al. [2010a] have previously provided evidence for the hypothesis that in California, including the locations of small earthquakes improved the forecasts of future epicenters of m ≥ 4.95 earthquakes [see also Hanks 1992, Marsan 2005, Helmstetter et al. 2005, Sornette and Werner 2005a, Sornette and Werner 2005b, for perspectives on the importance of small earthquakes]. The results of the present study do not provide conclusive evidence for or against this hypothesis, perhaps because of the fluctuations induced by the small number of target earthquakes. A more robust cross-validation method for the optimal smoothing parameter might be able to address this outstanding issue. Like its predecessor, the RELM experiment, the CSEP- Italy experiment required the use of a Poisson distribution for the number of earthquakes per space-magnitude bin [Schorlemmer et al. 2010a, Schorlemmer et al. 2007]. In principle, however, each model should provide its own model- dependent uncertainty bounds [Werner and Sornette 2008]. For the time-independent Poisson process model described here, the distribution of the number of shocks over the relevant five-year or ten-year time scales is assumed to be time-independent. However, a negative binomial distribution fits the number distribution better than the Poisson distribution [Schorlemmer et al. 2010b, Kagan 2010, Werner et al. 2010a, Werner et al. 2010b]. Therefore, in future iterations of the model, Poisson distributions should be replaced by appropriate alternatives in each space- magnitude bin. A difficulty with this approach will be the estimation of parameter values based on small and possibly correlated samples. Nonetheless, despite the simplicity of the model and its approximations, the five-year forecast submitted by Helmstetter et al. [2007] to the RELM experiment is outperforming others after the first 2.5 years [Schorlemmer et al. 2010b]. Whether the model will perform similarly in Italy, which is different tectonically from the much more seismically active California, will be an interesting test of the universal applicability of the model assumptions. Data and sharing resources We used three earthquake catalogs for this study: the parametric catalog of Italian earthquakes (Catalogo Parametrico dei Terremoti Italiani, CPTI08) [Rovida and the CPTI Working Group 2008], the catalog of Italian seismicity (Catalogo della Sismicità Italiana, CSI 1.1) [Castello et al. 2007, Chiarabba et al. 2005], and the Italian seismic bulletin (Bollettino Sismico Italiano, BSI) catalog [BSI Working Group 2002, Amato et al. 2006]. The BSI catalog is available at http://bollettinosismico.rm.ingv.it, and since July 2007, at http://ISIDe.rm.ingv.it/. The particular versions of the CSI and CPTI used are available at http://www.cseptesting.org/ regions/italy. Acknowledgements. We thank A. Christophersen, L. Gulia, J. Woessner and J. Zechar for stimulating discussions and F. Euchner for computational assistance. MJW was supported by the EXTREMES project of the ETH Competence Center Environment and Sustainability (CCES). AH was supported by the European Commission under grant TRIGS- 043251, and by the French National Research Agency under grant ASEISMIC. DDJ and YYK received support from the National Science Foundation through grant EAR-0711515, as well as from the Southern California Earthquake Center (SCEC). MJW thanks SCEC for travel support. SCEC is funded by the NSF Cooperative Agreement EAR- 0106924 and the USGS Cooperative Agreement 02HQAG0008. The SCEC contribution number for this study is 1437. References Amato, D., L. Badiali, M. Cattaneo, A. Delladio, F. Doumaz and F.M. Mele (2006). The real-time earthquake monitoring system in Italy, Geosciences (BRGM), 4, 70-75. Bird, P. and Y.Y. Kagan (2004). Plate-tectonic analysis of shallow seismicity: Apparent boundary width, beta, corner magnitude, coupled lithosphere thickness, and coupling in seven tectonic settings, Bull. Seismol. Soc. Am., 94, 2380- 2399, plus electronic supplement. BSI Working Group (2002). Bollettino Sismico Italiano, 2002- 2009, Istituto Nazionale di Geofisica e Vulcanologia, Bologna; http://bollettinosismico.rm.ingv.it/. Castellaro, S., F. Mulargia and Y.Y. Kagan (2006). Regression problems for magnitudes, Geophys. J. Intern., 165, 913- 930; doi: 10.1111/j.1365-246X.2006.02955.x. Castello, B., M. Olivieri and G. Selvaggi (2007). Local and duration magnitude determination for the Italian earthquake catalog, 1981-2002, Bull. Seismol. Soc. Am, 97, 128-139; doi: 10.1785/0120050258. Chiarabba, C., L. Jovane and R. Stefano (2005). A new view of Italian seismicity using 20 years of instrumental recordings, Tectonophysics, 395, 251-268. Console, R., D.D. Jackson and Y.Y. Kagan (2010). Using the ETAS model for catalog declustering and seismic background assessment, Pure Appl. Geophys. (The Frank Evison Volume), 167, 819-830; doi: 10.1007/s00024-010-0065-5. CPTI Working Group (2004). Catalogo Parametrico dei Ter- remoti Italiani, versione 2004 (CPTI04), INGV, Bologna; http://emidius.mi.ingv.it/CPTI04/. SMOOTHED SEISMICITY EARTHQUAKE FORECASTS 114 115 Ekström, G., A.M. Dziewonski, N.N. Maternovskaya and M. Nettles (2005). Global seismicity of 2003: Centroid- moment-tensor solutions for 1087 earthquakes, Physics Earth Planet. Inter., 148, 327-351. Field, E.H. (2007). A summary of previous Working Groups on California Earthquake Probabilities, Bull. Seismol. Soc. Am., 97, 1033-1053; doi: 10.1785/0120060048. Gutenberg, B. and C.F. Richter (1944). Frequency of earthquakes in California, Bull. Seis. Soc. Am., 34, 184-188. Hanks, T.C. (1992). Small earthquakes, tectonic forces, Science, 256, 1430-1432. Helmstetter, A., Y.Y. Kagan and D.D. Jackson (2005). Importance of small earthquakes for stress transfers and earthquake triggering, J. Geophys. Res., 110; doi: 10.1029/2004JB003286. Helmstetter, A., Y.Y. Kagan and D.D. Jackson (2007). High- resolution, time-independent grid-based forecast for M ≥ 5 earthquakes in California, Seismol. Res. Lett., 78, 78-86; doi: 10.1785/gssrl.78.1.78. Jordan, T.H. (2006). Earthquake predictability: Brick by brick, Seismol. Res. Lett., 77, 3-6. Kagan, Y.Y. and D.D. Jackson (1994). Long-term probabilistic forecasting of earthquakes, J. Geophys. Res., 99, 13,685-13,700. Kagan, Y.Y., D.D. Jackson and Y. Rong (2007). A testable five- year forecast of moderate and large earthquakes in Southern California based on smoothed seismicity, Seismol. Res. Lett., 78, 94-98; doi: 10.1785/gssrl.78.1.94. Kagan, Y.Y. (2010). Statistical distributions of earthquake num- bers: Consequence of branching process, Geophys. J. Intern., 180, 1313-1328; doi: 10.1111/j.1365-246X.2009.04487.x. Kagan, Y.Y. and D.D. Jackson (2010a). Earthquake forecasting in diverse tectonic zones of the globe, Pure Appl. Geophys. (The Frank Evison Volume), 167, 709-719; doi: 10.1007/ s00024-010-0074-4. Kagan, Y.Y. and D.D. Jackson (2010b). Short- and long-term earthquake forecasts for California and Nevada, Pure Appl. Geophys. (The Frank Evison Volume), 167, 685-692; doi: 10.1007/s00024-010-0073-5. Kagan, Y.Y., P. Bird and D.D. Jackson (2010). Earthquake patterns in diverse tectonic zones of the globe, Pure Appl. Geophys. (The Frank Evison Volume), 167, 721-741; doi: 10.1007/ s00024-010-0075-3. Marsan, D. (2005). The role of small earthquakes in redistributing crustal elastic stress, Geophys. J. Int., 163, 141-151; doi: 10.1111/j.1365-246X.2005.02700.x. MPS Working Group (2004). Redazione della mappa di pericolosità sismica prevista dall’Ordinanza PCM 3274 del 20 marzo 2003, Rapporto conclusivo per il Dipartimento della Protezione Civile, INGV, Milano-Roma, April 2004 (MPS04), 65 pp. + 5 appendices; http://zonesismiche. mi.ingv.it. Omori, F. (1894). On after-shocks of earthquakes, J. Coll. Sci. Imp. Univ. Tokyo, 7, 111-200. Reasenberg, P. (1985). Second-order moment of central Cali- fornia seismicity, 1969-82, J. Geophys. Res., 90, 5479-5495. Rovida, A. and the CPTI Working Group (2008). Catalogo Parametrico dei Terremoti Italiani, 1901-2006, versione 2008 (CPTI08), INGV, Milano-Pavia; http://www.cseptesting. org/regions/italy. Schorlemmer, D., M.C. Gerstenberger, S. Wiemer, D.D. Jackson and D.A. Rhoades (2007). Earthquake likelihood model testing, Seismol. Res. Lett., 78, 17. Schorlemmer, D., A. Christophersen, A. Rovida, F. Mele, M. Stucci and W. Marzocchi (2010a). Setting up an earthquake forecast experiment in Italy, Annals of Geophysics, 53, 3 (present issue). Schorlemmer, D., J.D. Zechar, M.J. Werner, E. Field, D.D. Jackson and T.H. Jordan (2010b). First results of the regional earthquake likelihood models experiment, Pure Appl. Geophys. (The Frank Evison Volume), 167, 859-876; doi: 10.1007/s00024-010-0081-5. Sornette, D. and M.J. Werner (2005a). Constraints on the size of the smallest triggering earthquake from the epidemic- type aftershock sequence model, Bath's law, and observed aftershock sequences, J. Geophys. Res., 110; doi: 10.1029/2004JB003535. Sornette, D. and M.J. Werner (2005b). Apparent clustering and apparent background earthquakes biased by undetected seismicity, J. Geophys. Res., 110; doi: 10.1029/2005JB003621. U.S. Geological Survey (2001). Preliminary determination of epicenters (PDE), monthly listings, US Department of Interior/ Geological Survey, National Earthquake Information Center, April-June 2000, Open File Report 2000-600-B; http://earthquake.usgs.gov/earthquakes/ eqarchives/epic/epic_global.php. Utsu, T., Y. Ogata and R.S. Matsu'ura (1995). The centenary of the Omori formula for a decay law of aftershock activity, J. Phys. Earth, 43, 1-33. Wells, D.L. and K.J. Coppersmith (1994). New empirical relationships among magnitude, rupture length, rupture width, rupture area, and surface displacement, Bull. Seismol. Soc. Am., 84, 974-1002. Werner, M.J. and D. Sornette (2008). Magnitude uncertainties impact seismic rate estimates, forecasts and predictability experiments, J. Geophys. Res. Solid Earth, 113; doi: 10.1029/2007JB005427. Werner, M.J., A. Helmstetter, D.D. Jackson and Y.Y. Kagan (2010a). High resolution long- and short-term earthquake forecasts for California, Bull. Seismol. Soc. Am., in revision, preprint available at http://arxiv.org/abs/0910.4981. Werner, M.J., J.D. Zechar, W. Marzocchi and S. Wiemer (2010b). Retrospective evaluation of the five-year and ten-year CSEP-Italy earthquake forecasts, Annals of Geophysics, 53, 3 (present issue). Zechar, J.D., D. Schorlemmer, M. Liukis, J. Yu, F. Euchner, P.J. Maechling and T.H. Jordan (2009). The Collaboratory for the Study of Earthquake Predictability perspective on SMOOTHED SEISMICITY EARTHQUAKE FORECASTS computational earthquake science, Concurr. Comput. Pract. Exp.; doi: 10.1002/cpe.1519. Zechar, J.D. and T.H. Jordan (2010). Simple smoothed seismicity earthquake forecasts for Italy, Annals of Geophysics, 53, 3 (present issue). Zhuang, J., Y. Ogata and D. Vere-Jones (2002). Stochastic declustering of space-time earthquake occurrences, J. Am. Stat. Assoc., 97, 369-380. *Corresponding author: Maximilian J. Werner, ETH Zurich, Swiss Seismological Service, Zurich, Switzerland; email: mwerner@sed.ethz.ch © 2010 by the Istituto Nazionale di Geofisica e Vulcanologia. All rights reserved. SMOOTHED SEISMICITY EARTHQUAKE FORECASTS 116