497_511.pdf


497

ANNALS OF GEOPHYSICS, VOL. 45, N. 3/4, June/August 2002

Nonparametric analysis of the
time structure of seismicity

in a geographic region

Graciela Estévez-Pérez (1), Henrique Lorenzo-Cimadevila (2) and Alejandro Quintela-del-Río (3)
(1) Departamento de Matemáticas, Facultad de Ciencias, Universidad de A Coruña, Spain

(2) Departamento de Ingeniería de los recursos naturales y medio ambiente, Escuela de Ingeniería
Técnica Industrial, Universidad de Vigo, Spain

(3) Departamento de Matemáticas, Facultad de Informática, Universidad de A Coruña, Spain

Abstract
As an alternative to traditional parametric approaches, we suggest nonparametric methods for analyzing temporal
data on earthquake occurrences. In particular, the kernel method for estimating the hazard function and the intensity
function are presented. One novelty of our approaches is that we take into account the possible dependence of the
data to estimate the distribution of time intervals between earthquakes, which has not been considered in most
statistics studies on seismicity. Kernel estimation of hazard function has been used to study the occurrence process
of cluster centers (main shocks). Kernel intensity estimation, on the other hand, has helped to describe the occurrence
process of cluster members (aftershocks). Similar studies in two geographic areas of Spain (Granada and Galicia)
have been carried out to illustrate the estimation methods suggested.

1.  Introduction

The problem of searching for stochastic
models to describe the sequence of occurrence
times of earthquakes from some geographic region
is of great interest to seismologists. In effect, a
detailed analysis of such process might reveal new
aspects of the pattern of occurrence of earth-
quakes, and suggest important ideas on the me-
chanism of earthquakes.

The development of detailed stochastic models
to describe the list of origin times or equivalently
that of time intervals between consecutive
earthquakes is quite recent. Vere-Jones (1970)
surveys some of the stochastic models (clustering
models and stochastic models for aftershock
sequences) proposed in the literature and describes
their behavior in several data sets. More spe-
cifically, Udias and Rice (1975) propose the
gamma distribution to describe the series of time
intervals between consecutive shocks. These
authors also deal with hazard and intensity
functions. Other more recent models include the
trigger models (Lomnitz and Nava, 1983), the
Epidemic-Type Aftershock Sequence (ETAS)
model (Ogata, 1988), whose extensions can be
seen in Ogata (1998), or refinements of Hawkes’
(1971) self-exciting point process model, which
describes spatial-temporal patterns in a catalog.

Mailing address: Dr. Graciela Estévez-Pérez, De-
partamento de Matemáticas, Facultad de Ciencias,
Universidad de A Coruña, Rua de Maestranza s/n, 15003
A Coruña, Spain; e-mail: graci@udc.es.

Key  words nonparametric estimation hazard function
intensity function  clustering  dependent data


498

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

However, standard models applied to seismic
data do not always fit the data well. In part, this
is because parametric models are usually only
well suited for a sequence of seismic events that
have similar causes. Moreover, parametric
models can be insensitive to poorly-fitting events,
which often are at least as interesting as well-
fitting ones (see Ogata, 1989).

In this article, we suggest nonparametric
methods for analyzing seismic data. These methods
do not require formulation of structural models, so
they are not affected by the deficiencies noted
above. They involve several different approaches
to nonparametric estimation of the hazard and
intensity functions of point processes that evolve
with time. This enables us to split up and analyze
the occurrence of temporal processes of earth-
quakes within a region without constraining them
to having predetermined properties. We argue that
nonparametric methods for the analysis of
earthquake data are valuable supplements to more
conventional parametric approaches, especially as
tools for exploratory data analysis.

The objective of our analysis is to show two
statistical tools (hazard and intensity functions)
which could help to describe the whole cycle of
seismic activity in a region without imposing
predetermined conditions on this activity. That
is, our analysis is based on the information
provided by the data and on the universally
accepted assumption of temporal grouping of
earthquakes. The hazard function is used to
confirm this grouping and characterize the
occurrence process of main shocks. On the other
hand, the aftershock sequences (clusters) have
been studied by means of the intensity function.

The paper is organized as follows: in Section
2 we describe the occurrence process of earth-
quakes in terms of its evolution in time. Section
3 introduces the nonparametric estimator of
hazard function and Section 4 contains the
analysis of seismic activity of two Spanish
geographic regions using the nonparametric
methods beforehand mentioned.

2.  The occurrence process of earthquakes

Earthquakes can be represented by point
events in a five-dimensional space-time-energy

continuum (
i
,

i
, h

i
, t

i
, M

i
) where 

i
and

i
are

the latitude and longitude of the epicenter, h
i

the depth of the focus, t
i

the origin time and M
i

the magnitude. A complete statistical analysis
of earthquakes must consider the distributions
and correlations of these five parameters. This
would involve handling a five-dimensional
series, which generally constitutes a very
complex problem. The starting point of such
problem is the consideration of the one-
dimensional series of occurrence times {t

i
}. To

give a precise meaning to this time series, its
space, time and magnitude boundaries must be
specified. Obviously, these boundaries will be
chosen according to the objectives of the study:
to characterize different seismic areas in the
same time period, to analyze several seismic
series in a particular region, etc.

Space specifications define the volume from
which the population of earthquakes, repre-
sented by the time series {ti } is taken. This may
be done in practice by specifying an area A and
a lower limit in depth H. Since detectability is
limited, a lower limit of magnitude M0 must al-
so be specified. This limit is a function of the
station distribution and sensitivity, and defines
the lowest magnitude for which all events from
anywhere in the bounded region can be detected.
A bounded set of earthquakes will then be a
series {t

i
} defined for a certain area A, depth H

and lower limit of magnitude M
0
, such that, it

would include all shocks originating from inside
the defined volume of magnitude larger than M0
for a particular time interval from t0 to tn.

Once a bounded set of time occurrence {t
i
} is

established, a series can be constructed with the
time intervals between consecutive earthquakes
{ t

i
}, such that t

i
= t

i
t

i 1 for i = 1, ..., n. The
distribution of the values of these intervals is of
great interest to specify the time structure of the
seismicity of a region.

The simplest statistical model to fit a series
of occurrence times of earthquakes is the
Poisson process, under which the time intervals
between consecutive events are exponentially
distributed. This model presupposes inde-
pendence of the events, so that the occurrence
of one earthquake is not influenced by that of
previous ones, which is very far from reality. In
effect, several authors (Knopoff, 1964; Lomnitz,


499

Nonparametric analysis of the time structure of seismicity in a geographic region

3.  Kernel estimation of hazard function

The distribution of time intervals between
consecutive earthquakes can be characterized
using the hazard function, which is defined by

where t is the random variable that measures
the time between consecutive shocks and P (. .)
indicates the conditional probability. This
function can also be written as

with f (.) and F(.) being the density and
distribution functions of t , respectively. Thus,
r(t) �t can be interpreted as the approximate
probability of a shock in the time interval (t, t +�t)
given that the immediately preceding one
happened at time 0. In other words, it is con-
sidered the instantaneous hazard earthquake
occurrence at time t.

Nonparametric estimation of hazard function
started with Watson and Leadbetter (1964 a,b)
who introduced the kernel estimator (see (3.1)),
and from that time on many papers on this topic
have appeared in the literature (see e.g., Hassani
et al., 1986 for a survey). Most of this literature
is based on the assumption of independence of
the data. However, in the case of microearthquake
studies, for instance, the hypothesis of depen-
dence, that is, the existence of causality or inter-
action between the occurrence times of shocks,
is more suitable (Rice and Rosenblatt, 1976).
Papers on dependent hazard estimation include
Sarda and Vieu (1989), Vieu (1991), Estévez and
Quintela (1999, 2002) and many others.

Throughout this paper we will use the kernel
estimator of the hazard function (Watson and
Leadbetter, 1964 a,b) because it has been studied
in great detail in dependence contexts. It is
defined as

(3.1)

1966; Vere-Jones, 1970; Udias and Rice, 1975)
have found, for different populations of earth-
quakes, that the Poisson fit to the time series
of microearthquakes is very poor, especially
for active periods. The deviation from the
model is principally due to the existence of a
much larger number of small intervals than
expected. The reason for such a large number
of small intervals is that the earthquakes happen
forming clusters, that is, one main shock is
followed and/or preceded by a stream of smaller
shocks, called aftershocks and/or precursors
(Lomnitz and Hax, 1967; Vere-Jones and
Davies, 1966) produced in the same general
focal region.

Therefore, some authors (e.g., Vere-Jones,
1970; Hawkes, 1971 and many others) have
defined the occurrence process of earthquakes
in terms of two components: i) a process of
cluster centers; and ii) a subsidiary process
defining the configuration of the members within
a cluster. The final process is taken as the
superposition of all the clusters. Several possible
models for these processes can be found in the
literature, for example the compound Poisson
processes (Vere-Jones and Davis, 1966),
trigger models (Vere-Jones and Davies, 1966;
Lomnitz and Nava, 1983) or epidemic-type
models (Hawkes, 1971; Lomnitz, 1974; Ogata
and Akaike, 1982; Ogata, 1988 and their re-
ferences). All these models assume structural
conditions on the occurrence of earthquakes,
as for example, that the process of cluster
centers is stationary and Poissonian. In Section
4.1, we will show that this hypothesis is not
very likely either for particular cases. A draw-
back of these approaches is that they depend
very much on the models, and so are subject
to the instability and goodness of fit problems
noted in Section 1.

Because the standard parametric models do
not always fit the earthquake time series well,
we suggest making use of nonparametric
analysis techniques. As mentioned in the
previous section, nonparametric methods for
analyzing the distribution of time intervals
between consecutive earthquakes will be
considered, obtaining another attempt to
describe temporal behavior of an earthquake
series in a geographic region.

r t
P t t t t t

tt
( ) =

< +( )
lim
�

�

�0

r t
f t

F t
( ) =

( )

( )1

r t
f t

F t
h

h

h

( ) =
( )

( )1


500

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

Estévez and Quintela (1999, 2002) proposed
two versions of the cross-validation method to
select the bandwidth h when the data are
dependent: the global and the local choice. The
first one selects only one h, searching the lowest
global estimation error. That is, if the global es-
timation error is

the value of cross-validation bandwidth used by these
authors is h

cv
, that satisfies                 , with

(3.2)

where

and

are the kernel estimators of f (t) and F(t), respect-
ively; K(.) is a kernel function, , and
h = h(n) R+ the smoothing parameter or
bandwidth. This parameter is crucial in the
estimation method since the shape of the
resulting estimator varies greatly according to
its value.

  f t nh K
t t

hh
i

i

n

( ) =
=

1

1

( )

  F t f x dx n H
t t

h
h h

i

n
it( ) = ( ) =

=

1

1
0

( )
H t K x dx

t
( ) = ( )

MISE h E r x r x dxh( ) = ( ) ( )( )
2

h CV hcv
h

= ( )argmin

CV h r x dxh( ) = ( )
2

x
n

f X

F X F X
h

i
i

h
i

i n ii

n ( )
( )( ) ( )( )=1

2

1 1

Fig.  1. Seismicity of Granada in the period 1983-1999.


501

Nonparametric analysis of the time structure of seismicity in a geographic region

an estimation of global error MISE(h). The
functions f

h
i(.), F h

i(.) in (3.2) are kernel
estimations of density and distribution functions
modified to take into account the dependence of
the data and Fn(.) is the empirical distribution
function. Note that MISE(h) depends on r (.) that
is an unknown function and hence it is necessary
to estimate it.

On the other hand, the local cross-validation me-
thod takes a different  h

cv, x  
for each estimation point

x, obtaining the lowest estimation error for each x.

In this case,    with CV
x
(h) an

estimator of local error MISE
x
(h) = E(r

h
(x) r(x))2.

We will illustrate these two approaches in the
sections that follow.

4.  Examples

In this section we try to study and compare
the seismic activity of two Spanish geographic
regions: Granada in the southeast and Galicia in
the northwest of Spain.

4.1.  Granada earthquakes data

The data-set to study is a group of micro-
earthquakes recorded during the period 1983-
1999 with epicenters between 36.5° to 37.7°N
and 3.5° to 4.5°W and magnitude larger than or
equal to 2.5 on the Richter scale. The scatter plot
of epicenters (fig. 1) for the n = 1117 shallow
shocks (focal depths of less than 30 km) il-
lustrates the seismicity of the region studied.

Basic descriptive plots of the event magni-
tudes are given in fig. 2a,b. The first one presents
the cumulative number of shocks with time with
the associated magnitudes. The second one
represents the distribution of magnitudes in this
region. Note in particular the straight line fit in
(b), which indicates that the Gutenberg-Richter

h CV hcv x
h

x, = argmin ( )

Fig.  2a,b. Basic descriptive plots of the seismic data:
a) cumulative number of shocks and plot of the
magnitudes versus the occurrence times; b) cumulative
distribution of magnitudes.

a

b


502

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

time intervals between consecutive earthquakes
{ t

i
 }

i =1
1116 (fig. 4) show a prominent trend towards

clustering, particularly clear in April, 1998 with
more than 100 events per 5 days. Note that large
time intervals ( t

i
) indicate slack periods and

times close to 0 show strong clustering. Arrows
pointing downward at the upper part of the
histogram mark the occurrence of one or more
earthquakes (the number of bars indicates the
number of shocks) with magnitude equal to or
larger than 4 on the Richter scale.

The average number of earthquakes per day
is 0.18 but the activity, as is shown by figs. 2a, 3
and 4, is not uniform with time. Figure 2a shows
five time periods whose seismic activity seems
fairly different. Combining the information pro-
vided by this graph and fig. 4, we can distinguish:
a quite active time span from 1985 to 1989, a
slack span from 1990 to 1995, and finally, strong
seismic activity in April 1998. Thus, it is obvious
that the process { t

i
 }

i = 1
1116 is not stationary. A

possible cause of the lack for stationarity could
be that the studied time period is quite short
because, as Ogata (1988) points out, «stationary
models are considered here in the prior belief
that such geophysical activity for a long time span
should be stationary».

Concerning the hypothesis of dependence of
the data, the theory of runs and the analysis of
autocorrelations establish that this hypothesis is
verified. In effect, the run test up and down gives
a p-value of 0.039, the run test above and below
the Median produces 2.81.10 18and the autocor-
relation function in fig. 5 gives autocorrelations
significantly different from zero.

The previously mentioned non-stationarity
makes the estimation of hazard function difficult,
however, the transformed process y

i
= ln tj + 1 ln tj ;

j =1,..., n, which is stationary as shown in fig. 6,
solves the problem. In effect, as

and the random variable ln t
n+1

conditioned to
has the same distribu-

tion as         , that is

(1944) law of magnitude frequency holds for
these data. This relation suggests that the catalog
includes essentially all the events of this magni-
tude that occurred.

The histogram for the origin times in hours
{t

i
 }

i = 1
1117 (fig. 3) and the sequence graph for the

Fig.  3. Time history of seismic activity during the
period 1983-1999 given in number of shocks per 208
days. Arrows indicate the occurrence of one or more
earthquakes with magnitude M  4.

y t tn n n= +ln ln1

t t t t tn n n1 1 1,..., ,...,ln +( )

ln lnt t t y t t tn n
d

n n n+ +( )1 1 1,..., ,...,   

y t t t
n n n

+( )ln 1,...,

Fig.  4.  Sequence of time intervals between conse-
cutive earthquakes { t

i
}

i =1
1116.


503

Nonparametric analysis of the time structure of seismicity in a geographic region

So, first we compute

with

      K(u) =

the Epanechnikov kernel and h calculated in the
same way as in Estévez and Quintela (1999).
Then, by means of a translation and a change of
variable, we estimate the hazard function of time
intervals between earthquakes, given that the last
one has happened at time t

n
17/12/1999 at

22:31:59.06 h and t
n
= e

7 . 5= 1807.65 h, obtaining
the curve in fig. 7.

The shape of this curve (high hazard for small
times and low hazard in other cases) suggests a
prominent clustering. Moreover, the last time
interval, t

n
= e

7 . 5= 1807.65, plays an important
role in the estimation process: if, for example,

t
n
was equal to e3.9 5= 51.93 h or e5.25= 190.57 h

(the two last shocks could be closer), the instan-
taneous hazard earthquake occurrence at time t

n

(dotted line and dashed line, respectively) would
be higher.

The main drawback of this method is that we
can only estimate the hazard for time tn.

Now, let us suppose that the time series  { t
i
}

is stationary, that is, that we have not detected
the lack of stationarity. In such case, we would
estimate the hazard function of { t

i
} using

for t > 0 and h the cross-validation bandwidth
(Estévez and Quintela, 1999). By comparing this
curve (fig. 8) with the previous ones, we can
conclude that both methods engender very

the hazard function of time intervals between
consecutive earthquakes, given that the last
registered time interval was t

n
, is estima-

ted using the hazard function estimation of
y

n
(r

h,1
(.)).

Fig.  5. Sample autocorrelation function for { t
i
}

i =1
1116.

Fig.  6. Sequence graph of transformed process { y
j
}.

  
r y

f y

F y

nh
K

y y

h

n
H

y y

h

h
h

h

i

i

n

i

i

n
,

,

,
1

1

1

1

1

1

1

1
1

( ) =
( )

( )
=

=

=

( )
( )

  
{ 34 1 2( )u
                        0

r t
f t

F t
h

h

h

( ) =
( )

( )1

if    u  [ 1, 1]

if    u  [ 1, 1]


504

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

Fig.  7. Conditional hazard at t
n

(solid line). Hazard
estimation for ln t

n
= 3.95 (dashed line) and for

ln t
n
= 5.25 (dotted line).

Fig.  8.  Hazard estimation under stationarity.

similar estimations when  ln t
n
is not an outlier.

So, if we are interested in estimating the hazard
at a particular time tn, the first procedure, which
takes into account the information provided by

tn, should be used. However, to estimate the
hazard function globally, the stationary could be
ignored and the second procedure, which is
easier, would be used.

In both cases the estimated hazard function
suggests the natural strong clustering of the
occurrence times of shocks.

By defining a cluster as a set of earthquakes
originating from a relatively small volume and
separated in time by intervals smaller than a fixed
duration («cluster length»), and cluster center as
a representative of the cluster (the first shock,
for instance), the occurrence process of earth-
quakes is taken as the superposition of all the
clusters.

In this problem we have considered clusters
with «cluster length» of less than 120 h, obtaining
a set of independent cluster centers. That is, after
removing aftershocks and precursors, the
remaining shocks can be taken as independent
events. A length of 120 h was chosen because it
was the smallest value producing independent
shocks, that is, removing the effect of the
aftershocks. Figure 9a,b presents the sequence
graphs of the cluster sizes (fig. 9a) and the
magnitudes of the centers (fig. 9b).

By comparing fig. 9a,b with fig. 3, we ob-
serve that the stronger earthquakes occur at the
start of or within small clusters. In other words,
the great clusters are formed by events of small
magnitude.

The largest cluster will be studied later.
We begin by analyzing the occurrence pro-

cess of cluster centers by means of the dis-
tribution of time intervals between them { t

i
�}

i=1
345.

These events happen independently (the p-values
of runs test above and below the median, and up
and down are 0.553 and 0.522, respectively) but
they do not have exponential distribution (the
p-value of the Kolmogorov-Smirnov test for
goodness of fit is less than 0.01). Thus, such
centers do not form a stationary Poisson pro-
cess contrary to what Vere-Jones and Davies
(1966), Vere-Jones (1970) and Udias and Rice
(1975) used. Since several tests for goodness
of fit show that any known model of distri-
bution seems suitable to describe the distri-
bution of { t

i
�}

i=1
345, we resort to nonparametric

methods to estimate this distribution. In
particular, we use the kernel estimator of
hazard function (3.1). Figure 10 presents this
estimation for { t

i
�}

i=1
345 when the bandwidths

have been globally selected (see Estévez and
Quintela, 1999) (solid line) and locally selected


505

Nonparametric analysis of the time structure of seismicity in a geographic region

Fig.  10. Hazard estimation of time intervals between
cluster centers with global bandwidth (solid line) and
local bandwidths (dashed line).

Fig.  9a,b. a) Sequence graph of cluster sizes (a total of 346); b) sequence graph of magnitudes of cluster centers.

(see Estévez and Quintela, 2002) (dashed line).
The low values of hazard function and the small
fluctuations throughout indicate that clusters
happen with a low occurrence rate and are almost
constant with time.

The occurrence process of the members
within the largest cluster is formed by 133 events

which happened from April 11 to April 30, 1998.
Their epicenters are concentrated in a small area
close to Loja, southwest of Granada (Spain).
Although the number of shocks per day was large
(6.65), none exceeded magnitude 3.9 on the
Richter scale.

The sequence graph of time intervals between
consecutive cluster members { t

i
��}

i=1
132, shown in

fig. 11, indicates that such process is not sta-
tionary and the last span t132

��  is large compared
to the other ones.

Therefore, it will not be possible to estimate
the distribution of time intervals between cluster
members.

The behavior of the cluster will be studied
using the process of origin times {t

i
��}

i=1
133 and its

intensity function. This function gives, for each
time t , (t) the average number of earthquakes
per unit of time, t hours after the main shock
(cluster center). The advantage of using the
intensity function is two-fold: firstly place is not
affected by the lack of stationarity of the time
intervals between shocks. Secondly, it clearly
shows the evolution of the cluster with time, and
therefore, it is suitable for comparing the seismic
activity of several geographic regions. In this
work, we propose to estimate the intensity
function by means of a kernel estimator, which

a b


506

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

is defined as

(Wand and Jones, 1995; Choi and Hall, 1999 and
their references), where K(.) is the kernel function
and h the bandwidth, as in (3.1). For this example,
the estimated curve appears in fig. 12, showing
a clear increase in the number of shocks at the

beginning of the cluster until the time with
highest intensity (approximately at t = 30 h) and
then a decrease.

4.2.  Galicia earthquakes data

Seismic activity increased in this region
during the 1990’s. The data analyzed correspond
to the 978 earthquakes occurring from January

Fig.  11. Sequence graph of time intervals between
consecutive cluster members.

Fig.  12. Intensity estimation of the largest cluster.

Fig.  13. Seismicity of Galicia in the period 1987-
2000.

  
ˆ ,

''

t
h

K
t t

h
t R

i

i

n

( ) =
=

+
1

1

  ( )

Fig.  14.  Cumulative distribution of magnitudes.


507

Nonparametric analysis of the time structure of seismicity in a geographic region

15, 1987 to May 3, 2000. Their epicenters, scat-
tered throughout Galicia (fig. 13), show small
spatial groupings at high risk areas: Sarria,
Monforte and Celanova.

The magnitude distribution of this region is
shown in fig. 14, which supports the Gutenberg-

Richter law of magnitude frequency. The
histogram of origin times {t

i
}

i =1
978 , the sequence

of time intervals between consecutive shocks
{ t

i
}

i=1
977 and the graph of the cumulative number

of shocks with their magnitudes are shown in
fig. 15a-c. These graphs show a quiet period until

c

Fig.  15a-c. a) Histogram of origin times {t
i
}

i =1
978, number of shocks per 208 days; b) sequence graph of time

intervals between consecutive earthquakes { t
i
}

i =1
977; c) cumulative number of shocks with sequence graph of

magnitudes.

a b


508

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

September 1995 and three temporal groupings
in December 1995, May 1997 and May-June
1998, which are related to the strongest earth-
quakes.

The hazard function estimation (fig. 16),
which was constructed as in the previous ex-
ample, presents a common shape: it decreases

suddenly in the first hours and then it fluctuates
around a small risk. This shape confirms the well
known fact that earthquakes form clusters.

Similarly, as in the Granada case, 206 clusters
with «cluster length» less than 144 h have been
formed. The sequence of sizes and magnitudes
of cluster centers (fig. 17a,b) show three im-
portant clusters, the first one beginning with a
big shock (4.7 on the Richter scale). Figures
17a,b and 15c also suggest that the strongest
earthquakes are related to small clusters, that is,
such events are not followed or preceded by
many earthquakes.

The hazard estimation of time intervals
between cluster centers (fig. 18) suggests small
and stable risk of clusters in time. However, as
in the Granada case, the distribution of time
intervals is not exponential either (the p-value
of the Kolmogorov-Smirnov test for goodness
of fit is 0.0006).

Finally, the occurrence of cluster members
in each grouping was analyzed using the intensity
function. The first cluster (fig. 19a) is composed
of 147 shocks, whose epicenters are contained
in a small area close to Sarria (Lugo). The in-
tensity estimation indicates that: i)  the cluster
begins with a high number of shocks (the first of
magnitude 4.6); ii) there is another grouping of

Fig.  16. Hazard estimation of time intervals between
consecutive shocks.

Fig.  17a,b. a) Sequence graph of cluster sizes (a total of 206); b) sequence graph of magnitudes of cluster
centers.

a b


509

Nonparametric analysis of the time structure of seismicity in a geographic region

events coinciding with another earthquake of
magnitude 4.6, and iii) the intensity function is
low in the remainder of the range.

The second cluster (fig. 19b) consists of 190
earthquakes, whose epicenters are also near
Sarria (Lugo). The shape of the intensity function
reflects the behavior of the cluster members: a
small group of precursors warns that important
shocks are to arrive (one of magnitude 5.1 and
another of 4.9) and then there is a sequence of
aftershocks.

Finally, the third cluster (fig. 19c) involves
79 events, with epicenters in Celanova (Orense).
This is the weakest cluster (few shocks of low
magnitude). Its estimated intensity function
shows fluctuations around quite small values the
whole time.

5.  Conclusions

In this work we show how the hazard and
intensity functions represent another way of
describing the temporal structure of seismic
activity in a geographic region. Kernel estimation
of hazard function has confirmed what Vere-
Jones (1970), Udias and Rice (1975) and many
others have noted: earthquakes have the tendency

Fig.  18. Hazard estimation of time intervals between
center clusters with global bandwidth (solid line) and
local bandwidths (dashed line).

Fig.  19a-c. Intensity estimations for important
clusters: a) December 1995; b) May 1997, and c) May-
June 1998.

c

a

b


510

Graciela Estévez-Pérez, Henrique Lorenzo-Cimadevila and Alejandro Quintela-del-Río

to group. The occurrence process of these groups
was also studied by means of the hazard function
and each important cluster has been described
using the intensity function.

As we argued, a major advantage of non-
parametric methods is that they do not require
formulation of structural models, which are
often well suited only for data that have closely
related seismic causes. Another novelty is that
we take into account the possible dependence
of the data to estimate the distribution of time
intervals between earthquakes, which has not
been considered in most statistics studies on
seismicity.

In reference to the results obtained on the
seismic activity of Granada and Galicia, we may
point out that: due to the effect of aftershocks,
the hazard function of time intervals between
consecutive shocks in an earthquake catalog will
present a shape as that shown in fig. 8 and 16. Its
greater or smaller above zero is an indicator of
the intensity of seismic activity of each region.
In Granada and Galicia the maximum values of
hazard are similar when shocks with magnitude
from 2 to 5 are considered.

Next, to analyze the seismic activity of the
Granada and Galicia region we formed clusters
lasting less than or equal to 120 and 144 h,
respectively. In both cases, a sequence of main
shocks mutually independent with low rate of
occurrence and almost constant with time has
been obtained. Thus, consecutive earthquakes
far away from 120 or 144 h do not present
interaction. The processes of main shocks seem
very similar for both regions.

Concerning the greatest clusters, although
the intensities take different shapes and values
(see fig. 12 and 19a-c), there is a clear predomi-
nance of shocks at the beginning of clusters.
However, the Granada cluster is shorter and
intense and in the Galicia region the aftershocks
are more aloof in time. Moreover, the clusters
of Galician earthquakes present intensity
functions with different shapes. This charac-
teristic suggests the use of intensity function,
of its shape in particular, to characterize and
compare different seismic regions. Note this
characterization is possible thanks to the use of
nonparametric methods, which do not pre-
determine the shape of such curves.

It is important to point out that although our
conclusions contradict the belief that Granada has
more seismic activity than Galicia, seismicity in
the latter has grown much in recent years.

Acknowledgements

We would like to express our gratitude to J.
Ibáñez and J.A. Esquivel (Instituto Andaluz de
Geofísica, Universidad de Granada, Spain) for
their generosity in providing the data used
throughout the work and for their comments,
which have helped prepare a better version of
the paper.

This research was financed by the Xunta
de Galicia (Spain) under research Project
PGIDT01PXI10504PR and by the DGES under
research Project PB98-0182-C02-01.

REFERENCES

CHOI, E. and P. HALL (1999): Nonparametric approach to
analysis of space-time data on earthquake occurrences,
J. Comput. Graphic. Stat., 8 (4), 733-748.

ESTÉVEZ, G. and A. QUINTELA (1999): Nonparametric
estimation of the hazard function under dependence
conditions, Commun. Stat. Theory Methods, 28 (10),
2297-2331.

ESTÉVEZ, G. and A. QUINTELA (2002) : Estimación no
paramétrica de la función de riesgo: aplicaciones a
sismología, Questiió, 25 (3), 437-477.

GUTENBERG, R. and C.F RICHTER (1944): Frequency of
earthquakes in California, Bull. Seismol. Soc. Am., 34,
185-188.

HASSANI, S., P. SARDA and P. VIEU (1986): Approche non
paramétrique en théorie de la fiabilité, Rev. Stat. Appl.,
35 (4), 653-676.

HAWKES,  A.G. (1971): Point spectra of some mutually exciting
point processes, J. R. Stat. Soc., Ser. B., 33, 438-443.

KNOPOFF,  L. (1964): The statistics of earthquakes in Southern
California, Bull. Seismol. Soc. Am., 54, 1871-1873.

LOMNITZ, C. (1966): Statistical prediction of earthquakes,
Rev. Geophys., 4, 377-393.

LOMNITZ, C. (1974): Global Tectonics and Earthquakes Risk
(Amsterdam: Elsevier).

LOMNITZ, C. and A. HAX (1967): Clustering in aftershock
sequences in The Earth Beneath the Continents,
Geophys. Monogr. 10, Am. Geophys. Union.

LOMNITZ, C. and F. A. NAVA (1983): The predictive value of
seismic gaps, Bull. Seismol. Soc. Am., 73, 1815-1824.

OGATA, Y. (1988): Statistical models for earthquakes
occurrences and residual analysis for point processes,
J. Am. Stat. Ass., 83 (401), 9-27.

OGATA, Y. (1989): Statistical model for standard seismicity


511

Nonparametric analysis of the time structure of seismicity in a geographic region

and detection of anomalies by residual analysis,
Tectonophysics, 169, 159-174.

OGATA, Y. (1998): Space-time point-process models for
earthquake occurrences, Ann. Inst. Stat. Mathem., 50,
379-402.

OGATA, Y. and H.  AKAIKE (1982): On linear intensity models
for mixed doubly stochastic Poisson and self-exciting
point processes, J. R. Stat. Soc., Ser. B, 44, 102-107.

RICE, J. and M. ROSENBLATT (1976): Estimation of the log
survival function and hazard function, Sankhya, Ser. A,
36, 60-78.

SARDA, P. and P. VIEU (1989): Empirical distribution function
for mixing random variables. Application in nonparametric
hazard estimation, Statistics, 20 (4), 559-571.

UDIAS, A. and J. RICE (1975): Statistical analysis of
microearthquakes activity near San Andres Geophysical
Observatory, Hollister, California, Bull. Seismol. Soc.
Am., 65, 809-828.

VERE-JONES, D. (1970): Stochastic models for earthquake
occurrence, J. R. Stat. Soc., Ser. B, 32, 1-62.

VERE-JONES, D. and R.B. DAVIES (1966): A statistical survey
of earthquakes in the main seismic region of New
Zealand. Part 2. Time Series Analyses, N. Z. J. Geol.
Geophys., 3 (9), 251-284.

VIEU, P. (1991): Quadratic errors for nonparametric estimates
under dependence, J. Multivariate Anal., 39, 324-347.

WAND, M.P. and M.C. JONES (1995): Kernel Smoothing
(Chapman & Hall), 167-168.

WATSON, G.S. and M.R. LEADBETTER (1964a): Hazard
analysis I, Biometrika, 51, 175-184.

WATSON, G.S. and M.R. LEADBETTER (1964b): Hazard
analysis II, Sankhya, Ser. A, 26, 110-116.

(received February 15, 2002;
accepted July 29, 2002)