Koch 
 

63 
 

A comparison of variability in two Cassin’s Auklet (Ptychoramphus 

aleuticus) colonies using spline-based nonparametric models 

 

Dean Koch 

 

Abstract: While variability in the reproductive performance of a 

population over time is a familiar and useful concept to ecologists, 

it can be difficult to capture mathematically. Commonly used 

ecological variability statistics, such as the standard deviation of 

the logarithm and coefficient of variation, discard the time-

ordering of observations and consider only the unordered response 

variable values. We used a relatively new methodology, the cubic 

regression spline (a flexible curve fitted to a scatterplot of data), 

both to illustrate trends in reproductive performance over time and 

to explore the utility of the cubic regression spline roughness 

penalty (J) as a statistic for measuring variability while retaining 

time-ordering information. We concluded that although J measures 

variability in a mathematical sense, it can be inappropriate in a 

population ecology context because of sensitivity to small-scale 

fluctuations. To illustrate our methodology, we used the CRS 

approach in an analysis of historical data from two Cassin’s Auklet 

colonies located on Frederick and Triangle Islands in coastal BC, 

developing a model for the annual mean nestling growth rate on 

each island over seven contiguous years. Model selection indicated 

a complex (nonlinear) trend in growth rate on both islands. We 

report higher variability in the resident bird population of Triangle 

Island than Frederick Island, based on a comparison of the fitted 

curves, and the values of the coefficient of variation and 

population variability summary statistics. 

 

Key Terms: Cassin’s Auklets; Alcid ecology; reproductive 

performance; chick-growth rate; ecological variability; standard 

deviation of the logarithm; coefficient of variation; population 

variability; historical time series; population time series; 

population dynamics; semiparametric modeling; nonparametric 

modeling; splines; cubic regression splines; additive models; 

roughness penalty 

 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

64 
 

Introduction 

Background 

Our interest in ecological variability statistics began while investigating 

data on two breeding populations of Cassin’s Auklet (Ptychoramphus 

aleuticus). This small, rarely seen seabird of the auk family spends most of 

its life out on the open sea, coming ashore only in the spring to breed in 

colonies (Bertram, Harfenist & Smith, 2005). Like other auks, the Cassin’s 

Auklet is an impressive diver, using its wings like flippers to propel itself 

through the water in search of food, regularly descending to depths of forty 

metres (Burger & Powell, 1989). During the breeding season, the Cassin’s 

Auklet spends its entire day foraging, returning onshore only at night to 

regurgitate food for its young in underground burrows (Bertram et al., 

2005). 

As plankton is the main food source for the Cassin’s Auklet, changes in 

production at the lowest levels of the marine food chain can impact the bird 

in a very immediate way. Bertram, Mackas & McKinnell (2001) argued that 

because of this close linkage, large-scale shifts in oceanographic conditions 

(e.g., El Niño / La Niña cycles, and long-term climate change) are reflected 

in the reproductive successes and failures of auklets. In this way, the 

Cassin’s Auklet can be thought of as a bellwether for the marine ecosystem 

(Bertram et al., 2005). By studying its population dynamics, we gain insight 

not only into auklet ecology, but also into the state of the ocean. 

A large body of research on Cassin’s Auklets provides the groundwork 

for the present study, including examinations of adult survival and diet, 

colony density, and year-by-year measurements of reproductive 

performance (e.g. Bertram et al., 2000; Bertram et al., 2001; Pyle, 2001; 

Hedd et al., 2002; Bertram et al., 2005; Bertram, Harfenist & Hedd, 2009). 

In some cases, variability in reproductive performance is discussed 

qualitatively (by simultaneously comparing multiple point estimates, one 



Koch 
 

65 
 

from each year); however, we are aware of no study that has attempted to 

quantify it in a statistically rigorous way.  

 The variability in a population time series is a concept of great interest 

to ecologists. It is widely accepted, for example, that higher temporal 

variability in abundance (more increases and decreases in the population 

size) translates into higher extinction risk (Vucetich et al., 2000; Inchausti 

& Halley, 2003). Not surprisingly, there are a number of different tools 

available for measuring variability in a population time series (see 

Fraterrigo & Rusak, 2008). However, the most common of these, including 

the standard deviation of the logarithm (SDL), coefficient of variation (CV), 

and Heath’s population variance (PV), are calculations on the response 

variable alone, and do not take into account the sequential ordering of 

observations available in a time series. We wished to improve on these 

statistics by finding a model that captures the trends in time, while also 

yielding a statistic for variability.  

We present a case study in which Cassin’s Auklet reproductive 

performance data is modeled using a statistical tool that is relatively new to 

population ecology, the cubic regression spline (CRS). Roughly speaking, a 

spline is a type of curve, and a CRS is a curve fitted to a plot of points to 

reveal patterns in a cloud of noisy data. We argue for the usefulness of the 

CRS approach and discuss the properties of J, a component of the CRS 

model, in the context of measuring variability.  

 

Cassin’s Auklet Data 

Every spring between 1994 and 2000, Cassin’s Auklet breeding colonies 

located on Frederick Island (off the west coast of Haida Gwaii, in northern 

BC) and Triangle Island (off the Northern tip of Vancouver Island, southern 

BC) were visited.  During the spring nesting period of each year, hatch date 

and nestling mass at 5 and 25 days of age were recorded for a sample of 

nestlings. Sample sizes ranged from 28 to 143 chicks per year per island. 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

66 
 

No sampling was done on Frederick Island in 1999. Both islands host large 

breeding colonies of seabirds (Triangle Island boasts the largest known 

colony of Cassin’s Auklet in the world), and have been the subject of 

ecological research for decades (Bertram et al., 2009). For this study, we 

use a consolidation of historical datasets (D. Bertram & A. Harfenist, 

personal communication, May 1
st
, 2012). 

For each chick surveyed, an estimation of growth rate,   , was 

computed as the scaled difference in mass over 20 days of chick-rearing:  

                 ⁄ , where      is the mass of the i
th

 chick at age 

a days.  Because chick growth is, in reality, sigmoid (roughly S-shaped, if 

plotting weight against time), it is a simplification to summarize the rate of 

growth by a single number. However, this practice is common (Harfenist, 

1995; Bertram et al., 2001; Hedd et al., 2002; Bertram et al., 2005) and 

justified by observations that the 5-25 day age range falls within a roughly 

linear (straight line) period of the sigmoid growth curve (Harfenist, 1995). 

In the Cassin’s Auklet literature, growth rates of individual chicks are 

commonly interpreted as proxies for the reproductive success of a colony as 

a whole. This is because chick-growth rate is a good predictor of hatching 

and fledging success, quality of food provisioning, and first-year survival 

(Harfenist, 1995; Bertram et al., 2001; Hedd et al., 2002; Bertram et al., 

2005). In short, chick-growth rates tell us how healthy a population is, in 

terms of its capacity to grow and to rebound after disturbances (productivity 

and persistence). 

Anecdotal reports and qualitative examinations by seabird biologists 

suggest that Triangle Island exhibits much more variability in reproductive 

performance than Frederick Island (Bertram et al., 2001; Bertram et al., 

2009). Our goal was to back this claim with quantitative evidence from the 

1990s, and to find a statistical model illustrating the overall seven-year 

trend on both islands. 



Koch 
 

67 
 

In technical terms, we required a model explaining the colony-wide 

mean growth rate,     ̅̅ ̅̅ , where c indicates colony (F=Frederick or 

T=Triangle) and t indicates year. To compare variability in reproductive 

performance we also required a measure of the variability of     ̅̅ ̅̅  with 

respect to years. To introduce the trends on both islands, the     ̅̅ ̅̅  alues 

connected by a year-to-year linear interpolation line are plotted in Figure 1.  

 

Figure 1. Linear interpolation of the annual mean growth rates on Frederick 

Island (solid line connecting     ̅̅ ̅̅ ̅ values) and Triangle Island (dashed line c 
connecting     ̅̅ ̅̅ ̅ values), overlaid on a scatterplot of the raw growth-rate 

data,   , against time (one point per nestling). To distinguish data between 
the two islands, all points from Triangle Island have been shifted ahead by 

0.1 years on this plot. Sample sizes within each year are listed above, nT and 

nF for Triangle and Frederick islands, respectively. *Note the absence of 

data for Frederick Island in 1999. 

 

Figure 1 reveals a large amount of spread in the growth rate data and 

hints at some year-to-year differences between islands. Some further 

exploration of the data revealed that hatch date is responsible for some of 

the variance in the growth rates. The biological explanation is that the 

nourishment of nestlings hinges on seasonal prey items only available 

during a small part of the year. Chicks reared too late in the season are at a 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

68 
 

disadvantage when they miss this narrow window of high food quality 

(Bertram et al., 2001; Bertram et al., 2009).  

Figure 2 illustrates the correlation between growth rate    and hatch 

date    by island. 

Figure 2. Growth rate data plotted against Julian hatch date, by island (one 

point per nestling, all years combined). In both Frederick and Triangle 

Islands the correlation is significant (  
             ; and   

  
           ). The dotted line indicates a straight line fit to the data, 
i.e., a simple linear regression. 

 

This relationship between    and    suggests that, by including hatch date 

in our analysis as a covariate, we may remove some of the noise from our 

dependent variable (chick-growth rate). 

 

Model Development: Parametric Regression 

The observation that the Triangle Island colony performed worse than 

Frederick Island during the 1990s as a whole, and in terms of year-by-year 

comparisons, has been reported already in previous research (Bertram et al., 

2001; Bertram et al., 2005; Bertram et al., 2009). Statistical methods in 

these studies typically involved making point estimates (e.g., of average 

chick-growth rate, or average adult survival) for individual years, then 

comparing these numbers. However, our goals of testing for a variability 



Koch 
 

69 
 

difference between islands, and modeling reproductive performance 

patterns over the long term, forced us to take a different tack.  

The issue is how to model the relationship between chick-growth rate, 

    ̅̅ ̅̅ ,  and time. This relationship does not appear linear (straight line) given 

the dips that we observed in        in the years 1996 and 1998 (Figure 1). 

We therefore required a curvilinear relationship in time, t, which is referred 

to as the function     . 

In the language of statistics, curve fitting, or finding the best form for 

    , is called regression. Regression comes in two flavours, parametric 

and nonparametric. Parametric methods are simpler and yield results that 

are more directly interpreted, but come at the cost of assuming far more, at 

the outset, about the shape of the function (Faraway, 2006). For example, 

polynomial regression (a parametric method) is a very popular choice for 

curve fitting in ecology (Montgomery, Peck & Vining, 2012). However, 

one must specify the degree of the polynomial at the outset, which, among 

other things, restricts the number of upward and downward swings in the 

curve, and the variety of shapes that it may adopt.  

In our case, we believed it unwise to use parametric methods, because 

in selecting one for     , we may exclude many other plausible forms, 

knowing as little as we do about the complex underlying year-dependent 

processes that generated our growth rate values. These processes might 

include shifts in marine production of zooplankton and yearly differences in 

the age-structure of the colony, the number of experienced nesters, etc.  

 

Model Development: Nonparametric Regression 

The philosophy of nonparametric regression is that one gains flexibility and 

convenience in fitting the curve by sacrificing some knowledge of what, in 

terms of an explicit equation, is the relationship between   and t (Wood,  



The Arbutus Review Vol. 4, No. 1 (2013) 

 

70 
 

2006). While one may still write out a formula for     ,  it would be 

convoluted (even to the statistician), and uninformative to the analysis. For 

our purposes, the main advantage of nonparametric regression is that we let 

the data speak for itself, rather than enforcing a preconceived shape for the 

long-term reproductive performance patterns. 

In nonparametric regression the method used to calculate      is 

referred to as a “smoother” and the fitted curve is called a “smooth.” A 

variety of smoothers are available (see Rupert, Wand & Carroll, 2003, pp. 

57-88, or Faraway 2006, pp. 211-288, for an overview). We used the cubic 

regression spline (CRS), originally developed by Wahba (1990), and 

recommended as a good all-purpose smoother by Faraway (2006) and 

Wood (2006).  

The flexibility of smoothers makes them especially useful in situations 

where the relationship being modeled is complex or nonlinear (since, 

compared with parametric methods, they impose far fewer constraints on 

the final shape of the fitted curve). They are also particularly well suited to 

data exploration in larger datasets (Faraway, 2006). Where traditional 

methods (e.g., a series of boxplots, or a linear interpolation plot of average 

values, as in Figure 1) become crowded and unclear, smoothers can deliver 

a clearer and more aesthetically pleasing result.  

Smoothing methods have gained widespread popularity in recent years, 

with applications as varied as: forecasting in economics (Yatchew, 1998); 

age-depth dating in archaeology and paleoecology (Telford, Heegaard & 

Birks, 2004); and understanding links between suicide rates and the weather 

(Likhvar, Honda & Ono, 2011). Examples in the same vein as our present 

work include Fleming et al. (2010), who used smooths to model the role of 

environmental factors on migratory songbird populations, and Siriwardena 

et al. (1998), who smoothed time series data on farmland birds to identify 

and compare long-term population trends. The present study is, to our 



Koch 
 

71 
 

knowledge, the first use of smoothing methods on Cassin’s Auklet 

population data. 

Sections 1.5 and 1.6 introduce some mathematical details of CRS 

models. Readers from non-mathematics/statistics backgrounds may wish to 

skip to the non-technical summary in section 1.7. 

 

Model Development: Properties of the CRS Smoother 

In spline-based methods (including the CRS), the observed range of the 

covariate (e.g., time, in our case) is partitioned into intervals whose 

endpoints are referred to as knots. Within each interval, a curve is fitted 

under the constraint that all of the pieces align end to end. In the case of the 

CRS, the sections of the curve are third-degree polynomials, there is a knot 

at each unique covariate value, and the complete spline is continuous up to 

second derivatives. If no further constraints are imposed, such a spline 

typically overfits the data. In the extreme case, this would result in 

obtaining a curve which passes through every data point; this is undesirable 

because the actual pattern in time is obscured by errors in our measurements 

which we seek to remove, not track. CRS smoothers avoid this problem by 

imposing a roughness penalty in the objective function. If the dataset 

contains n paired data points, (     ) for   1, 2, 3, …n, then the objective 

function is: 

∑ (        )
  

           (1) 

The first term is the sum of squared errors (quantifying variance), and 

the second is a roughness penalty, J, scaled by multiplication with , the 

smoothing parameter. The function (1) is minimised to obtain the fitted 

smooth     . 

The minimiser of (1) keeps the prediction error low while maintaining a 

level of smoothness (the bias-variance trade-off), the smoothness being 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

72 
 

tuned via a single number, . Figures 3a-f illustrate the role of  with CRS 

smooths fitted to the Triangle Island data for a range of  values. 

 

Figure 3. CRS smoother fitted to Triangle Island growth rates against year. 

A range of values are demonstrated for the smoothing parameter , as well 

as the resulting J values: (a) =0, J=151.7; (b) =0.1, J=107.3; (c)  =0.5, 

J=41.3; (d)  =1, J=19.8; (e) =5, J=2.5; (f) =100, J=0.1. 
 

The problem of selecting the smoothing parameter  has been 

investigated at length (Wahba 1990, pp. 45-65; Wood, 2011). If  is 

selected too high, we are underfitting (obtaining too simple a curve, as in 

Figure 3f), and if it is too low, we are overfitting. We used restricted 

maximum likelihood (REML) to automatically select . Automated 

methods are preferable in keeping with our desire to let the data speak for 

itself, and REML is less prone to underfitting than the more popular 

generalized cross validation (Wood, 2011). 

 

Roughness Penalty as a Measure of Variability 

J measures roughness in the fitted spline (i.e., it quantifies the number and 

magnitude of fluctuations in the curve, or as Wood (2006) calls it, the 

“wiggliness”). Mathematically, the usual definition is  

        ∫          
 

 
  , where       is the smallest interval 

containing all of the observed covariate values (Wood, 2006). A wonderful 

result emerges with J so defined: Of all functions with absolutely  



Koch 
 

73 
 

continuous first derivatives on      , the natural cubic spline (a CRS with 

some additional boundary constraints) is the unique minimiser of the 

objective (1) (Green and Silverman, 1994, pp. 13-19). This means that if we 

accept J as a valid measure of wiggliness, and (1) a valid measure of model 

fit, a CRS curve is, mathematically speaking, the ideal fit for the data (out 

of any smooth curve imaginable). 

Given the above definition, J indeed reflects the variability in a time 

series fitted with a smooth. Unfortunately, because the concept of 

population variability has no single well-established mathematical 

definition in ecology (Heath, 2006), it is difficult to argue one way or 

another about the robustness of an estimator. However, given the following 

example, it appears that J can be misleading if used as a measure of 

variability for the purpose of evaluating population viability (i.e., risk of 

extinction). 

 Consider a time series generated from the sinusoidal function      

          over the time interval       . This function has      

     . Let us compare   values for two sine waves (say    and   )  

having different amplitude and frequency (say amplitude       and 

frequency      ). With        , we find that       

whenever      
  

. This implies that if the frequency of    is twice 

that of   , then its amplitude must only exceed ¼ that of    to achieve a 

higher   value. To frame this in a population ecology context, suppose    

and    are abundance indices for two populations. Now, suppose that the 

first population oscillates between 1% and 199% of its average size, once 

per year.  Finally, we suppose the second population oscillates less 

dramatically, between 75% and 125%, but does so twice each year. Then in 

fact the population (  ) would have a higher J value (indicating higher 

variability). The oscillations in population 1 introduce far more of a 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

74 
 

viability risk than those of population 2 (because of the periodic slumps to 

1%), yet a comparison of the J values would lead the unwary ecologist to 

the opposite conclusion. 

 

Summary 

In order to model the complex relationship between our covariates (time 

and hatch date) and response variable (chick-growth rate), we used a 

nonparametric method known as the CRS smoother to obtain fitted curves 

known as smooths. We reported our final CRS-based model using smooths 

to illustrate reproductive performance trends in time on Frederick and 

Triangle Islands, as well as their dependence on hatch date. A component of 

the CRS, the roughness penalty, J, measures wiggliness in the fitted curve. 

We saw that, by definition, J is a variability measure, yet the mathematical 

investigation in Section 1.6 suggests it would be inappropriate for use in 

comparing viability and persistence of populations. We supported this claim 

with a simulation study, in which J values led us to an incorrect conclusion 

about population viability; whereas the well-established population 

variability statistics, CV and PV (Heath, 2006; Fraterrigo & Rusak, 2008), 

yielded the correct result. Finally, we compared variability in reproductive 

performance on Frederick and Triangle Islands using CV and PV. 

 

Methods 

Additive Models 

The additive model (AM) is a convenient statistical framework for 

employing CRS smoothers. For the Auklets dataset, we used a special case 

of the varying-coefficients AM (Hastie & Tibshirani, 1993), which permits 

smooth terms conditioned on a categorical predictor (allowing us to fit 

separate curves for each island). We used smoothing splines (a special type 

of CRS) and REML for selection of . We explored models that included 

hatch date and year smooths, both island-specific and common to both 



Koch 
 

75 
 

islands. To test for nonlinearity, we considered models where smooths are 

replaced by linear terms. We used approximate F-tests to rule out certain 

models. Then, we ranked the remaining ones using Akaike’s Information 

Criterion (AIC) to identify a preferred model (Wood, 2006). For all the AM 

analysis, we used the R package “mgcv” (Wood, 2011).  

We also considered J as a metric for variability in the chick-growth 

rates over time. We computed the roughness penalty for a simulated dataset 

to demonstrate that Jcan be misleading in a population ecology context. 

Finally, we compared the values of coefficient of variation (CV) and 

population variability (PV) of the within-year means of each island to 

identify quantitative differences in the variability of reproductive 

performance. 

 

Roughness Parameter Simulation Study 

To investigate the use of   as a measure of variability we generated data 

from                ⁄
 
  at each of 13 values of t equally spaced 

over the time interval         with two sets of parameter values, with the 

first having (a)    ,     and the second having (b)     ⁄ , 

   . A CRS was then fitted to each dataset, using REML for smoothing 

parameter selection and the   values computed for the fitted splines. This 

process was repeated 1000 times to produce histograms of the   values and 

to approximate their distributions. In this example, the sinusoidal process 

generating data for (a) had amplitude six times that of (b), yet (b) had a 

higher   value because it oscillated three times as often. 

 

 

 

 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

76 
 

Results 

Model Selection 

The most complex model considered here (model i in Table 1) expresses the 

linear predictor for mean growth rate as the sum of two CRS smooths, one 

for year,     , and one for hatch date,     , conditioned on island. 

Defining an island indicator variable       if observation i comes from 

Triangle Island, and zero otherwise, model i is: 

                                                (3) 

where         
   and             

Referring back to Figure 1, while the Triangle Island growth rate trend 

seems nonlinear, a flat straight line is plausible for Frederick. We tested for 

nonlinearity in the Frederick Island year trend by fitting model ii (Table 2), 

which has a linear term replacing the CRS. The approximate F-test rejected 

this latter model in favour of model i (      ,            ), 

indicating the Frederick Island data is generated from a nonlinear year 

effect. Similarly, we tested the hypothesis that the Triangle Island year term 

effect was linear with model iii, which the F-test rejected in favour of 

nonlinearity (      ,            ). Finally, we considered the 

hypothesis that a single year smooth representing a common trend on both 

islands is sufficient, using model iv.  The much higher AIC score indicates 

that a single smooth is inadequate to model the trends on Triangle and 

Frederick Islands (note that because model iv is not nested inside model i, 

an F-test is not appropriate here) so we retained the island-specific year 

smooths.  

Similarly, we evaluated nonlinearity in the hatch date effects by 

replacing the appropriate smooth by linear terms (models v and vi). In both 

cases the smooth terms performed better, as indicated by AIC and 

approximate F-test scores (     ,            and      , 



Koch 
 

77 
 

           respectively).Finally, we considered whether one 

nonlinear hatch date effect was sufficient to explain the variability on both 

islands using model vii. Model i performed slightly better, although the 

difference in AIC was small. The hatch date term was significant in model 

vii (         ), confirming its usefulness in the AM. 

Model viii eliminates island effects, having year and hatch date effects 

fitted as smooths across the pooled data. This model performed the worst in 

AIC, and captured very little structure in the dataset (R
2
=7.8%). 

 

Table 1. Model selection results for the Cassin’s Auklet data.  Effective 

degrees of freedom (EDF), AIC, and R
2
 values for all candidate additive 

models are provided. Effect terms differing from those in model i are 

bolded for clarity. Models are listed in ascending order by AIC. 

# Model for       EDF AIC  R
2 
(%)

 

i                       

                              

25.02 2693.7 37.1 

vii              

                          

18.66 2698.6 36.4 

vi                   

                              

19.5 2721.2 35.2 

v                       

                         

18.2 2732.9 34.4 

ii                       22.3 2749.2 33.7 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

78 
 

                         

iv                    

                    

19.9 2859.4 26.7 

iii                   

                              

17.6 2879.7 25.3 

viii           10.0 3107.9 7.8 

 

The Full Model 

Based on AIC scores and approximate F-tests, model i, the sum of island-

dependent smooths for both year and hatch date, was preferred. Figure 4 

shows the fitted smooths from model i (  ,   ,   , and   ) individually. 

Although our model yields a 3D response surface (briefly, the response 

surface is a pictorial representation of growth rate as a function of time and 

hatch date), the trend over years can be appreciated by looking at predicted 

values in a 2D slice obtained by fixing hatch date to some constant value 

(Figure 5). Because of the additive structure of the model, a change in hatch 

date results only in vertical translations of the two response curves plotted 

in Figure 5 relative to each other. 



Koch 
 

79 
 

Figure 4. Fitted smooths from model i along with predicted values and 

approximate 95% confidence bands. Year smooths (a)       and (b)       
are fitted with a cubic regression spline (CRS) of 6 and 7 knots, 

respectively. Hatch date smooths (c)      , and (d)       are fitted with 
a CRS of 48 and 59 knots, respectively. 

 

Figure 5. Predicted values of Cassin’s Auklet growth rate from model i for 

all years sampled, at the median hatch date 139 (April 29th). Dotted lines 

indicate approximate 95% pointwise confidence intervals. 

 

 

 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

80 
 

Wiggliness Simulation Results 

In simulations, the CRS fitted to series (b) had a higher Jvalue 46% of the 

time, indicating higher variability in (b) despite its less wiggly appearance. 

In contrast, CV and PV computed for this simulated data were higher in 

series (a) 100% of the time, indicating strongly that (a) is more variable.  

 

Figure 6. Scatterplots showing one example of the simulated data and fitted 

splines (solid lines) overlaid on the original sine curves (dotted lines) for 

parameter sets (a)    ,     and (b)     ⁄ ,    . Histograms 
of the   values for (a) and (b) are plotted below. 
 

Discussion 

Model selection not only served to identify the best model, but also to test 

hypotheses about trends in the data. The approximate F-tests revealed that 

the year-wise trends on Triangle and Frederick Islands were not only 

different, but both nonlinear (i.e., straight lines are inadequate to represent 

these effects). Likewise, we found good evidence that the hatch date effect 

was nonlinear and island-specific.  



Koch 
 

81 
 

The hatch date smooths in Figures 4c-d are interesting to the seabird 

biologist as we see a marked difference between islands, with Triangle 

growth rates exhibiting more of a negative trend as the nesting season 

progresses, indicating that within-year timing is more of an issue in the 

Triangle population than for birds from Frederick. Note that in Figure 4c, 

the confidence bands widen dramatically towards the extremes of the hatch 

date range because of a scarcity of observations (fewer than 1% lie outside 

of [125, 166]). This smooth should therefore be interpreted cautiously near 

its endpoints.  

The year effects plots are most interesting as they illustrate the multi-

year variability in growth rate means. Frederick is fairly flat throughout the 

study period; whereas, Triangle exhibits fluctuations which are larger and 

more frequent, with dips in the years 1996 and 1998. 

The simulations with the roughness penalty were informative as we had 

originally hoped to develop a test for a difference in variability between    

and     by estimating the distribution of J (either by simulating from the 

posterior distribution of J given our data, or by bootstrapping). Such a test 

might still have uses, but with the aim of comparing population time series, 

the example in Section 1.6 suggests that it could be misleading in an 

ecological context. In the realm of population ecology, short-term low 

amplitude changes are of less importance to population viability than long-

term high amplitude changes. The latter represent events that could threaten 

the persistence of the population, such as depressing it to levels at which 

recovery is hindered by small population size  (e.g., small population levels 

can make it more difficult for a colony to fend off predators, or to cooperate 

in finding good foraging patches), or bolstering it to levels at which density-

dependent effects could lead to a population crash (e.g., resulting from 

competition for food, or from parasite outbreaks; Boyce, 1992). In this light, 

if the goal is to compare the viability of populations through variability 

measurements, a good measure should not be overly sensitive to high-



The Arbutus Review Vol. 4, No. 1 (2013) 

 

82 
 

frequency, low-amplitude oscillations, but rather pick up more on the higher 

amplitude swings. 

Qualitatively, the higher variability in the Triangle Island time series is 

obvious from Figures 4 and 5. We conclude with quantitative evidence to 

reinforce this observation. We computed CV and PV on the yearly averages 

on each island. As expected, Triangle Island had higher values of both 

statistics than Frederick Island (CVT =15.5 and PVT=0.17 versus CVF =5.5 

and PVF=0.06). 

 

Conclusion 

Nonparametric regression has considerable value for conveying trends in an 

ecological time series. In terms of clarity and aesthetics, these models 

provide a large improvement over graphs of linear interpolation of the 

averages or yearly boxplots. They also provide a degree of flexibility, which 

is necessary when modeling nonlinear trends. For these reasons we expect 

to see more of the CRS and other smoothing methods in the future, not only 

in ecology, but in any field where exploratory data analysis is needed (e.g., 

economics, climate research, geography, and health sciences). While we 

failed to find a variability measure in the AM with CRS smooths, these 

models remained very useful for data exploration and illustration. 

To researchers using the CRS smoother, we advise caution in the 

interpretation of the roughness penalty   as a statistic representing temporal 

variability. Despite its role as a wiggliness measure, J can be misleading in 

the context of population ecology, where variability is often a sign of 

extinction risk. We recommend established variability measures, such as PV 

or CV, as a simple and convenient option for investigators seeking a 

quantitative reading of variability. Future work will return to the CRS 

model to examine the possibility of formulating a variability statistic as a 

function of the fitted parameters,  , since confidence intervals for such a 

statistic can be estimated through simulation from the posterior distribution 



Koch 
 

83 
 

of    . One could for example devise a hypothesis test for a difference in 

CV( ̂) or PV( ̂) using this approach. This would offer an improvement 

over simply calculating CV or PV for the yearly means (as we did), since 

information about the temporal ordering of observations would be included 

as part of the process of fitting  ̂.  New statistics such as these could prove 

useful in any field in which the variability in a historical time series is of 

interest (such as research on climate change seeking to tease apart natural 

variability from that introduced by human activity).  

Our investigation showed that Cassin’s Auklets on Triangle Island deal 

with more year-to-year fluctuations in reproductive performance than those 

on Frederick Island. This identifies Triangle as a more at-risk population, 

which could be useful in advising future conservation efforts. Furthermore, 

the difference in population variability is suggestive of a more unstable 

pattern of zooplankton productivity in the local marine ecosystem near 

Triangle Island. A detailed discussion of this last point is the subject of 

forthcoming work (Bertram et al., in prep.). 

 

References 

Bertram, D. F., Jones, I. L., Cooch, E. G., Knechtel, H. A., & Cooke, F. 

(2000) Survival rates of Cassin's and Rhinoceros Auklets at Triangle 

Island, British Columbia. The Condor, 102, 155-162. 

Bertram, D. F., Mackas, D. L., & McKinnell, & S. M. (2001). The seasonal 

cycle revisited: interannual variation and ecosystem consequences. 

Progress in Oceanography, 49, 283-307. 

Bertram, D. F., Harfenist, A., & Smith, B. D. (2005). Ocean climate and El 

Niño impacts on survival of Cassin’s Auklets from upwelling and 

downwelling domains of British Columbia. Canadian Journal of 

Fisheries and Aquatic Sciences, 62, 2841-2853. doi: 10.1139/F05-190. 

Bertram, D. F., Harfenist, A., & Hedd, A. (2009). Seabird nestling diets 

reflect latitudinal temperature-dependent variation in availability of key 

zooplankton prey populations. Marine Ecology Progress Series, 393, 

199-200. doi: 10.3354/meps08223. 



The Arbutus Review Vol. 4, No. 1 (2013) 

 

84 
 

Bertram, D. F., Harfenist, A., Cowen, L. L., Koch, D., Lemon, M. J. F., 

Hipfner, J. M., & Drever, M. (in prep.). Cassin’s Auklet nestling 

reproductive failure is related to latitudinal temperature dependent 

mismatches with copepod prey populations. 

Boyce, M. S. (1992). Population Viability Analysis. Annual Review of 

Ecology and Systematics, 23, 481-506. 

Burger, A.E., & Powell, D.W. (1989). Diving depths and diet of Cassin’s 

Auklet at Reef Island, British Columbia. Canadian Journal of Zoology, 

68, 1572–1577. 

Faraway, J. J. (2006). Extending the Linear Model with R. Boca Raton, FL: 

Chapman and Hall/CRC Press.  

Fleming, J. M., Cantoni, E., Field, C., McLaren, I. (2010). Extracting long-

term patterns of population changes from sporadic counts of migrant 

birds. Environmetrics, 21, 482-492. 

Fraterrigo, J. M., & Rusak, J. A. (2008). Disturbance-driven changes in the 

variability of ecological patterns and processes. Ecology Letters, 11, 

756-770. doi: j.1461-0248.2008.01191.x 

Green, P. J., & Silverman, B. W. (1994). Nonparametric Regression and 

Generalized Linear Models. Suffolk, England: Chapman & Hall. 

Harfenist, A. (1995).  Effects of Growth-Rate Variation on Fledging of 

Rhinoceros Auklets (Cerorhinca monocerata). The Auk, 112, 60-66. 

Hastie, T., & Tibshirani, R. (1993). Varying-coefficient models. Journal of 

the Royal Statistical Society: Series B, 55, 757-796. 

Hedd, A., Ryder, J. L., Cowen L. L., & Bertram, D. F. (2002). Inter-annual 

variation in the diet, provisioning and growth of Cassin’s auklet at 

Triangle Island, British Columbia: responses to variation in ocean 

climate. Marine Ecology Progress Series, 229, 221-232. 

Heath, J. (2006). Quantifying temporal variability in population 

abundances. Oikos, 115, 573-581. doi: 10.1111/j.2006.0030-

1299.15067.x 

Inchausti, P., & Halley, J. (2003). On the relation between temporal 

variability and persistence time in animal populations. Journal of 

Animal Ecology, 72, 899-908. doi: j.1365-2656.2003.00767.x 

Likhvar, V., Honda, Y., & Ono, M. (2011). Relation between temperature 

and suicide mortality in Japan in the presence of other confounding 

factors using time-series analysis with a semiparametric approach. 

Environmental Health and Preventative Medicine, 16, 36-43. 

Montgomery, D.C., Peck, E.A., & Vining, G.G. (2012). Introduction to 

Linear Regression Analysis. Hoboken, NJ: John Wiley & Sons. 



Koch 
 

85 
 

Pyle, P. (2001). Age at first breeding and natal dispersal in a declining 

population of Cassin's Auklet. The Auk, 118, 996-1007.  

Rupert, D., Wand, M., & Carroll, R. (2003). Semiparametric Regression. 

New York, NY: Cambridge University Press. 

Siriwardena, G. M., Baillie, S. R., Buckland, S. T., Fewster, R. M., 

Marchant, J. H., & Wilson, J. D. (1998). Trends in the abundance of 

farmland birds; a quantitative comparison of smoothed Common Birds 

Census indices. Journal of Applied Ecology, 35, 24-43. 

Telford, R. J., Heegaard, E., & Birks, H. J. B. (2004). All age–depth models 

are wrong: but how badly? Quaternary Science Reviews, 23, 1-5. 

Vucetich, J. A., White, T. A., Qvarnemark, L., & Ibarguen, S. (2000). 

Population variability and extinction risk. Conservation Ecology, 14, 

1704-1714. doi: 10.1111/j.1523-1739.2000.99359.x 

Wahba, G. (1990). Spline Models for Observational Data (1
st
 ed.). 

Philadelphia, PA: SIAM.   

Wood S. N. (2006). Generalized Additive Models: An Introduction with R. 

Boca Raton, FL: Chapman and Hall/CRC Press.  

Wood, S. N. (2011). Fast stable restricted maximum likelihood and 

marginal likelihood estimation of semiparametric generalized linear 

models. Journal of the Royal Statistical Society: Series B, 73, 3-36. doi: 

j.1467-9868.2010.00749.x 

Yatchew, A. (1998). Nonparametric Regression Techniques in Economics. 

Journal of Economic Literature, 36, 669-721. 

 

Contact Information 

Dean Koch, from the Department of Mathematics and Statistics, can be 

reached at dk@uvic.ca. 

 

Acknowledgements 

An Undergraduate Student Research Award from the Natural Sciences and 

Engineering Research Council of Canada, as well as the Jamie Cassels 

Undergraduate Research Award, supported this research. I would like to 

thank Professor Laura Cowen for her invaluable guidance and 

encouragement in this project, as well as Dr.’s Doug Bertram and Anne 

Harfenist for providing the data and supporting the research.