Koch 63 A comparison of variability in two Cassin’s Auklet (Ptychoramphus aleuticus) colonies using spline-based nonparametric models Dean Koch Abstract: While variability in the reproductive performance of a population over time is a familiar and useful concept to ecologists, it can be difficult to capture mathematically. Commonly used ecological variability statistics, such as the standard deviation of the logarithm and coefficient of variation, discard the time- ordering of observations and consider only the unordered response variable values. We used a relatively new methodology, the cubic regression spline (a flexible curve fitted to a scatterplot of data), both to illustrate trends in reproductive performance over time and to explore the utility of the cubic regression spline roughness penalty (J) as a statistic for measuring variability while retaining time-ordering information. We concluded that although J measures variability in a mathematical sense, it can be inappropriate in a population ecology context because of sensitivity to small-scale fluctuations. To illustrate our methodology, we used the CRS approach in an analysis of historical data from two Cassin’s Auklet colonies located on Frederick and Triangle Islands in coastal BC, developing a model for the annual mean nestling growth rate on each island over seven contiguous years. Model selection indicated a complex (nonlinear) trend in growth rate on both islands. We report higher variability in the resident bird population of Triangle Island than Frederick Island, based on a comparison of the fitted curves, and the values of the coefficient of variation and population variability summary statistics. Key Terms: Cassin’s Auklets; Alcid ecology; reproductive performance; chick-growth rate; ecological variability; standard deviation of the logarithm; coefficient of variation; population variability; historical time series; population time series; population dynamics; semiparametric modeling; nonparametric modeling; splines; cubic regression splines; additive models; roughness penalty The Arbutus Review Vol. 4, No. 1 (2013) 64 Introduction Background Our interest in ecological variability statistics began while investigating data on two breeding populations of Cassin’s Auklet (Ptychoramphus aleuticus). This small, rarely seen seabird of the auk family spends most of its life out on the open sea, coming ashore only in the spring to breed in colonies (Bertram, Harfenist & Smith, 2005). Like other auks, the Cassin’s Auklet is an impressive diver, using its wings like flippers to propel itself through the water in search of food, regularly descending to depths of forty metres (Burger & Powell, 1989). During the breeding season, the Cassin’s Auklet spends its entire day foraging, returning onshore only at night to regurgitate food for its young in underground burrows (Bertram et al., 2005). As plankton is the main food source for the Cassin’s Auklet, changes in production at the lowest levels of the marine food chain can impact the bird in a very immediate way. Bertram, Mackas & McKinnell (2001) argued that because of this close linkage, large-scale shifts in oceanographic conditions (e.g., El Niño / La Niña cycles, and long-term climate change) are reflected in the reproductive successes and failures of auklets. In this way, the Cassin’s Auklet can be thought of as a bellwether for the marine ecosystem (Bertram et al., 2005). By studying its population dynamics, we gain insight not only into auklet ecology, but also into the state of the ocean. A large body of research on Cassin’s Auklets provides the groundwork for the present study, including examinations of adult survival and diet, colony density, and year-by-year measurements of reproductive performance (e.g. Bertram et al., 2000; Bertram et al., 2001; Pyle, 2001; Hedd et al., 2002; Bertram et al., 2005; Bertram, Harfenist & Hedd, 2009). In some cases, variability in reproductive performance is discussed qualitatively (by simultaneously comparing multiple point estimates, one Koch 65 from each year); however, we are aware of no study that has attempted to quantify it in a statistically rigorous way. The variability in a population time series is a concept of great interest to ecologists. It is widely accepted, for example, that higher temporal variability in abundance (more increases and decreases in the population size) translates into higher extinction risk (Vucetich et al., 2000; Inchausti & Halley, 2003). Not surprisingly, there are a number of different tools available for measuring variability in a population time series (see Fraterrigo & Rusak, 2008). However, the most common of these, including the standard deviation of the logarithm (SDL), coefficient of variation (CV), and Heath’s population variance (PV), are calculations on the response variable alone, and do not take into account the sequential ordering of observations available in a time series. We wished to improve on these statistics by finding a model that captures the trends in time, while also yielding a statistic for variability. We present a case study in which Cassin’s Auklet reproductive performance data is modeled using a statistical tool that is relatively new to population ecology, the cubic regression spline (CRS). Roughly speaking, a spline is a type of curve, and a CRS is a curve fitted to a plot of points to reveal patterns in a cloud of noisy data. We argue for the usefulness of the CRS approach and discuss the properties of J, a component of the CRS model, in the context of measuring variability. Cassin’s Auklet Data Every spring between 1994 and 2000, Cassin’s Auklet breeding colonies located on Frederick Island (off the west coast of Haida Gwaii, in northern BC) and Triangle Island (off the Northern tip of Vancouver Island, southern BC) were visited. During the spring nesting period of each year, hatch date and nestling mass at 5 and 25 days of age were recorded for a sample of nestlings. Sample sizes ranged from 28 to 143 chicks per year per island. The Arbutus Review Vol. 4, No. 1 (2013) 66 No sampling was done on Frederick Island in 1999. Both islands host large breeding colonies of seabirds (Triangle Island boasts the largest known colony of Cassin’s Auklet in the world), and have been the subject of ecological research for decades (Bertram et al., 2009). For this study, we use a consolidation of historical datasets (D. Bertram & A. Harfenist, personal communication, May 1 st , 2012). For each chick surveyed, an estimation of growth rate, , was computed as the scaled difference in mass over 20 days of chick-rearing: ⁄ , where is the mass of the i th chick at age a days. Because chick growth is, in reality, sigmoid (roughly S-shaped, if plotting weight against time), it is a simplification to summarize the rate of growth by a single number. However, this practice is common (Harfenist, 1995; Bertram et al., 2001; Hedd et al., 2002; Bertram et al., 2005) and justified by observations that the 5-25 day age range falls within a roughly linear (straight line) period of the sigmoid growth curve (Harfenist, 1995). In the Cassin’s Auklet literature, growth rates of individual chicks are commonly interpreted as proxies for the reproductive success of a colony as a whole. This is because chick-growth rate is a good predictor of hatching and fledging success, quality of food provisioning, and first-year survival (Harfenist, 1995; Bertram et al., 2001; Hedd et al., 2002; Bertram et al., 2005). In short, chick-growth rates tell us how healthy a population is, in terms of its capacity to grow and to rebound after disturbances (productivity and persistence). Anecdotal reports and qualitative examinations by seabird biologists suggest that Triangle Island exhibits much more variability in reproductive performance than Frederick Island (Bertram et al., 2001; Bertram et al., 2009). Our goal was to back this claim with quantitative evidence from the 1990s, and to find a statistical model illustrating the overall seven-year trend on both islands. Koch 67 In technical terms, we required a model explaining the colony-wide mean growth rate, ̅̅ ̅̅ , where c indicates colony (F=Frederick or T=Triangle) and t indicates year. To compare variability in reproductive performance we also required a measure of the variability of ̅̅ ̅̅ with respect to years. To introduce the trends on both islands, the ̅̅ ̅̅ alues connected by a year-to-year linear interpolation line are plotted in Figure 1. Figure 1. Linear interpolation of the annual mean growth rates on Frederick Island (solid line connecting ̅̅ ̅̅ ̅ values) and Triangle Island (dashed line c connecting ̅̅ ̅̅ ̅ values), overlaid on a scatterplot of the raw growth-rate data, , against time (one point per nestling). To distinguish data between the two islands, all points from Triangle Island have been shifted ahead by 0.1 years on this plot. Sample sizes within each year are listed above, nT and nF for Triangle and Frederick islands, respectively. *Note the absence of data for Frederick Island in 1999. Figure 1 reveals a large amount of spread in the growth rate data and hints at some year-to-year differences between islands. Some further exploration of the data revealed that hatch date is responsible for some of the variance in the growth rates. The biological explanation is that the nourishment of nestlings hinges on seasonal prey items only available during a small part of the year. Chicks reared too late in the season are at a The Arbutus Review Vol. 4, No. 1 (2013) 68 disadvantage when they miss this narrow window of high food quality (Bertram et al., 2001; Bertram et al., 2009). Figure 2 illustrates the correlation between growth rate and hatch date by island. Figure 2. Growth rate data plotted against Julian hatch date, by island (one point per nestling, all years combined). In both Frederick and Triangle Islands the correlation is significant ( ; and ). The dotted line indicates a straight line fit to the data, i.e., a simple linear regression. This relationship between and suggests that, by including hatch date in our analysis as a covariate, we may remove some of the noise from our dependent variable (chick-growth rate). Model Development: Parametric Regression The observation that the Triangle Island colony performed worse than Frederick Island during the 1990s as a whole, and in terms of year-by-year comparisons, has been reported already in previous research (Bertram et al., 2001; Bertram et al., 2005; Bertram et al., 2009). Statistical methods in these studies typically involved making point estimates (e.g., of average chick-growth rate, or average adult survival) for individual years, then comparing these numbers. However, our goals of testing for a variability Koch 69 difference between islands, and modeling reproductive performance patterns over the long term, forced us to take a different tack. The issue is how to model the relationship between chick-growth rate, ̅̅ ̅̅ , and time. This relationship does not appear linear (straight line) given the dips that we observed in in the years 1996 and 1998 (Figure 1). We therefore required a curvilinear relationship in time, t, which is referred to as the function . In the language of statistics, curve fitting, or finding the best form for , is called regression. Regression comes in two flavours, parametric and nonparametric. Parametric methods are simpler and yield results that are more directly interpreted, but come at the cost of assuming far more, at the outset, about the shape of the function (Faraway, 2006). For example, polynomial regression (a parametric method) is a very popular choice for curve fitting in ecology (Montgomery, Peck & Vining, 2012). However, one must specify the degree of the polynomial at the outset, which, among other things, restricts the number of upward and downward swings in the curve, and the variety of shapes that it may adopt. In our case, we believed it unwise to use parametric methods, because in selecting one for , we may exclude many other plausible forms, knowing as little as we do about the complex underlying year-dependent processes that generated our growth rate values. These processes might include shifts in marine production of zooplankton and yearly differences in the age-structure of the colony, the number of experienced nesters, etc. Model Development: Nonparametric Regression The philosophy of nonparametric regression is that one gains flexibility and convenience in fitting the curve by sacrificing some knowledge of what, in terms of an explicit equation, is the relationship between and t (Wood, The Arbutus Review Vol. 4, No. 1 (2013) 70 2006). While one may still write out a formula for , it would be convoluted (even to the statistician), and uninformative to the analysis. For our purposes, the main advantage of nonparametric regression is that we let the data speak for itself, rather than enforcing a preconceived shape for the long-term reproductive performance patterns. In nonparametric regression the method used to calculate is referred to as a “smoother” and the fitted curve is called a “smooth.” A variety of smoothers are available (see Rupert, Wand & Carroll, 2003, pp. 57-88, or Faraway 2006, pp. 211-288, for an overview). We used the cubic regression spline (CRS), originally developed by Wahba (1990), and recommended as a good all-purpose smoother by Faraway (2006) and Wood (2006). The flexibility of smoothers makes them especially useful in situations where the relationship being modeled is complex or nonlinear (since, compared with parametric methods, they impose far fewer constraints on the final shape of the fitted curve). They are also particularly well suited to data exploration in larger datasets (Faraway, 2006). Where traditional methods (e.g., a series of boxplots, or a linear interpolation plot of average values, as in Figure 1) become crowded and unclear, smoothers can deliver a clearer and more aesthetically pleasing result. Smoothing methods have gained widespread popularity in recent years, with applications as varied as: forecasting in economics (Yatchew, 1998); age-depth dating in archaeology and paleoecology (Telford, Heegaard & Birks, 2004); and understanding links between suicide rates and the weather (Likhvar, Honda & Ono, 2011). Examples in the same vein as our present work include Fleming et al. (2010), who used smooths to model the role of environmental factors on migratory songbird populations, and Siriwardena et al. (1998), who smoothed time series data on farmland birds to identify and compare long-term population trends. The present study is, to our Koch 71 knowledge, the first use of smoothing methods on Cassin’s Auklet population data. Sections 1.5 and 1.6 introduce some mathematical details of CRS models. Readers from non-mathematics/statistics backgrounds may wish to skip to the non-technical summary in section 1.7. Model Development: Properties of the CRS Smoother In spline-based methods (including the CRS), the observed range of the covariate (e.g., time, in our case) is partitioned into intervals whose endpoints are referred to as knots. Within each interval, a curve is fitted under the constraint that all of the pieces align end to end. In the case of the CRS, the sections of the curve are third-degree polynomials, there is a knot at each unique covariate value, and the complete spline is continuous up to second derivatives. If no further constraints are imposed, such a spline typically overfits the data. In the extreme case, this would result in obtaining a curve which passes through every data point; this is undesirable because the actual pattern in time is obscured by errors in our measurements which we seek to remove, not track. CRS smoothers avoid this problem by imposing a roughness penalty in the objective function. If the dataset contains n paired data points, ( ) for 1, 2, 3, …n, then the objective function is: ∑ ( ) (1) The first term is the sum of squared errors (quantifying variance), and the second is a roughness penalty, J, scaled by multiplication with , the smoothing parameter. The function (1) is minimised to obtain the fitted smooth . The minimiser of (1) keeps the prediction error low while maintaining a level of smoothness (the bias-variance trade-off), the smoothness being The Arbutus Review Vol. 4, No. 1 (2013) 72 tuned via a single number, . Figures 3a-f illustrate the role of with CRS smooths fitted to the Triangle Island data for a range of values. Figure 3. CRS smoother fitted to Triangle Island growth rates against year. A range of values are demonstrated for the smoothing parameter , as well as the resulting J values: (a) =0, J=151.7; (b) =0.1, J=107.3; (c) =0.5, J=41.3; (d) =1, J=19.8; (e) =5, J=2.5; (f) =100, J=0.1. The problem of selecting the smoothing parameter has been investigated at length (Wahba 1990, pp. 45-65; Wood, 2011). If is selected too high, we are underfitting (obtaining too simple a curve, as in Figure 3f), and if it is too low, we are overfitting. We used restricted maximum likelihood (REML) to automatically select . Automated methods are preferable in keeping with our desire to let the data speak for itself, and REML is less prone to underfitting than the more popular generalized cross validation (Wood, 2011). Roughness Penalty as a Measure of Variability J measures roughness in the fitted spline (i.e., it quantifies the number and magnitude of fluctuations in the curve, or as Wood (2006) calls it, the “wiggliness”). Mathematically, the usual definition is ∫ , where is the smallest interval containing all of the observed covariate values (Wood, 2006). A wonderful result emerges with J so defined: Of all functions with absolutely Koch 73 continuous first derivatives on , the natural cubic spline (a CRS with some additional boundary constraints) is the unique minimiser of the objective (1) (Green and Silverman, 1994, pp. 13-19). This means that if we accept J as a valid measure of wiggliness, and (1) a valid measure of model fit, a CRS curve is, mathematically speaking, the ideal fit for the data (out of any smooth curve imaginable). Given the above definition, J indeed reflects the variability in a time series fitted with a smooth. Unfortunately, because the concept of population variability has no single well-established mathematical definition in ecology (Heath, 2006), it is difficult to argue one way or another about the robustness of an estimator. However, given the following example, it appears that J can be misleading if used as a measure of variability for the purpose of evaluating population viability (i.e., risk of extinction). Consider a time series generated from the sinusoidal function over the time interval . This function has . Let us compare values for two sine waves (say and ) having different amplitude and frequency (say amplitude and frequency ). With , we find that whenever . This implies that if the frequency of is twice that of , then its amplitude must only exceed ¼ that of to achieve a higher value. To frame this in a population ecology context, suppose and are abundance indices for two populations. Now, suppose that the first population oscillates between 1% and 199% of its average size, once per year. Finally, we suppose the second population oscillates less dramatically, between 75% and 125%, but does so twice each year. Then in fact the population ( ) would have a higher J value (indicating higher variability). The oscillations in population 1 introduce far more of a The Arbutus Review Vol. 4, No. 1 (2013) 74 viability risk than those of population 2 (because of the periodic slumps to 1%), yet a comparison of the J values would lead the unwary ecologist to the opposite conclusion. Summary In order to model the complex relationship between our covariates (time and hatch date) and response variable (chick-growth rate), we used a nonparametric method known as the CRS smoother to obtain fitted curves known as smooths. We reported our final CRS-based model using smooths to illustrate reproductive performance trends in time on Frederick and Triangle Islands, as well as their dependence on hatch date. A component of the CRS, the roughness penalty, J, measures wiggliness in the fitted curve. We saw that, by definition, J is a variability measure, yet the mathematical investigation in Section 1.6 suggests it would be inappropriate for use in comparing viability and persistence of populations. We supported this claim with a simulation study, in which J values led us to an incorrect conclusion about population viability; whereas the well-established population variability statistics, CV and PV (Heath, 2006; Fraterrigo & Rusak, 2008), yielded the correct result. Finally, we compared variability in reproductive performance on Frederick and Triangle Islands using CV and PV. Methods Additive Models The additive model (AM) is a convenient statistical framework for employing CRS smoothers. For the Auklets dataset, we used a special case of the varying-coefficients AM (Hastie & Tibshirani, 1993), which permits smooth terms conditioned on a categorical predictor (allowing us to fit separate curves for each island). We used smoothing splines (a special type of CRS) and REML for selection of . We explored models that included hatch date and year smooths, both island-specific and common to both Koch 75 islands. To test for nonlinearity, we considered models where smooths are replaced by linear terms. We used approximate F-tests to rule out certain models. Then, we ranked the remaining ones using Akaike’s Information Criterion (AIC) to identify a preferred model (Wood, 2006). For all the AM analysis, we used the R package “mgcv” (Wood, 2011). We also considered J as a metric for variability in the chick-growth rates over time. We computed the roughness penalty for a simulated dataset to demonstrate that Jcan be misleading in a population ecology context. Finally, we compared the values of coefficient of variation (CV) and population variability (PV) of the within-year means of each island to identify quantitative differences in the variability of reproductive performance. Roughness Parameter Simulation Study To investigate the use of as a measure of variability we generated data from ⁄ at each of 13 values of t equally spaced over the time interval with two sets of parameter values, with the first having (a) , and the second having (b) ⁄ , . A CRS was then fitted to each dataset, using REML for smoothing parameter selection and the values computed for the fitted splines. This process was repeated 1000 times to produce histograms of the values and to approximate their distributions. In this example, the sinusoidal process generating data for (a) had amplitude six times that of (b), yet (b) had a higher value because it oscillated three times as often. The Arbutus Review Vol. 4, No. 1 (2013) 76 Results Model Selection The most complex model considered here (model i in Table 1) expresses the linear predictor for mean growth rate as the sum of two CRS smooths, one for year, , and one for hatch date, , conditioned on island. Defining an island indicator variable if observation i comes from Triangle Island, and zero otherwise, model i is: (3) where and Referring back to Figure 1, while the Triangle Island growth rate trend seems nonlinear, a flat straight line is plausible for Frederick. We tested for nonlinearity in the Frederick Island year trend by fitting model ii (Table 2), which has a linear term replacing the CRS. The approximate F-test rejected this latter model in favour of model i ( , ), indicating the Frederick Island data is generated from a nonlinear year effect. Similarly, we tested the hypothesis that the Triangle Island year term effect was linear with model iii, which the F-test rejected in favour of nonlinearity ( , ). Finally, we considered the hypothesis that a single year smooth representing a common trend on both islands is sufficient, using model iv. The much higher AIC score indicates that a single smooth is inadequate to model the trends on Triangle and Frederick Islands (note that because model iv is not nested inside model i, an F-test is not appropriate here) so we retained the island-specific year smooths. Similarly, we evaluated nonlinearity in the hatch date effects by replacing the appropriate smooth by linear terms (models v and vi). In both cases the smooth terms performed better, as indicated by AIC and approximate F-test scores ( , and , Koch 77 respectively).Finally, we considered whether one nonlinear hatch date effect was sufficient to explain the variability on both islands using model vii. Model i performed slightly better, although the difference in AIC was small. The hatch date term was significant in model vii ( ), confirming its usefulness in the AM. Model viii eliminates island effects, having year and hatch date effects fitted as smooths across the pooled data. This model performed the worst in AIC, and captured very little structure in the dataset (R 2 =7.8%). Table 1. Model selection results for the Cassin’s Auklet data. Effective degrees of freedom (EDF), AIC, and R 2 values for all candidate additive models are provided. Effect terms differing from those in model i are bolded for clarity. Models are listed in ascending order by AIC. # Model for EDF AIC R 2 (%) i 25.02 2693.7 37.1 vii 18.66 2698.6 36.4 vi 19.5 2721.2 35.2 v 18.2 2732.9 34.4 ii 22.3 2749.2 33.7 The Arbutus Review Vol. 4, No. 1 (2013) 78 iv 19.9 2859.4 26.7 iii 17.6 2879.7 25.3 viii 10.0 3107.9 7.8 The Full Model Based on AIC scores and approximate F-tests, model i, the sum of island- dependent smooths for both year and hatch date, was preferred. Figure 4 shows the fitted smooths from model i ( , , , and ) individually. Although our model yields a 3D response surface (briefly, the response surface is a pictorial representation of growth rate as a function of time and hatch date), the trend over years can be appreciated by looking at predicted values in a 2D slice obtained by fixing hatch date to some constant value (Figure 5). Because of the additive structure of the model, a change in hatch date results only in vertical translations of the two response curves plotted in Figure 5 relative to each other. Koch 79 Figure 4. Fitted smooths from model i along with predicted values and approximate 95% confidence bands. Year smooths (a) and (b) are fitted with a cubic regression spline (CRS) of 6 and 7 knots, respectively. Hatch date smooths (c) , and (d) are fitted with a CRS of 48 and 59 knots, respectively. Figure 5. Predicted values of Cassin’s Auklet growth rate from model i for all years sampled, at the median hatch date 139 (April 29th). Dotted lines indicate approximate 95% pointwise confidence intervals. The Arbutus Review Vol. 4, No. 1 (2013) 80 Wiggliness Simulation Results In simulations, the CRS fitted to series (b) had a higher Jvalue 46% of the time, indicating higher variability in (b) despite its less wiggly appearance. In contrast, CV and PV computed for this simulated data were higher in series (a) 100% of the time, indicating strongly that (a) is more variable. Figure 6. Scatterplots showing one example of the simulated data and fitted splines (solid lines) overlaid on the original sine curves (dotted lines) for parameter sets (a) , and (b) ⁄ , . Histograms of the values for (a) and (b) are plotted below. Discussion Model selection not only served to identify the best model, but also to test hypotheses about trends in the data. The approximate F-tests revealed that the year-wise trends on Triangle and Frederick Islands were not only different, but both nonlinear (i.e., straight lines are inadequate to represent these effects). Likewise, we found good evidence that the hatch date effect was nonlinear and island-specific. Koch 81 The hatch date smooths in Figures 4c-d are interesting to the seabird biologist as we see a marked difference between islands, with Triangle growth rates exhibiting more of a negative trend as the nesting season progresses, indicating that within-year timing is more of an issue in the Triangle population than for birds from Frederick. Note that in Figure 4c, the confidence bands widen dramatically towards the extremes of the hatch date range because of a scarcity of observations (fewer than 1% lie outside of [125, 166]). This smooth should therefore be interpreted cautiously near its endpoints. The year effects plots are most interesting as they illustrate the multi- year variability in growth rate means. Frederick is fairly flat throughout the study period; whereas, Triangle exhibits fluctuations which are larger and more frequent, with dips in the years 1996 and 1998. The simulations with the roughness penalty were informative as we had originally hoped to develop a test for a difference in variability between and by estimating the distribution of J (either by simulating from the posterior distribution of J given our data, or by bootstrapping). Such a test might still have uses, but with the aim of comparing population time series, the example in Section 1.6 suggests that it could be misleading in an ecological context. In the realm of population ecology, short-term low amplitude changes are of less importance to population viability than long- term high amplitude changes. The latter represent events that could threaten the persistence of the population, such as depressing it to levels at which recovery is hindered by small population size (e.g., small population levels can make it more difficult for a colony to fend off predators, or to cooperate in finding good foraging patches), or bolstering it to levels at which density- dependent effects could lead to a population crash (e.g., resulting from competition for food, or from parasite outbreaks; Boyce, 1992). In this light, if the goal is to compare the viability of populations through variability measurements, a good measure should not be overly sensitive to high- The Arbutus Review Vol. 4, No. 1 (2013) 82 frequency, low-amplitude oscillations, but rather pick up more on the higher amplitude swings. Qualitatively, the higher variability in the Triangle Island time series is obvious from Figures 4 and 5. We conclude with quantitative evidence to reinforce this observation. We computed CV and PV on the yearly averages on each island. As expected, Triangle Island had higher values of both statistics than Frederick Island (CVT =15.5 and PVT=0.17 versus CVF =5.5 and PVF=0.06). Conclusion Nonparametric regression has considerable value for conveying trends in an ecological time series. In terms of clarity and aesthetics, these models provide a large improvement over graphs of linear interpolation of the averages or yearly boxplots. They also provide a degree of flexibility, which is necessary when modeling nonlinear trends. For these reasons we expect to see more of the CRS and other smoothing methods in the future, not only in ecology, but in any field where exploratory data analysis is needed (e.g., economics, climate research, geography, and health sciences). While we failed to find a variability measure in the AM with CRS smooths, these models remained very useful for data exploration and illustration. To researchers using the CRS smoother, we advise caution in the interpretation of the roughness penalty as a statistic representing temporal variability. Despite its role as a wiggliness measure, J can be misleading in the context of population ecology, where variability is often a sign of extinction risk. We recommend established variability measures, such as PV or CV, as a simple and convenient option for investigators seeking a quantitative reading of variability. Future work will return to the CRS model to examine the possibility of formulating a variability statistic as a function of the fitted parameters, , since confidence intervals for such a statistic can be estimated through simulation from the posterior distribution Koch 83 of . One could for example devise a hypothesis test for a difference in CV( ̂) or PV( ̂) using this approach. This would offer an improvement over simply calculating CV or PV for the yearly means (as we did), since information about the temporal ordering of observations would be included as part of the process of fitting ̂. New statistics such as these could prove useful in any field in which the variability in a historical time series is of interest (such as research on climate change seeking to tease apart natural variability from that introduced by human activity). Our investigation showed that Cassin’s Auklets on Triangle Island deal with more year-to-year fluctuations in reproductive performance than those on Frederick Island. This identifies Triangle as a more at-risk population, which could be useful in advising future conservation efforts. Furthermore, the difference in population variability is suggestive of a more unstable pattern of zooplankton productivity in the local marine ecosystem near Triangle Island. A detailed discussion of this last point is the subject of forthcoming work (Bertram et al., in prep.). References Bertram, D. F., Jones, I. L., Cooch, E. G., Knechtel, H. A., & Cooke, F. (2000) Survival rates of Cassin's and Rhinoceros Auklets at Triangle Island, British Columbia. The Condor, 102, 155-162. Bertram, D. F., Mackas, D. L., & McKinnell, & S. M. (2001). The seasonal cycle revisited: interannual variation and ecosystem consequences. Progress in Oceanography, 49, 283-307. Bertram, D. F., Harfenist, A., & Smith, B. D. (2005). Ocean climate and El Niño impacts on survival of Cassin’s Auklets from upwelling and downwelling domains of British Columbia. Canadian Journal of Fisheries and Aquatic Sciences, 62, 2841-2853. doi: 10.1139/F05-190. Bertram, D. F., Harfenist, A., & Hedd, A. (2009). Seabird nestling diets reflect latitudinal temperature-dependent variation in availability of key zooplankton prey populations. Marine Ecology Progress Series, 393, 199-200. doi: 10.3354/meps08223. The Arbutus Review Vol. 4, No. 1 (2013) 84 Bertram, D. F., Harfenist, A., Cowen, L. L., Koch, D., Lemon, M. J. F., Hipfner, J. M., & Drever, M. (in prep.). Cassin’s Auklet nestling reproductive failure is related to latitudinal temperature dependent mismatches with copepod prey populations. Boyce, M. S. (1992). Population Viability Analysis. Annual Review of Ecology and Systematics, 23, 481-506. Burger, A.E., & Powell, D.W. (1989). Diving depths and diet of Cassin’s Auklet at Reef Island, British Columbia. Canadian Journal of Zoology, 68, 1572–1577. Faraway, J. J. (2006). Extending the Linear Model with R. Boca Raton, FL: Chapman and Hall/CRC Press. Fleming, J. M., Cantoni, E., Field, C., McLaren, I. (2010). Extracting long- term patterns of population changes from sporadic counts of migrant birds. Environmetrics, 21, 482-492. Fraterrigo, J. M., & Rusak, J. A. (2008). Disturbance-driven changes in the variability of ecological patterns and processes. Ecology Letters, 11, 756-770. doi: j.1461-0248.2008.01191.x Green, P. J., & Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models. Suffolk, England: Chapman & Hall. Harfenist, A. (1995). Effects of Growth-Rate Variation on Fledging of Rhinoceros Auklets (Cerorhinca monocerata). The Auk, 112, 60-66. Hastie, T., & Tibshirani, R. (1993). Varying-coefficient models. Journal of the Royal Statistical Society: Series B, 55, 757-796. Hedd, A., Ryder, J. L., Cowen L. L., & Bertram, D. F. (2002). Inter-annual variation in the diet, provisioning and growth of Cassin’s auklet at Triangle Island, British Columbia: responses to variation in ocean climate. Marine Ecology Progress Series, 229, 221-232. Heath, J. (2006). Quantifying temporal variability in population abundances. Oikos, 115, 573-581. doi: 10.1111/j.2006.0030- 1299.15067.x Inchausti, P., & Halley, J. (2003). On the relation between temporal variability and persistence time in animal populations. Journal of Animal Ecology, 72, 899-908. doi: j.1365-2656.2003.00767.x Likhvar, V., Honda, Y., & Ono, M. (2011). Relation between temperature and suicide mortality in Japan in the presence of other confounding factors using time-series analysis with a semiparametric approach. Environmental Health and Preventative Medicine, 16, 36-43. Montgomery, D.C., Peck, E.A., & Vining, G.G. (2012). Introduction to Linear Regression Analysis. Hoboken, NJ: John Wiley & Sons. Koch 85 Pyle, P. (2001). Age at first breeding and natal dispersal in a declining population of Cassin's Auklet. The Auk, 118, 996-1007. Rupert, D., Wand, M., & Carroll, R. (2003). Semiparametric Regression. New York, NY: Cambridge University Press. Siriwardena, G. M., Baillie, S. R., Buckland, S. T., Fewster, R. M., Marchant, J. H., & Wilson, J. D. (1998). Trends in the abundance of farmland birds; a quantitative comparison of smoothed Common Birds Census indices. Journal of Applied Ecology, 35, 24-43. Telford, R. J., Heegaard, E., & Birks, H. J. B. (2004). All age–depth models are wrong: but how badly? Quaternary Science Reviews, 23, 1-5. Vucetich, J. A., White, T. A., Qvarnemark, L., & Ibarguen, S. (2000). Population variability and extinction risk. Conservation Ecology, 14, 1704-1714. doi: 10.1111/j.1523-1739.2000.99359.x Wahba, G. (1990). Spline Models for Observational Data (1 st ed.). Philadelphia, PA: SIAM. Wood S. N. (2006). Generalized Additive Models: An Introduction with R. Boca Raton, FL: Chapman and Hall/CRC Press. Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B, 73, 3-36. doi: j.1467-9868.2010.00749.x Yatchew, A. (1998). Nonparametric Regression Techniques in Economics. Journal of Economic Literature, 36, 669-721. Contact Information Dean Koch, from the Department of Mathematics and Statistics, can be reached at dk@uvic.ca. Acknowledgements An Undergraduate Student Research Award from the Natural Sciences and Engineering Research Council of Canada, as well as the Jamie Cassels Undergraduate Research Award, supported this research. I would like to thank Professor Laura Cowen for her invaluable guidance and encouragement in this project, as well as Dr.’s Doug Bertram and Anne Harfenist for providing the data and supporting the research.