Layout 1 Advances in Oceanography and Limnology, 2016; 7(1): 36-50 ARTICLE DOI: 10.4081/aiol.2016.5791 INTRODUCTION Water temperature in lakes is governed by a complex heat budget resulting from the combination of different heat flux components that are mainly exchanged between the lake and the atmosphere. Water temperature is the primary driver of vertical stratification in lakes, thus it significantly affects transport of mass (including nutrients and dissolved oxygen), energy, and momentum within the water column. It crucially controls several physical (e.g., thermal stratifi- cation, mixing processes), geochemical (e.g., chemical re- action rates, oxygen solubility), and ecological (e.g., metabolism, growth, and reproduction of organisms) processes, with considerable influences on the overall lake water quality, ecosystem functioning, and community com- position (Wetzel, 2001; Gallina et al., 2013). It is therefore evident that any significant changes in water temperature may lead to alterations in the thermal regime of the lake and in the community structure of many freshwater habitats (Winder and Sommer, 2012; De Senerpont Domis et al., 2013; Schabhüttl et al., 2013; Butcher et al., 2015), with possible modifications of the biochemical compositions of some algae species (Flaim et al., 2014). This is particularly relevant considering that lakes have been demonstrated to be highly sensitive to changes in environmental conditions (Adrian et al., 2009; O’Reilly et al., 2015). In the light of the above considerations, large efforts have been directed towards the development of models able to predict water temperature, with a particular attention to Lake Surface Temperature (LST). Several models of dif- ferent type and complexity have been proposed to simulate water temperature, ranging from simple regression models (McCombie, 1959; Webb, 1974; Livingstone and Lotter 1998; Kettle et al., 2004; Sharma et al., 2008) to more com- plex process-based numerical models (Perroud et al., 2009; Martynov et al., 2010; Thiery et al., 2014). Regressions models are attractive because they require little informa- tion, usually only air temperature, but generally they are not able to address some fundamental processes (e.g., the Prediction of lake surface temperature using the air2water model: guidelines, challenges, and future perspectives Sebastiano Piccolroaz Department of Civil, Environmental and Mechanical Engineering, University of Trento, via Mesiano 77, I-38123,Trento, Italy Corresponding author: s.piccolroaz@unitn.it ABSTRACT Water temperature plays a primary role in controlling a wide range of physical, geochemical and ecological processes in lakes, with considerable influences on lake water quality and ecosystem functioning. Being able to reliably predict water temperature is therefore a desired goal, which stimulated the development of models of different type and complexity, ranging from simple regression-based models to more sophisticated process-based numerical models. However, both types of models suffer of some limitations: the first are not able to address some fundamental physical processes as e.g., thermal stratification, while the latter generally require a large amount of data in input, which are not always available. In this work, lake surface temperature is simulated by means of air2water, a hybrid physically-based/statistical model, which is able to provide a robust, predictive understanding of LST dynamics knowing air temperature only. This model showed performances that are comparable with those obtained by using process based models (a root mean square error on the order of 1°C, at daily scale), while retaining the simplicity and parsimony of regression-based models, thus making it a good candidate for long-term applications. The aim of the present work is to provide the reader with useful and practical guidelines for proper use of the air2water model and for critical analysis of results. Two case studies have been selected for the analysis: Lake Superior and Lake Erie (USA). These are clear and emblematic examples of a deep and a shallow temperate lake characterized by markedly dif- ferent thermal responses to external forcing, thus are ideal for making the results of the analysis the most general and comprehensive. Particular attention is paid to assessing the influence of missing data on model performance, and to evaluating when an observed time series is sufficiently informative for proper model calibration or, conversely, data are too scarce thus leading to the risk of overfitting. The final aim of the work is to facilitate the use of the model also by scientists that do not necessarily have a solid background on mod- eling or physics. This work is also an attempt to foster the communication and interaction among colleagues of a branch of science, limnology, which suffer of significant fragmentation. This is summarized in the future perspectives and challenges concerning potential improvements of the air2water, with a particular emphasis on possible cross-sectoral applications Key words: Lake surface temperature; air2water; air temperature, thermal response; temperature modeling. Received: February 2016. Accepted: March 2016. No n- co mm er cia l u se on ly 37S. Piccolroaz effect of thermal stratification) and their use may be ques- tionable especially when it is necessary to extrapolate tem- perature values beyond the limits of the measured time series, as is typically the case in climate change studies. On the other hand, deterministic models are designed to pro- vide an exhaustive description of the thermal behaviour of the lake, but they require detailed time series of meteoro- logical variables, which are not always available for long periods and with a sufficient accuracy. In order to overcome the limitations of traditional ap- proaches, Piccolroaz et al. (2013) recently developed air2water, a hybrid physically-based/statistical model, which is able to provide a robust, predictive understanding of LST dynamics knowing air temperature only. The hy- brid formulation of the air2water model combines a phys- ically based derivation of the governing equation with a stochastic calibration of the parameters. In this way, the information contained in the data is transferred directly to model parameters, whose calibrated values can provide significant information as to how the real system behaves (thanks to the physical-based structure of the model). The underlying rationale behind the development of this model is to take advantage of the fact that the governing laws of physics are generally well understood, to intro- duce opportune simplifications while retaining all the fun- damental processes (and their physical meaning) involved. The purpose is to minimize data requirements and computational effort, which still represent the most common limitations, and to develop a as simple as possi- ble but not simpler (citing a famous quote by Albert Ein- stein) mathematical tool able to provide a reliable description of a natural phenomenon on the basis of the data that are available. The model has been successfully tested considering lakes characterized by different mor- phometric characteristics and using different sources of data (see e.g., Toffolon et al., 2014a, who applied the air2water model to 14 different lakes in the temperate re- gion: 7 located in North America, 6 in Europe, and 1 in Asia). In all cases, air2water performed similarly to more complex process-based models (i.e., RMSE on the order of 1°C for daily temperatures), even though these latter models generally require a much larger amount of infor- mation. The model has been shown to satisfactorily cap- ture seasonal variations and inter-annual dynamics of LST, and to provide key information to investigate the role of stratification in controlling the thermal response of lakes (Piccolroaz et al., 2015a). This work provides the reader with practical guide- lines for proper use of the air2water model and for critical analysis of results, with the final goal of facilitating the use of the model by scientists that do not necessarily have a solid background on modelling or physics. However, this work should not be considered simply as a collection of best practices, but also as an attempt to foster commu- nication among colleagues from different disciplines with a common interest in aquatic science. The reader will find answers to questions like: What is the meaning of model parameters, how are they derived, and how should we se- lect their a priori range of variation?; What is the maxi- mum allowable percentage of missing data to obtain reliable results?; How long should the calibration period be?; What version of the model should be used?; Does lake depth affect model performance?. Particular attention is given to analysing the effects of data scarcity on model performance in modelling LST. Finally, future directions and perspectives concerning possible improvements of the air2water model are discussed, with a particular emphasis on cross-sectoral applications. METHODS Study sites and available data The air2water model is applied to two lakes charac- terized by significantly different morphological and ther- mal characteristics: Lake Superior and Lake Erie (USA) (Fig. 1). Lake Superior is the largest, deepest, and most northern of the Great Lakes, while Lake Erie is the small- est, shallowest, and most southern of the two lakes in this study (Tab. 1). Long-term data of air and surface water temperature are available from different sources. In this work the following sources of data are used: GLSEA daily LST retrieved from satellite imagery (i.e., skin tempera- ture) provided by National Oceanic and Atmospheric Ad- ministration (NOAA) Great Lakes Environmental Research Laboratory (GLERL, webpage: http://www. glerl.noaa.gov/glsea/asc_ 1024/) and derived from NOAA polar-orbiting satellites equipped with AVHRR sensors, and daily air temperature at 2 meters above ground from ERA-Interim reanalysis (provided by the European Centre for Medium-Range Weather Forecasts, ECMWF and downloaded from http://apps.ecmwf.int/datasets/data/in- terim-full-daily/levtype=sfc/). Both datasets cover the 20- year period 1995-2014 and contain spatially distributed data (with resolution equal to about 1.3 km and 80 km in the two cases, respectively). The data have been post- processed in order to evaluate lake-average values, i.e., temperature values have been aggregated at the lake scale. Moreover, in order to allow the analyses, as presented in the Results section, the missing data in the LST series have been replaced by interpolation with a moving aver- age filter of 10 days. Fig. 1 shows the typical annual cycles of air and water temperature for the two lakes, and suggests the existence of markedly different thermal behaviours: i) due to the higher latitude, air temperature over Lake Superior is gen- erally colder than for Lake Erie (annual mean, minimum, and maximum equal to about 3.9°C vs 9.6°C, -13.9°C vs No n- co mm er cia l u se on ly 38 The air2water model: guidelines, challenges, and perspectives -6.4°C, and 17.8°C vs 23.5°C, for the two lakes respec- tively) and the maximum air temperature occurs later (be- ginning of August vs middle of July); ii) the amplitude of the phase lag (hysteresis) between air and water temper- ature is more evident for Lake Superior than for Lake Erie indicating a larger thermal inertia due to the larger water volume; iii) consequently, the onset of direct thermal strat- ification (i.e., when Tw≥4°C)) in Lake Superior occurs later in the year (end of May vs middle of April) as well as the period of maximum stratification (i.e., when Tw is maximum; end vs beginning of August); iv) the shape of LST annual cycle deviates from the nearly sinusoidal pat- tern of air temperature in Lake Superior, contrary to what happens in the case of Lake Erie; and v) Lake Superior is generally colder than Lake Erie (mean annual LST equal to 6.5°C and 11.3°C in the two lakes, respectively). The choice of these two case studies is not only mo- tivated by the large amount of high quality and freely available data, but also, and more importantly, by the fact that they are good examples of deep and shallow temper- ate lakes characterized by markedly different thermal re- sponses to external forcing. This requisite is certainly of major importance in order to write as much as possible exhaustive and generally valid guidelines for best prac- tices around the use of the air2water model. The air2water model The air2water model is based on a lumped heat budget of the surface volume of the lake at daily time scale, and is derived from the following volume-inte- grated heat equation: (eq. 1) from which the variation of water temperature (Tw) in time (t: hereafter expressed in days) is directly dependent on the product between the heat flux into the upper water vol- ume (Hnet) and the surface area of the lake (A), and in- versely dependent on the surface volume of water involved in the heat exchange with the atmosphere (Vs: hereafter also referred to as the reactive volume), density (ρ) (1000 kg m–3), and specific heat capacity at constant pressure (Cp) (4186 J kg–1 °C–1). Hnet can be expressed as the combination of several contributions entering and ex- iting the upper water volume (VS) (see Fig. 2 for a schematic, and Supplementary Material A for details), which are primarily controlled by: the net shortwave (HS) and longwave (Ha) radiation actually absorbed by the sur- face volume (i.e., accounting for water reflectivity), the longwave radiation emitted from the lake (HW), the latent heat flux due to evaporation and condensation (Hl), and the sensible heat flux due to convection (HC). Heat flux due to precipitation, the heat exchanged with inlets/out- lets, and the heat exchanged between surface volume and deep water or sediments can be considered as insignificant Fig. 1. Geographical location of Lake Superior and Lake Erie in the Great Lakes region and in North America. Typical annual cy- cles (averaged over the period 1995 to 2014) of air and water tem- perature for the two lakes. Tab. 1. Main morphological characteristics of the investigated lakes. Volume (km3) Surface area (km3) Maximum depth (m) Average depth (m) Geographic coordinates Lake Superior 12,000 82,100 406 147 47.7°N 87.5°W Lake Erie 480 25,667 64 19 42.2°N 81.2°W No n- co mm er cia l u se on ly 39S. Piccolroaz factors, and are not explicitly included in the formulation of air2water. However, their contribution is indirectly ac- counted for in the calibration of parameters. Following Livingstone and Padisák (2007), air temperature can be considered as a proxy for the integrated effect of the ex- ternal forcing, and it can be assumed, together with LST, as the key factor controlling the heat balance of the sur- face layer of the lake. This is the central concept of the air2water model. In particular, Hnet is included in a linear form obtained by Taylor expansion in terms of both air (Ta) and water (TW) temperatures, as follows: (eq. 2) whereand T̄W are reference values (e.g., long term averages of Ta and TW, respectively), and Hnet,0=Hnet | T̄a , T̄W is the part of Hnet that is independent on air and water temperatures. In general, however, Hnet,0 can vary in time. As a first ap- proximation, this is accounted for by defining Hnet,0 as the sum of a constant value and a sinusoidal function of time with a period of 1 year, the latter term summarizing, albeit in a simplified form, the combined effect due to the vari- ability of all meteorological variables other than air tem- perature (e.g., solar radiation, wind speed, air humidity, cloudiness) at annual time scale. Equation (1) can be therefore rewritten as follows: (eq. 3) where the definition of parameters âi,i=1, 2, 3, 5, 6 can be derived from equation (2) once the single heat flux terms are evaluated through suitable empirical relationships (Martin and McCutcheon, 1998). Refer to Supplementary Material A for details about the linearization Hnet of , and the definition of parameters âi. By introducing the dimensionless ratio δ=VS /Vr (which can be also interpreted as the ratio between the av- erage depth of the surface layer DS=VS /A and that of the reference layer Dr=Vr /A.), eq. (3) can be rewritten as the following ordinary differential equation, representing the full version of the air2water model: (eq. 4) where parameters ai,i=1, 2, 3, 5 are defined as ai=âiA/(Vr ρCp)=âi/(Dr ρcp). In this form, the geometrical characteristics of the lake (surface area, volume, and depth) are not required to be explicitly specified, since are implicitly accounted for in the model parameters ai, which require calibration. In order to ensure proper model cali- bration excluding unrealistic solutions, the model param- eters are allowed to vary within a physically plausible range, which can be easily estimated knowing (even ap- proximately) the mean depth of the lake, as will be thor- oughly discussed in the Results section. Equation (4) is numerically integrated with a daily time step (i.e., dt=1 day; see also the Methods section for further details). Finally, in order to account for the significant seasonal variability of the reactive volume as a consequence of ther- mal stratification, Piccolroaz et al. (2013) assumed that the dimensionless ratio (δ) is a function of the difference be- tween LST and a reference value of the deep water temper- ature (Th), through the following empirical relationship: (eq. 5) where Th can be assumed to be 4°C for dimictic lakes, and the minimum or maximum water temperature for warm and cold monomictic lakes, respectively, and a4, a7, and a8 are model parameters. From the first formula in equation (5) it is easy to see that the dimensionless ratio δ is theoretically defined in a range from 0 to 1, with δ decreasing for in- creasing thermal stratification (here represented by the dif- ference Tw–Th), thus mimicking the fact that the surface water volume affected by the surface heat budget gets pro- gressively thinner. Conversely, δ=1 when the lake is isothermal (i.e., Tw–Th), suggesting that the reference vol- ume can be interpreted as the maximum water volume in- volved in the heat exchange with the atmosphere during the year. The same considerations apply to the second formula in equation (5), which is valid when the lake is inversely stratified (i.e., when ). In this case, however, the possible effect of heat flux reduction due to ice cover is also in- cluded by a fictitious increase of the effective volume (see the second term on the right-hand side). In order to simulate ice formation at the surface, a lower bound is imposed on by introducing a threshold value. This threshold is generally 0°C when the water temperature is measured close to the surface, but it can be higher when temperature is measured at deeper depths. Despite being simple, the parameteriza- tion of δ presented in equation (5) is suitable to reproduce Fig. 2. Main heat fluxes involved in the heat budget of the sur- face layer. See Supplementary Material A for the description of the single terms. No n- co mm er cia l u se on ly 40 The air2water model: guidelines, challenges, and perspectives seasonal and interannual patterns of thermal stratification, as it has been clearly demonstrated for the cases of Lake Constance (Toffolon et al., 2014a) and Lake Superior (Pic- colroaz et al., 2015a). Equations (4) and (5) taken together constitute the air2water model in its full, 8-parameter version. Two sim- plified versions of the model are also available: a 6-para- meter version where δ=1 when the lake is inversely stratified; and a 4-parameter version which, beyond the above simplification, does not include the externally im- posed sinusoidal forcing (i.e., a5=0). This latter version can be considered particularly appropriate when the an- nual cycles of Tw and/or of Ta are approximately sinu- soidal: in fact, from basic principles of trigonometry, the sum of sinusoidal functions with the same period (i.e., 1 year) but different amplitude and phase, yields another si- nusoid with different amplitude and phase but the same period. Therefore, two sinusoids are enough, and the term can be removed. For the reason given in the Results section (second paragraph), the whole analysis is performed considering only the 4- and 6-parameter versions of the air2water model, without loss in generality. Numerical solution and model calibration The second release of the air2water model is now available at https://github.com/spiccolroaz/air2water, where the source code (written in Fortran 90/95), the pre- compiled executable files (Linux/Windows), a readme file, and an example application are freely downloadable (the code is published under the Creative Commons At- tribution-ShareAlike 3.0 license). In this new release, the main improvement concerns the numerical solution of the ordinary differential equation (4), which, together with equations (5), constitutes the air2water model. Users can now choose among Euler, Runge-Kutta 2nd order, Runge- Kutta 4th order, and Crank-Nicolson numerical schemes. The first three schemes are explicit, and in summer, when δ→0, it may happen that a daily time step is too large to adequately integrate equation (4), possibly generating nu- merical instabilities. In order to avoid this situation and provide an accurate prediction of Tw, an adaptive sub-step- ping procedure has been implemented, in which the orig- inal integration time step of one day is divided into a number of equal sub-steps according to the stability con- ditions of the method (Butcher, 2008). Predictions of Tw are anyway provided at daily time scale. Conversely, the last numerical scheme is implicit, 2nd order accurate, and unconditionally stable: a sub-stepping procedure is not re- quired and the daily time step is used for the whole sim- ulation, making it generally faster (but less accurate than Runge-Kutta 4th order) than the previous schemes. In this case, in order to obtain a closed-form analytical expres- sion of equation (4), is handled explicitly, thus only to the numerator of the right-hand side of equation (4) has been discretized according to the Crank-Nicolson scheme. Model calibration is performed through a Monte Carlo-based optimization approach in which a large num- ber of parameter sets are sampled and evaluated in terms of a given metric of model efficiency. Here, the Root Mean Square Error (RMSE) between observed and mod- elled values is considered as an optimization metric, meaning that at the end of the optimization loop the best set of parameters is identified as the one providing the smallest RMSE. The sampling procedure is performed through the Particle Swarm Optimization (PSO) algo- rithm, a simple and powerful population-based stochastic optimization technique firstly proposed by Kennedy and Eberhart (1995) for solving engineering problems, and successively applied to a variety of different fields, in- cluding hydrology (Gill et al., 2006; Piccolroaz et al., 2015b). For further details about this optimization proce- dure, the reader is referred to Supplementary Material B. Numerical integration of equation (4) requires that the series of air temperature (i.e., the external forcing) be con- tinuous and at daily resolution. Therefore gaps (in case they exist) must be reconstructed e.g., by replacement with the average value of all air temperature measure- ments available in the data set for the same specific day of the year when the data is missing. Conversely, the time series of observed LST can contain missing data. In this case, missing data are not replaced, and they simply do not contribute to the evaluation of the prediction perform- ance (e.g., through the evaluation of RMSE between ob- served and simulated LST). This allows for using air2water with LST observational time series at any fre- quency (e.g., weekly, monthly, seasonal) that is not nec- essarily the daily, or simply with irregular time series. The effect on model performance of the presence of missing LST data will be analysed in detail in the Results section. As a final note, besides RMSE the user can choose be- tween other metrics of model performance: the Nash-Sut- cliffe Efficiency index (NSE, Nash and Sutcliffe, 1970) and the Kling-Gupta Efficiency index (KGE, Gupta et al., 2009). In addition, model calibration can be performed using Simple Random Sampling or the Latin Hypercube Sampling technique (McKay et al., 1979) besides the PSO, which are computationally more expensive but ex- plore more uniformly the space of parameters, allowing for conducting sensitivity analyses of model parameters. RESULTS Evaluating the a priori range of model parameters As mentioned in the Methods section, to ensure proper model calibration, model parameters are required to be No n- co mm er cia l u se on ly 41S. Piccolroaz defined within a physically consistent a priori range of variation. This range should be sufficiently wide to allow for the existence of an optimal and physically plausible set of parameters, and at the same time it should not be indiscreetly large to avoid convergence to unrealistic so- lutions. Suitable a priori ranges of variations for param- eters ai,i=1, 2, 3, 5 can be evaluated on the basis of physical considerations, recalling that ai=âi/(Dr ρcp). Re- liable estimates of âi and Dr are therefore required. The possible range of variation of parameters âi can be ob- tained from equations (A11)-(A15) in Supplementary Ma- terial A, considering all possible values and combinations of the physical coefficients that appear in these equations (Martin and McCutcheon, 1998). Also the reference depth Dr , i.e., the mean depth of the largest water volume in- volved in the surface heat budget of the lake during the year, see Methods) can be assumed to vary within a range of possible values. Reasonably, Dr is bounded from above by the average depth of the lake (D=V/A, where V and A are volume and surface area of the lake, respectively), i.e., when Dr=D the whole lake participates to the heat ex- change with the atmosphere when the water column is well mixed. However, for the case of very shallow lakes (e.g., having the mean depth on the order of a few meters), Fig. 3. Estimate of the a priori range of variation of model parameters as a function of the mean depth of the lake , and regression re- lationships as determined by Toffolon et al. (2014a) analyzing 14 lakes with different morphologies. No n- co mm er cia l u se on ly 42 The air2water model: guidelines, challenges, and perspectives the effective volume participating to the heat budget may partially involve lake sediments making the effective vol- ume larger than the mere lake water volume (Toffolon et al., 2014a). This possibility is implicitly accounted for in the calibration of model parameters without the need of specifying any additional input information, but simply setting the upper bound of Dr to be larger than D (10 m is a reasonable and safe choice). As for the lower bound of Dr, experience suggests that a simple option is to linearly vary it from D=1 m for m to 50 m for m, which is cer- tainly a conservative underestimate. In fact, in Lake Baikal (Russia, the world’s deepest lake) D=744 m and 50 m only roughly represents the thickness of the epil- imnion during strong thermal stratification (Piccolroaz and Toffolon, 2013), suggesting that the Dr is certainly larger than this value. Parameter , which is the phase of the sinusoidal term with amplitude a5 summing up all con- tributions to the heat budget with the exception of the di- rect effect of air temperature, simply varies from 0 to 1. Parameter a4 controls the intensity of the stratification (thus the volume that is affected by the heat exchange), and, based on practical experience, its possible range of variation can be defined as in Fig. 3d. Fig. 3 shows the range of variation of all parameters as a function of D, evaluated based on the above consid- erations and setting the coefficients in equations (A11)- (A15) according to typical values that they assume in the temperate region. Note that in principle this estimate is coherent with the 6- and 8-parameter versions of the model, while in the 4-parameter version the meaning of the parameters is slightly different as parameter a5 is ab- sorbed into parameters a1, a2, and a3. In general, however, experience suggests that the same range of parameters can be safely used for all versions of the model. Fig. 3 also shows the relationships between model parameters and lake average depth D as determined by Toffolon et al. (2014a) where 14 temperate lakes were analysed which were characterized by significantly different morpholo- gies, using the 4-and 8-parameter versions of the model (here the relationships obtained for the full 8-parameter version are assumed valid also for the 6-parameter version given the strong similarity between the two versions of the models). The regressions between model parameters and are D in Tab. 2. From the combined analysis of Fig. 3 and Tab. 2, two main comments can be made: First, the regression lines are well within the physical a priori ranges of parameters, suggesting that these ranges are properly defined. The only exception is parameter a1 in the 6-parameter version, whose regression line is beneath the lower physical bound for D>300 m. However, for such deep lakes, previous re- sults suggest that this relationship is likely not significant (see e.g., the case of Lake Baikal in the original paper by Toffolon et al., 2014a), and in any case the overall de- pendence on D is weak. Second, and perhaps more im- portant, despite by definition parameters ai,i=1, 2, 3, 5, should depend inversely on depth, the regression lines do not simply scale with D–1 (see e.g., the exponents of the power laws in Tab. 1). This is indicative that air2water is able to suitably reproduce the complex thermal behaviour of a lake, by transferring the information contained in the observed data directly to model parameters, which, in turn, have a significant dependence on lake depth. Post-calibration analysis The optimal set of parameters resulting from the cali- bration procedure is required to be well centred within the a priori range of variation, in order to exclude any confine- ment effect due to bounds that are too narrow. This is ex- pected to always be the case when using the a priori range of parameters discussed in the previous section. However, it is always preferable to perform an a posteriori sensitivity analysis, aimed at excluding the eventuality of parameter ranges that are too narrow and at the same time evaluating parameters’ identifiability and significance. This analysis is easily done producing and analysing the shape of the so- called dotty plot”, which are projections of the measure of model performance (in this case expressed through RMSE) obtained after the calibration procedure within the hyper- space of parameters, onto single parameter axes (Beven and Freer, 2001; see Fig. 4 for a schematic). Preferably, dotty plots should be obtained using Simple Random Sampling or Latin Hypercube Sampling techniques for model cali- bration instead of PSO, to avoid clustering around the best solution. If a dotty plot is sharp and well defined (as in Fig. 4a) it means that the parameter is significant and well iden- tifiable, while if it is flat and scattered (as in Fig. 4b) it means that the parameter is not significant or the model is overparameterized. Detailed discussions about parameters identifiability of the three versions of the air2water model can be found in Piccolroaz et al. (2013) and Toffolon et al. (2014a). Parameters are well identifiable for all versions of the model (being slightly higher in the 4-parameter version Tab. 2. Equations of the regression relationships between model parameters and the mean depth of the lake found by Toffolon et al. (2014a) analysing 14 lakes with different morphologies, and shown in Fig. 3. Parameter Regression equation 4 parameters 6(8) parameters a1 –0.042+0.017 log (D) 0.488–0.096 log (D) a2 0.223 D–0.635 0.207 D–0.672 a3 0.175 D–0.540 0.262 D–0.659 a4 35.4 D–0.360 31.3 D–0.330 a5 – 0.843 D–0.732 a6 – 0.628–0.030 log (D) No n- co mm er cia l u se on ly 43S. Piccolroaz due to lower number of parameters), with the only excep- tion of parameters a7 and a8 in the full, 8-parameter version. The main reason is that these parameters are not fully in- dependent, and may produce significant interactions. A more appropriate parameterization of δ during inverse strat- ification and ice formation periods is currently under de- velopment. Since a7 generally achieves relatively high values implying δ~1 for TW≤4°C (Toffolon et al., 2014a), the following analysis is performed considering only the 4- and 6-parameter versions, still retaining full generality. Results of the 4- and 6-parameter versions of the model for the cases of lakes Superior and Erie are presented in Fig. 5 and Fig. 6. In both cases, the calibration of the pa- rameters was performed using two-thirds of the data set (13 years, from 1995 to 2007) and leaving one-third for the val- idation (7 years, from 2008 to 2014). Fig. 5 shows scatter- plots for the two lakes and the two versions of air2water during the calibration period. No systematic deviation (bias) is observed, and the dispersion along the diagonal does not exhibit significant trends. Both these characteris- tics are confirmed by the relatively small values of RMSE and values of the coefficient of determination (R2) close to one: RMSE=1.00°C and R2=0.97 and RMSE=0.93°C and R2=0.97 for Lake Superior (4- and 6-parameter versions), and RMSE=0.87°C and R2=0.99 and RMSE=0.82°C and R2=0.99 for Lake Erie (same model versions). In Figure 6 simulated LST is compared with observations during the validation period, showing close agreement overall. RMSEs in validation are: 0.90°C and 0.79°C for Lake Su- perior (4- and 6-parameter versions), and 0.73°C and 0.68°C for Lake Erie (same model versions). Fig. 6 displays the ability of the model to appropriately capture seasonal dynamics and interannual variability. This suggests that air2water is a valuable tool for long-term predictions of LST, in both deep and shallow lakes. The model shows slightly weaker performance in the case of Lake Superior due to its more complex thermal behaviour, which is sig- nificantly controlled by stratification and thermal inertia (Piccolroaz et al., 2015a). Furthermore, the relative wors- ening of the 4-parameter version relative to the 6-parameter version is higher in this case (RMSE increases by 14% in validation) than in Lake Erie (RMSE increases by 7%). This suggests that the hypotheses at the basis of the deri- vation of the simplest, 4-parameter version of the air2water model (see Methods) are likely to be more appropriate in the case of shallow lakes, and anyway when air and water temperature annual cycles shows a nearly sinusoidal pattern (see Methods and Fig. 1). Effects of missing data on model performance In this section, the effect on model performance of the presence of missing data in the time series of observed LST is analyzed and discussed. In fact, long-term contin- uous observations of LST are only rarely available, thus often limiting their practical use. For example, in lakes that freeze, offshore monitoring buoys are generally re- moved during winter to prevent damage from ice. Also LST time series retrieved from satellite imagery, which are generally more continuous during the year, may have gaps during periods of cloudiness. Finally, the constant and continuous in-situ monitoring of a lake requires suf- ficient funding and qualified personnel which are not al- ways available, especially over long-term periods. The performance of the air2water model is evaluated by progressively increasing the number of gaps in the LST series, from 10% to 90%, by increments of 10%. Percent- ages of missing data of 95%, 97%, 99%, and 99.5% are also considered, which roughly correspond to the availabil- ity of 18, 11 (monthly), 4 (seasonal), and 2 measurements per year, on average. In order to perform a robust statistical analysis, for each of the considered missing data scenarios an ensemble of 100 series of LST is obtained from the orig- inal, continuous series of observations, by randomly ex- cluding the correspondent number of data. Then, the model is calibrated on the basis of these artificially deteriorated 13-year series of data (1995 to 2007), and validated on the remaining 7-year period (2008 to 2014). In order to allow for a fair and unbiased comparison among model perform- ance obtained for the different scenarios and for the refer- ence (i.e., continuous time series, no gaps) simulation, the validation period is not modified and the same continuous series shown in Fig. 6 is used in all cases. Results of the analysis for both the 4- and the 6-para- meter versions of the model are shown in Fig. 7. For each scenario, the RMSEs obtained for the ensemble of simu- lations are presented through a box plot, where the circle indicates the median value of the distribution. By com- paring the median values with the RMSEs of the reference simulations (continuous lines), it is possible to conclude that, as a general tendency, no degradation of model per- formance will occur until a data gap of about 50%-60% Fig. 4. Schematic of (a) a sharp and well defined dotty plot and (b) a flat and scattered dotty plot. Each black dot corresponds to one model simulation (one parameter set) and the red dot repre- sents the optimal parameter set. No n- co mm er cia l u se on ly 44 The air2water model: guidelines, challenges, and perspectives for the 6-parameter version, and until a data gap of about 70% for the 4-parameter version. In any case the whole box plot is within 10% of the reference value until a data gap of about 90%-95%. When the percentage of the data gap is larger, model performance diminishes, which oc- curs faster for the 6-parameter version of the model and for the deepest lake. In fact, when the data gap is signifi- cantly large the structure of the 6-parameter version of the model may become too complex (i.e., there are too many parameters) relative to the number of observations, thus running the risk of overfitting (Vapnik, 1999). This is more evident in deep lakes, which are characterized by more complex thermal dynamics due to the significant role played by stratification and thermal inertia (Piccol- roaz et al., 2015a). It is possible to conclude that the 6-parameter version of the model is preferable to the 4-parameter version when the amount of missing data is lower than 95% (i.e., when data are available at about bi-weekly resolution, on aver- age). Up to 95% missing data, the model still performs reasonably well compared to the reference case when the LST series in calibration is complete. With more than 95% of data missing, the air2water model should be used cau- tiously, making a case by case assessment evaluating whether results are reasonable compared to the expected behaviour of the lake, and preferring the simplest 4-para- meter version. In particular, this version of the model shows acceptable performance until the percentage of missing data reaches about 97% (i.e., when data are avail- able at about monthly resolution, on average), and partic- ularly for the shallow Lake Erie. As a final remark, note that some boxplots in Fig. 6 are partially (and to a minor extent) beneath the reference value of RMSE, which indicates that there are a few cases where the optimal set of parameters obtained with a less complete series of LST observations provide slightly better perform- ances in validation. This is likely due to the specific time period considered in the analysis and to the quality of LST observations, and is not explored further here. How length of the calibration period and percentage of missing data affect model performance The analysis presented in the previous section is spe- cific of a 13-year long calibration period, and here it is gen- eralized by considering different lengths of the calibration period, with the aim to provide an overview of the conse- quences of data scarcity on model performance. The final aim is to provide the user of the air2water model with a criterion to assess whether the observational dataset used for model calibration is sufficiently informative to obtain a reliable calibration or not. The same analysis described above is therefore extended considering different lengths of the calibration period: 1, 2, 3, 5, 8, and 13 years (as a tribute to Leonardo Fibonacci). In order not to introduce biases in the results, when testing calibration periods shorter than 13 years, the sequences of years are randomly extracted from the original 13-year long series ranging from 1995 to 2007. Then, in analogy with the previous analysis, an ensemble of 100 artificially deteriorated series of LST is randomly generated for each combination of per- centage of gaps and length of the calibration period. 6 cal- Fig. 5. Scatter plot of observed against simulated LST during the calibration period (1995-2007) for (a) Lake Superior and (b) Lake Erie, and for the 4- and 6- parameter versions of the air2water model. No n- co mm er cia l u se on ly 45S. Piccolroaz ibration period lengths and 9 percentages of missing data are investigated for a total of 54 different combinations (hereafter referred to as scenarios) and 5400 model runs. Results are presented in Fig. 8, which shows the rela- tive deterioration of each scenario with respect to the best performing case (through the ratio RMSEi/min ({RMSEi}54i =1), where RMSEi is the median root mean square error of the i-th scenario in validation, and ranges from 1 to 54), for the two lakes and the two versions of the model. Results confirm and extend the previous analy- sis: a larger degradation (in relative terms) of model per- formance with increasing deterioration of the dataset is observed for the 6-parameter version (and, secondarily, for the deepest lake). In this case, at least 8 years of data with no more than 80% of missing data are required to avoid a worsening of more than 10% from the best sce- nario, for both Lake Superior and Lake Erie. Conversely, with the 4-parameter version a calibration period of 2 or 3 years with up to 80% or 90% missing data is sufficient to obtain the same deterioration in model performance (again in relative terms), for Lake Superior and Lake Erie, respectively. Furthermore, in general, similar model per- formances can be achieved with a lower number of total observations (i.e., larger percentage of missing data) if the calibration period is longer. In other words, a longer cal- ibration period with fewer measurements may be more in- formative than a shorter calibration period with more data, suggesting the high value of disposing of a series of data characterized by significant interannual variability. As an example, model performance is roughly the same when Fig. 6. Comparison between simulated and observed surface water temperature during the validation period (2008–2014) for (a) Lake Superior and (b) Lake Erie, and for the 4- and 6- pa- rameter versions of the air2water model. Observed air temper- ature data are also presented. Fig. 7. Box plots of RMSEs values obtained in validation considering different percentages of missing data in the calibration time series of LST, for (a) Lake Superior and (b) Lake Erie. The circle indicates the median value of the distributions. For each missing data scenario, an ensemble of 100 artificially deteriorated series of LST is randomly generated. No n- co mm er cia l u se on ly 46 The air2water model: guidelines, challenges, and perspectives considering a 13-year long period with 95% of gaps (i.e., 237 valid data) or a 8-year long period with 80% of gaps (i.e., 584 valid data; see Fig. 8b). Finally, RMSEi obtained using the 4-paramters and 6- parameters versions of the model are compared for the two lakes, making possible to draw a map of preference (in absolute terms) between the two versions of the model as a function of the different scenarios (see Fig. 9). In both cases, the 4-paramters version of the air2water model is more performant, thus it is to be preferred, versus the 6- parameter version when the calibration period is shorter than about 5 years, or when it is longer but with more than 97% of gaps. The same considerations about model over- fitting discussed in the previous section apply also here. DISCUSSION In previous works, Piccolroaz et al. (2013, 2015a) and Toffolon et al. (2104a) have already demonstrated the high potential of the air2water model as a simple, yet ef- fective, predictive tool for simulating LST when only air temperature data are available. The model is able to prop- erly simulate the hysteresis loop between air and water temperature in both shallow and deep lakes, and to accu- rately capture seasonal and interannual fluctuations of LST. The model also allows for the simulation of stratifi- cation dynamics in lakes, without the need to introduce a complex description of the air-water interface processes Fig. 8. air2water model performance (in terms of increasing RMSE in validation) as a function of the amount of missing data and cal- ibration period length, for Lake Superior and Lake Erie, and for the 4- and 6- parameter versions. No n- co mm er cia l u se on ly 47S. Piccolroaz based on a detailed quantification of the single heat flux components. Furthermore, it has been successfully ap- plied using different sources of data, as e.g., LST meas- ured at buoys or retrieved from satellite and air temperature from observations or re-analysis, suggesting a high degree of flexibility concerning the possibility to use different types of data as input. This is possible be- cause of the physically-based structure of the model al- lowing for the acquisition of information about the studied system directly from the data, through the calibration of model parameters. This process is further facilitated given the extreme simplicity of the air2water model, which makes it particularly prone to automatic calibration pro- cedures within a Monte Carlo-like framework. In this way, model parameters assimilate the information con- tained in the observations, and in turn the user may learn how the real system behaves from the values of the pa- rameters, identifying what are the most important processes controlling the thermal response of the lake. In- formativeness of observations is a crucial aspect that should be considered carefully in order to exclude an im- proper calibration of the model parameters, and an unre- liable, or at least uncertain, prediction of LST. This critical detail is addressed in the Results section, where air2water model users can find some recommended best practices for a proper use of the model. The simplicity and robustness of the air2water model suggest its possible use in different context and for differ- ent purposes, heading towards new challenges: • The investigation of the response of lakes to air tem- perature variations under climate change scenarios. In this perspective, the air2water model represents a valuable alternative tool to simpler regression models, which require the same data in input but are not able to address some fundamental processes (e.g., the hys- teresis cycle between air and water temperature); but also to more complex process-based models, which require a significantly larger amount of input data without showing significantly better performances (see e.g., Results in Thiery et al., 2014). • The direct coupling with atmospheric circulation and weather forecasting models. Recent attempts in this direction have been made adopting complex one-di- mensional lake models (e.g., using k-e turbulence model as in Goyette and Perroud 2012), but have in- evitably shown some limitations as e.g., expensive computational cost and the need of ground-truth in- formation. Simpler models have also been used to this aim (dating back to Hostetler et al., 1993), but in any case requiring the entire set of meteorological data. Again, the simplicity, parsimony, and robustness of air2water make it a good candidate for being adopted as a lumped lake model integrated in meteorological models. • The coupling with simple water quality, ecological and biogeochemical modules in order to investigate processes that are significantly controlled by water temperature, as e.g., nutrients, dissolved oxygen, and aquatic ecosystem dynamics. This would be a good opportunity to cross the boundaries (according to Tof- folon et al., 2014b) between the various disciplines of aquatic science facilitating the dialogue and collabo- ration between scientists from different background. Indeed, fragmentation of limnology into expert, spe- cialised fields, with limited interaction is a well- known major issue of this branch of science (Peters, 1990; Lewis, 1995; Salmaso and Mosello, 2010). • The definition of regionalization relationships be- tween model parameters and morphological charac- teristics of lakes, with the final aim to apply the model to ungauged lakes. Expanding the analysis of Toffolon et al. (2014a) that analysed 14 temperate lakes char- acterized by different morphology, by including addi- tional lakes possibly at different latitudes (e.g., tropical and polar lakes) is particularly interesting. In this re- gard, the growing availability of collections of lakes’ observational data at the global scale is particularly at- tractive (e.g., Global Lake Temperature Collaboration - GLTC, Sharma et al., 2015; Global Lake Ecological Observatory Network - GLEON, Weathers et al., 2013), also for testing the air2water model on lakes outside of the temperate zone (e.g., tropical or polar lakes). Furthermore, the application of air2water glob- ally may provide interesting insights into how LST in lakes around the world is expected to respond to cli- mate change in the future, possibly identifying some meaningful hotspots as in O’Reilly et al. (2015). Fig. 9. Diagram of preference between the 4- and 6- parameter versions of the air2water model as a function of the amount of missing data and calibration period length, for Lake Superior and Lake Erie. No n- co mm er cia l u se on ly 48 The air2water model: guidelines, challenges, and perspectives CONCLUSIONS The results of this work provide the reader with guide- lines and best practices for using the air2water model, as a simple tool to predict LST when only air temperature is available. After having briefly recalled the derivation of the model and the meaning of parameters, the model is used to simulate LST in two lakes characterized by sig- nificantly different depths: Lake Superior and Lake Erie (USA). These two case studies are chosen as clear and emblematic examples of a deep and a shallow temperate lake characterized by markedly different thermal re- sponses to external forcing, with the aim of making the results of the analysis as much as possible general and comprehensive. The whole analysis is carried out consid- ering the 4- and 6-parameter versions of the model. The full, 8-parameter version is not considered here, due to the sub-optimal parameterization of during inverse strat- ification and ice formation periods, whose improvement is currently under development. In this work, the possible user of the air2water model is provided with all the fundamental information for a proper use of the model: from the initial definition of ap- propriate a priori range of variations of model parameters to an effective post-processing analysis of results, passing through a sensitivity analysis about the influence of miss- ing data on model performance. Particular attention is paid to this last point, which can be summarized as fol- lows: i) longer calibration periods with overall less num- ber of measurements is likely to be more informative than shorter calibration periods with more data (suggesting the high value of disposing of time series with high interan- nual variability); ii) when the number of missing data in- creases, model performance diminishes more for the 6-parameter version, suggesting the risk of model over- fitting; iii) for short calibration time series (e.g., shorter than about 5 years in this case), the 4-parameter version of the model is likely to be preferable anyway; and iv) as a secondary effect, model performance diminishes more for deeper lakes when data are missing, compared to shal- low lakes, due to complex thermal behaviour that is chiefly influenced by lake depth. Coherently with one of the main goals of this work, which is to foster the dialogue among the several branches of aquatic science, a flowchart of the main modelling steps is shown in Fig. 10, which is intended to make the sequence of the operational phases at the basis of the use of air2water clearer and easier to follow also to users with different mathematical and/or technical backgrounds. In- deed, the air2water model has been developed with the clear intention to offer a simple tool that can indifferently be used by physicists and biologists, modellers and ex- perimentalists, possibly generating new collaborations to- wards an integrated understanding of how LST responds to climate forcing and what are the effects on the ecolog- ical status of the lake. In this perspective, everyone that is interested can col- laborate to improve the model with comments, suggestions and contributions, which are highly welcomed and easy to share through https://github.com/spiccolroaz/air2water. ACKNOWLEDGMENTS The author is grateful to Marco Toffolon for discussions on an earlier version of the manuscript, to Elisa Calamita for preliminary analysis of the data, and to Ulrike Obertegger (Edmund Mach Foundation, Italy) for rewriting the post-processing script in R (available on https://github. com/spiccolroaz/air2water). The author is also thankful to NOAA (National Oceanic and Atmospheric Administra- tion) for LST data used in this work (data can be down- loaded from http://www.glerl.noaa.gov/glsea/asc_1024/) Fig. 10. Flowchart of the main modelling steps: input data, def- inition of the a priori ranges of model parameters, run of air2water within a Monte Carlo optimization framework, re- sults. For a more detailed description of how to use the model, please refer to the file README.txt in https://github.com/spic- colroaz/air2water. No n- co mm er cia l u se on ly 49S. Piccolroaz and to ECMWF (European Centre for Medium-Range Weather Forecasts) for daily air temperature (data can be downloaded from http://apps.ecmwf.int/datasets/data/in- terim-full-daily/levtype=sfc/). Finally, the author thanks the two anonymous Reviewers for their constructive com- ments, which helped to improve the manuscript. REFERENCES Adrian R, O’Reilly CM, Zagarese H, Baines SB, Hessen DO, Keller W, Livingstone DM, Sommaruga R, Straile, D, Van Donk E, Weyhenmeyer GA, Winder M, 2009. Lakes as sen- tinels of climate change. Limnol. Oceanogr. 54:2283-2297. Beven K, Freer J, 2001. A dynamic TOPMODEL. Hydrol. Process. 15:1992-2011 Butcher JC, 2008. Numerical methods for ordinary differential equations. 2nd ed. J. Wiley & Sons, London. Butcher JB, Nover D, Johnson TE, Clark CM, 2015. Sensitivity of lake thermal and mixing dynamics to climate change. Cli- matic Change. 129:295-305. De Senerpont Domis LN, Elser JJ, Gsell AS, Huszar VLM, Ibel- ings BW, Jeppesen E, Kosten S, Mooij WM, Roland F, Som- mer U, Van Donk E, Winder M, Lürling M, 2013. Plankton dynamics under different climatic conditions in space and time. Freshwater Biol. 58:463-482. Flaim G, Obertegger U, Anesi A, Guella G, 2014. Temperature- induced changes in lipid biomarkers and mycosporine-like amino acids in the psychrophilic dinoflagellate Peridinium aciculiferum. Freshwater Biol. 59:985-997. Gallina N, Salmaso N, Morabito G, Beniston M, 2013. Phyto- plankton configuration in six deep lakes in the peri-Alpine region: are the key drivers related to eutrophication and cli- mate? Aquat. Ecol. 47:177-193. Gill MK, Kaheil YH, Khalil A, McKee M, Bastidas L, 2006. Multiobjective particle swarm optimization for parameter estimation in hydrology. Water Resour. Res. 42:W07417. Goyette S, Perroud M, 2012. Interfacing a onedimensional lake model with a single-column atmospheric model: Application to the deep Lake Geneva, Switzerland. Water Resour. Res. 48:W04507. Gupta HV, Kling H, Yilmaz KK, Martinez GF, 2009. Decompo- sition of the mean squared error and nse performance crite- ria: Implications for improving hydrological modelling. J. Hydrol. 377:80-91. Hostetler SW, Bates GT, Giorgi F, 1993. Interactive coupling of a lake thermal model with a regional climate model. J. Geo- phys. Res. 98:5045-5057. Kennedy J, Eberhart R, 1995. Particle swarm optimization, p. 1942-1948. Proc. IEEE Int. Conf. on Neural Networks, Uni- versity of Western Australia, Perth, Australia. Kettle H, Anderson RTNJ, Livingstone DM, 2004. Empirical modeling of summer lake surface temperatures in southwest Greenland. Limnol. Oceanogr. 49:271-282. Lewis WM, 1995. Limnology, as seen by limnologists. J. Con- temp. Water Res. Educ. 98:4-8. Livingstone DM, Lotter AF, 1998. The relationship between air and water temperatures in lakes of the Swiss Plateau: a case study with palaeolimnological implications. J. Paleolimnol. 19:181-198. Livingstone DM, Padisák J, 2007. Large-scale coherence in the response of lake surface-water temperatures to synoptic- scale climate forcing during summer. Limnol. Oceanogr. 52:896-902. Martin JL, McCutcheon S, 1998. Hydrodynamics and transport for water quality modeling. CRC Press. Martynov A, Sushama L, Laprise R, 2010. Simulation of tem- perate freezing lakes by one-dimensional lake models: per- formance assessment for interactive coupling with regional climate models. Boreal Environ. Res. 15:143-164. McCombie AM, 1959. Some Relations Between Air Tempera- tures and the Surface Water Temperatures of Lakes. Limnol. Oceanogr. 4:252-258. McKay MD, Beckman RJ, Conover WJ, 1979. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technomet- rics. 21:239-245. Nash JE, Sutcliffe JV, 1970. River flow forecasting through con- ceptual models 1. A discussion of principles. J. Hydrol. 10: 282-290. O’Reilly CM, Sharma S, Gray DK, Hampton SE, Read JS, Row- ley RJ, Schneider P, Lenters JD, McIntyre PB, Kraemer BM, Weyhenmeyer GA, Straile, D, Dong B, Adrian R, Allan MG, Anneville O, Arvola L, Austin J, Bailey JL, Baron JS, Brookes JD, de Eyto E, Dokulil MT, Hamilton DP, Havens K, Hetherington AL, Higgins SN, Hook S, Izmest’eva LR, Joehnk KD, Kangur K, Kasprzak P, Kumagai M, Kuusisto E, Leshkevich G, Livingstone DM, MacIntyre S, May L, Melack JM, Mueller-Navarra DC, Naumenko M, Noges P, Noges T, North RP, Plisnier PD, Rigosi A, Rimmer A, Ro- gora M, Rudstam LG, Rusak JA, Salmaso N, Samal NR, Schindler DE, Schladow SG, Schmid M, Schmidt SR, Silow E, Soylu ME, Teubner K, Verburg P, Voutilainen A,Watkin- son A, Williamson CE, Zhang G, 2015. Rapid and highly variable warming of lake surface waters around the globe. Geophys. Res. Lett. 42:773-781. Perroud MS, Goyette S, Martynov A, Beniston M, Anneville O, 2009. Simulation of multiannual thermal profiles in deep Lake Geneva: A comparison of one-dimensional lake mod- els. Limnol. Oceanogr. 54:1574-1594. Peters RH, 1990. Pathologies in limnology. Mem. Ist. Ital. Idro- biol. 47:181-217. Piccolroaz S, Toffolon M, 2013. Deep water renewal in Lake Baikal: a model for long term analyses. J. Geophys. Res.- Oceans 118:6717-6733. Piccolroaz S, Toffolon M, Majone B, 2013. A simple lumped model to convert air temperature into surface water temper- ature in lakes. Hydrol. Earth Syst. Sci. 17:3323-3338. Piccolroaz S, Toffolon M, Majone B, 2015a. The role of strati- fication on lakes’ thermal response: the case of Lake Supe- rior. Water Resour. Res. 51:7270-7288. Piccolroaz S, Majone B, Palmieri F, Cassiani G, Bellin A, 2015b. On the use of spatially distributed, time-lapse microgravity surveys to inform hydrological modeling, Water Resour. Res. 51:7878-7894. Salmaso N, Mosello R, 2010. Limnological research in the deep southern subalpine lakes: synthesis, directions and perspec- tives. Adv. Oceanogr. Limnol. 1:29-66. Schabhüttl S, Hingsamer P, Weigelhofer G, Hein T, Weigert A, Striebel M, 2013. Temperature and species richness effects No n- co mm er cia l u se on ly 50 The air2water model: guidelines, challenges, and perspectives in phytoplankton communities. Oecologia 171:527-536. Sharma S, Walker SC, Jackson DA, 2008. Empirical modelling of lake water-temperature relationships: a comparison of ap- proaches. Freshwater Biol. 53:897-911. Sharma S, Gray DK, Read JS, O’Reilly CM, Schneider P, Qudrat A, Gries C, Stefanoff S, Hampton SE, Hook S, Lenters JD, Livingstone DM, McIntyre PB, Adrian R, Allan MG, Anneville O, Arvola L, Austin J, Bailey J, Baron JS, Brookes J, Chen Y, Daly R, Dokulil M, Dong B, Ewing K, de Eyto E, Hamilton DP, Havens K, Haydon S, Hetzenauer H, Heneberry J, Hetherington AL, Higgins SN, Hixson E, Izmest’eva LR, Jones BM, Kangur K, Kasprzak P, Köster O, Kraemer BM, Kumagai M, Kuusisto E, Leshkevich G, May L, MacIntyre S, Müller-Navarra D, Naumenko M, Noges P, Noges T, Niederhauser P, North RP, Paterson AM, Plisnier PD, Rigosi A, Rimmer A, Rogora M, Rudstam L, Rusak JA, Salmaso N, Samal NR, Schindler DE, Schladow G, Schmidt SR, Schultz T, Silow EA, Straile D, Teubner K, Verburg P, Voutilainen A, Watkinson A, Weyhenmeyer GA, Williamson CE, Woo KH, 2015. A global database of lake surface temperatures collected by in situ and satellite meth- ods from 1985-2009. Sci. Data 2:150008. Thiery W, Stepanenko VM, Fang X, Jöhnk KD, Li Z, Martynov A, Perroud M, Subin, ZM, Darchambeau F, Mironov D, van Lipzig NPM., 2014. LakeMIP Kivu: evaluating the repre- sentation of a large, deep tropical lake by a set of 1-dimen- sional lake models. Tellus Ser. A 66:21390. Toffolon M, Piccolroaz S, Majone B, Soja AM, Peeters F, Schmid M, Wüest A, 2014a. Prediction of surface water temperature from air temperature in lakes with different morphology, Limnol. Oceanogr. 59:2185-2202. Toffolon M, Piccolroaz S, Bouffard D, 2014b. Crossing the boundaries of physical limnology. Eos 95:403. Vapnik VN, 1999. An Overview of Statistical Learning Theory. IEEE T. Neural Network 10:988-999. Weathers KC, Hanson PC, Arzberger P, Brentrup J, Brookes JD, Carey CC, Gaiser E, Hamilton DP, Hong GS, Ibelings BW, Istvánovics V, Jennings E, Kim B, Kratz TK, Lin F-P, Mu- raoka K, O’Reilly C, Piccolo MC, Rose KC, Ryder E, Zhu G, 2013. The Global Lake Ecological Observatory Network (GLEON): the evolution of grassroots network science. Bull. Limnol. Oceanogr. 22:71-73. Webb MS, 1974. Surface Temperatures of Lake Erie. Water Re- sour. Res. 10:199-210. Wetzel RG, 2001. Limnology: Lake and River Ecosystems. 3rd ed. Academic Press. Winder M, Sommer U, 2012. Phytoplankton response to a changing climate. Hydrobiologia 698:5-16. No n- co mm er cia l u se on ly