Regression kriging to improve basal area and growing stock volume estimation based on remotely sensed data, terrain indices and forest inventory of black pine forests Ferhat Bolat*,#, Sinan Bulut#, Alkan Günlü, İlker Ercanlı, Muammer Şenyurt Deparment of Forest Engineering, Faculty of Forestry, Çankırı Karatekin University, Turkey *Corresponding author: fbolat@karatekin.edu.tr # These authors contributed equally to this work (Received for publication 5 April 2019; accepted in revised form 20 July 2020) Abstract Background: The use of satellite imagery to quantify forest metrics has become popular because of the high costs associated with the collection of data in the field. Methods: Multiple linear regression (MLR) and regression kriging (RK) techniques were used for the spatial interpolation of basal area (G) and growing stock volume (GSV) based on Landsat 8 and Sentinel-2. The performance of the models was tested using the repeated k-fold cross-validation method. Results: The prediction accuracy of G and GSV was strongly related to forest vegetation structure and spatial dependency. The nugget value of semivariograms suggested a moderately spatial dependence for both variables (nugget/sill ratio~70%). Landsat 8 and Sentinel-2 based RK explained approximately 52% of the total variance in G and GSV. Root-mean-square errors were 7.84 m2 ha-1 and 49.68 m3 ha-1 for G and GSV, respectively. Conclusions: The diversity of stand structure particularly at the poorer sites was considered the principal factor decreasing the prediction quality of G and GSV by RK. New Zealand Journal of Forestry Science Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 https://doi.org/10.33494/nzjfs502020x49x E-ISSN: 1179-5395 published on-line: 01/08/2020 © The Author(s). 2020 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Research Article Open Access Nevertheless, there remain a number of issues like unequal or fragmented forest distribution, differing tree species and age classes, which can lead to difficulty when trying to maximise the spatial variance explained when modeling forest metrics (Chirici et al. 2008; Gebreslasie et al. 2008; Ingram et al. 2005; Lu et al. 2004). Because of these drawbacks, and in order to estimate forest metrics at an acceptable level of confidence and at a fine level of detail, scientists have combined remotely sensed data and ground measurements using methods such as ordinary least square, machine learning, and geo-statistic methods in the last ten years (Mallinis et al. 2004; Franco-Lopez et al. 2001; Ingram et al. 2005; Meng et al. 2009; dos Reis et al. 2018). For example, Meng et al. (2009) evaluated ordinary kriging, universal kriging, co- kriging and regression kriging methods combined with Landsat 7 ETM+ for spatially predicting G of pine forests. Introduction Forest inventory studies are conducted to quantify forest attributes such as basal area (G), growing stock volume (GSV), biomass and carbon sequestration that are providing essential information for forest managers. G has been often used as an important auxiliary variable to determine competition indices (Contreras et al. 2011), stand density (Curtis 1982), and diameter- height increment and mortality (Kuehne et al. 2016). Determination of the GSV (i.e. standing volume) is needed to assess silvicultural treatments for ensuring sustainable wood production in a managed forest (O’hehir & Nambiar 2010). Measuring G and GSV can be time consuming particularly for precision forestry. As a consequence, foresters are using remote sensing for the estimation of forest metrics in remote and difficult locations. Keywords: Forestry, k-fold cross-validation, Landsat 8, Sentinel-2, semi-arid region Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 2 They found that using regression kriging resulted in the greatest precision and almost unbiased estimates of G for Loblolly pine and Slash pine. The authors concluded that kriging is robust for the interpolation of forest variables such as G, GSV, and other carbon and forestry metrics such as stand density, stand age and dominant height. Maselli and Chiesi (2006) evaluated three methods relying on remotely sensed data including k-nearest neighbour method, a locally calibrated regression analysis and a kriging method for forest volume estimation in a Mediterranean region. Results showed that all three methods were similar in terms of their correlation coefficient and root-mean-square error, but the kriging method was slightly better with regard to lower residual variance. dos Reis et al. (2018) assessed a variety of methods including a random forest algorithm and a kriging method integrated with Landsat Thematic Mapper (TM) data for the spatial prediction of G and GSV. Multiple linear regression and artificial neural network models had the poorest performance for the estimation of timber volume of Eucalyptus clonal stands. In forest inventories, a wide range of data types are gathered in terms of forest variables as well as geographical locations. The relationships between a response variable and explanatory variable(s) can differ from one sample unit to another because of sampling density, site index and stand age. In order to obtain unbiased residual estimates, local (random) effects should be introduced in a model (Fortin et al. 2007). If a model accounts for local effects adequately, it may predict the target variable at an acceptable level of accuracy (Magnussen et al. 2017). Geostatistical approaches have the ability to account for unknown random effects in a local forest area, providing approximately unbiased estimates of residuals (Isaaks & Srivastava 1989). Also, the forest metrics are spatially correlated within the same locations. Although multiple regression analysis is user-friendly and easy to use for forest managers, it does not account for spatial autocorrelation in data. Therefore, especially in case of a trend in the error variance across predictions, this method can be inappropriate in the prediction of forest metrics due to its biased parameter estimates. RK uses residuals from the least-square methods (i.e. multivariate regression). The improvements on the predictions can be important when taking spatial autocorrelation in the data into account. Although there are a variety of approaches to consider spatial autocorrelation in data, the development and implementation of these methods require a strong statistical background. In this context, RK is both user- friendly and relatively simple, which may suit the needs of forest managers. Therefore, the objective of this study was to evaluate the performance of Landsat 8 and Sentinel-2 based RK and MLR for improving the spatial predictions of G and GSV. Methods Study area The study area is within the Inner Anatolia semi-arid climate in the Black Sea - steppe transitional zone in Turkey (Figure 1). It is bounded by latitudes 40° 29′ 09″–40° 30′ 44″ N and longitudes 33° 25′ 47″–33° 27′ 19″ E (WGS 1984 UTM Zone 36N). In the study area, average annual temperature and total precipitation are approximately 11.3 °C and 412 mm, respectively, while elevations range from 1280 to 1642 m (Turkish State Meteorological Service n.d.). Terrain variables of the study area were obtained from a digital elevation map with a twelve-metre resolution (Table 1). The FIGURE 1. Location of the study area and its location in Turkey study area consists of 380 hectares, where Anatolian black pine (Pinus nigra Arnold. subsp. pallasiana Lamb Holmboe) is widely distributed. In this region, harsh ecological conditions such as low precipitation, high temperatures, and poor soils contribute towards poor forest growth (Barbati et al. 2014; Erşahin et al. 2016; Göl & Abay 2003). Such conditions adversely influence the distribution and abundance of conifer forests in this region (Çolak & Rotherham 2006). Anatolian black pine has been frequently planted because it adapts well to the semi-arid conditions of this region (Konukcu 2001). Field research The study area was afforested in the 1961-1965 period and have been thinned every ten years since establishment. A systematic sampling with 51 plots distributed within a 200 x 200 m grid were measured in 2016. Circular plot sizes ranged from 400 to 800 m2 depending on the crown closure as suggested by the forest management guidelines for Turkey (Anonymous 2008). We calculated descriptive statistics for basal area (G, m2 ha-1), and growing stock volume (GSV, m3 ha-1) using the SAS software®. (Table 2). G and GSV of each sample plot were calculated by Eq. 1 and Eq. 2 (Şenyurt 2017). In each sample plot, tree diameters at breast height (dbh) were measured with 0.1 cm-precision for each living tree (dbh ≥ 8 cm) using calipers. Trees with unsuppressed growth (due to lack of competition) representing potential site productivity were selected to assess the site index of each sample plot. Tree heights were measured with 1% precision using the Vertex IV ultrasound instrument. Top height was calculated as the mean height of the 100 tallest trees within a hectare. The mean age of each sample plot was measured using increment cores extracted from five trees that were selected based on the mean square diameter of the sample plot. Subsequently, the site index of each sample plot was assessed using the site index equation developed by Kalıpsız (1963) for black pine stands for a base age of 100 years. Relative density (RD) of each sample plot was calculated by the stand density index of Curtis (1982) (Eq.3). Satellite images, processing, and data Landsat 8 and Sentinel-2 satellite images were obtained from the United States Geological Survey Earth Explorer data portal (USGS 2000) and acquired on 02 and 20 August 2016, respectively. WGS 1984 (UTM Zone 36) projection system was used for orthorectification and georeferencing of the satellite images with first order nearest neighbourhood rules. The atmospheric correction was applied to Landsat 8 and Sentinel-2 images using the ATCOR module of QGIS® Software. Then, the geometric correction was applied using twenty ground control points such as crossings, bridges, and hill tops through a Global Positioning System (GPS). Inventory was carried out using circular plots of 400, 600 and 800 m2. A single pixel obtained from Landsat 8 covers an area of 900 m2 (30x30 m). Sentinel-2 has bands with a spatial resolution of 10, 20 and 60 m, which cover an area of 100, 200 and 3600 m2, respectively. Since Landsat 8 and Sentinel-2 spatial resolution (i.e. B9 and B10 bands of Sentinel-2) are higher than sample plot size, the position of some sample plots may not be the center of a pixel. In such a circumstance, the buffer zone was applied for obtaining more representative data. For those plots that did not coincide with the center of the corresponding pixel, the average of all those pixels comprised by a plot was calculated. Detail descriptions and formulae for vegetation indices calculated from Landsat 8 and Sentinel-2 bands can be found in Vescovo et al. (2012), Chrysafis et al. (2017), and Korhonen et al. (2017). Table 3 lists all bands and vegetation indices from both satellite sensors used in this study. The six bands with 30 m spatial resolution of Landsat 8 were used, i.e. B2 (minimum value= 450; maximum value=515 nm), B3 (525-600 nm), Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 3 Variable Min. Max. Mean Std. Dev. Aspect (°) 0.00 354.09 221.98 120.30 Slope (°) 4.00 49.74 26.16 12.07 Elevation (m) 1264.00 1547.00 1434.53 71.66 TWI 3.00 16.17 5.70 2.19 STI 0.00 0.98 0.03 0.15 CI -50.00 32.38 1.71 13.05 TWI: Topographic Wetness Index, STI: Sediment Transport Index, CI: Convergence Index TABLE 1: Descriptive statistics of terrain attributes of the study area Variable #N Min. Max. Mean Std. Dev. CV (%) G (m2ha-1) 51 6.64 55.21 27.89 11.47 41.13 GSV (m3ha-1) 51 38.95 341.90 170.66 72.42 42.44 CV: coefficient of variation, #N: number of plots TABLE 2. Descriptive statistics of the forest inventory data B4 (630-680 nm), B5 (845-885 nm), B6 (1560-1660 nm) and B7 (2100-2300 nm). The twelve bands of Sentinel-2 were used, i.e. B2 (458-523 nm), B3 (543-578 nm), B4 (650-680 nm), B8 (785-900 nm) with 10 m resolution, B5 (698-713 nm), B6 (733-748 nm), B7 (773-793 nm), B8A (855-875 nm), B11 (1565-1655 nm), B12 (2100- 2280 nm) with 20 m resolution, and B9 (930-950 nm), and B10 (1365-1385 nm) with 60 m spatial resolution. Statistical and geostatistical analysis Multiple linear regression (MLR) MLR has been used to assess the relationships between forest structural attributes and remotely sensed data (Næsset 2002). We focused on the predictability of G and GSV using MLR and regression kriging (RK) based on Landsat 8, Sentinel-2 data, and terrain indices. For this purpose, the statistical significance of the terrain variables and of the variables obtained from Landsat 8 and Sentinel-2 were tested at 0.05 significance level using the forward variable selection technique in SAS® software. (4) Where; b0 is a constant coefficient, bi is the vector of independent variables, xi is an independent variable that accounts for variation of a dependent variable and e is the model residual. The model residuals are expected to have a normal distribution. Regression kriging (RK) The sample plots in the same location are inherently interdependent (spatially correlated). Therefore, in fitting a model, it is highly important to consider the spatial autocorrelation in data in order to improve the model performance. RK uses the spatial autocorrelation in the residuals from MLR, and therefore may improve the predictions, as MLR does not consider spatial autocorrelation in data. Kriging methods have been increasingly used for interpolating spatially dependent data in recent decades. RK consists of three steps. Initially, MLR is carried out to estimate the regression parameters. Then, the residuals from MLR are incorporated into ordinary kriging to account for prediction uncertainty using ArcGIS® software. Finally, the values of the target variable were calculated by adding the predictions of MLR and the kriged values of the residuals (Odeh et al. Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 4 1995) (Figure 2). (5) Where: φ0 and φi are the estimated values by MLR, xi is an independent variable that accounts for dependent variable variation, n is the number of observations, λi is ordinary kriging weight and, ε is the kriged model residual at measurement locations. Predictive performance and validation of MLR and RK The repeated k-fold cross-validation with n repetition procedure was used to adequately exploit our small sample data (n=51) using SAS® software. Five-fold with 10 repetitions was applied in the cross-validation procedure, which allowed to reduce prediction biases. The data were randomly split into two groups (training and testing subsamples) using the unrestricted random sampling method at each repetition, and the different random subsamples were selected for the training and testing purposes. This process was repeated ten times for determining the best model parameters at the 0.05 significance level. Finally, the best predictive MLR and RK models were compared using goodness-of-fit statistics including the root-mean-square error (RMSE), the mean absolute error (MAE) and the adjusted coefficient of determination (R2adj) (Kozak & Kozak 2003) (Eq. 6 to 8). The similarities between the observed and the predicted values were assessed by the correlation coefficient (r). RK and MLR were further evaluated by plotting the residuals for each model. Also, the distribution of the relative error percent, RE (%), was used for model comparison. Satellite Indices Satellite Indices Satellite Indices Satellite Indices La nd sa t 8 B1-B7, B9 Se nt in el -2 S1-S12, S8a Se nt in el -2 MSRRE1 Se nt in el -2 NLINIRn2 NDVI NDVI MSRRE2 RSRRE1 EVI DVI MSRNIRn1 RSRRE2 SAVI MSRNIR MSRNIRn2 RSRNIRn1 MSAVI NLINIR NLIRE1 RSRNIRn2 NDMI RSRNIR NLIRE2 EVIRE1 NBR EVINIR NLINIRn1 EVIRE2 NBR2 CTVI TABLE 3. Bands (B) and vegetation indices of Landsat 8 and Sentinel-2 considered as explanatory variables for G and GSV estimation. For calculations of spectral indices see Chrysafis et al. (2017) Where; Di and Di are the observed and the predicted values, respectively, Di is the mean of the observed values, and nT and k are the total number of the observations and independent variables, respectively. Results We used multiple linear regression (MLR) to predict basal area (G) and growing stock volume (GSV) based on Landsat 8 and Sentinel-2 data and terrain indices choosing explanatory variable(s) at the 0.05 level of significance. The band values and vegetation indices of Landsat 8 and Sentinel-2 (Table 3) and terrain variables (Table 1) were analyzed to predict G and GSV. When Landsat 8 and the terrain variables were considered, EVI, elevation and STI independent variables were significant at the 0.05 level of significance. When Sentinel-2 and the terrain variables were considered, NLINIRn2, elevation and STI were found to be significant at P<0.05 (Table 5). Goodness-of-fit statistics for MLR and RK are given in Table 4. RK performed better than MLR for both prediction and validation datasets, particularly when using RK based on Landsat 8 data. In both data sets, the models based on Sentinel-2 data performed worse than those based on Landsat 8. In summary, the prediction and validation statistics suggested using RK based on Landsat 8 data to predict G and GSV (Table 4). Residual plots for G and GSV are displayed in Figure 3. The residual patterns of RK and MLR had no trend, suggesting that the predictions were unbiased for all cases. Figure 4a shows the relative errors (REs) for the model predictions against different values of G and GSV. The number of observed data used in the predictions is an important factor determining modeling performance or prediction quality. Interestingly, in general, Figure 4a indicated that the models performed better at G ≤20 and G ≥40 m2ha-1, while the number of observed data were lower at these Gs. However, we believe that those lower RE values for G predictions at ≤20 and ≥40 m2ha-1 could be biased. Figure 4b shows a similar case for the GSV models (the number of observed data are not shown in the figure). A graph of the observed versus modelled (predicted) Gs, are shown in Figure 5. These graphics suggest that G estimates obtained from Landsat 8 showed greater similarity to the observations than those obtained from Sentinel-2 (r=0.72). In contrast, GSV estimates obtained from Sentinel-2 were more precise than those obtained from Landsat 8 (r=0.75). The experimental semivariograms for the residuals of G and GSV are shown in Figure 6. Nugget values for Landsat 8 are lower than those for Sentinel-2. The nugget values are 70% for both G and GSV when using Landsat 8 and 70% for G and 65% for GSV when using Sentinel-2. Cambardella et al. (1994) proposed that a variable with a nugget ratio <25% is assumed to be strongly spatially dependent, between 25% and 75% moderately spatially Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 5 ̭ ̵ FIGURE 2: Flowchart of regression kriging application used to spatially interpolate G and GSV dependent, and >75% weakly spatially dependent. The nugget value was quite high in our research field, suggesting small-scale variabilities in the study area, in other words, a weak autocorrelation in our data. The surface maps of G and GSV obtained by RK are shown in Figure 7. Discussion The goodness-of-fit-statistics suggested that RK models were adequate for both G and GSV predictions. Although the residual distributions showed that RK models were unbiased for G and GSV estimates, the relative error distributions suggested that RK models were biased for lower and higher values of G and GSV. Since plots within the same sampling unit are inherently correlated, the assumption of independence of observations is generally violated in the ordinary least square methods (e.g. multivariate regression) (Gregorie 1987). This feature results frequently in large error variance in residuals achieved by MLR (Fortin et al. 2007). In this study, the small error variance occurred due to low spatial dependence in the residuals Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 6 Prediction set (n=30) Validation set (n=21) RMSE MAE RMSE MAE G Landsat 8 MLR 9.23 7.22 0.34 12.44 8.31 0.28 RK 7.84 5.89 0.52 10.30 8.98 0.28 Sentinel-2 MLR 9.08 7.36 0.36 17.11 9.44 0.15 RK 8.40 6.62 0.45 9.79 7.97 0.12 GSV Landsat 8 MLR 57.39 45.24 0.36 73.93 51.49 0.14 RK 49.68 37.54 0.52 62.86 53.32 0.31 Sentinel-2 MLR 59.55 46.85 0.31 103.37 58.77 0.15 RK 47.47 37.20 0.56 68.44 57.35 0.17 TABLE 4. The goodness-of-fit statistics of the fitted MLR and RK models for G (m2 ha-1) and GSV (m3 ha-1) in the Anatolian black pine forest. Note: Bold numbers indicate best G and GSV models Response Variable Parameter Estimate S.E. t-Value p-Value Ga Intercept 122.548 35.196 3.482 0.002 EVI 104.102 29.968 3.474 0.002 STI -22.835 8.472 -2.695 0.013 Elevation -0.091 0.024 -3.835 0.001 GSVa Intercept 766.719 221.901 3.455 0.002 EVI 690.130 188.942 3.653 0.001 STI -137.790 53.416 -2.580 0.016 Elevation -0.581 0.149 -3.891 0.001 Gb Intercept -49155.840 16092.504 -3.055 0.005 NLINIRn2 49282.437 16090.139 3.063 0.005 STI -18.741 10.135 -1.849 0.076 Elevation -0.055 0.026 -2.130 0.043 GSVb Intercept -212171.607 86662.410 -2.448 0.022 NLINIRn2 213198.437 86702.293 2.459 0.022 STI -136.718 59.549 -2.296 0.031 Elevation -0.532 0.166 -3.214 0.004 TABLE 5. Parameter estimates, standard errors (S.E.), t-values, and p-values of RK models predicting basal area (G, m2 ha-1) and growing stock volume (GSV, m3 ha-1) as a function of Landsat 8, Sentinel-2 data and terrain variables for the Anatolian black pine a Landsat 8 data; b Sentinel-2 data R2adj R 2 adj (Figure 6). While MLR may be more suitable in the case of low spatial dependency in the data, it does not take into account the spatial structure in data (Palmer et al. 2009). RK has the capability to account for these structures in continuous variables and has the advantages of the variable selection (Odeh et al. 1995). Therefore, we recommend the use of RK thus providing the surface map of the predicted values and accounting for measurement uncertainties. Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 7 The study area is located in the Anatolian steppe transitional zone, covered by a semi-arid ecosystem (Göl & Abay 2003). In this region, tree (or stand) growth is limited due to the ecological conditions such as low precipitation, extreme temperature, and also poor soils. In this type of ecosystems, the General Directorate of Forestry of Turkey frequently recommends the Anatolian black pine for afforestation (Konukcu 2001). However, the harsh environmental conditions may lead FIGURE 3. Residuals for G and GSV against predicted values from the RK and MLR models based on Landsat 8 and Sentinel-2 data FIGURE 4. Line plots of relative errors, RE (%), of RK and MLR based on Landsat 8 and Sentinel-2 data for: (a) basal area (G); and (b) growing stock volume (GSV). TABLE 2: Confusion matrix Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 8 FIGURE 5. Scatterplots of observed and predicted values of G and GSV using RK based on Landsat 8 and Sentinel-2 data FIGURE 6. Experimental (circles) and theoretical (lines) semivariograms for basal area (G) and growing stock volume (GSV) based on Landsat 8 and Sentinel-2 data to highly variable stands. Therefore, the explanatory variables (e.g. NDVI) are likely to be less correlated with the response variable, as suggested by our study. In our view, the forest vegetation structure limited the performance of RK, as the study site was composed of poor growth and thinned patches. In other words, sample plots with similar diameters at breast height may differ in tree heights ranging from 20 to 24 m (8.2%), 15 to 19 m (57.4%) and 10 to 14 m (34.4%). These conditions led to poor correlations between the spectral data and the ground measurements. Another reason for sub- optimal predictions was the thinning treatments in the study area. Since the stands were partially thinned, the sample plots with similar G showed the differences in terms of crown closure and stem density, which results in a discrepancy between the spectral data and the stand characteristics, leading to imprecise and biased models. Thinning treatments and harsh environmental conditions led to the spatial discontinuities, suggesting a weak spatial dependency of the G and GSV in the study area. In other words, the high short-range variability as evidenced by high nugget effect (~70%) occurred in the data (Dai et al. 2014; Gilbert & Lowell 1997; Gunnarsson et al. 1998; Maselli & Chiesi 2006; Viana et al. 2012). The nugget effect is one of the key factors decreasing the interpolation quality by kriging (Isaaks & Srivastava 1989). Our results suggest that a systematic sampling (200 x 200 m grid) resulted in a high nugget effect in a semi-arid region (nugget effect ~70% for our data). In this context, the results of this study are important, evidencing that the sampling distance and scheme used in National Forest Inventory is not proper for geostatistical studies and that additional random subsamples are needed in finer resolution for a successful geostatistical analysis and interpolation. In these types of forests, Destan et al. (2013) and Corona et al. (2014) advised that inventory should be carried out with smaller sampling distances and that including more auxiliary variables representing spatial variability (e.g. soil and topography) may improve the performance of the kriging methods. In this study, aspect, slope, elevation, TWI, STI, and CI were used as explanatory variables to improve the predictions. Along with EVI indices of Landsat 8, NLINIRn2 indices of Sentinel-2, elevation and STI contributed to explain the total variance of G and GSV at the 0.05 significance level. Consequently, our results are promising as RK explained approximately fifty percent of total variance observed in both G and GSV obtained from Landsat 8 and Sentinel-2. Conclusions EVI obtained from Landsat 8, NLINIRn2 obtained from Sentinel-2, elevation and STI were the best independent variables explaining G and GSV. RK performed adequately to predict G from Landsat 8 and GSV from Sentinel-2. RK and MLR had similar residual scatterplots. However, the relative error distributions showed that especially in GSV estimates RK based on Landsat 8 performed better than MLR based on both Landsat 8 and Sentinel-2. A highly random distribution of G and GSV occurred in our study site as shown by high nugget variance, suggesting a weak autocorrelation from a geostatistical perspective, which might be attributed to the coexistence in short distances of high-low productive patches, the coarse sampling scheme and undefined thinning practices. This suggests the sampling scheme may have been inadequate to detect the autocorrelation in G and GSV. Across such sites, a geostatistical sampling scheme should be performed with shorter sampling distances to improve modeling of semivariograms around the origin, which may help to decrease the nugget effect. Sampling sites having equal distributions of site index values can increase the performance of MLR and RK. The sampling sites with thinning practices should be excluded to improve the variance explained. In conclusion, we found that reasonably accurate predictions for forest planning can be achieved using Landsat 8 and Sentinel-2 through the RK method. List of abbreviations dbh: Diameter breast height G: Basal area GSV: Growing stock volume MAE: Mean absolute error MLR: Multiple linear regression R2adj: Adjusted coefficient of determination RD: Relative density RE: Relative error RK: Regression kriging RMSE: Root-mean-square error Competing interests The authors have no competing interests. Authors’ contributions All authors participated in the forest inventory. FB and SB analysed the data, interpreted the results, and wrote the entire paper. Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 9 FIGURE 7. The surface maps of G and GSV predictions from the RK model based on Landsat 8 and Sentinel-2 data Acknowledgements The authors thank Dr. Sabit Erşahin and two anonymous reviewers for his valuable comments which helped to improve the quality of the manuscript. Also, the authors thank Nesrin Bolat and Rabia Kalkan for helping to collect data in the field. This study was supported by Çankırı Karatekin University (Project No.: OF090316B05). References Anonymous. (2008). Forest management guidelines, Republic of Turkey, General Directorate of Turkey, Forest Management and Planning Department. Ankara. Barbati, A., Marchetti, M., Chirici, G., & Corona, P. (2014). European forest types and forest Europe SFM indicators: tools for monitoring progress on forest biodiversity conservation. Forest Ecology and Management, 321, 145-157. https://doi. org/10.1016/j.foreco.2013.07.004 Cambardella, C., Moorman, T., Parkin, T., Karlen, D., Novak, J., Turco, R., & Konopka, A. (1994). Field- scale variability of soil properties in central Iowa soils. Soil Science Society of America Journal, 58(5), 1501-1511. https://doi.org/10.2136/ sssaj1994.03615995005800050033x Chirici, G., Barbati, A., Corona, P., Marchetti, M., Travaglini, D., Maselli, F., & Bertini, R. (2008). Non-parametric and parametric methods using satellite images for estimating growing stock volume in alpine and Mediterranean forest ecosystems. Remote Sensing of Environment, 112(5), 2686-2700. https://doi. org/10.1016/j.rse.2008.01.002 Chrysafis, I., Mallinis, G., Siachalou, S., & Patias, P. (2017). Assessing the relationships between growing stock volume and Sentinel-2 imagery in a Mediterranean forest ecosystem. Remote Sensing Letters, 8(6), 508-517. https://doi.org/10.1080/215070 4X.2017.1295479 Contreras, M.A., Affleck, D., & Chung, W. (2011). Evaluating tree competition indices as predictors of basal area increment in western Montana forests. Forest Ecology and Management, 262(11), 1939-1949. https://doi.org/10.1016/j.foreco.2011.08.031 Corona, P., Fattorini, L., Franceschi, S., Chirici, G., Maselli, F., & Secondi, L. (2014). Mapping by spatial predictors exploiting remotely sensed and ground data: A comparative design-based perspective. Remote Sensing of Environment, 152, 29-37. https://doi.org/10.1016/j.rse.2014.05.011 Curtis, R.O. (1982). A simple index of stand density for Douglas-fir. Forest Science, 28(1), 92-94. Çolak, A.H., & Rotherham, I.D. (2006). A review of the forest vegetation of Turkey: its status past and present and its future conservation. Paper presented at the Biology and Environment: Proceedings of the Royal Irish Academy. https://doi.org/10.3318/ BIOE.2006.106.3.343 Dai, F., Zhou, Q., Lv, Z., Wang, X., & Liu, G. (2014). Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau. Ecological Indicators, 45, 184-194. https://doi.org/10.1016/j.ecolind.2014.04.003 Destan, S., Yilmaz, O., & Sahin, A. (2013). Making objective forest stand maps of mixed managed forest with spatial interpolation and multi-criteria decision analysis. iForest-Biogeosciences and Forestry, 6(5), 268. https://doi.org/10.3832/ifor0099-006 dos Reis, A.A., Carvalho, M.C., de Mello, J.M., Gomide, L.R., Ferraz Filho, A.C., & Junior, F.W.A. (2018). Spatial prediction of basal area and volume in Eucalyptus stands using Landsat TM data: an assessment of prediction methods. New Zealand Journal of Forestry Science, 48(1), 1. https://doi. org/10.1186/s40490-017-0108-0 Erşahin, S., Bilgili, B.C., Dikmen, Ü., & Ercanli, I. (2016). Net primary productivity of Anatolian forests in relation to climate, 2000–2010. Forest Science, 62(6), 698-709. https://doi.org/10.5849/ forsci.15-171 Fortin, M., Daigle, G., Ung, C.-H., Bégin, J., & Archambault, L. (2007). A variance-covariance structure to take into account repeated measurements and heteroscedasticity in growth modeling. European Journal of Forest Research, 126(4), 573-585. https://doi.org/10.1007/s10342-007-0179-1 Franco-Lopez, H., Ek, A.R., & Bauer, M.E. (2001). Estimation and mapping of forest stand density, volume, and cover type using the k-nearest neighbors method. Remote Sensing of Environment, 77(3), 251-274. https://doi.org/10.1016/S0034- 4257(01)00209-7 Gebreslasie, M., Ahmed, F., & van Aardt, J. (2008). Estimating plot-level forest structural attributes using high spectral resolution ASTER satellite data in even-aged Eucalyptus plantations in southern KwaZulu-Natal, South Africa. Southern Forests: a Journal of Forest Science, 70(3), 227-236. https:// doi.org/10.2989/SF.2008.70.3.6.667 Gilbert, B., & Lowell, K. (1997). Forest attributes and spatial autocorrelation and interpolation: effects of alternative sampling schemata in the boreal forest. Landscape and Urban Planning, 37(3-4), 235-244. https://doi.org/10.1016/S0169-2046(97)80007- 2 Göl, C., & Abay, G. (2003). Researches on General Soil Properties and Nature Plant Species of Plantatian Area of Cankın Forestry Faculty of Ankara University in Turkish. Journal of South-West Anatolia Forest Research Institute, 5(211). Gregorie, T.G. (1987). Generalized error structure for forestry yield models. Forest Science, 33(2), 423- 444. Gunnarsson, F., Holm, S., Holmgren, P., & Thuresson, T. (1998). On the potential of kriging for forest management planning. Scandinavian Journal of Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 10 Forest Research, 13(1-4), 237-245. https://doi. org/10.1080/02827589809382981 Ingram, J.C., Dawson, T.P., & Whittaker, R.J. (2005). Mapping tropical forest structure in southeastern Madagascar using remote sensing and artificial neural networks. Remote Sensing of Environment, 94(4), 491-507. https://doi.org/10.1016/j. rse.2004.12.001 Isaaks, E.H., & Srivastava, R.M. (1989). An introduction to applied geostatistics. Oxford University Press, 575 p. Kalıpsız, A. (1963). Türkiye’de karaçam meşcerelerinin tabii bünyesi ve verim kudreti üzerine araştırmalar. OGM Yayınları, İstanbul. Konukcu, M. (2001). Ormanlar ve Ormancılığımız, Devlet Planlama Teşkilatı Yayını, No; 2630, Ankara. Korhonen, L., Packalen, P., & Rautiainen, M. (2017). Comparison of Sentinel-2 and Landsat 8 in the estimation of boreal forest canopy cover and leaf area index. Remote Sensing of Environment, 195, 259-274. https://doi.org/10.1016/j. rse.2017.03.021 Kozak, A., & Kozak, R. (2003). Does cross validation provide additional information in the evaluation of regression models? Canadian Journal of Forest Research, 33(6), 976-987. https://doi. org/10.1139/x03-022 Kuehne, C., Weiskittel, A.R., Wagner, R.G., & Roth, B.E. (2016). Development and evaluation of individual tree-and stand-level approaches for predicting spruce-fir response to commercial thinning in Maine, USA. Forest Ecology and Management, 376, 84-95. https://doi.org/10.1016/j. foreco.2016.06.013 Lu, D., Mausel, P., Brondızio, E., & Moran, E. (2004). Relationships between forest stand parameters and Landsat TM spectral responses in the Brazilian Amazon Basin. Forest Ecology and Management, 198(1-3), 149-167. https://doi.org/10.1016/j. foreco.2004.03.048 Magnussen, S., Mauro, F., Breidenbach, J., Lanz, A., & Kändler, G. (2017). Area-level analysis of forest inventory variables. European journal of forest research, 136(5-6), 839-855. https://doi. org/10.1007/s10342-017-1074-z Mallinis, G., Koutsias, N., Makras, A., & Karteris, M. (2004). Forest parameters estimation in a European Mediterranean landscape using remotely sensed data. Forest Science, 50(4), 450-460. Maselli, F., & Chiesi, M. (2006). Evaluation of statistical methods to estimate forest volume in a Mediterranean region. IEEE Transactions on Geoscience and Remote Sensing, 44(8), 2239-2250. https://doi.org/10.1109/TGRS.2006.872074 Meng, Q., Cieszewski, C., & Madden, M. (2009). Large area forest inventory using Landsat ETM+: a geostatistical approach. ISPRS Journal of Photogrammetry and Remote Sensing, 64(1), 27-36. https://doi.org/10.1016/j.isprsjprs.2008.06.006 Næsset, E. (2002). Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sensing of Environment, 80(1), 88-99. https://doi. org/10.1016/S0034-4257(01)00290-5 O’hehir, J., & Nambiar, E. (2010). Productivity of three successive rotations of P. radiata plantations in South Australia over a century. Forest Ecology and Management, 259(10), 1857-1869. https://doi. org/10.1016/j.foreco.2009.12.004 Odeh, I.O., McBratney, A., & Chittleborough, D. (1995). Further results on prediction of soil properties from terrain attributes: heterotopic cokriging and regression-kriging. Geoderma, 67(3-4), 215-226. https://doi.org/10.1016/0016-7061(95)00007-B Palmer, D.J., Höck, B., Kimberley, M.O., Watt, M.S., Lowe, D.J., & Payn, T.W. (2009). Comparison of spatial prediction techniques for developing Pinus radiata productivity surfaces across New Zealand. Forest Ecology and Management, 258(9), 2046-2055. https://doi.org/10.1016/j.foreco.2009.07.057 Şenyurt, M. (2017). ÇNÜ Araştırma ormanı Karaçam meşcereleri için artım ve büyüme tahminlerinin yapay sinir ağları tekniği ile elde edilmesi. Project No: OF090316B05. Turkish State Meteorological Service. (n.d.) Retrieved from https://mgm.gov.tr/eng/forecast-cities.aspx Vescovo, L., Wohlfahrt, G., Balzarolo, M., Pilloni, S., Sottocornola, M., Rodeghiero, M., & Gianelle, D. (2012). New spectral vegetation indices based on the near-infrared shoulder wavelengths for remote detection of grassland phytomass. International Journal of Remote Sensing, 33(7), 2178-2195. https://doi.org/10.1080/01431161.2011.607195 Viana, H., Aranha, J., Lopes, D., & Cohen, W.B. (2012). Estimation of crown biomass of Pinus pinaster stands and shrubland above-ground biomass using forest inventory data, remotely sensed imagery and spatial prediction models. Ecological Modelling, 226, 22-35. https://doi.org/10.1016/j. ecolmodel.2011.11.027 Bolat et al. New Zealand Journal of Forestry Science (2020) 50:4 Page 11