Agricultural and Food Science, Vol. 18(2009): 302-316


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18 (2009): 302–316.

302

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

303

© Agricultural and Food Science 
Manuscript received February 2009

Role of benchmark technology in sustainable  
value analysis 

An application to Finnish dairy farms

Timo Kuosmanen 1,2 and Natalia Kuosmanen1
1Economic Research Unit, MTT Agrifood Research Finland, Luutnantintie 13, FI-00410 Helsinki, Finland 

2Department of Business Technology, Helsinki School of Economics, PO Box 1210, FI-00101 Helsinki, Finland, 
e-mail: firstname.lastname@mtt.fi

Sustainability is a multidimensional concept that entails economic, environmental, and social aspects. The 
sustainable value (SV) method is one of the most promising attempts to quantify sustainability performance 
of firms. SV compares performance of a firm to a benchmark, which must be estimated in one way or an-
other. This paper examines alternative parametric and nonparametric methods for estimating the benchmark 
technology from empirical data. Reviewed methods are applied to an empirical data of 332 Finnish dairy 
farms. The application reveals four interesting conclusions. First, the greater flexibility of the nonparametric 
methods is evident from the better empirical fit. Second, negative skewness of the regression residuals of 
both parametric OLS and nonparametric CNLS speaks against the average-practice benchmark technology 
in this application. Third, high positive correlations across a wide spectrum of methods suggest that the find-
ings are relatively robust. Forth, the stochastic decomposition of the disturbance term to filter out the noise 
component from the inefficiency term yields more realistic efficiency estimates and performance targets.

Key-words: benchmarking, eco-efficiency, environmental performance, productive efficiency analysis, 
stochastic frontier estimation, sustainable value analysis, sustainable development.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18 (2009): 302–316.

302

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

303

Introduction

Measuring corporate contributions to sustainability 
has attracted increasing attention in the recent years. 
A number of different practical approaches have 
been suggested (see e.g. Tyteca 1998). One of the 
most promising developments is sustainable value 
(SV), introduced by Figge and Hahn (2004, 2005). 
SV is a systematic economic approach for measur-
ing sustainable value creation of firms.1 A firm is 
said to create sustainable value whenever it uses its 
bundle of resources more efficiently than another 
firm would have used it. In principle, reallocating 
resources from firms that create negative sustainable 
value to firms that create positive sustainable value 
can increase the economic welfare while keeping 
all stocks of capital in the economy at a constant 
level. Thus, firms creating sustainable value would 
be able to compensate for any rebound effects that 
might occur. 

The recent study by Kuosmanen and Kuos-
manen (2009) criticizes the original Figge and 
Hahn’s SV estimator for making strong, unrealistic 
assumptions about a linear benchmark technology 
that is identified by just a single data point. Build-
ing an explicit link between SV method and the 
frontier approach to environmental performance 
assessment,2 Kuosmanen and Kuosmanen pro-
pose to use a more general benchmark technology, 
which can be estimated from empirical data using 
established econometric methods such as stochas-
tic frontier analysis (SFA) or data envelopment 
analysis (DEA) (see e.g. Fried et al. 2007 for an 
up-to-date review of these methods).

The purpose of this paper is to provide a de-
tailed examination and classification of alternative 
methods available for estimating the benchmark 

1  By “firm” we refer to any productive unit, which 
may be a private or public organization, or an aggregated 
entity such as an industry, sector or country.  
2  The frontier approach to environmental perform-
ance has a large and growing literature: see e.g. Färe et 
al. (1996), Tyteca (1996, 1997, 1998), Callens and Tyteca 
(1999), Zaim (2004), Kuosmanen and Kortelainen (2005, 
2007a), Cherchye and Kuosmanen (2006) and Kortelain-
en and Kuosmanen (2007), and references therein.

technology in the context of SV analysis. On 
one hand, methods can be classified according to 
whether an average-practice or best-practice tech-
nology is estimated. The best-practice technolo-
gies can be further classified as deterministic and 
stochastic technologies, depending on whether a 
stochastic noise term is included or not. On the 
other hand, the methods can be classified as being 
parametric and nonparametric in their orientation. 
Parametric methods assume a specific functional 
form of the production function, which is usually 
linear in its parameters. Nonparametric methods 
do not assume a particular functional form, but es-
timate the benchmark technology based on some 
minimal set of axioms. In this paper we restrict to 
the standard monotonicity and concavity axioms; 
other possible sets of axioms fall beyond the scope 
of this paper.3 

In addition to reviewing the theoretical prop-
erties and practical implementation of alternative 
methods, a critical examination of advantages and 
disadvantages of alternative methods is presented. 
To this end, we apply the alternative methods in-
cluded in the review to data from a sample of 332 
Finnish dairy farms. The data are obtained from the 
Farm Accountancy Data Network (FADN) data-
base, and they can be seen as a typical data set used 
in the SV assessments in agricultural sector. The 
results of the empirical analysis reveal the critical 
role of parametric functional form assumptions on 
one hand, and the importance of accounting for 
stochastic noise on the other. 

It should be noted that this study is one of the 
first empirical applications of the recently devel-
oped stochastic nonparametric envelopment of 
data (StoNED: Kuosmanen 2006, Kuosmanen and 
Kortelainen 2007b) and corrected convex nonpar-
ametric least squares (C2NLS:  Kuosmanen and 
Johnson 2009) methods, respectively. These two 
methods are based on nonparametric least squares 
estimation subject to shape constraints (monoto-
nicity, concavity) on the benchmark technology. 

3  There are also other nonparametric methods such as 
the kernel estimation, which are based on local averaging 
(e.g. Fan et al. 1996). Such non-axiomatic methods fall 
beyond the scope of this paper.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

304

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

305

The purpose of these methods is to bridge the gap 
between the existing parametric and nonparametric 
methods by combining the appealing characteris-
tics of both approaches. Therefore, comparing the 
results of different estimation techniques in a rela-
tively large sample of farms is also of methodo-
logical interest, providing insights and guidance 
for further methodological development.   

Sustainable Value 

Consider a production process where R resources 
(including natural, physical, human, and intellectual 
capital) are transformed into economic output (e.g. 
gross or net value added, or some physical output). 
The resource use by firm i  is characterized by vec-
tor xi=(xi1...xiR)', and the economic output of firm 
i is denoted by yi . According to Figge and Hahn 
(2004, 2005), firm i creates sustainable value if 
the economic output exceeds the opportunity cost 
of resource use. Thus, the SV measure is defined 
as the difference of output yi and the opportunity 
cost of xi. It is worth emphasizing that this relative 
measure does not tell if a particular firm is sustain-
able or not. It measures sustainability performance 
in a relative sense: by reallocating resources from 
firms with negative SV to those with positive value, 
a higher economic welfare could be achieved without 
increasing the total resource use of the economy. 

To calculate SV, we need to know the opportu-
nity cost of resources. Kuosmanen and Kuosmanen 
(2009) argue that the opportunity cost is not direct-
ly observable, but must be estimated from data in 
one way or another. In economics, the opportunity 
cost of using a resource for a specific activity refers 
to the income foregone by not using the resource 
in the best alternative activity. However, the best 
alternative use is not always self-evident. Kuos-
manen and Kuosmanen argue that the best alterna-
tive use of a resource depends on both the available 
technology and the other resources available for the 
alternative activity. 

To develop a rigorous definition of SV, Kuos-
manen and Kuosmanen (2009) characterize the 

benchmark technology as production function  
f: ℝ​+​​

R​ →ℝ+, which indicates the maximum amount 
of output that the benchmark technology can pro-
duce using the given amounts of input resources. 
They interpret the numerical value of the produc-
tion function f(x) as the total opportunity cost of re-
source bundle x, and the partial derivative ∂f(x)/∂xr 
as the marginal opportunity cost of resource r in 
point x. In general, the marginal opportunity cost 
need not be constant but depends on the amount of 
other resources available. 

The production function development of Kuos-
manen and Kuosmanen (2009) implies the follow-
ing general definition of SV:

SVi = yi  – f(xi).  (1)

The rationale behind identity (1) comes from the 
conceptual definition by Figge and Hahn, but Kuos-
manen and Kuosmanen’s definition is more general 
because it does not assume linearity or any other 
particular functional form of f. More specifically, 
Figge and Hahn’s (2004) original measure of SV 
is a special case of (1), obtained by specifying f 
as a linear function f(x)=  

R

 
 ∑   
r=1

   βrxr where coefficients 
βr=y

*/ x r  
*  represent eco-efficiency of a pre-defined 

benchmark unit (y*, x*)  in terms of resource r. 
Interestingly, identity (1) defines SV as a re-

sidual between the observed output and the produc-
tion function. Simply reorganizing identity (1) and 
introducing a random disturbance vi, we obtain the 
regression equation

yi = f(xi ) + εi ,= f(xi ) + SVi + vi ,  (2)

where εi represents a composite disturbance term 
that consists of differences in sustainability perfor-
mance across firms (i.e., sustainable value SVi ), and 
(optionally) the effects of measurement errors, differ-
ences in unobserved or omitted variables, and other 
deviations from the production function f, captured 
by the random noise term vi. From this perspective, 
the generalized SV formulation (1) conforms with 
the classic approach to measuring performance dif-
ferences across firms based on regression residuals 
(e.g. Timmer 1971, Richmond 1974).


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

304

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

305

Estimating benchmark technology

Classification

We next review alternative methods available 
for estimating the benchmark technology. Taking 
equation (2) as a starting point, we will classify 
the methods in six categories according to how the 
production function f and the composite disturbance 
εi  are specified. 

Firstly, methods can be classified as parametric 
or nonparametric depending on the specification 
of the production function f. Parametric methods 
postulate a priori some specific functional form 
for f (e.g. Cobb-Douglas, translog, or others) and 
subsequently estimate its unknown parameters. By 
contrast, nonparametric methods do not restrict to 
any single functional form, but assume only that f 
satisfies certain regularity axioms (e.g. monotonic-
ity and concavity). 

Secondly, methods can be classified according 
to the interpretation of the composite disturbance εi 
and the sustainable value SVi as average-practice 
or best-practice approaches. In the average-practice 
approaches, εi may be positive or negative, and no 

attempt is made to isolate the differences in sustain-
ability performance SVi from the noise term vi. As 
a result, the estimated technology f represents the 
average practice in the sample. The best-practice 
approaches generally estimate the frontier (i.e., the 
maximum output that can be produced with the giv-
en resources). The best-practice approaches can be 
further classified into deterministic and stochastic 
methods. The deterministic best-practice approach-
es assume away the noise term vi=0∀i=1,...,n, and 
assign all deviations from benchmark to one-sided 
inefficiency, implying​ SVi≤0​∀i=1,...,n. In sto-
chastic best-practice approaches we interpret εi as 
composite error term, from which the subcompo-
nents of inefficiency and noise (vi ≠0,SVi≤0) can be 
estimated and isolated. 

Combining the above described criteria gives 
us six different categories, as described in Table 
1 together with some canonical references. In the 
following sub-sections, each of these six types of 
methods is described in more detail. We start from 
the parametric ordinary least squares (OLS), and 
its best-practice variants parametric programming 
(PP), corrected ordinary least squares (COLS), 
and the stochastic frontier analysis (SFA). We then 
proceed to the nonparametric approaches: convex 

Table 1. Classification of methods

parametric

(f linear in parameters)

non-parametric

(f increasing and concave)
average-practice OLS

Cobb and Douglas (1928)

CNLS
Hildreth (1954)
Hanson and Pledger (1976)

best-practice, 
deterministic

PP
Aigner and Chu (1968)
Timmer (1971) 

COLS
Winsten (1957)
Greene (1980)

DEA 
Farrell (1957) 
Charnes, Cooper, Rhodes (1978) 

C2NLS
Kuosmanen and Johnson (2009)

best-practice,  
stochastic

SFA
Aigner, Lovell, and Schmidt (1977)
Meeusen and van den Broeck (1977)

StoNED
Kuosmanen (2006)
Kuosmanen and Kortelainen (2007b)

Abbreviations: OLS = ordinary least squares, PP = parametric programming, COLS = corrected ordinary least squares, SFA = stochas-
tic frontier analysis, CNLS = convex nonparametric least squares, DEA = data envelopment analysis, C2NLS = corrected convex non-
parametric least squares, StoNED = stochastic nonparametric envelopment of data.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

306

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

307

nonparametric least squares (CNLS), data envel-
opment analysis (DEA), corrected convex non-
parametric least squares (C2NLS), and stochastic 
nonparametric envelopment of data (StoNED).

Ordinary Least Squares (OLS)
OLS is the most standard and traditional estima-
tion technique in econometrics and statistics (e.g. 
Greene 2007). OLS can be a useful method for 
estimating average practice benchmarks when one 
is only interested in the composite εi term, and 
the distinction between the pure sustainable value 
(SVi) and the stochastic noise (vi) can be ignored. 
However, consistency of OLS requires that the 
composite εi has a symmetric distribution. More 
specifically, estimation of SV by OLS requires 
the following Gauss-Markov assumptions: 1) εi  
are andom variables that are uncorrelated with 
the resource use xi and across observations, 2) the 
conditional distribution of εi  has zero mean (i.e. 
E(εi|xi)=0), and 3) the production function f(xi) is 
linear (as implicitly assumed by Figge and Hahn). 
By assumption 3), equation (2) can be expressed 
in the matrix form as 

yi=α+β'xi+ εi .     (3)

Minimizing the sum of squares of SVi sta-
tistics, we obtain the closed form solution 
 ̂  ε i= yi–(X'X)

-1X'y x i  
'  

where matrix X=(1xi...xn) and vector y= (1yi...yn)'. 
Under assumptions 1)–3), the OLS estimator is 
unbiased, consistent, and has the smaller variance 
than any other linear estimator (i.e., OLS is the best 
linear unbiased estimator (BLUE)) (Greene 2007). 
Moreover, if we further assume that εi are normally 
distributed, then OLS is the maximum likelihood 
estimator, and the conventional methods of statisti-
cal inference apply. 

Let us take a closer look at the OLS assump-
tions. Firstly, assumption 3) of linear functional 
form can be easily relaxed; OLS can be applied as 
long as f is a linear function of the unknown param-
eters α,β, which does not mean that f is necessarily 
a linear function of resources xi . For example, the 
log-linear Cobb-Douglas function is nonlinear in 
xi but linear in parameters α,β. Still, the functional 

form of f must be assumed a priori, which intro-
duces a risk of specification error. 

Secondly, assumption 2) implies that the bench-
mark technology represents the average practice:  εi  
can be positive or negative, with the expected value 
zero. Importantly, if the composite disturbance εi  
contains an asymmetric inefficiency component  
SVi ≤0, as commonly assumed in the frontier ap-
proach, the assumption 2) will be violated. If that 
is the case, the OLS estimator will be inconsistent 
and biased (see Kuosmanen and Fosgerau 2009). 
Introducing an asymmetric inefficiency term SVi ≤0  
leads us to the best-practice methods, PP, COLS, 
and SFA, to be considered next. 

Thirdly, violations of assumption 1) have been 
extensively studied in econometrics and there are 
methods for dealing with problems of endogene-
ity and serial correlation (see e.g. Greene 2007). 
For brevity, we here abstract from violations of 
assumption 1).

Parametric programming (PP)
Aigner and Chu (1968) were the first to estimate a 
best-practice frontier with the parametric regression 
techniques. Their parametric programming (PP) 
model can be seen as a deterministic frontier vari-
ant of the regression model (3), obtained by setting  
vi=0 , and SVi ≤0∀i=1,...,n . The PP problem is

 min    α,β,SV {  
n

 
 ∑   i=1  S V i  

2 |SVi ≤0∀i=1,...,n;yi=α+β'xi+SVi∀i=1,...,n}
   (4)
An alternative specification is to minimize the 

sum (- ∑  i=1  
n  SVi), leading to a linear programming 

problem. Whichever specification is used, the 
constrained programming problem (4) does not 
merely shift the OLS regression line upwards to 
the frontier, it also influences the coefficients: the 
estimated intercept and slope coefficients obtained 
by PP model (4) generally differ from the OLS es-
timates of (3). 

Corrected Ordinary Least Squares (COLS) 
Another deterministic best-practice approach (often 
confused with PP) is corrected ordinary least squares 
(COLS). The basic idea of COLS was first suggested 


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

306

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

307

by Winsten (1957); consistency of COLS estimator 
was formally shown by Greene (1980). 

COLS is a two-stage procedure: in the first 
stage, the frontier is estimated by ordinary least 
squares (OLS) regression; and in the second stage, 
the frontier is shifted upwards such that the result-
ing COLS frontier envelopes all data. Note that 
the OLS residuals (  ̂  ε  i  

OLS 
 
) take both positive and 

negative values. In the COLS model, these error 
terms are attributed to inefficiency, thus the COLS 
estimator of SV is obtained as

  SV i  
COLS  =   ̂  ε  i  

OLS  –  max    h    ̂  ε  h  
OLS    (5)

Values of  SV i  
COLS  range from [0,–∞] , with 0 indicat-

ing efficient performance. Similarly, we adjust the 
intercept terms as 

αCOLS=αOLS –  max    h    ̂  ε  h  
OLS  .         (6)

Slope coefficients  ̂  β COLS are obtained directly from 
(3) as  ̂  β COLS=βOLS.

Stochastic frontier analysis (SFA)
SFA model developed by Aigner et al. (1977) and 
Meeusen and van den Broeck (1977) is nowadays 
the most frequently used parametric regression 
technique for estimating best-practice technologies. 
SFA differs from the deterministic approaches PP 
and COLS in that it includes a stochastic noise term 
vi that captures the effects of measurement errors, 
outliers, and other stochastic disturbances in the 
data. Filtering out the effects of stochastic noise vi 
from SV is an attractive feature of SFA. 

The estimation of the SFA model requires 
certain distributional assumptions: the typical ap-
proach is to assume that the noise term is normally 
distributed with zero mean and unknown finite var-
iance [i.e., vi ~ N(0, σ v  

2 )], and the pure sustainable 
value is half-normally distributed with an unknown 
variance [i.e., SVi~|N(0, σ SV  

2  )|].4 In practice, the SFA 

4  Alternative distributional assumptions about the 
inefficiency term are sometimes used (e.g. truncated 
normal, exponential, or gamma). However, the distribu-
tion does not influence the relative performance ranking 
of the firms.

frontiers are usually estimated by maximum likeli-
hood techniques. The maximum likelihood prob-
lem can be stated as

 max    α,β,σ,λ   –nlnσ +  
n

 
 ∑   
i=1

   [lnФ(  
-εiλ ___ σ  ) –   

1 __ 2  (  
εi __ σ  )], (7)

where εi=α+β'xi–yi , λ=σSV /σv , σ
2= σ SV  

2   /  σ v  
2  and Ф is 

the cumulative distribution function of the standard 
normal distribution. The sustainable values must be 
inferred indirectly, using the conditional distribution 
at a given εi. Given the estimated  ̂  σ SV, ̂  σ v  from 
(7), Jondrow et al. (1982) have shown that the 
conditional expected value of the sustainable value 
of firm i is obtained as 

E(SVi| ̂  ε i)= –  
 ̂  ε i  ̂  σ  u  

2 
 ____   ̂  σ  u  

2 +  ̂  σ  v  
2    +  

  ̂  σ  u  
2    ̂  σ  v  

2 
 ____    ̂  σ  u  

2    ̂  σ  v  
2   [  

Ф( ̂  ε 
i
/  ̂  σ  v  

2 )
 ________ 1–Ф( ̂  ε i/  ̂  σ  v  
2 )   ]  (8)

where Ф is the density function of the standard 
normal distribution. The conditional expected value 
(8) is an unbiased but inconsistent estimator of SVi: 
irrespective of the sample size n, the variance of the 
estimator does not converge to zero.

While filtering the noise out is a convenient fea-
ture, SFA still requires the prior assumption about 
the functional form of f (similar to OLS, PP, and 
COLS). However, we often have no good reason 
to prefer one functional form to another. Unfortu-
nately, imposing a wrong functional form can be a 
source of specification errors that result as biased 
and inconsistent estimates. Sometimes, different 
functional forms have almost equally good empiri-
cal fit, but the ranking of firms according to sustain-
able value are dramatically different. Dependence 
on the prior imposed functional form is the main 
disadvantage of SFA compared to the nonparamet-
ric approaches to be discussed next.

Concave Nonparametric Least Squares (CNLS)
If the functional form of the regression function is not 
known beforehand, we can resort to nonparametric 
regression techniques. CNLS is the oldest approach 
in that literature, dating back to the work by Hildreth 
(1954). CNLS requires the same Gauss-Markov as-
sumptions on the disturbance term εi as imposed in 
OLS. In contrast to OLS, however, CNLS does not 
assume linearity or any other parametric functional 


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

308

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

309

form for f. Rather, it postulates that f belongs to the 
set of continuous, monotonic increasing and globally 
concave functions, denoted henceforth by F2 . 

The CNLS problem finds f∈F2 that minimizes 
the sum of squares of the deviations, formally:5

 min    f,ε  {  
n

 
 ∑   

i=1
   ε i  

2 │yi=f(xi)+εi ∀i=1,...,n; f∈F2} (9)
The CNLS problem (9) identifies the best-fit 

function f from the family F2, which includes an 
infinite number of possible functions. This makes 
problem (9) generally hard to solve. Existing single 
regressor algorithms (e.g. Meyer 1999) require that 
the data are sorted in ascending order according 
to the scalar valued regressor x. However, such a 
sorting is not possible in the general multiple re-
gression setting where x is a vector. 

To estimate the CNLS problem in the general 
multi-input setting, Kuosmanen (2008) has trans-
formed the infinite dimensional problem (9) into an 
equivalent finite-dimensional quadratic program-
ming (QP) problem that can be solved by standard 
mathematical programming algorithms: 

   (10)
Note that in contrast to models (3) and (4), in 

problem (10) the intercept and slope coefficients  
αi, βi can differ from one firm to another. Instead 
of fitting one regression line to the cloud of ob-
served points as in OLS, we fit n different regres-
sion lines that can be interpreted as tangent lines to 
the unknown production function f. In this respect, 
Kuosmanen’s QP representation (10) applies in-
sights from the celebrated Afriat’s Theorem (Afriat 
1967, 1972). The slope coefficients βi  represent the 
marginal products of inputs (i.e., the sub-gradients 
∇f (xi)). The second constraint imposes concavity 
through a system of inequality constraints on tan-
gent lines, known as the Afriat inequalities: these 

5  For statistical properties of the CNLS estima-
tors, see  e.g. Groeneboom et al. (2001) and references 
therein.

inequalities are the key to modeling concavity con-
straints in the general multiple regressor setting. 
The third constraint imposes monotonicity. 

Given the estimated coefficients αi, βi from 
(10), we can construct the following piece-wise 
estimator of the benchmark technology:

   
 (11)

In principle, estimator f CNLS  consists of n hy-
perplane segments. In practice, however, the esti-
mated coefficients αi, βi are clustered to a relatively 
small number of alternative values: the number of 
different hyperplane segments is usually much 
lower than n (see Kuosmanen 2008)

Analogous to OLS, CNLS estimates the aver-
age-practice benchmark. It hence shares the same 
problem as OLS: if the composite disturbance εi  
contains an asymmetric component SVi ≤0, the 
exogeneity assumption E(εi│xi)=0  will be vio-
lated. If that is the case, the CNLS estimator will 
be inconsistent and biased. Introducing an asym-
metric inefficiency term SVi ≤0 leads us to the best-
practice methods, DEA, C2NLS, and StoNED, to 
be considered next.

Data envelopment analysis (DEA)
Data envelopment analysis (DEA) (Charnes et al. 
1978) is the most widely used nonparametric frontier 
approach. DEA is a deterministic linear program-
ming method. DEA does not require any prior 
assumptions about the functional form of function 
f, but only assumes that f belongs to the family of 
monotonic increasing and globally concave func-
tions (F2), similar to CNLS. An important advantage 
of DEA is that it does not require any statistical as-
sumptions about the composite distubance term εi. 
However, assuming away noise (i.e., vi=0∀i=1,...,n)  
is a strong assumption as such.  

DEA estimator of production function f can be 
expressed as6 

6  Formulation (12) was first presented by Afriat 

 9

2
2,

1
min ( )  1,..., ;   

n

i i i if
i

y f i n f Fε ε
=

 
= + ∀ = ∈ 

 
ε x .     (9) 

The CNLS problem (9) identifies the best-fit function f from the family 2F , which 

includes an infinite number of possible functions. This makes problem (9) generally hard 

to solve. Existing single regressor algorithms (e.g., Meyer 1999) require that the data are 

sorted in ascending order according to the scalar valued regressor x. However, such a 

sorting is not possible in the general multiple regression setting where x is a vector.  

To estimate the CNLS problem in the general multi-input setting, Kuosmanen 

(2008) has transformed the infinite dimensional problem (9) into an equivalent finite-

dimensional quadratic programming (QP) problem that can be solved by standard 

mathematical programming algorithms:  

2

, ,
1

 1,..., ; 
min  , 1,..., ;  

 1,...,

i i i i in

i i i i h h i
I

i

y i n
h i n

i n

α ε
ε α α

=

′ = + + ∀ = 
 ′ ′+ ≤ + ∀ = 
 ≥ ∀ = 


α β ε

β x
β x β x

β 0
     (10) 

Note that in contrast to models (3) and (4), in problem (10) the intercept and slope 

coefficients ,i iα β  can differ from one firm to another. Instead of fitting one regression 

line to the cloud of observed points as in OLS, we fit n different regression lines that can 

be interpreted as tangent lines to the unknown production function f. In this respect, 

Kuosmanen’s QP representation (10) applies insights from the celebrated Afriat’s 

Theorem (Afriat 1967, 1972). The slope coefficients iβ  represent the marginal products 

of inputs (i.e., the sub-gradients ( )if∇ x ). The second constraint imposes concavity 

through a system of inequality constraints on tangent lines, known as the Afriat 

inequalities: these inequalities are the key to modeling concavity constraints in the 

general multiple regressor setting. The third constraint imposes monotonicity.  

Given the estimated coefficients ( , )i iα β  from (10), we can construct the 

following piece-wise estimator of the benchmark technology: 

 
{ }

{ }
1,...,

( ) minCNLS i ii nf α∈
′= +x β x .        (11) 

In principle, estimator CNLSf  consists of n hyperplane segments. In practice, however, the 

estimated coefficients ( , )i iα β  are clustered to a relatively small number of alternative 

 9

2
2,

1
min ( )  1,..., ;   

n

i i i if
i

y f i n f Fε ε
=

 
= + ∀ = ∈ 

 
ε x .     (9) 

The CNLS problem (9) identifies the best-fit function f from the family 2F , which 

includes an infinite number of possible functions. This makes problem (9) generally hard 

to solve. Existing single regressor algorithms (e.g., Meyer 1999) require that the data are 

sorted in ascending order according to the scalar valued regressor x. However, such a 

sorting is not possible in the general multiple regression setting where x is a vector.  

To estimate the CNLS problem in the general multi-input setting, Kuosmanen 

(2008) has transformed the infinite dimensional problem (9) into an equivalent finite-

dimensional quadratic programming (QP) problem that can be solved by standard 

mathematical programming algorithms:  

2

, ,
1

 1,..., ; 
min  , 1,..., ;  

 1,...,

i i i i in

i i i i h h i
I

i

y i n
h i n

i n

α ε
ε α α

=

′ = + + ∀ = 
 ′ ′+ ≤ + ∀ = 
 ≥ ∀ = 


α β ε

β x
β x β x

β 0
     (10) 

Note that in contrast to models (3) and (4), in problem (10) the intercept and slope 

coefficients ,i iα β  can differ from one firm to another. Instead of fitting one regression 

line to the cloud of observed points as in OLS, we fit n different regression lines that can 

be interpreted as tangent lines to the unknown production function f. In this respect, 

Kuosmanen’s QP representation (10) applies insights from the celebrated Afriat’s 

Theorem (Afriat 1967, 1972). The slope coefficients iβ  represent the marginal products 

of inputs (i.e., the sub-gradients ( )if∇ x ). The second constraint imposes concavity 

through a system of inequality constraints on tangent lines, known as the Afriat 

inequalities: these inequalities are the key to modeling concavity constraints in the 

general multiple regressor setting. The third constraint imposes monotonicity.  

Given the estimated coefficients ( , )i iα β  from (10), we can construct the 

following piece-wise estimator of the benchmark technology: 

 
{ }

{ }
1,...,

( ) minCNLS i ii nf α∈
′= +x β x .        (11) 

In principle, estimator CNLSf  consists of n hyperplane segments. In practice, however, the 

estimated coefficients ( , )i iα β  are clustered to a relatively small number of alternative 


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

308

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

309

  
   (12)

This yields a continuous, piece-wise linear frontier 
that envelopes the observed data from above. If 
yi=fDEA(xi), then SVi =0 , and the firm is diagnosed 
as efficient. If yi<fDEA(xi) , then SVi <0 , and the firm 
is said to be inefficient. In standard DEA, outcome 
yi>fDEA(xi)  is not possible. Given a resource vector 
x, the values of this production function are easy to 
compute by linear programming. 

DEA estimator (12) can be interpreted as a non-
parametric counterpart to Aigner and Chu’s (1968) 
PP method described above (see Kuosmanen and 
Jonson 2009). The main advantage of DEA is its 
more general and flexible specification of f. How-
ever, DEA assumes away the stochastic noise term 
v, similar to the PP and COLS methods reviewed 
above.  

Corrected convex nonparametric least squares 
(C2NLS)
Kuosmanen and Johnson (2009) have recently 
proposed to combine the classic idea of COLS 
estimation to the CNLS method described above, 
referring to the new method as C2NLS. The practical 
implementation of C2NLS method consists of two 
steps. In step 1), we estimate the average-practice 
frontier using CNLS (equation (10). To estimate 
the deterministic best-practice frontier, we shift the 
frontier in step 2) directly analogous to the COLS 
procedure (equations (5) and (6)): the C2NLS esti-
mator for SV can be formally stated as  

 (13)

If the sustainable values SVi  are independently 
distributed across firms, C2NLS provides a more 
efficient estimator than DEA. However, DEA does 
not require any independence assumption, and it 

(1972), who formally proved that (12) is the minimal 
function that envelops all observed data points and satis-
fies monotonicity and concavity.  

is more robust to heteroskedasticity. On the other 
hand, the regression interpretation of the C2NLS 
method enables one to introduce contextual vari-
ables that explain differences in sustainability 
performance in the same regression model, thus 
avoiding the pitfalls of the two-stage semiparamet-
ric estimation (see Johnson and Kuosmanen 2009, 
for details). It also paves a way for introducing a 
stochastic noise term v to the nonparametric fron-
tier estimation.

Stochastic nonparametric envelopment of data 
(StoNED)
Stochastic Nonparametric Envelopment of Data 
(StoNED) is a new estimation method developed by 
Kuosmanen (2006) and Kuosmanen and Kortelainen 
(2007b). Like CNLS, DEA, and C2NLS, StoNED 
does not require any prior functional form assump-
tion about f, but only assumes that f belongs to the 
family F2. The StoNED method differs from DEA 
and C2NLS in that it decomposes the deviations of 
yi from f(xi) into two sources: the pure sustainable 
value SVi  and the stochastic noise term vi, similar 
to SFA. In other words, StoNED combines the 
deterministic part of DEA with the stochastic part 
of SFA, thus combining the key advantages of both 
methods. 

The practical estimation of the StoNED model 
is conducted in two stages. In the first stage, the con-
ditional expected value of y is estimated by CNLS 
regression (equation (10)). Given the CNLS residu-
als from problem (10), we subsequently filter out 
the noise from the sustainable values. This requires 
some distributional assumptions, e.g. the standard 
SFA assumptions vi~N(0, σ v  

2 ) and SVi~│N(0, σ SV  
2  )│. 

Parameters σSV ,σv can be estimated by the method 
of moments or maximum pseudolikelihood tech-
niques (see Kuosmanen and Kortelainen 2007b for 
details). The conditional expectation of the sustain-
able value is then computed using the Jondrow et 
al. formula (8).   

StoNED offers a general framework that en-
compasses both DEA and SFA as its special cases. 
Specifically, if we restrict the noise component vi 
equal to zero, StoNED falls back to the standard 
DEA. On the other hand, if we impose some par-
ticular functional form on f, then StoNED boils 

 10

values: the number of different hyperplane segments is usually much lower than n (see 

Kuosmanen 2008) 

 Analogous to OLS, CNLS estimates the average-practice benchmark. It hence 

shares the same problem as OLS: if the composite disturbance iε  contains an 

asymmetric component 0iSV ≤ , the exogeneity assumption ( ) 0i iE ε =x  will be 

violated. If that is the case, the CNLS estimator will be inconsistent and biased. 

Introducing an asymmetric inefficiency term 0iSV ≤  leads us to the best-practice 

methods, DEA, C2NLS, and StoNED, to be considered next. 

 
Data envelopment analysis (DEA) 

Data envelopment analysis (DEA) (Charnes et al. 1978) is the most widely used 

nonparametric frontier approach. DEA is a deterministic linear programming method. 

DEA does not require any prior assumptions about the functional form of function f, but 

only assumes that f belongs to the family of monotonic increasing and globally concave 

functions (F2), similar to CNLS. An important advantage of DEA is that it does not 

require any statistical assumptions about the composite disturbance term iε . However, 

assuming away noise (i.e., 0 1,...,iv i n= ∀ = ) is a strong assumption as such.   

 DEA estimator of production function f can be expressed as1  

0
1 1 1

( ) max ; 1
n n n

DEA i i i i i
i i i

f y
λ

λ λ λ
≥

= = =

 
= ≥ = 

 
  x x x .      (12) 

This yields a continuous, piece-wise linear frontier that envelopes the observed data from 

above. If ( )i DEA iy f= x , then 0iSV = , and the firm is diagnosed as efficient. If 

( )i DEA iy f< x , then 0iSV < , and the firm is said to be inefficient. In standard DEA, 

outcome ( )i DEA iy f> x  is not possible. Given a resource vector x, the values of this 

production function are easy to compute by linear programming.  

 DEA estimator (12) can be interpreted as a nonparametric counterpart to Aigner 

and Chu’s (1968) PP method described above (see Kuosmanen and Jonson 2009). The 

                                                 
1 Formulation (12) was first presented by Afriat (1972), who formally proved that (12) is the minimal function that 
envelops all observed data points and satisfies monotonicity and concavity.   

 11

main advantage of DEA is its more general and flexible specification of f. However, DEA 

assumes away the stochastic noise term v, similar to the PP and COLS methods reviewed 

above.   

 
Corrected convex nonparametric least squares (C2NLS) 

Kuosmanen and Johnson (2009) have recently proposed to combine the classic idea of 

COLS estimation to the CNLS method described above, referring to the new method as 

C2NLS. The practical implementation of C2NLS method consists of two steps. In step 1), 

we estimate the average-practice frontier using CNLS (equation (10). To estimate the 

deterministic best-practice frontier, we shift the frontier in step 2) directly analogous to 

the COLS procedure (equations (5) and (6)): the C2NLS estimator for SV can be formally 

stated as   
2 ˆ ˆmaxC NLS CNLS CNLSi i hh

SV ε ε= − .        (13) 

 If the sustainable values iSV  are independently distributed across firms, C
2NLS 

provides a more efficient estimator than DEA. However, DEA does not require any 

independence assumption, and it is more robust to heteroskedasticity. On the other hand, 

the regression interpretation of the C2NLS method enables one to introduce contextual 

variables that explain differences in sustainability performance in the same regression 

model, thus avoiding the pitfalls of the two-stage semiparametric estimation (see Johnson 

and Kuosmanen 2009, for details). It also paves a way for introducing a stochastic noise 

term v to the nonparametric frontier estimation. 

 
Stochastic nonparametric envelopment of data (StoNED) 

Stochastic Nonparametric Envelopment of Data (StoNED) is a new estimation method 

developed by Kuosmanen (2006) and Kuosmanen and Kortelainen (2007b). Like CNLS, 

DEA, and C2NLS, StoNED does not require any prior functional form assumption about 

f, but only assumes that f belongs to the family F2. The StoNED method differs from 

DEA and C2NLS in that it decomposes the deviations of yi from f(xi) into two sources: 

the pure sustainable value iSV  and the stochastic noise term vi, similar to SFA. In other 


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

310

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

311

down to SFA. The main advantage of StoNED to 
the parametric SFA is the independence of the ad 
hoc assumptions about the functional form of the 
benchmark technology. On the other hand, the main 
advantage of StoNED to the nonparametric DEA 
is the better robustness to outliers, data errors, and 
other stochastic noise in the data. While in DEA the 
benchmark technology is spanned by a relatively 
small number of efficient firms, in StoNED all ob-
servations influence the benchmark. 

Application to Finnish dairy farms

Objectives
We next apply the eight estimation methods de-
scribed and classified in the previous section to 
the empirical data of 332 Finnish dairy farms. 
Objectives of this exercise are three-fold. First, the 
application illustrates the results and information 
obtainable with alternative methods. Second, appli-
cation of different methods enables us to compare 
the SV estimates, and analyze their correlations. 
Third, the results enable us to critically evaluate 
the advantages and disadvantages of alternative 
methods. Thus, this analysis sheds further light 
on the choice of the benchmark technology in SV 
analysis. Although the sustainability performance 
of dairy farms is of considerable interest per se (see, 
e.g. van Passel et al. 2007), our main focus is on 
the comparison of alternative estimation methods 
in a typical empirical data. 

Data

The data set is obtained from the FADN database. 
The output and the resource use are measured on per 
hectare basis. The economic output is the total revenue 
from milk and other products, and is expressed in € 
ha−1. Economic resources include labor (hr ha−1) and 
farm capital (€ ha−1). Unfortunately, farm level data 
on environmental and social resources of dairy farms 
are extremely limited for the purposes of sustainability 
assessment. As two environmental resources extract-
able from the FADN data, we include the total energy 
cost (€ ha−1) and the net nitrogen use (kg N ha−1). The 
net nitrogen use has been calculated based on farm 
gate nitrogen surplus method (Nevens et al. 2006, 
Virtainen and Nousiainen 2005). The limited scope of 
the sustainability indicators is an obvious shortcom-
ing of this data set, but similar data problems arise 
in virtually all farm-level environmental efficiency 
or SV analyses (see e.g. Reinhard et al. 1999, van 
Passel et al. 2007, and references therein).

Descriptive statistics of the sample data are 
reported in Table 2. The output varies from 561 € 
ha−1 up till 6,691 € ha−1, with a distribution skewed 
heavily to the left. The labor, capital, and energy 
intensities also exhibit large variance and skewness 
to left. Net nitrogen surplus is positive at all farms 
included in the sample, with the average value of 
72 kg N ha−1.

We next applied the eight methods described in 
the previous section and classified in Table 1. For 
the parametric methods, the log-linear Cobb-Doug-
las functional form has been used. For DEA, the 
output-oriented variable returns to scale specifica-
tion is used. For the stochastic SFA and StoNED 

Table 2. Data set descriptive statistics for the year 2004, sample size equals 332 Finnish dairy farms

variable mean standard deviation minimum maximum

Total output, € ha−1 1,948 760 561 6,691
Labor, hr ha−1 124 61 25 421
Farm capital, € ha−1 5,282 2,573 1,172 20,493
Energy, € ha−1 125 56.5 42 433
Net N surplus, kg ha−1 72 25.5 5.4 210


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

310

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

311

methods, we assume the half-normal SV distribution 
and normally distributed noise. The variance decom-
position of SFA and StoNED has been conducted 
by the method of moments. In SFA, residuals of the 
log-linear OLS model were used. In StoNED, the 
residuals of the CNLS model were divided by the 
output per hectare to circumvent heteroskedasticity, 
and thus obtained standardized residuals were used 
in the variance decomposition.  

Results

Table 3 reports the summary statistics of the SV 
estimates obtained with each method, together with 
the coefficient of determination R2 = 1–(SSE/SST), 
where SSE is the sum of squares of residuals and SST 
is the sum of squares of output y around its mean. 
For comparability, we calculated the SSE for all 
methods using difference y – f(x), attributing both 
the SV and noise components in the unexplained 
variance. Comparing the R2 statistics of the paramet-
ric methods and their nonparametric counterparts, 
we observe that the latter group of methods yields 
somewhat better empirical fit in all specifications. 
This is an expected result, given the greater flex-
ibility of the nonparametric specification.

Comparing the SV estimates of the average-
practice and best-practice methods, we note that 
the latter ones indicate a negative SV value for all 
farms: it is not possible to perform better than the 
best practice. The positive mean SV of OLS esti-
mates is due to the log-transformation. The deter-
ministic methods COLS and C2NLS indicate the 
smallest average and median SV statistic, suggest-
ing highest degree of inefficiency. The stochastic 
methods SFA and StoNED have a smaller vari-
ance in SV statistics. This is because the stochastic 
methods filter out the noise component from the 
composite disturbance term. 

It is also interesting to compare the correlations 
in SV estimates across farms. Table 4 reports the 
correlation table with the Pearson product moment 
correlation coefficients for the SV estimates and 
the Spearman rank correlation coefficients (in pa-

rentheses). By construction, SV estimates obtained 
by OLS and COLS methods exhibit perfect correla-
tion. The same is true for the nonparametric CNLS 
and C2NLS methods. In general, the SV estimates 
and rankings obtained from most methods are high-
ly correlated. A notable exception is DEA, which 
yields SV estimates that are negatively correlated 
with all other method, except for a small positive 
correlation with StoNED. Except for DEA, other 
nonparametric SV estimates are highly correlated 
with each other, and the same is true for the para-
metric estimates. The CNLS and C2NLS estimates 
are relatively highly correlated with the parametric 
estimates, SFA and PP in particular. We may inter-
pret the positive correlations across the spectrum of 
methods (except for DEA) as evidence for robust-
ness in the SV estimates and rankings.  

Differences in the SV performance at the farm 
level are lost in the summary statistics and correla-
tion tables. However, reporting the SV statistics for 
all 332 farms is not practical. To shed some light 
on sustainability performance at the farm-level, Ta-
ble 5 reports the SV estimates and relative ranks 
for the five farms with the lowest and the highest 
output per hectare in the sample, respectively, la-
beled as farms no. 1–5 and 328–332. The upper 
part of Table 5 reports the SV estimates (left col-
umn) and the relative rankings (right column) of 
these ten farms obtained by using the parametric 
methods (OLS, PP, COLS, and SFA). Analogously, 
the lower part of Table 5 reports the corresponding 
SV statistics and farm rankings obtained by using 
the nonparametric methods (CNLS, DEA, C2NLS, 
and StoNED). Despite the high correlations in SV 
estimates and rankings across method at the level 
of the entire sample, the farm-level SV estimates 
and rankings exhibit substantial differences across 
methods. For example, farm no. 332 performs rela-
tively well according to OLS and COLS (rank 10), 
whereas this farm is one of the worst performers 
according to StoNED (rank 326 out of 332). On the 
other hand, farm no. 3 is one of the best farms ac-
cording to the StoNED model, but OLS and COLS 
rank it as 283. For an individual farm, different 
methods can show a dramatically different picture 
about the relative performance, let alone the abso-
lute improvement potential.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

312

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

313

Table 3. Descriptive statistics of sustainable value (SV) estimates and coefficients of determination (R2)

parametric non-parametric 
average-practice OLS CNLS

mean: 45.36 € ha−1 0.00 € ha−1

median: 46.03 € ha−1 40.23 € ha−1

st. dev.: 430.44 € ha−1 341.30 € ha−1

min: –1114.00 € ha−1 –1581.72 € ha−1

max: 1477.77 € ha−1 1176.17 € ha−1

R2: 0.679 0.798
best-practice, 

deterministic

PP DEA
mean: –1057.68 € ha−1 –776.82 € ha−1

median: –1028.38 € ha−1 –773.92 € ha−1

st. dev.: 491.24 € ha−1 461.86 € ha−1

min: –3045.42 € ha−1 –2163.73 € ha−1

max: 0 € ha−1 0 € ha−1

R2: 0.582 0.631
COLS C2NLS

mean: –1432.41 € ha−1 –1176.17 € ha−1

median: –1431.73 € ha−1 –1135.94 € ha−1

st. dev.: 430.44 € ha−1 341.30 € ha−1

min: –2591.76 € ha−1 –2757.89 € ha−1

max: 0 € ha−1 0 € ha−1

R2: 0.679 0.798
best-practice, 

stochastic

SFA StoNED
mean: –327.27 € ha−1 –310.60 € ha−1

median: –292.07 € ha−1 –256.35 € ha−1

st. dev.: 267.47 € ha−1 283.22 € ha−1

min: –1547.50 € ha−1 –1741.11 € ha−1

max: 0 € ha−1 0 € ha−1

R2: 0.679 0.798
Abbreviations: OLS = ordinary least squares, PP = parametric programming, COLS = corrected ordinary least squares, SFA = sto-
chastic frontier analysis, CNLS = convex nonparametric least squares, DEA = data envelopment analysis, C2NLS = corrected con-
vex nonparametric least squares, StoNED = stochastic nonparametric envelopment of data.

Table 4. Correlation matrix of SV estimates; Pearson product moment correlation coefficients (Spearman rank correla-
tion coefficients in parentheses)

OLS PP COLS SFA CNLS DEA  C2NLS StoNED
 OLS 1 0.819 

(0.792)
1 0.882 

(0.935)
0.642 
(0.619)

–0.053 
(–0.024)

0.642 
(0.619)

0.328 
(0.341)

 PP 1 0.819 
(0.792)

0.819 
(0.946)

0.752 
(0.724)

–0.035 
(–0.039)

0.752 
(0.724)

0.648 
(0.637)

 COLS 1 0.882 
(0.935)

0.642 
(0.619)

–0.053 
(–0.024)

0.642 
(0.619)

0.328 
(0.341)

 SFA 1 0.778 
(0.748)

–0.022 
(–0.025)

0.778 
(0.748)

0.619 
(0.565)

 CNLS 1 –0,008 
(0.007)

1 0.902 
(0.914)

 DEA 1 –0.008 
(0.007)

0.013 
(0.018)

 C2NLS 1 0.902 
(0.914)

 StoNED 1
Abbreviations: OLS = ordinary least squares, PP = parametric programming, COLS = corrected ordinary least squares, SFA = stochastic 
frontier analysis, CNLS = convex nonparametric least squares, DEA = data envelopment analysis, C2NLS = corrected convex nonparamet-
ric least squares, StoNED = stochastic nonparametric envelopment of data.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

312

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

313

Table 5. Sustainable value (SV) statistics and relative rankings of the five least productive (no. 1–5) and most produc-
tive (no 328–332) farms in terms of output per hectare

parametric methods OLS PP COLS SFA

farm no. output (€ ha−1) SV rank SV rank SV rank SV rank
1) 561 –557.3 311. –1272.9 239. –2035.1 311. –560.9 281.

2) 748 –233.5 260. –814.8 98. –1711.3 260. –336.3 187.

3) 775 –374.9 283. –1185.4 209. –1852.6 283. –459.9 250.

4) 788 –210.1 253. –875.7 123. –1687.8 253. –323.6 180.

5) 817 –35.0 194. –541.9 44. –1512.8 194. –175.4 113.
…

328) 4033 1220.0 6. –382.9 24. –257.8 6. –8.8 24.

329) 4727 1098.4 11. –605.5 52. –379.4 11. –24.2 49.

330) 4769 1477.8 1. –295.1 16. 0.0 1. –9.3 29.

331) 5052 1283.9 4. –651.9 61. –193.9 4. –18.7 42.

332) 6691 1127.9 10. –1728.4 305. –349.8 10. –159.0 102.

nonparametric methods CNLS DEA C2NLS StoNED

farm no. output (€ ha−1) SV rank SV rank SV rank SV rank
1) 561 –107.1 220. –425.1 78. –1283.3 220. –172.7 133.

2) 748 124.0 129. –1548.7 314. –1052.2 129. –14.8 36.

3) 775 244.0 73. –167.3 39. –932.1 73. 0.0 1.

4) 788 190.1 91. –825.2 179. –986.1 91. –1.9 10.

5) 817 135.1 121. –911.4 204. –1041.1 121. –16.4 40.
…

328) 4033 600.5 9. –538.1 100. –575.7 9. –131.5 110.

329) 4727 –536.6 314. –716.7 151. –1712.8 314. –1158.5 327.

330) 4769 456.4 21. –1013.6 235. –719.8 21. –360.8 208.

331) 5052 43.0 164. –840.6 182. –1133.2 164. –738.9 308.

332) 6691 –145.3 233. –615.5 119. –1321.5 233. –1142.5 326.

Abbreviations: OLS = ordinary least squares, PP = parametric programming, COLS = corrected ordinary least squares, SFA = stochastic 
frontier analysis, CNLS = convex nonparametric least squares, DEA = data envelopment analysis, C2NLS = corrected convex nonpara-
metric least squares, StoNED = stochastic nonparametric envelopment of data.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

314

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

315

Concluding discussion

Starting from a generalized formulation of sus-
tainable value that is consistent with nonlinear 
benchmark technologies and facilitates estimation 
of the benchmarks from empirical data, we have 
reviewed eight alternative methods for estimating 
benchmark technologies and sustainable value 
scores. We distinguished between parametric and 
nonparametric approaches, depending on the as-
sumed functional form of the benchmark technology. 
We also draw distinction between average-practice 
and best-practice approaches, further classifying 
the best-practice approaches into deterministic and 
stochastic methods. For each six categories, there 
are sound estimation methods that can be applied 
in empirical SV analysis.

To shed further light on the choice of the esti-
mation method, the eight approaches reviewed in 
this paper were applied to the empirical production 
data of 332 Finnish dairy farms. Based on the re-
sults of the application, the following conclusions 
can be drawn. 

Firstly, the nonparametric methods achieved a 
better empirical fit than their parametric counter-
parts in terms of to the coefficient of determination 
(R2). This is one of the benefits from the greater 
flexibility of the nonparametric specification that 
does not force the benchmark technology to a rigid 
structure of some parametric functional form. The 
nonparametric approaches considered in this paper 
build upon the monotonicity and concavity axioms, 
which ensure that the estimated benchmark tech-
nology conforms with the regularity conditions 
of the microeconomic theory. However, possible 
violations of the regularity conditions can be det-
rimental for the nonparametric methods. Except for 
DEA, the nonparametric methods are also compu-
tationally demanding. For example, computing the 
CNLS problem (10) with GAMS software required 
more than 3.2 Million iterations. The benefits of 
nonparametric estimation do not come without 
cost.

Secondly, significant negative skewness in 
the regression residuals of both parametric OLS 
and nonparametric CNLS speak against using the 

average-practice benchmarks in this application. 
As we have noted, estimating an average practice 
benchmark technology in the presence of an asym-
metric inefficiency component in the disturbance 
term yields biased and inconsistent estimates. 

Thirdly, high positive correlations across a 
wide spectrum of methods (except for DEA) in 
both the SV estimates and the relative rankings 
suggest that the findings from the regression based 
approaches are relatively robust to possible specifi-
cation errors, sampling errors, and data problems. 
The results from DEA analysis are likely perturbed 
by measurement errors, outliers, and other noise 
in data, to which DEA estimates are known to be 
sensitive.

Fourthly, the deterministic best-practice bench-
marks indicate enormous improvement potential 
in sustainability performance, but it is question-
able whether such performance targets are realistic. 
While we have used the best data available for a 
typical SV analysis at the farm level, the data are 
far from perfect. There are a number of omitted 
factors and sources of error that must be acknowl-
edged. For these reasons, the stochastic frontier 
methods SFA and StoNED, which filter out the 
noise component from the inefficiency term and 
attribute only a part of the deviations from the fron-
tier to the SV estimate, are likely to provide more 
realistic estimates of the sustainable improvement 
potential, which translates into more realistic per-
formance targets at the farm level.   

In conclusion, a number of methods for esti-
mating benchmark technologies are available. The 
choice of the estimation method depends on the 
quality and coverage of data, the sample size, the 
number of resources, among other considerations. 
Since there is no single superior method for all ap-
plications, reporting estimates of several alterna-
tive methods can shed some light on the robustness 
of results.  

Acknowledgements. This paper has benefited of com-
ments and suggestions from two anonymous reviewers 
of this journal as well as participants to the SVAPPAS 
project (http://www.svappas.ugent.be). Financial support 
from the 6th Framework Programme of the EU for this 
project is gratefully acknowledged (project code: SSPE–


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

314

A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Vol. 18(2009): 302–316.

315

CT–2006–44215). The usual disclaimer applies: the views 
expressed in this paper are those of the authors and not 
necessarily those of their organizations or sponsors, nor 
those who have commented on the earlier draft versions.

References
Afriat S.N. 1967. The construction of a utility function 

from expenditure data. International Economic Review 
8:  67–77.

Afriat S.N. 1972. Efficiency estimation of production func-
tions. International Economic Review 13:  568–598.

Aigner D. & Chu S. 1968. On estimating the industry pro-
duction function. American Economic Review 58: 826–
839.

Aigner D., Lovell C.A.K. & Schmidt P. 1977. Formulation 
and estimation of stochastic frontier production function 
models. Journal of Econometrics 6:  21–37.

Callens I. & Tyteca D. 1999. Towards indicators of sustain-
able development for firms: a productive efficiency per-
spective. Ecological Economics 28:  41–53.

Charnes A. Cooper W.W. & Rhodes E. 1978. Measuring 
the efficiency of decision making units. European Jour-
nal of Operational Research 2:  429–444.

Cherchye L. & Kuosmanen T. 2006. Benchmarking sus-
tainable development: A synthetic meta-index ap-
proach. Ch. 7 in McGillivray M. & Clarke M. (eds): Un-
derstanding Human Well-being. United Nations Univer-
sity Press, Tokyo.

Cobb C.W. & Douglas P.H. 1928. A Theory of production. 
American Economic Review 18:  139–165.

Fan Y., Li Q. & Weersink A. 1996. Semiparametric estima-
tion of stochastic production frontier models. Journal of 
Business and Economic Statistics 14:  460–468.

Färe R., Grosskopf S. & Tyteca D. 1996. An activity anal-
ysis model of the environmental performance of firms: 
Application to fossil-fuel-fired electric utilities. Ecologi-
cal Economics 18:  161–175.

Farrell M.J. 1957. The measurement of productive efficien-
cy. Journal of the Royal Statistical Society, Series A 

 120:  253–281.
Figge F. & Hahn T. 2004. Sustainable value added – mea-

suring corporate contributions to sustainability beyond 
eco-efficiency. Ecological Economics 48:  173–187.

Figge F. & Hahn T. 2005. The cost of sustainability capi-
tal and the creation of sustainable value by companies. 
Journal of Industrial Ecology  9:  47–58.

Fried H., Lovell C.A.K. & Schmidt S. (eds) 2007. The 
measurement of productive efficiency and productivity 
change. Oxford University Press, New York.

Greene W.H. 1980. Maximum likelihood estimation of 
econometric frontier functions. Journal of Economet-
rics 13:  26–57.

Greene W.H. 2007. Econometric analysis. 6th Edition, Pear-
son Education International, New Jersey.

Groeneboom P., Jongbloed G. & Wellner J.A. 2001. A ca-
nonical process for estimation of convex functions: The 
“invelope” of integrated Brownian motion plus t(4). An-

nals of Statistics 29:  1620–1652.
Hanson D.L. & Pledger G. 1976. Consistency in concave 

regression. Annals of Statistics 4:  1038–1050.
Hildreth C. 1954. Point estimates of ordinates of concave 

functions. Journal of the American Statistical Associa-
tion 49:  598–619.

Johnson A.L. & Kuosmanen T. 2009. How operation-
al conditions and practices effect productive perfor-
mance? Efficient nonparametric one-stage estimators, 
paper presented at the European Workshop on Effi-
ciency and Productivity Analysis (EWEPA), June 23–
26  2009, Pisa, Italy. 

Jondrow J., Lovell C.A.K., Materov I.S. & Schmidt P. 1982. 
On estimation of technical inefficiency in the stochas-
tic frontier production function. Journal of Economet-
rics 19:  233–238.

Kortelainen M. & Kuosmanen T. 2007. Eco-efficiency analy-
sis of consumer durables using absolute shadow prices. 
Journal of Productivity Analysis 28:   57–69.

Kuosmanen T. 2006. Stochastic nonparametric envelop-
ment of data: combining virtues of SFA and DEA in a uni-
fied framework. MTT Discussion Paper 3/2006.

Kuosmanen T. 2008. Representation theorem for con-
vex nonparametric least squares. Econometrics Jour-
nal 11:   308–325.

Kuosmanen T. & Fosgerau M. 2009. Neoclassical versus 
frontier production models? Testing for the presence of 
inefficiencies in the regression residuals. Scandinavian 
Journal of Economics 111:  317–333.

Kuosmanen T. & Johnson A.L. 2009. Data envelopment 
analysis as nonparametric least squares regression. 
Operations Research: forthcoming.

Kuosmanen T. & Kortelainen M. 2005. Measuring eco-ef-
ficiency of production with data envelopment analysis. 
Journal of Industrial Ecology 9:  59–72.

Kuosmanen T. & Kortelainen M. 2007a. Valuing environ-
mental factors in cost-benefit analysis using data envel-
opment analysis. Ecological Economics 62:  56–65.

Kuosmanen T. & Kortelainen M. 2007b. Stochastic non-
parametric envelopment of data: Cross-sectional fron-
tier estimation subject to shape constraints. University 
of Joensuu, Economics DP No. 46.

Kuosmanen T. & Kuosmanen N. 2009. How not to mea-
sure sustainable value (and how one might). Ecological 
Economics: forthcoming.

Meeusen W. & van den Broeck J. 1977. Efficiency esti-
mation from Cobb-Douglas production functions with 
composed error. International Economic Review 18:  
435–445.

Meyer M.C. 1999: An extension of the mixed primal-dual 
bases algorithm to the case of more constraints than 
dimensions. Journal of Statistical Planning and Infer-
ence 81:  13–31.

Nevens F., Verbruggen I., Reheul D. & Hofman G. 2006. 
Farm gate nitrogen surpluses and nitrogen use efficien-
cy of specialized dairy farms in Flanders: Evolution and 
future goals. Agricultural Systems 88:  142–155.

Reinhard S., Lovell C.A.K. & Thijssen G. 1999. Economet-
ric estimation of technical and environmental efficiency: 
An application to Dutch dairy farms. American Journal 
of Agricultural Economics 81:  44–60. 

Richmond J. 1974. Estimating the efficiency of production. 
International Economic Review 15:  515–521.


A G R I C U L T U R A L  A N D  F O O D  S C I E N C E

Kuosmanen, T. & Kuosmanen, N. Benchmark technology in sustainable value analysis

316

tiin SV-indeksin arvot kullekin tilalle. Vaihtoehtoisia 
estimointimenetelmiä arvioitiin kriittisesti sekä ekonom-
etrisen teorian että empiiristen tulosten valossa. Tulosten 
perusteella voidaan vetää seuraavat johtopäätökset: 

1) Eri menetelmillä määritettyjen SV-indeksien 
korrelaatiot ovat korkeita ja positiivisia, joten eri mene-
telmillä arvioidut tilojen paremmuusjärjestykset kestä-
vyyden suhteen ovat pitkälti yhdenmukaisia.

2) Parametrittomat menetelmät pystyvät joustavuu-
tensa ansiosta selittämään suuremman osuuden (heh-
taarikohtaisen) tuotoksen vaihtelusta tilojen välillä kuin 
vastaavat parametriset menetelmät.

3) Regressiomallien jäännöstermien jakaumat ovat 
vinoja. Keskimääräinen vertailuteknologia antaa siten 
tässä aineistossa harhaisen kuvan tilojen tuotantomah-
dollisuuksista.

4) Satunnaisvirheitten huomioon ottaminen kes-
tävyyden määrittämisessä parantaa tulosten luotetta-
vuutta.

Timmer C.P. 1971. Using a probabilistic frontier production 
function to measure technical efficiency. Journal of Po-
litical Economy 79:  767–794.

Tyteca D. 1996. On the measurement of the environmen-
tal performance of firms – A literature review and a pro-
ductive efficiency perspective. Journal of Environmen-
tal Management 46:  281–308.

Tyteca D. 1997. Linear programming models for the mea-
surement of environmental performance of firms – Con-
cepts and empirical analysis. Journal of Productivity 
Analysis 8:  183–197.

Tyteca D. 1998. Sustainability indicators at the firm level: 
Pollution and resource efficiency as a necessary con-
dition toward sustainability. Journal of Industrial Ecol-
ogy 2:  61–77.

van Passel S., Nevens F., Mathijs E., & van Huylenbroeck 
G. 2007. Measuring farm sustainability and explaining 
differences in sustainable efficiency. Ecological Eco-
nomics 62:  149–161.

Virtainen H. & Nousiainen J. 2005. Nitrogen and phospho-
rus balances on Finnish dairy farms. Agricultural and 
Food Science 14:  166–180.

Winsten C.B. 1957. Discussion on Mr. Farrell’s paper. 
Journal of the Royal Statistical Society Series A 120:  
282–284.

Zaim O. 2004. Measuring environmental performance of 
state manufacturing through changes in pollution in-
tensities: A DEA framework. Ecological Economics 
48:  37–47.

SELOSTUS

Vertailuteknologian merkitys tuotannon kestävyyden arvioinnissa 

Sovellutus suomalaisille maitotiloille
Kuosmanen Timo ja Kuosmanen Natalia

Helsingin kauppakorkeakoulu ja MTT

Kestävä kehitys on monitahoinen käsite, johon sisältyy 
taloudellisen, ekologisen ja sosiaalisen kestävyyden ulot-
tuvuudet. Yritystoiminnan kestävyyden mittaaminen on 
todettu tärkeäksi, mutta myös haastavaksi ongelmaksi. 
Yksi varteenotettavimmista kestävyyden mittaustavoista 
on Figgen ja Hahnin kehittämä sustainable value (SV) 
menetelmä. Artikkelin kirjoittajat ovat aikaisemmassa 
tutkimuksessaan kritisoineet alkuperäisen SV-estimaat-
torin rajoittavia lineaarisuusoletuksia. Nämä oletukset 
voidaan välttää kirjoittajien kehittämän yleistetyn SV-
menetelmän avulla, joka mahdollistaa epälineaaristen 
vertailuteknologioitten käyttämisen ja niiden empiirisen 
estimoinnin. 

Tutkimuksen tarkoituksena on arvioida vaihtoe-
htoisia parametrisia ja paramerittomia menetelmiä 
vertailuteknologian estimointiin yleistetyn SV-analyysin 
viitekehikossa. Kahdeksaa erilaista vertailuteknologian 
estimointiin yleisesti käytettyä menetelmää sovellettiin 
332 suomalaisen maitotilan empiiriseen aineistoon. Kul-
lakin menetelmällä saatujen tulosten perusteella lasket-


	Introduction
	Sustainable Value
	Estimating benchmark technology
	Application to Finnish dairy farms
	Data

	Results
	Concluding discussion
	References
	SELOSTUS