C:/Documents and Settings/Profesor/Escritorio/Cubo/Cubo/Cubo/Cubo 2011/Cubo2011-13-01/Vol13 n\2721/Art N\2607 .dvi


CUBO A Mathematical Journal
Vol.13, No¯ 01, (103–123). March 2011

Engineering Design under Imprecise Probabilities:
Computational Complexity

Vladik Kreinovich

Department of Computer Science,

University of Texas at El Paso,

500 W. University,

El Paso, Texas 79968, USA.

email: vladik@utep.edu

ABSTRACT

In engineering design problems, we want to make sure that a certain quantity c of

the designed system lies within given bounds – or at least that the probability of

this quantity to be outside these bounds does not exceed a given threshold. We may

have several such requirements – thus the requirement can be formulated as bounds

[F c(x), F c(x)] on the cumulative distribution function Fc(x) of the quantity c; such

bounds are known as a p-box.

The value of the desired quantity c depends on the design parameters a and the pa-

rameters b characterizing the environment: c = f (a, b). To achieve the design goal, we

need to find the design parameters a for which the distribution Fc(x) for c = f (a, b) is

within the given bounds for all possible values of the environmental variables b. The

problem of computing such a is called backcalculation. For b, we also have ranges with

different probabilities – i.e., also a p-box. Thus, we have backcalculation problem for

p-boxes.

For p-boxes, there exist efficient algorithms for finding a design a that satisfies the

given constraints. The next natural question is to find a design that satisfies additional


104 Vladik Kreinovich CUBO
13, 1 (2011)

constraints: on the cost, on the efficiency, etc. In this paper, we prove that that in

general, the problem of finding such a design is computationally difficult (NP-hard).

We show that this problem is NP-hard already in the simplest possible linearized case,

when the dependence c = f (a, b) is linear. We also provide an example when an efficient

algorithm is possible.

RESUMEN

En los problemas de diseño de ingenieŕıa, en los que quiere asegurarse de que una deter-

minada cantidad c del sistema se encuentra dentro de los ĺımites dados – o por lo menos

que la probabilidad de esa cantidad fuera de estos ĺımites no superen un determinado

umbral. Es posible que haya varios requisitos – la exigencia puede formularse como

ĺımites [F c(x), F c(x)] en la función de distribución acumulada Fc(x) de la cantidad de

c, esos ĺımites son conocidos como p-caja.

El valor de la cantidad deseada c depende de los parámetros de diseño a y los parámetros

b caracterizar el medio ambiente: c = f (a, b). Para lograr el objetivo de diseño, tenemos

que encontrar los parámetros de diseño a para que la distribución de Fc(x) para c =

f (a, b) esté dentro de los ĺımites dados por todos los valores posibles de las variables

ambientales b. El problema de la informática se llama a retrocálculo. Por b, también

tienen rangos con diferentes probabilidades, es decir, también una p-box. Por lo tanto,

tenemos un problema de retrocálculo para p-cajas.

Para p-cajas, existen algoritmos eficientes para encontrar un diseño a que satisface

las restricciones dadas. La pregunta lógica es encontrar un diseño que satisfaga re-

stricciones adicionales: el coste, la eficiencia, etc. En este trabajo, demostramos, en

general, el problema de encontrar un diseño que es computacionalmente dif́ıcil (NP-

hard). Se demuestra que este problema es NP-hard ya en el caso lineal más simple

posible, cuando la dependencia c = f (a, b) es lineal. También ofrecemos un ejemplo,

cuando un algoritmo eficiente es posible.

Keywords: Engineering design, imprecise probability, computational complexity, p-boxes, NP-

hard

AMS Subject Classification: 65K10, 65J99, 49M15, 49J53, 47J20, 47H04, 90C30, 90C33.

1 Engineering Design Problems and the Notion of Backcal-
culation: Deterministic Case

One of the main objective of engineering design is to guarantee that the value of a certain quantity

(or several quantities) c is within a given range [c, c]. For example, when we design a car engine,


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 105

we must make sure:

• that its power is at least as much as needed for the loaded car to climb the steepest mountain

roads,

• that the concentration of undesirable substances in the exhaust does not exceed the required

threshold, etc.

The value of the quantity c usually depends on the parameters a describing the design and on

the parameters b of the environment: c = f (a, b). For example, the concentration of a substance

in a car exhaust depends:

• on the parameter(s) that describe the design of the car exhaust filters, and

• on the concentration of the chemicals in the original fuel.

We need to select a design a in such a way that

c = f (a, b) ∈ [c, c]

for all possible values of the environmental parameter(s) b.

In this paper, we consider the simplest case when:

• the design of each system is characterized by a single parameter a, and

• the environment is also characterized by a single parameter b.

We will show that already in this simple case, the design problem is, in general, computationally

difficult (NP-hard). We also show that in some cases, a feasible algorithm is possible. (Some of

our results first appeared in a conference paper [5].)

To be able to find a design that satisfies the given constraint on c for all possible values of the

environmental parameter b, we need to know which values of b are possible, i.e., we need to know

the range [b, b] of possible values of b. Thus, we arrive at the following problem:

• we know the desired range [c, c];

• we know the dependence c = f (a, b);

• we know the range [b, b] of possible values of b;

• we want to describe the set of all values of a for which f (a, b) ∈ [c, c] for all b ∈ [b, b].

This design-related problem is sometimes called a backcalculation problem, to emphasize its differ-

ence from the forward calculation problem, when

• we are given a design a and

• we want to estimate the value of the desired characteristic c = f (a, b).


106 Vladik Kreinovich CUBO
13, 1 (2011)

2 Linearized Problem

In many engineering situations, the intervals of possible values of a and b are reasonably narrow:

a ≈ ã and b ≈ b̃ for some ã and b̃. In such situations, we can expand the dependence c = f (a, b) in

Taylor series and ignore terms which are quadratic and higher order in terms of ∆a
def
= a − ã and

∆b
def
= b − b̃. As a result, we get a simple linear dependence

c ≈ c0 + ka · a + kb · b. (2.1)

We can simplify this expression even further if we take into account that the numerical value

of each of the quantities a and b depends on the choice of the starting point and on the choice

of a measuring unit. If we change the starting point and the measuring unit, then the new

numerical value can be obtained from the original one by an appropriate linear transformation.

For example, if we know the temperature tC in Celsius, then we can compute the temperature tF

in the Fahrenheit scale as tF = 32 + 1.8 · tC . We can use this possibility to simplify the above

expression for c. Specifically, we can change the starting points and the measuring units in such a

way that:

• the new numerical value for a is described by the linear expression c0 + ka · a, and

• the new numerical value for b is described by the linear expression kb · b.

In these new scales, the dependence of c on a and b takes the simplest form c = a + b.

We will show that the design problem becomes computationally difficult (NP-hard) already

for this simplest case.

3 From Guaranteed Bounds to p-Boxes

Ideally, it is desirable to provide a 100% guarantee that the quantity c never exceeds the threshold

c. In practice, however, too many unpredictable factors affect the performance of a system and

thus, such a guarantee is not realistically possible. What we can realistically guarantee is that the

probability of exceeding c is small enough. In other words, we set some threshold εc > 0 and we

require that Prob(c ≤ c) ≥ 1 − εc.

In addition to this requirement, we can also require that the excess of c over c be not too large.

This can be done, e.g., by requiring that for some value c1 > c, the probability Prob(c ≤ c1) is

bounded from below by the value 1−ε1 for some smaller ε1 < ε. We can several such requirements

for different values ci and εi.

Similarly, instead of the idealized exact inequality c ≥ c, in practice, we can only require that

Prob(c ≥ c) ≤ δ for some small probability δ > 0.


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 107

From the mathematical viewpoint, all such constraints are lower or upper bounds on the values

of the cumulative distribution function

Fc(x)
def
= Prob(c ≤ x).

By combining the bounds corresponding to all the constraints, we can thus conclude that the cdf

Fc(x) must satisfy, for every x, the inequalities

F c(x) ≤ Fc(x) ≤ F c(x), (3.1)

where F c(x) is the largest of all the lower bounds on Fc(x) and F c(x) is the smallest of all the

upper bounds on Fc(x).

In other words, for every x, the corresponding value Fc(x) must belong to the interval

[F c(x), F c(x)]. This x-dependent interval is known as a probability box, or a p-box, for short;

see, e.g., [2].

Similarly, for the environmental parameter b, we rarely know guaranteed bounds b and b. At

best, we know that for a given bound b, the probability of exceeding this bound is small, i.e., that

Prob(b ≤ b) ≥ 1 − εb for some small εb. So here too, instead of a single bound, in effect, we have

a p-box [F b(x), F b(x)].

4 Towards Formulating the Design (Backcalculation) Prob-
lem for p-Boxes

In the deterministic approach to design, we assume that we can manufacture an object with the

exact value a of the corresponding parameter – or at least the value which is guaranteed to be

within the given bounds [a, a].

In manufacturing, however, it is not practically possible to always guarantee that the value

a is within the given interval. At best, we can guarantee that, e.g., the probability of a ≤ a is

greater than or equal to 1 − εa for some small value εa. In other words, the design restriction on

a can also be formulated in terms of p-boxes. Thus, we arrive at the following problem.

5 Backcalculation Problem for p-Boxes

We are given:

• the desired p-box [F c(x), F c(x)] for c;

• the dependence c = f (a, b); and

• the p-box [F b(x), F b(x)] describing b.


108 Vladik Kreinovich CUBO
13, 1 (2011)

Our objective is to find a p-box [F a(x), F a(x)] for which:

• for every probability distribution

Fa(x) ∈ [F a(x), F a(x)],

• for every probability distribution Fb(x) ∈ [F b(x), F b(x)], and

• for all possible correlations between a and b,

the distribution of c = f (a, b) is within the given p-box [F c(x), F c(x)].

6 Reminder: Forward Calculation for p-Boxes

In order to analyze the backcalculation problem for p-boxes, let us first describe how the corre-

sponding forward calculation problem is solved. Let us assume that we know:

• the p-box [F a(x), F a(x)] for a; and

• the p-box [F b(x), F b(x)] describing b.

The objective of forward calculation is to find the range [F c(x), F c(x)] of possible values of Fc(x)

for c = f (a, b) for all possible distributions Fa(x) ∈ [F a(x), F a(x)] and Fb(x) ∈ [F b(x), F b(x)] and

all possible correlations between a and b.

It turns out that these calculations are best done in terms not of the original cdfs and p-boxes,

but rather in terms of their inverses – quantile functions. For a cdf Fa(x), quantiles a0, . . . , an

are described as values for which Fa(ai) =
i

n
. Since the cdf is monotonic, the quantiles are also

monotonic: if i < j, then ai ≤ aj .

When instead of the exact cdf, we only know a p-box [F a(x), F a(x)], then instead of the exact

quantiles ai we only know interval bounds [ai, ai] for these quantiles:

• the lower bounds ai are quantiles of the function F c(x), and

• the upper bounds ai are quantiles of the function F c(x).

These bounds are also monotonic: if i < j, then ai ≤ aj and ai ≤ aj .

For the function c = f (a, b) = a + b, forward calculation can be easily described in terms of

the quantile bounds: once we know the bounds [ai, ai] and [bi, bi] corresponding to a and b, we can

compute the quantile bounds for c as follows:

ci = max
j

(aj + bi−j ); (6.1)

ci = min
j

(aj−i + bn−j ). (6.2)

These formulas were first described in [8].


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 109

7 Formulation of the Problem

In terms of quantile bounds, the backcalculation problem takes the following form:

• we know the quantile intervals [bi, bi] corresponding to the environmental variable b;

• we are given the intervals [̃ci, c̃i] that should contain the quantiles for c = a + b;

• we must find the bounds ai and ai for which, for the values ci and ci are determined by the

formulas (6.1) and (6.2), we have [ci, ci] ⊆ [̃ci, c̃i].

8 In Effect, We Have Two Different Problems: Finding ai
and Finding ai

An important consequence of the formulas (6.1) and (6.2) is that:

• the lower bounds ci for c are determined only by the lower bounds ai and bi for a and b, and

• the upper bounds ci for c are determined only by the upper bounds ai and bi for a and b.

Thus, in effect, we can formulate the problem of finding the values ai and the problem of finding

the values ai as two separate yet similar problems.

Without losing generality, in the following text, we will only consider the following problem

of finding ai:

• we know the values bi;

• we are given the values c̃i;

• we must find the values a0 ≤ . . . ≤ an for which

c̃i ≤ max
j

(aj + bi−j ). (8.1)

9 A Simple Example

Before we start describing how to solve the general problem, let us first describe a simple example.

In the general p-box case, to describe the uncertainty of a variable x, we use n + 1 quantile

intervals [xi, xi] for i = 0, 1, . . . , n.

P-box uncertainty is a generalization of the case of interval uncertainty, in which the uncer-

tainty in each variable x is characterized by a single interval [x, x]. This case corresponds to n = 0.


110 Vladik Kreinovich CUBO
13, 1 (2011)

In this case, the inequality for a0 takes the form c̃0 ≤ b0 + a0 or, equivalently, c̃0 − b0 ≤ a0 – the

same form as for interval uncertainty.

We are looking for the simplest possible example which is different from interval uncertainty.

After a single interval, the next simplest example is when we use two intervals, i.e., when n = 1.

In this case, we need to find two values a0 ≤ a1 that satisfy the following inequalities:

c̃0 ≤ b0 + a0; (9.1)

c̃1 ≤ max(b0 + a1, b1 + a0). (9.2)

Depending on which term in the max expression is the largest, we have two cases: b0 + a1 ≤ b1 + a0
and b0 + a1 ≥ b1 + a0.

In the first case, the following inequalities must be satisfied:

c̃0 ≤ b0 + a0;

b0 + a1 ≤ b1 + a0; (9.3)

c̃1 ≤ b1 + a0.

The first of these three inequalities from (9.3) is equivalent to

c̃0 − b0 ≤ a0. (9.4)

Similarly, the third inequality from (9.3) is equivalent to

c̃1 − b1 ≤ a0. (9.5)

Thus, these two inequalities are equivalent to

max(̃c0 − b0, c̃1 − b1) ≤ a0. (9.6)

The second inequality from (9.3) is equivalent to

a1 − a0 ≤ b1 − b0. (9.7)

Thus, in the first case, the values a0 and a1 must satisfy the following two inequalities:

a0 ≥ max(̃c0 − b0, c̃1 − b1);

0 ≤ a1 − a0 ≤ b1 − b0. (9.8)

Graphically, the values that satisfy these inequalities fill the following area:


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 111

-
a0

6

a1

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

v0

v0

v1

where v0
def
= max(̃c0 − b0, c̃1 − b1) and v1

def
= v0 + b1 − b0.

In the second case, the following inequalities must be satisfied:

c̃0 ≤ b0 + a0;

b0 + a1 ≥ b1 + a0; (9.9)

c̃1 ≤ b0 + a1.

By moving the unknowns to one side and all the other terms to the other side, we conclude that

we must satisfy the following inequalities:

a0 ≥ c̃0 − b0;

a1 − a0 ≥ b1 − b0; (9.10)

a1 ≥ c̃1 − c̃0.

(Since b1 − b0 ≥ 0, the second inequality automatically implies that a1 ≥ a0.)

Graphically, we have the following representation:


112 Vladik Kreinovich CUBO
13, 1 (2011)

-
a0

6

a1

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

��

�
�

�
�

�
�

�

�
�

�
�

�
��

�
�

�
�

�
�

�
�

�
�

��

�
�

�
�

�

�
�

�
��

�
�

�
�

�
�

��

�
�

�

�
���
����

w0 w1

w2

where w0
def
= c̃0 − b0, w1

def
= w0 + (b1 − b0), and w2

def
= c̃1 − c̃0.

Overall, the set of all possible design values is the union of these two sets. Let us illustrate

this union on a simple numerical example when b0 = c̃0 = 0, b1 = 1, and c̃1 = 2. In this case, the

first case corresponds to the following inequalities

a0 ≥ 1;

0 ≤ a1 − a0 ≤ 1. (9.11)

and the second case leads to the following inequalities:

a0 ≥ 0;

a1 − a0 ≥ 1; (9.12)

a1 ≥ 2.

Thus, the union of the two corresponding sets has the following form:

-
a0

6

a1

�
�

�
�

�
�

�
�

�
�

�
�

�
�

��

�
�

�
�

�
�

�

�
�

�
�

�
��

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

��

�
�

�
�

�

�
�

�
��

�
�

�
�

�
�

��

�
�

�

�
���
����

1

1

2


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 113

10 A Designed System Usually Consists of Several Subsys-
tems

A designed system usually consists of several subsystems. So, instead of selecting a single p-box

for a single design parameter a, we need to design p-boxes corresponding to all these subsystems.

Let S denote the number of these subsystems, and let a
(s)
i , b

(s)
i , and c̃

(s)
i denote quantile

bounds corresponding to the s-th subsystem, s = 1, 2, . . . , S.

Thus, we arrive at the following problem:

• we know the values b
(s)
i ;

• we are given the values c̃
(s)
i ;

• we must find, for each s = 1, . . . , S, the values

a
(s)
0 ≤ . . . ≤ a

(s)
n

for which

c̃
(s)
i ≤ max

j
(a

(s)
j + b

(s)
i−j ). (10.1)

11 There Exist Effective Algorithms for Backcalculation

For p-boxes, there are efficient algorithms for solving the backcalculation problem; see, e.g., [1, 3,

4, 7].

12 Need for Additional Cost Constraints

In general, as we can see, the backcalculation problem has many possible solutions. Some design

solutions require less efforts, some require more efforts. It is therefore desirable not just to find a

solution, but rather to find a solution which satisfies given constraints on the manufacturing efforts

such as cost, energy expenses, etc.

The values a
(s)
i are the lower bounds on the design parameters. The smaller the lower bounds,

the easier it is to maintain them. Thus, the cost of maintaining a lower bound increases with the

value a
(s)
i .

In this paper, we show that the problem is NP-hard even for the simplest case when the corre-

sponding effort is simply proportional to the value a
(s)
i , and thus, the overall effort of maintaining

all the characteristics a
(s)
i is equal to the weighted linear combination

E =
S∑

s=1

n∑

i=0

w
(s)
i · a

(s)
i . (12.1)


114 Vladik Kreinovich CUBO
13, 1 (2011)

The corresponding constraint is that this effort should not exceed a given value e.

As we have mentioned, we may have several (C) constraints corresponding to different type of

effort – cost, energy consumption, etc. Thus, in general, the constrained backcalculation problem

takes the following form.

13 Formulation of the Problem in Precise Mathematical
Terms

We are given:

• positive integers n, S, and C;

• the values b
(s)
i corresponding to different s = 1, . . . , S and i = 0, . . . , n;

• the values c̃
(s)
i corresponding to different s = 1, . . . , S and i = 0, . . . , n;

• the values ec corresponding to different c = 1, . . . , C, and

• the values w
(s)
c,i corresponding to different s = 1, . . . , S, c = 1, . . . , C, and i = 0, . . . , n.

We must find for each s = 1, . . . , S, the values

a
(s)
0 ≤ . . . ≤ a

(s)
n

for which the following two sets of inequalities are satisfied:

c̃
(s)
i ≤ max

j
(a

(s)
j + b

(s)
i−j ); (13.1)

S∑

s=1

n∑

i=0

w
(s)
c,i · a

(s)
i ≤ ec. (13.2)

14 Our Main Result

Our main result is that the above problem is NP-hard.

15 Proof

Main idea. Formally, NP-hard means that an arbitrary problem from a certain class NP can be

reduced to this problem; see, e.g., [6]. Thus, to prove that a problem is NP-hard, it is sufficient to

prove that a known NP-hard problem can be reduced to it. Indeed,


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 115

• by definition of NP-hardness, every problem with the class NP can be reduced to the known

NP-hard problem;

• since this known problem can be reduced to our problem,

• we can therefore conclude that every problem form the class NP can be reduced to our

problem;

• in other words, we can conclude that our problem is NP-hard.

In our proof, as such a known NP-hard problem, we take the knapsack problem; see, e.g., [6].

In this problem, we are given a set of S objects, for each of which we know its volume vs > 0

and its price ps > 0. We also know the total volume V of a knapsack and the threshold price P .

Within the restriction on the volume, we must select some of the S objects in such a way that the

total price of all the selected objects is at least P .

To describe this problem in precise terms, for each object i, we define a new variable xs such

that xs = 1 if the s-th object is taken and xs = 0 if the s-th object is not taken. In terms of

these new variables, the overall volume of all the selected objects is equal to
S∑

s=1
vs · xs, and the

overall price of all the selected objects is equal to
S∑

s=1
ps · xs. Thus, the knapsack problem takes

the following form: find the values xs ∈ {0, 1} for which

S∑

s=1

vs · xs ≤ V (15.1)

and
S∑

s=1

ps · xs ≥ P. (15.2)

We will prove that this problem can be reduced to the above backcalculation problem, i.e., that

for each instance v1, . . . , vS , p1, . . . , pS , V, P of the knapsack problem there is an instance of the

backcalculation problem whose solution can effectively lead to the solution of the original knapsack

problem.

Towards reduction: selection of p-boxes. In our reduction, we will use the same pair of

p-boxes b, c for all S subsystems, the only difference will be in the weights.

Let us denote the common value of b
(s)
i for all s = 1, . . . , S by bi and the common value of c̃

(s)
i

by ci. In these notations, the constrains on the unknowns a
(s)
i take a simplified form

ci ≤ max
j

(a
(s)
j + bi−j ). (15.3)

In our reduction, we take n = 1. For n = 1, for every s, the inequalities (15.3) lead to the

following constraints on the corresponding two unknown a
(s)
0 ≤ a

(s)
1 :

c0 ≤ a
(s)
0 + b0; (15.4)


116 Vladik Kreinovich CUBO
13, 1 (2011)

c1 ≤ max(a
(s)
0 + b1, a

(s)
1 + b0). (15.5)

Specifically, we take b0 = c0 = 0, b1 = 1, and c1 = 2. For these values, the above inequalities take

the following form:

a
(s)
0 ≥ 0; (15.6)

2 ≤ max(a
(s)
0 + 1, a

(s)
1 ). (15.7)

The largest of the two values is greater than or equal to 2 if and only if (at least) one of these two

values is greater than or equal to 2. Thus, the second constraint (15.7) means that:

• either 2 ≤ a
(s)
0 + 1 and thus a

(s)
0 ≥ 1

• or a
(s)
1 ≥ 2.

Analysis of the selected p-boxes. Let us show that out of all possible solutions a
(s)
0 ≤ a

(s)
1

satisfying these two inequalities, only two solutions satisfy the additional constraint a
(s)
0 + a

(s)
1 ≤ 2:

• a
(s)
0 = a

(s)
1 = 1 and

• a
(s)
0 = 0 and a

(s)
1 = 2.

Indeed, we know that for each solution, we have:

• either a
(s)
0 ≥ 1

• or a
(s)
1 ≥ 2.

In the first case a
(s)
0 ≥ 1, we have

a
(s)
0 + a

(s)
1 = 2a

(s)
0 + (a

(s)
1 − a

(s)
0 ). (15.8)

The first term in the right-hand side is ≥ 2, the second term is always non-negative – since

a
(s)
1 ≥ a

(s). Thus, the only possibility for the right-hand side sum to be ≤ 2 is when the value

2a
(s)
0 is exactly 2, and the difference a

(s)
1 − a

(s)
0 is exactly 0. In this case, we have a

(s)
0 = a

(s)
1 = 1.

In the second case a
(s)
1 ≥ 2, due to a

(s)
0 ≥ 0 (condition (15.6)), the only possibility for the sum

a
(s)
0 + a

(s)
1 to be ≤ 2 is when a

(s)
0 = 0 and a

(s)
1 = 2.

Reduction and the final part of the proof. We will reduce the given instance of the knapsack

problem to the following system with 3 constraints:

• In the first constraint, we take w
(s)
1,i = 1 for all s and i, and we take e1 = 2 · S.

• In the second constraint, we take w
(s)
2,0 = vs, w

(s)
2,1 = 0, and e2 = V .


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 117

• In the third constraint, we take w
(s)
2,0 = 0, w

(s)
2,1 = ps, and

e3 = P − 2 ·
S∑

s=1

ps. (15.9)

Let us first analyze the first constraint

S∑

s=1

(a
(s)
0 + a

(s)
1 ) ≤ 2 · S. (15.10)

As we have shown in the previous section, each simple sum a
(s)
0 + a

(s)
1 is at least 2, and it is only

equal to 2 when either a(s) = 0 or a
(s)
0 = 1. Thus, the only possibility for the sum

S∑
s=1

(a
(s)
0 + a

(s)
1 )

of S such simple sums to not exceed 2S is when each of these sums is equal to exactly 2, i.e., when

for every s,

• either either a(s) = 0 or a
(s)
0 = 1, and

• a
(s)
1 = 2 − a

(s)
0 .

For these values a
(s)
i , the second constraint takes the form

S∑

s=1

vs · a
(s)
0 ≤ V, (15.11)

and the third constraint takes the form

S∑

s=1

ps · a
(s)
1 ≤ 2 ·

S∑

s=1

ps − P. (15.12)

Substituting the expression a
(s)
1 = 2 − a

(s)
0 into this inequality, we get

S∑

s=1

ps · (2 − a
(s)
0 ) ≤ 2 ·

S∑

s=1

ps − P, (15.13)

i.e., equivalently,

2

S∑

s=1

ps −
S∑

s=1

ps · a
(s)
0 ≤ 2

S∑

s=1

ps − P, (15.14)

which, in its turn, is equivalent to
S∑

s=1

ps · a
(s)
0 ≥ P, (15.15)

Thus, for every solution to the constrained backcalculation problem, the values xs
def
= a

(s)
0

form a solution to the knapsack problem: each of them is equal to 0 or 1, and they satisfy the

corresponding inequalities (15.11) and (15.15).


118 Vladik Kreinovich CUBO
13, 1 (2011)

Vice versa, one can easily check that if the values xs form a solution to the knapsack problem,

then the corresponding values a
(s)
0 = xs and a

(s)
1 = 2 − xs form a solution to the constrained

backcalculation problem.

The reduction is proven, so the backcalculation problem is indeed, in general, NP-hard.

16 Situation in Which a Feasible Algorithm Is Possible

Let us describe an example of an optimization problem in which a feasible algorithm is possible.

Optimal backcalculation: towards the formulation of the problem in precise terms. In

general, there are many combinations of the values ai that satisfy the desired constraints. Which

combination should we choose?

In the design case, a natural requirement is to select the values ai which will be the easiest to

implement. To describe this idea in precise terms, we must analyze how easy is it to implement

different values of ai.

In general, the value ai is the value for which F (ai) =
i

n
, i.e., we which we must guarantee

that

Prob(a ≤ ai) = F (ai) ≤ F (ai) =
i

n
. (16.1)

In particular:

• once we select the value a0, we must guarantee that the probability Prob(a ≤ a0) of the

actual value a being below a0 is 0:

Prob(a ≤ a0) = 0;

• once we select a1, we must guarantee that the probability Prob(a ≤ a1) of the actual value

a being below a1 does not exceed 1/n:

Prob(a ≤ a1) ≤ 1/n;

• once we select a2, we must guarantee that the probability Prob(a ≤ a2) of the actual value

a being below a2 does not exceed 2/n:

Prob(a ≤ a2) ≤ 2/n;

• etc.

When we motivated the need to take into account partial information about the probabilities, we

have mentioned that the most difficult task is to guarantee that the actual a never gets below a


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 119

threshold. The corresponding restriction is related to the value a0: we must guarantee that a ≥ a0.

The smaller this value a0, the weaker this constraint and thus, the easiest to satisfy. Thus, it is

reasonable to select the smallest possible value a0.

Once this value is selected and the corresponding bound is guaranteed, we must guarantee

that the inequalities a ≥ ai be guaranteed with the probabilities ≥ 1 − i/n. The closer this

probability to 1, the more stringent is the corresponding requirement, and thus, the more difficult

the corresponding task. So, after we have selected a0, the most difficult of the remaining tasks is

to select the value a1, the value for which the guaranteed probability of violating the restriction

a ≥ a1 is the smallest (probability = 1/n). Thus, to make the design as easy to implement as

possible, we should make this restriction the least difficult to implement – i.e., we should select

the value a1 as small as possible.

Once we have fixed a0 and a1, the most difficult of the remaining tasks is to to guarantee that

a ≥ a2 with probability ≥ 1 − (2/n). Thus, we must select the corresponding threshold a2 to be

as small as possible.

As a result, we arrive at the following formulation of the backcalculation problem.

Formulation of the optimal backcalculation problem in precise mathematical terms.

Out of all the tuples a0 ≤ . . . ≤ an that satisfy the inequalities (8.1),

• we first select all the tuples for which the value a0 is the smallest possible;

• out of the selected tuples, we select all the tuples for which the value a1 is the smallest

possible;

• then, out of the newly selected tuples, we select those for which the value a2 is the smallest

possible;

• etc.

In mathematical terms, we can say that a tuple a = (a0, . . . , an) is better than a tuple a
′ =

(a′0, . . . , a
′
n) if one of the following conditions hold:

• either a0 < a
′
0;

• or a0 = a
′
0 and a1 < a

′
1;

• or a0 = a
′
0, a1 = a

′
1, and a2 < a

′
2;

• . . .

• or a0 = a
′
0, . . . , ai−1 = a

′
i−1, and ai < a

′
i;

• . . .


120 Vladik Kreinovich CUBO
13, 1 (2011)

• or a0 = a
′
0, . . . , an−1 = a

′
n−1, and an < a

′
n.

In computer science, this relation is known as a lexicographic (alphabetic) order, since this is exactly

how words are placed in a dictionary or in a lexicon: a word w = ℓ0ℓ1 . . . consisting of the letters

ℓ0, ℓ1, . . . , is placed before a word w
′ = ℓ′0ℓ

′
1 . . . consisting of the letters ℓ

′
0, ℓ

′
1, . . . if one of the

following conditions hold:

• either the letter ℓ0 precedes the letter ℓ
′
0 (e.g., ∅ apple goes before ∅ zebra);

• or ℓ0 = ℓ
′
0 and the next letter ℓ1 precedes the corresponding letter ℓ

′
1 (e.g., ∅ abuse goes

before ∅ alpha);

• . . .

• or ℓ0 = ℓ
′
0, . . . , ℓi−1 = ℓ

′
i−1, and the next letter ℓi precedes the corresponding letter ℓ

′
i;

• . . .

In these terms, we must select the tuple which is the smallest in the lexicographic order.

Towards a solution to the optimal backcalculation problem. To find the optimal solution,

let us start with the value a0. For i = 0, the condition (8.1) becomes a0 + b0 ≤ c̃0, i.e., equivalently,

a0 ≥ c̃0 −b0. The smallest real number that satisfies this inequality is the value a0 = c̃0 −b0. Thus,

we should take

a0 = c̃0 − b0. (16.2)

Let us now assume that we have already selected the values a0, . . . , ai−1, and that we are now

selecting the value ai. The i-th condition (8.1) has the form

c̃i ≤ max
j

(bi−j + aj ) = max

[
max
j≤i−1

(bi−j + aj ), b0 + ai

]
. (16.3)

If

c̃i ≤ max
j≤i−1

(bi−j + aj ), (16.4)

then the condition (16.3) is already satisfied. In this case, the only restriction on ai is that

ai ≥ ai−1; thus, the smallest possible value of ai is ai = ai−1.

If the inequality (16.4) is not satisfied, then to satisfy (16.3), we must satisfy the inequality

c̃i ≤ b0 + ai, i.e., equivalently, ai ≥ c̃i − b0. Thus, the smallest possible value here is ai = c̃i − b0.

So, we arrive at the following algorithm:


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 121

Algorithm for optimal backcalculation. ([1, 3, 4, 7]) We know:

• the interval quantiles [bi, bi] for the environmental variable b; and

• the interval quantiles [̃ci, c̃i] for c.

We want to find the lexicographically optimal interval quantiles [ai, ai] for a for which c satisfies

the given constraints.

In this algorithm, we compute the values a0, a1, . . . one by one.

• First, we compute a0 = c̃0 − b0.

• Once the values a0, . . . , ai−1 are computed, we check whether

c̃i ≤ max
j≤i−1

(bi−j + aj ). (16.5)

If this inequality is satisfied, we take ai = ai−1; otherwise, we take ai = c̃i − b0.

Comment. In the derivation of the algorithm, we, in fact, proved that the result a of applying

this algorithm always satisfies the inequalities (8.1): indeed, we have selected each value ai in such

a way that the i-th inequality (8.1) is satisfied.

We have also shown that no tuple satisfying (8.1) can be (lexicographically) better than the

result a of using this algorithm – and thus, this result a is indeed optimal.

Algorithm illustrated on the above simple example. In the above example when b0 = c̃0 =

0, b1 = 1, and c̃1 = 2, we do the following:

• First, we compute a0 = c̃0 − b0 = 0 − 0 = 0.

• Next, we check whether

c̃1 ≤ max
j≤0

(bi−j + aj ) = b1 + a0. (16.6)

Here, we have 2 = c̃1 6≤ b1 + a0 = 1 + 0 = 1, so we take a1 = c̃1 − b0 = 2 − 0 = 2.

One can easily check that the resulting tuple (a0, a1) = (0, 2) is indeed lexicographically optimal:

• it has the smallest possible value of a0 = 0, and

• out of all the tuples with this value of a0, it has the smallest possible value of a1.


122 Vladik Kreinovich CUBO
13, 1 (2011)

-
a0

6

a1

�
�

�
�

�
�

�
�

�
�

�
�

�
�

��

�
�

�
�

�
�

�

�
�

�
�

�
��

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

�
�

��

�
�

�
�

�

�
�

�
��

�
�

�
�

�
�

��

�
�

�

�
���
����

1

1

2 t

Acknowledgments

This work was supported in part by NSF grant HRD-0734825 and by Grant 1 T36 GM078000-01

from the National Institutes of Health.

Received: June 2009. Revised: November 2009.

References

[1] S. Ferson, Using approximate deconvolution to estimate cleanup targets in probabilistic

risk analyses, In: P. T. Kostecki, E. J. Calabrese, and M. Bonazountas (eds.), Hydrocarbon

Contaminated Soils, Amherst Scientific Publishers, Amherst, Massachusetts, 1995, pp. 245–

254.

[2] S. Ferson. RAMAS Risk Calc 4.0. CRC Press, Boca Raton, Florida, 2002.

[3] S. Ferson, V. Kreinovich, and W. T. Tucker, Untangling equations involving uncer-

tainty: deconvolutions, updates, and backcalculations, Proceedings of the NSF Workshop on

Reliable Engineering Computing, Savannah, Georgia, September 15–17, 2004.

[4] S. Ferson and T. F. Long, Deconvolution can reduce uncertainty in risk analyses, In: M.

Newman and C. Strojan (eds.), Risk Assessment: Measurement and Logic, Ann Arbor Press,

Ann Arebor, Michigan, 1997.

[5] V. Kreinovich, Expert knowledge is needed for design under uncertainty: for p-boxes, back-

calculation is, in general, NP-hard, Proceedings of the 2009 Annual Conference of the North

American Fuzzy Information Processing Society NAFIPS’09, Cincinnati, Ohio, June 14–17,

2009.

[6] C. H. Papadimitriou, Computational Complexity, Addison Wesley, 1993.


CUBO
13, 1 (2011)

Engineering Design under Imprecise Probabilities:
Computational Complexity 123

[7] W. T. Tucker and S. Ferson, Setting cleanup targets in a probabilistic assessment, In: S.

Mishra (ed.), Groundwater Quality Modeling and Management under Uncertainty, American

Society of Civil Engineers, Reston, Virginia, 2003.

[8] R. C. Williamson and T. Downs, Probabilistic arithmetic I: Numerical methods for calcu-

lating convolutions and dependency bounds, International Journal of Approximate Reasoning,

1990, Vol. 4, pp. 89–158.