HUNGARIAN JOURNAL 
OF INDUSTRIAL CHEMISTRY 

VESZPREM 
Vol. 29. pp. 61-65 (2001) 

TOWARD GLOBAL OPTIMIZATION OF A DIFFICULT OPTIMAL CONTROL 
PROBLEM 

R.Luus 

(Department of Chemical Engineering, University of Toronto, Toronto, ON M5S 3E5, CANADA) 

Received: July 5, 2001 

The optimal control problem introduced in 1980 by Yeo [1] exhibits numerous local optima which have not been 
reported in the literature. When 50 time stages of equal length are used, there are only two local optima. When SO time 
stages are taken, there are over 50 local optima. It was found that there are at least 15 local optima which are better than 
the one reported in the literature as the global optimum. With the use of 100 time stages of equal length, the number of 
local optima is increased substantially to over 150. With 100 time stages of equal length, the global optimum solution 
found with iterative dynamic progranuning (IDP) gives I= 0.119044. This value of the performance index is 0.2% better 
than the value that had been accepted in the literature as the global optimum. IDP provides a good means of getting the 
global optimum solution for this problem, but numerous runs are necessary due to the presence of a large number of local 
optima. 

Keywords: optimal singular control, iterative dynamic progranuning, IDP, global optimization, sensitivity 

Introduction Yeo's optimal control problem 

Establishing the optimal control policy for 
nonlinear control problems is usually quite difficult, 
especially when the system exhibits low sensitivity of 
the performance index to a. change in the control policy. 
When the number of time stages is small, then the 
optimal control policy can usually be readily obtained 
either by iterative dynamic programming [2,3) or with 
the LJ optimization procedure [4}. For many problems a 
small number of flexible stage lengths, where the stage 
lengths are incorporated into optimization, can be used 
to give very accurate results by enabling the optimal 
switching times to be accurately determined. However, 
there are some problems where the structure of the 
optimal control policy changes quite substantially when 
the number of time stages is increased. This is 
especially true with the system introduced by Yeo [1]. 
The goal of this paper is to illustrate the interesting 
features presented by this system and to show how the 
optimal control policy can be established in a reliable 
way. 

One of the earliest examples tried with iterative 
dynamic programming [5,6], is the optimal control 
problem used by Yeo fl] to illustrate the use of 
quasilinearization to solve nonlinear singular control 
problems. The system is described by 

dxl 
-=Xz 
dt 

dx _3_::::u 
dt 

dxs =1 
dt 

{2) 

{3) 

(5) 


62 

with x(O) = [ 0 -1 - .J5 0 0 f, and final time t.f = 1. The 
constraints on the control are 

-4::::; u $10 (6) 

It is required to find the control policy that minimizes 
the performance index 

(7) 

This system was used for global optimization by Rosen 
and Luus [7] with the introduction of a line search into 
sequential quadratic programming (SQP). With the use 
of P = 80 time stages of equal length they obtained 
I= 0.119258, and concluded by extrapolation that the 
minimum value for the performance index for the 
continuous case (infinite number of stages) should be 
I= 0.11923. 

This system was recently investigated by Esposito 
and Floudas [8], with P = 10 time stages, who found 
two local optima as was found by Luus [3]. The 
differential equations are well behaved, but to get six 
figure accuracy for the performance index, it was 
necessary to use double precision and to use a tight local 
error tolerance of 10·8 in the subroutine DVERK [9]. 

The optimal value of the performance index 
depends very much on the number of time stages that 
are used. The goal here is to illustrate the nature of the 
optimal solution when the. number of time stages of 
equal length is increased. 

Table I Affect of the number of time stages P on the 
value of the performance index after 50 passes, each 

consisting of 20 iterations 

Number of 
time staaes P 

10 
20 
40 
50 
60 
80 
100 

Performance index 
I 

0.120114 
0.119438 
0.119292 
0.119277 
0.119272 
0.119134 
0.119085 

Results and Discussion 

CPU time 
seconds 

5.9 
9.0 

28.9 
47.1 
61.1 
110.5 
180.7 

In using iterative dynamic programming (IDP), we 
used the multi~pass approach as outlined by Luus [3, p. 
106], using 20 iterations per pass. The region 
contraction factor was chosen as r= 0.95 and the region 
restoration factor 17 = 0.85. A single grid point was used 
at each time stage, and 3 candidates for control, chosen 
as o. ±r were used at each grid point. The initial value 
for control was chosen as 7.0 and the initial region size r 
= 4 was used. The computational results carried out on a 
PentiumiY400 personal computer in double precision 

after 50 passes for different number of time stages P are 
given in Table I. 

It is noted that IDP yields the optimum substantially 
faster than the global optimization procedure of 
Esposito and Floudas [8]. Even with P = 100 the 
computation time is almost negligible. 

The use of P = 60 provides a very small refinement 
to the control policy obtained with P = 40 and P == 50 
for which the optimal control policy is shown in Fig.]. 

e 
"E 
0 
(.) 

0.0 0.4 
Time,t 

0.8 

Fig.l Optimal control policy for P = 50 giving I = 
0.119277 

0.0 M 
Time,t 

0.8 

Fig.2 Optimal control policy for P = 80, giving I ::: 
0.1191325 


~ 
"C 
.E 
8 
c: 
til 

E 
.g 
C]) 
a. 

0.11912 L---1---l..----JL--'--L----JL-_[__L__j 
72 76 

Position of dip, stage number 

Fig.3 Local optima asa function of the location of 
the dip in the control pro.le for P = 80 

e 
'E 
8 

0.0 0.4 
Time, t 

0.8 

Fig.4 Control pro.le for a local optimum for P = 80, 
giving I= 0.1191603 

Since there is a significant improvement in the 
performance index for P = 80 stages, the run was 
repeated, using 40 iterations per pass and allowing 100 
passes. The performance index was marginally 
decreased to I= 0.119133. As is shown in Fig.2, the use 
of P = 80 yields a totally different optimal control 
policy. By using a greater number of time stages, the 
~hotter stage lengths allow sudden changes to be made 
Ill the control policy. These abrupt changes in the 
control policy here improve the performance index quite 
significantly. 

::::. 

~ 
0 

(.) 

-4 

0.4 
Time,t 

63 

0.8 

Fig.5 Control pro.le for local optimum for P = 100 
giving I= 0.119250 

e 
'E 
0 

(.) 

-4 

0.4 
Time,t 

0.8 

Fig.6 Control pro.le for local optimum for P = 100, 
giving I = 0.119092 

There is a dip in the control policy to -4 at stage 59. and 
in the time interval (0.8, 1.0) there is a dip to -4 at stage 
75, with a significant improvement in the resulting 
performance index over the value of 0.119258, where 
there are no dips to the minimum value of controL As is 
shown in Fig.3, the position of the latter dip affects the 
performance index. With P = 80 we can also get local 
optima with two dips in the time interval (0.8, 1.0}, as is 
shown in Fig A, where the dips at stages 70 and 76 give 
a performance index I= 0.119160. 


64 

0.1190580 85 90 95 100 
Position of dip, stage number 

Fig.7 Local optima asa function of the location of a 
single dip in the control pro.le with P = 100 

0.0 0.4 
Time,t 

0.8 

Fig.8 Control pro.le for a local optimum with P = 
100. giving l = 0.119045 

The gk1bal optimum for P = 80 with 1 = O.H91325 
occurs with a single dip at stage 75. 

With the use of P = 100 time stages of equal length 
the number of local optima is quite large. The local 
optimum control policy in which there are no dips, 
yielding I= 0.119250 is shown in Fig.S. The best local 
optimum with a single dip at stage 94, giving I = 
O.HOOIJ2 is shown in Fig.6. As is shown in Fig.7. the 
position of the single dip affects the value of the 
performance index. As cae be seen 14 local optima with 
a sin,E!Je dip give a better value for the performam:e 
index than the local tlptimum wiU!out any dips. 

0.0 0.4 
Time, t 

Fig.9 Optimal control policy for P = 100 giving I= 
0.1190437 

0.0 0.4 
Time,t 

Fig.JO Control policy for P = 150, giving I = 
O.H8977 

What makes this problem challenging is the 
existence of double dips in the time interval (0.8, 1.0), 
many of which give better values for the performance 
index. In fact, the best local optimum with dips at stages 
88 and 95, giving 1 = 0.119045, is shown in Fig.8. 
Further improvement to 1 = 0.1190437 is obtained with 
the control policy shown in Fig.9. where there are 
additional 3 dips in the time interval (0.2, 0.8). There 
are over 150 local optima for this system with P = 100 
stages of equal length. Here. with lDP only the most 
likely candidates for global optimum were scanned to 
yield the global optimum of l = 0.119044. 


0.0 0.4 
Time, t 

0.8 

Fig.ll State trajectories corresponding to the 

65 

This example illustrates the difficulty in 
determining with absolute certainty the global optimum 
of some nonlinear systems. IDP proved to be an 
effective tool in the establishment of local optima. 
Through the interpretation of the results obtained for the 
local optima, the global optimum for 100 time stages 
was readily obtained. However, the complexity of the 
problem is realized, so the results obtained here provide 
only a step toward obtaining the global optimum to the 
problem for which we could conceptually use an infinite 
number of time stages. 

Acknowledgement 

Financial support from the Natural Sciences and 
Engineering Research Council of Canada is gratefully 
appreciated. 

SYMBOLS 

control policy in x Figure 10 I performance index to be minimized 
number of time stages of equal length 

The question of what happens when the number of 
time stages is further increased arises. It is obvious that 
the number of local optima will increase. In fact, a run 
was performed with P = 150, yielding I = 0.118977$, 
with the control policy shown in Fig.JO. This result was 
obtained with the conditions used for Table 1, so this 
result is expected to be close to the global optimum. It is 
interesting to note that now there are three dips in the 
time interval (0.8, 1.0). The corresponding trajectories 
for the first three state variables is given in Fig.ll. 
There is no doubt that further noticeable improvement 
of the performance index can be obtained by using even 
a larger number of time stages, such as 200. 
Alternatively, one could use time stages of varying 
length to refine the results obtained here. Research in 
this area is continuing. 

Conclusions 

By increasing the number of time stages to solve the 
optimal control problem presented by Yeo [1], some 
very interesting features about this system were noted. 
With the use of 80 time stages numerous local optima, 
that had been observed for the frrst time, were found. A 
large number of these local optima were better than the 
best local optimum that was obtained by SQP. The use 
of 100 time stages yielded even a larger number of local 
optima. By scanning numerous local optima, the global 
optimum with 100 stages of equal length was 
determined as I= 0.119044. This value is 0.2% better 
than the best value reported in the literature. With the 
use of 150 stages, the performance index was further 
improved to 0.118977, showing that further increases in 
the number of time stages, or the use of stages of 
varying length should yield even better values for the 
performance index.. 

p 

r region size 
t time 
f_t final time 
u control variable 
x1 ith state variable 
x state vector 

Greek letters 
y region contraction factor by which the 1 egion size 
is reduced after every iteration 
1J restoration factor by which the region size is 
restored after every pass 

Abbreviations 
IDP iterative dynamic programming 
SQP sequential quadratic programming 

REFERENCES 

1. YEOB.P.: Int. J. Control, 1980, 32, 723~730. 
2. Luus R.: Control and Intelligent Systems, 1998, 26, 

1~8. 
3. Luus R.: Iterative Dynamic Programming, 

Chapman & Hall CRC, London, UK, 2000. 
4. LUUS R. and HENNESSY D.: Ind. Eng. Chern. Res., 

1999,38, 1948·1955. 
5. Luus R.: Hung. J, Ind. Chem., 1989, 17, 523~543. 
6. Luus R.: Int. J. Control1990, 19, 995~ 1013. 
7. ROSEN 0. and LUUS R.: J. Optimization Theory & 

Applk., 1992, 73, 547--562. 
8. ESPOSITO W.R. 'lnd FLOUDAS C.A.: J. Global 

Optimization, 2()()(), 17.97-126. 
9. HULL T.E., ENRIGHT W .D. and JACKSON K.R.: U5er 

Guide to DVERK •• a Subroutine for Solving 
Nonstiff ODE's, Report 100. 1976, Department of 
Computer Science, University tJff(.'fnnto, Canada 


	Page 60 
	Page 61 
	Page 62 
	Page 63 
	Page 64