AP06_1.vp


1 Introduction
The dynamic properties of a real plant are usually identi-

fied by making a model – choosing a model structure and
estimating the unknown parameters of the model using data
measured on the real plant [1]. The first goal of this paper is
to compare a set of parameter estimations of an ARX model
where each estimation is obtained by minimizing the p-norm
(1� p � 2). The measurement of the system output is consid-
ered to be damaged by a number of outliers.

Another problem is optimal control of dynamic systems.
Model predictive control (MPC) strategies are very popular
[2, 4]. Optimal predictive control of an ARX or state space
model is usually obtained by minimizing the quadratic cri-
terion. If a non-quadratic norm is used in the optimality
criterion, different results are obtained. For example, for
p � 1 dead beat control is obtained. Minimizing the l1 norm
using linear programming in MPC control has been consid-
ered by many authors (e.g. [5, 6, 7]). A connection between
linear programming and optimal control is shown for exam-
ple in [8, 9]. In this paper, optimal predictive control utilizing
p-norm minimization of the criterion is shown, and the results
are illustrated by simple examples.

The paper is organized as follows: The second section
shows identification of an ARX model using the p-norm. The
algorithm for minimizing the p-norm where 1< p < 2 is reca-
pitulated in section 3. This algorithm is known as Iteratively
Reweighted Least Squares (IRLS). In section 4, the predictive
control strategy for state space models is reviewed. In sections
5 and 6, examples of system identification and control in
p-norm are shown.

2 Identification of an ARX model
using p-norm
The ARX model of a dynamic system can be described by

the equation
y t a y t a y t n

b u t b u t n
n

n

( ) ( ) ( )
( ) ( )

� � � � � � �

� � � �
�

�

1 1

0 1

1 1
1

�

� � e t( )
(1)

where y(t) is the system output in time t and u(t) is the system
input, e(t) is an equation error and ai, bj are system parame-
ters. Note that only a single input – single output (SISO) sys-
tem is considered in this paper. The parameter estimation
problem from a given set of data

�
t y t y t u t u t� � �{ ( ), ( ), , ( ), ( ), }1 1� �

leads to minimization of error vector � �� � �e t e t T( ), ( ),1 � . We
are looking for the parameter vector

� �x � a a b b T1 2 0 1, , , , ,� �

that ensures the best approximation of the output vector
� �b � � � �y t y t y t m T( ), ( ), , ( )1 1� by a vector Ax, where

A �
� �

� � �
	

�
�
�

y t y t u t
y t y t u t
( ) ( ) ( )
( ) ( ) ( )

1 2
2 3 1

� �

� �

� � � � �

�


�
�
�

. (2)

Both vector b and matrix A are formed from the measured
data � t. The most usual solution of the parameter estimation
problem involves minimization of the quadratic norm of the
error vector, known as the Euclidean norm. In such a case, the
optimization problem is

min
x

Ax b� 2 . (3)

The solution is known as Least Squares (LS) [10]. In reality
the noise of the output measurement is also found in ele-
ments of data matrix A. The solution of such a problem leads
to Total Least Squares (TLS) [11]. If the input measurement is
noise free, the problem can be solved by Mixed LS and TLS
[11]. In some applications, we can use the more general
p-norm instead of the quadratic norm [10, 12]. The p-norm is
defined as

� �y
y

p
p

n
p p

i

y y p

y y

� � � � � �

�
�

1

1

1 2

1�

�

,

max , , .( )
(4)

Then the problem of parameter estimation is defined as
follows

x Ax b
x

* arg min� � p
p . (5)

For 1� p � 2 p-norm minimization can be done by iterative
solution of the LS problem. For p � 1 and p � � the solution
can be obtained by linear programming.

3 p-norm minimization
The general p-norm can be used in the optimization prob-

lem defined by (5). If the p-norm is restricted by 1� p � 2 the

©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 33

Czech Technical University in Prague Acta Polytechnica Vol. 46  No. 1/2006

Identification and Predictive Control by
p-norm Minimization

J. Pekař, J. Štecha

Real time system parameter estimation from the set of input-output data is usually solved by minimization of quadratic norm errors of system
equations – known in the literature as least squares (LS) or its modification as total least squares (TLS) or mixed LS and TLS. It is known
that the utilization of the p-norm (1�p�2) instead of the quadratic norm suppresses the wrong measurements (outliers) in the data. This
property is shown for different norms, and it is shown that the influence of outliers is suppressed if p � 1. Also optimal predictive control
utilizing p-norm minimization of the criterion is developed, and the simulation results show the properties of such control.

Keywords: identification, predictive control, p-norm, ARX model, iteratively reweighted least squares, linear programming.


minimization problem is convex and the solution is unique.
This problem can be solved by an iterative algorithm known
as Iteratively Reweighted Least Squares (IRLS) [10]. Note that
the optimization problem with 1-norm or �-norm can be
solved by linear programming.

3.1 Iteratively reweighted least squares
Let us solve the approximation problem

� �min ( )
x

x Ax b� � � � �p
p p1 2. (6)

We assume, that all coordinates of the residuum
�( )x b Ax� � are nonzero. Then the function �(x) can be de-
fined as

�( ) ( ) ( ) ( )x x x x� �
�

�

�
� ��i p
i

m

i
p

i
i

m

1

2 2

1

� � . (7)

The previous problem is weighted Least Squares:

min ( ) ( ) , ( ) ( )
x

D b Ax D� � �
p�

� �
2

2

2

diag . (8)

Because of the dependency of the diagonal weighting ma-
trix D(�) on the unknown solution x, the problem must be
solved by an iterative algorithm:
1. x(0) is an initial solution, set the iteration counter k � 0.
2. calculate �(k) by

�
( ) ( )k k� �b Ax , D( ) ( )k k

p

�
�

�

�
�

�

�

�
�

�

diag �
2

2 . (9)

3. utilizing the weighted LS algorithm, �x( )k is obtained by
solving

� � �
�

x D A x
x

( ) ( ) ( )arg min ( )k k k� �
2
. (10)

4. the next iteration x(k+1) is obtained as

x x x( ) ( ) ( )k k k� � �1 � . (11)

5. if the convergence criterion is satisfied, then stop and
x(k+1) is the solution, else set k k� � 1 and go to 2.

3.2 Minimization of 1-norm by linear
programming

In order to minimize the norm p � 1 linear programming
(LP) can be used. The following problems are equivalent

min min{ : , }
x y

Ax b y Ax b y Ax b y� � � � � � �1 1
T . (12)

Introducing the augmented vector z
x
y

�
	

�
�


�, the standard

form of the LP problem is obtained

min{ : },
z

c z Az bT � (13)

cT � [ ]0 0 1 1, , , , ,� � ,

A
A
A

b
b
b

�
�

� �
	

�

�


� � �

	
�

�


�

I
I

, .

The only drawback of such computation is that the LP
problem can have more than one solution.

3.3 Minimization of �-norm by linear
programming

In order to minimize the norm p � �, linear programming
(LP) can also be used. Similarly as for 1-norm, the following
two problems are equivalent
min min{ : , }

x
Ax b Ax b Ax b� � � � � � �� y

y y y1 1 , (14)

where 1 � [ , , , ]1 1 1� T is the unit vector. Introducing the aug-

mented vector z
x
y

�
	

�
�


�, the standard form of the LP problem is

obtained:
min{ : },

z
c z Az bT � (15)

where

c �

	
�
�
�
�

�


�
�
�
�

0

0
1

�
, A

A

A

�

�

�

�

�

�

	
�
�
�
�
�
�
�

�


�
�
�
�
�
�
�

1

1
1

1

�

�

, b
b
b

�
�
	

�

�


� .

4 Predictive control strategy
Suppose the state and output development model is given

in the form
x Ax Bu v

y CX Du e
( ) ( ) ( ) ( ),

( ) ( ) ( ) ( ).
t t t t

t t t t
� � � �

� � �

1 (16)

where x(t), y(t) and u(t) are the state, output and input of the
system and A, B, C and D are matrices of appropriate di-
mensions, v(t) and e(t) are state and measurement noises
respectively with zero means and covariances Rv and Re, inde-
pendent of the state and input of the process. For the optimal
predictive strategy the quality criterion is usually given in the
quadratic form

� � � �J y t w t r u t
t

K
� � �

�
 
!

"!

#
$
!

%!�
�� ( ) ( ) ( ) ( )2 2

1

1x , (17)

where K is the horizon of the predictive control strategy, w(t)
is the reference and r is the weighting coefficient. Let us
introduce the augmented vectors

� �Y � y y K T( ), , ( )1 � , � �W � w w K T( ), , ( )1 � ,
� �U � u u K T( ), , ( )1 � , � �V � v v K T( ), , ( )1 � ,
� �E � e e K T( ), , ( )1 � and augmented matrices,

� �P C A C A C� �T T T T K T
T

, , , ( )� 1 ,

S

D
CB D

CA CA D

�

	
�
�
�
�

�


�
�
�
�� �

0 0
0

�

�

� � � �

�
K K2 3

, Q
C

CA CA

�

	
�
�
�
�

�


�
�
�
�� �

0 0 0
0 0

0

�

�

� � � �

�
K K2 3

.

Then the criterion with the quadratic cost function can be
written in the form

� �J rT T� � � �� ( ) ( ) ( )Y W Y W U U x 1 , (18)
where the augmented vector Y � Px(1) � SU � QV � E. The cri-
terion can be minimized by completing the squares if the

34 ©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/

Acta Polytechnica Vol. 46  No. 1/2006 Czech Technical University in Prague


constraints are not considered. After simple manipulation the
optimal control results

� �U S S I S Px W* ( ) ( )� � � ��T Tr 1 1 (19)

The state and output noises only enlarge the optimal
value of the criterion. Often only differences of input signal
are considered. The criterion then has the form

� � � �J y t w t r u t
t

K
� � �

�
 
!

"!

#
$
!

%!�
�� ( ) ( ) ( ) ( )2 2

1

1� x , (20)

where &u(t) � u(t) � u(t � 1). In such a case, the optimal predic-
tive control strategy is

� �� �U S S R R S Px W R U* ( ) ( ) ( )� � � � ��T T T Tr r1 1 0 , (21)
where � �U( ) ( ), , ,0 0 0 0� u T� and matrix R equals

R �
�

�

	
�
�
�
�
�
�

�


�
�
�
�
�
�

1 0 0 0
1 1 0 0

0 0 1 0
0 0 1 1

�

�

� �

�

.

This way achives integral action of control. Also for predic-
tive control a different norm minimization of the quality
criterion can be used

J y i w i r u i
p
p

p
p

k

k
� � �

� �
� ( ) ( ) ( )�

� 1

(22)

According to weighting coefficient r, the quadratic norm
suppresses large control errors and large input signals. If the
norm p � 2 is used, small members in the criterion have
greater influence, and if p � 1 the control law approaches to
dead beat control.

5 Example I – suppression of outliers
by p-norm minimization
This section shows an example of system parameter iden-

tification. The experiment clearly shows the influence of out-
liers on the results of identification for utilization of different
norms. Consider a discrete-time system described by the
transfer function

Y z
z z

z z
( )

. . .
. .

�
� �

� �

0 002870 0 009882 0 002126
2 444 2 008

2

3 2 z
U z

z z z
E z

�
�

� � �

0 5488
1

2 444 2 008 0 54883 2

.
( )

. . .
( ).

(23)

Using this model, the data for our experiments was gen-
erated. Fig. 1 shows the system input and output time
trajectories. 1000 samples of input-output data are corrupted
by 11 outliers (completely wrong measurements).

Fig. 2 and Fig. 3 show the step responses of the identified
models together with the step responses of the nominal
model, where the each identification was done by minimizing
a different norm.

It is not convenient to use the p � � norm for system pa-
rameter identification from data corrupted by outliers – see
Fig. 4.

The Euclidean norm of the parameter estimation error
can be used to demonstrate the accuracy of the estimation.
The norm is

J � �x x* 2 , (24)
where x is the estimation and x* is the vector of the true pa-
rameters of the nominal system. Fig. 5 shows the dependency
of the norm of the estimation error on the p-norm of the crite-
rion (i.e., on p).

©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 35

Czech Technical University in Prague Acta Polytechnica Vol. 46  No. 1/2006

Fig. 1: Data for experiments

0 20 40 60
0

0.2

0.4

0.6

0.8

1

1.2

Step Response

Time (sec)

A
m

p
lit

u
d

e

true model
identification

0 20 40 60
0

0.2

0.4

0.6

0.8

1

1.2

Time (sec)

A
m

p
lit

u
d

e

true model
identification

Step Response

Fig. 2: Results of identification – step responses (left p � 2, right p � 1.7)


36 ©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/

Acta Polytechnica Vol. 46  No. 1/2006 Czech Technical University in Prague

0 20 40 60
0

0.2

0.4

0.6

0.8

1

1.2

Step Response

Time (sec)

A
m

p
lit

u
d

e

true model
identification

Step Response

Time (sec)
0 20 40 60

0

0.2

0.4

0.6

0.8

1

1.2

true model
identification

A
m

p
lit

u
d

e

Fig. 3: Results of identification – step responses (left p � 1.3, right p � 1)

0 10 20 30 40 50 60
�0.2

0

0.2

0.4

0.6

0.8

1

1.2

Step Response

Time (sec)

A
m

p
lit

u
d

e

true model
identification

Fig. 4: Results of identification - step response for p � �

Fig. 5: Dependency of the norm of the estimation error on the p-norm (i.e., on p)


6 Example II – predictive control by
p-norm minimization
Consider the second order continuous-time system de-

scribed by a transfer function

G s
s s

( )
. .

�
� �

1
0 7 0 932

(25)

that is sampled by a sampling period T ss � 01. . Note that
the discrete-time system model was transformed to the state

space. In the first experiment, three different MPC control-
lers were designed:

� The first MPC minimizes the 1-norm. Such a minimization
problem can be converted to linear programming (see
section 3).

� The second MPC minimizes the p-norm where p is consid-
ered to be 1 � p � 2. In this case, the minimization problem
can be solved by the IRLS algorithm (see section 3).

� The third case is standard MPC minimizing the quadratic
norm of the error vector. This optimization can be solved

©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 37

Czech Technical University in Prague Acta Polytechnica Vol. 46  No. 1/2006

0 2 4 6 8 10
�20

0

20

40

60

80

time [s]

sy
st

e
m

o
u

tp
u

t
[-

]

reference
system output

0 2 4 6 8 10

0

20

40

60

80

time [s]

sy
st

e
m

in
p

u
t
[-

]

System input

�20

0 2 4 6 8 10
�20

0

20

40

60

80

time [s]

sy
st

e
m

o
u

tp
u

t
[-

]

reference
system output

0 2 4 6 8 10

0

20

40

60

80

time [s]

sy
st

e
m

in
p

u
t
[-

]

System input

�20

Fig. 7: Simulation results for 1.5 (left) and 2 (right) norm (with weighting coefficient r � 1)

0 2 4 6 8 10
�20

0

20

40

60

80

time [s]

sy
st

e
m

o
u

tp
u

t
[-

]

reference
system output

0 2 4 6 8 10

0

20

40

60

80

time [s]

sy
st

e
m

in
p

u
t
[-

]

System input

�20

0 2 4 6 8 10
�20

0

20

40

60

80

time [s]

sy
st

e
m

o
u

tp
u

t
[-

]
0 2 4 6 8 10

0

20

40

60

80

time [s]
sy

st
e

m
in

p
u

t
[-

]

System input

reference

system output

�20

Fig. 6: Simulation results for 1 (left) and 1.1 (right) norm (with weighting coefficient r � 1)


as a least squares problem or, with constraints, as a qua-
dratic programme.

The only parameter that is different for each MPC is the
kind of p-norm. All other parameters are not changed (pre-
diction horizon N � 4s and weighting coefficient r � 1). Fig. 6
and Fig. 7 show the simulation results for the first experiment.
The performances of reference tracking and system input are
shown for p � 1, 1.1, 1.5, 2. Minimizing the 1-norm of the
quality criterion achieves dead beat control.

The second experiment shows the influence of the weight-
ing coefficient r in the criterion (22) on the time length of
dead beat control (for 1 norm). The simulation results for
weighting coefficient r � 1 are shown in Fig. 6 (left), for r � 0.1
and r � 10 in the Fig. 8. The relatively large penalty r does not
allow big changes in differences of control variable &u and
therefore the time of dead beat control is longer.

7 Conclusions
The purpose of this paper is to show the influence of dif-

ferent p-norm selection on ARX model identification and
control. When the measurements of the system output were
damaged by outliers, it is shown that the Euclidean norm
gives worse results than the p-norm for 1 � p � 2. The best
results were achieved with the p-norm for p � 1.

Predictive control realized by the p-norm of quality crite-
rion minimization shows interesting results from LQ control
(Linear system and Quadratic criterion) to dead beat control.

8 Acknowledgment
This work was partly supported by grants 102/05/0903,

405/05/2173 and 102/05/2075 of the Grant Agency of the
Czech Republic.

References
A shorter version of this paper was presented in [3].
[1] Ljung, L.: System Identification: Theory for the User. Pren-

tice Hall, Englewood Cliffs, NJ, 1987.
[2] Aström, K. J., Wittenmark, B.: Computer Controlled Sys-

tems: Theory and design. Prentice Hall, Inc., Upper Saddle
River, NJ, 1997.

[3] Pekař, J., Štecha, J., Havlena, V.: “Identification and
Predictive Control of ARX Model by p-norm Minimiz-
ation.” Proceedings of the 23rd IASTED International
Conference on Modelling, Identification, and Control.
Anaheim 2004. ISBN 0-88986-387-3.

[4] Findeisen, R., Imsland, L., Allgöwer, F., Foss, B. A.:
“State and Output Feedback Nonlinear Model Predic-
tive Control: An Overview.” European Journal of Control,
Vol. 9 (2003), p. 190–206.

[5] Rao, C. V., Rawlings, J. B.: “Linear Programming and
Model Predictive Control.” Journal of Process Control,
Vol. 10 (2000), p. 283–289.

[6] Allwright, J. C., Papavasiliou, G. C.: “On Linear Pro-
gramming and Robust Model Predictive Control Using
Impulse Response.” Sys. Cont. Let., Vol. 18 (1992),
p. 159–164.

[7] Genceli, H., Nikolau, M.: “Robust Stability Analysis of
Constrained -Norm Model Predictive Control. AIChE J.,
Vol. 39 (1993), No. 12, p. 1954–1965.

[8] Change, T. S., Seborg, D. E.: A Linear Programming
Approach for Multivariable Feedback Control with
Inequality Constraints. Int. J. Control, Vol. 37 (1983),
p. 583–597.

[9] Zadeh, L. A., Whalen, J. H.: “On Optimal Control and
Linear Programming.” IRE Trans. Auto, Cont. Vol. 7
(1962), p. 45– 46.

38 ©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/

Acta Polytechnica Vol. 46  No. 1/2006 Czech Technical University in Prague

0 2 4 6 8 10
�20

0

20

40

60

80

time [s]

sy
st

e
m

o
u

tp
u

t
[-

]

reference
system output

0 2 4 6 8 10
�100

0

100

200

time [s]

sy
st

e
m

in
p

u
t
[-

]

System input

0 2 4 6 8 10
-20

0

20

40

60

80

time [s]

sy
st

e
m

o
u

tp
u

t
[-

]

reference
system output

0 2 4 6 8 10
-20

0

20

40

60

80

time [s]
sy

st
e

m
in

p
u

t
[-

]

System input

Fig. 8: The simulation results for 1 norm for weighting coefficient r � 0.1 (left) and r � 10 (right)


[10] Björck, .: Numerical Methods for Least Squares Problems.
Siam, Philadelphia, 1996.

[11] Van Huffel, S., Vandewalle, J.: The Total Least Squares
Problem. Computational Aspects and Analysis. Siam, Phila-
delphia, 1991.

[12] Boyd, S., Vandenberghe, L.: “Introduction to Convex
Optimization with Engineering Applications.” Lecture
notes, Stanford University, 2002.
Available at: isl.stanford.edu/pub/boyd/.

Ing. Jaroslav Pekař, Ph.D.
phone: +420 266 052 325
fax: +420 286 890 555
e-mail: jaroslav.pekar@honeywell.com

Honeywell Prague Laboratory
Pod vodárenskou věží 4
182 08 Prague 8, Czech Republic

Prof. Ing. Jan Štecha, CSc.
phone: +420 224 357 238
fax: +420 224 918 646
e-mail: stecha@control.felk.cvut.cz

Department of Control Engineering

Czech Technical University in Prague
Karlovo nám. 13
12 135 Prague 2, Czech Republic

©  Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 39

Czech Technical University in Prague Acta Polytechnica Vol. 46  No. 1/2006