HUNGARIAN JOURNAL 
OF lNDUSTRIAL CHEMISTRY 

VESZPREM 
Vol. 29. pp. 67-70 (2001) 

LEAST SQUARES AND THE MATRIX RICCATI EQUATION 

C. STOREY 

(The Institute of Mathematics and Simulation Sciences, Faculty of Computing Sciences and Engineering, De Montfort 
University, The Gateway, Leicester, LEI 9BH, UK) 

Received: July 26, 2001 

There are a number of methods for least squares and total least squares solution of linear equations. One such method 
involves solving an appropriate algebraic matrix Riccati equation. This approach is investigated here using an algorithm 
based on Newton's method with an infinite series solution of the resulting linear matrix equations. Some numerical 
examples, from the recent literature, are used to illustrate the techniques. 

Keywords: Matrix Riccati equation; Newton's method; infinite series solution; overdetermined linear systems; least 
squares and total least squares. 

Introduction 

When a system of linear equations 

(1) 

with Land M, m x nand m x p matrices respectively, 
does not have a unique solution X it is usual to seek a 
solution that is "best" in some sense by imposing 
appropriate constraints. This paper is concerned with 
computing the least squares (LS) solution, which 
imposes the constraint that X should minimize the norm 

!ILX - Mjj~, and the total least squares (TLS) solution 

which requires X to minimize IlL- Vjj~ + jjM - Wjj~ 
subject to the constraint 

VX=W. (2) 

The main difference between LS and TLS is that the 
former assumes the matrix L is exact whereas the latter 
allows for errors in both L and M. Perhaps the most 
important approach in solving these problems is 
singular value decomposition (SVD) the use of which is 
fully discussed in [1] and [2} for LS and TLS 
respectively. Another possible approach is the 
following. By using Lagrange multipliers, in the usual 
way, it can be shown [3J, [4], that X satisfies the 
algebraic matrix Riccati equation (AMRE) 

XM T LX+ LT LX- XM T M - LT M = 0. (3) 
When Newton's method is applied to the general 

AMRE 

XPX +QX +XF+G::O (4) 

a linear matrix equation, solutions Xi of which 
converge to X the solution of Eqn. ( 4) under appropriate 
conditions, is obtained. There are many possible ways 
of solving this linear matrix equation (see, e.g., (5]) and 
one of these, which involves infmite series, is studied in 
{6]. Methods for offsetting the difficulties involved if a 
poor starting approximation is made to the solution and 
for accelerating convergence are also investigated in [6}. 
The solution techniques are implemented by two 
algorithms, AI and A2, depending on whether X is 
unsymmetric or symmetric. A simplified form of Al is 
given in the Appendix for convenience. 

The aim of the present paper is to illustrate the use 
of AI in solving LS and TLS problems by means of a 
number of numerical examples from the literature. 

Numerical Results 

Example 1 

The first example is taken from [7] and is concerned 
with modelling the biological activity of fifteen 
different compounds with respect to a set of eight 
variables which describe the various morphological and 
physiochemical properties of the compounds. The data 
used is given in Table 1 of {7]. Setting the initial 
approximation X0 =0 in Al gave the same least 
squares solution as in 17] (when rounded). 


68 

XLS = [0.636, 0.080, 0.095, -0.308, 
0.169, 0.241, 0.278, 0.238f 

and, of course, the same value, 0.61, for the residual 

sum of squares SS. X Ls also agrees with L+ M where 

L+ is the pseudo-inverse of L. 
The smallest singular value of Lis 0.337365 and of 

[L;M] is 0.335713 so that conditions for a unique TLS 
solution, [2], are Gust) satisfied. If A1 is now continued 

from the solution X LS given above it converges to 
X= [0.716, 0.259, 0.124, -0.336, 

0.186, 0.531, -0.435, 0.337f 
with e = 0.16 before accuracy begins to deteriorate. 
(Using a line search shows a slight improvement in 
accuracy before deterioration.) When this model is 
applied to the whole of the data the resulting sum of 
squares is 41.893 . , 

The smallest singular value of [L;M], for this 
example, is as given above and the SVD formula 

X= [l:.L-(0.335713) 2 13J1 LTM 
gives 

X = [ -4.28,. 1.20,- 2.21, -1.29, 

- 2.36, 11.oo, -14.60, 2.1ot 

which has a residual error e = 6( -4) and which is much 
better than the previous result from Al. If now, 
however, the above X is used as the starting 
approx.imation in A1 the result is refined to give a better 
X with a residual error less than 10(-6). 

Algorithm Al was also used to compute the sums of 
squares, denoted by ''PRESS" in [7], for both LS and 
TLS; these measure the predictive properties of the 
models used. The same groupings of the date were used 
as in [7] and Table 1 shows the results obtained. 

Table 1 Predictive Sums of Squares for Example 1 

LS 
TLS 

Group 1 
1.172 
1.319 

Group 2 Group 3 Total 
1.183 0.804 3.159 
1.540 34.832 37.691 

The total for LS agrees with that given in [7]. The 
total for TLS is inflated by the very large value from 
Group 3 and presumably accounts for the comment, 
.. when full rank is approached, it behaved differently," 
made regarding TLS in [7] and for the use of rank 3 
TLS rather than full rank. (Before making the 
calculations all the nine sets of data were scaled to have 
zero mean and unit variance.) 

Example2 

This example is taken from { l] and involves finding 
a linear fit to data given in Table 26.1 in that reference. 
In ( 11 five different ucandidaten solutions are obtained 
and one of them 

X =f-2.486~-Q.529.-0.l94.1.616,3.455t (5) 

is proposed as the best LS solution according to a 
number of statistical criteria. Now if A1 is used to find a 
LS and then a TSL solution the result is 

X = [ -73.3, 98.4, -78.0, 90.7,- 78.2f ( 6) 

which is very different from the solution given in Eqn. 
(5) despite the fact that their residual norms are quite 
close. It is also interesting that the norm of (5) is only 
about 4.6 compared with 188 for that of (6) and that the 
smallness of norm is one of the criteria mentioned 
above for preferring one solution to another. The TSL 
solution would not have found much favour from this 
particular viewpoint. 

Example 3 

This example, with 

L=(~~) and M =G) 
is of a type often used to show that the TSL problem 
may not have a solution unless extra constraints are 

imposed [2]. Applying A1, using X~ = 0, gives the 
LS solution [1,.0f, continuing until convergence now 

gives the TSL solution [L61803,of. Another much 
used method for solving problems of the kind 
considered in this paper involves some form of direct 
minimization (see, e.g., [8]) as in the following 
treatment. 

Let 

(xl X2), (xs) and (x7) x3 x4 x 6 x8 
be estimates of L, X and M respectively; form 

+ (1 \2 2 2 2 
J 1 = -xi J + X2 + X3 + X4 

+ (1-x1x5 -.xzx6 } 
+ (1- X:3X5 - x4 x6 }. 

or alternatively, with more variables involved, 

fz =(1-xl} +Xz2 +X32 +X42 +(l-x7} +V-xs} 

+(x7 -x1xs -x2x6Y 
+ (xs - x3xs - x4x6} (8) 

and minimize ft or f 2 with respect to all of the 
variables involved in each case. So, for example, using 
"Find Minimum" from "MATHEMA TICA" on ft the 
solution 

(9) 

is obtained and using / 2 gives the solution 

Notice that the solution given in Eqn.(9) is the same 
as that found by Al. The similar, but simpler example 

with L= [a.bf and M = [uf can be analysed 
algebraicaUy to quantize the difference between using 


f 1 and !2 • Thus, for example, taking 
L = [l,-G.9Sf gives solutions X = 2.37165 and 
X ""0.0553169 using A andf~ respectively. (Again 

using Jj gives the same solution as Al and as the SVD 
formula given in Example 1.) 

Example4 

This is similar to Example 3 but rather more 
sophisticated; it is taken from {21 where it is analysed 
usingSVD, 

r
../6 .(6 l 

L= J214 -/'il4 .• 

.fin .fin 
M =[~ ]·· 

-..fi 
Using X 0 = 0 as the starting approximation in Al gives 
the LS solution 

xu =[o.zs2633.o.:zsz633r 
and then proceeding to convergence gives the TLS 
solution 

X ns = {0.408248, 0.408248f 
Notice that this is the same as the solution 

p!../6.11 Ji;f given in {2j. 
Jn the same manner, as for the previous example, 

setting up the appropriate Jj and using FindMinimum 
gives 

X = (o.408245, 0.408425f 
and with the appropriate / 2 the solution 

X =[o.34186,0.34186T. 
is obtained. 

Example$ 

The final example ts concerned wltb parameter fitting 
and 1s raken from l8J. Here 

L 
fl l I t I 

=I o o.9 t.s 2.6 3.3 ... 
l i I I 1 Y 
4.4 5 .. 2 6.1 65 7.4 J 

M=b~S~4~4~3~J~2&2LIAt5f 
In dus example the first rolumn of L Is. fixed and so one 
merhod of proc¢alins is. as follows.. f2}; ILMI •> 
muhtphed by a su1tabfe HO\Ilioelmlder matnx to &•'•e 1he 
newmatn:t 

(I() 

:..16:!28 t::t.oi•)t) II '1004 
0 ·4.68666 ::!.7174;'\ 
0 -~.78666 L7174S 
0 ·::!..9SMt\ 1.917451 
0 .;!,;86N, 0.8J14J6 
0 -U866b J.Ol745 
0 -0.~86656 0 lf74J6 
0 0.513344 0.117446 
0 0.913~44 .0.2825$4 
0 1.81334 ·1.18:!5~ 

and algorithm A 1 is now apphcd to the ~ubmatm. 
obtained by deleting the first row and column of the 
above matrix. to give the :.«ond componenr .t: of the 

solution vector X. The first component .t1 of X '" then 
obtained from 

.3.16228 XI ::: J 1.7004 -12..07Q9t:· ; 
in this way 

x :::.{5.7840~.-0.SJ55o2f 
is obtained. 

Direct minimizattt~~l ~an oth.o be u.<ied M 1r1 the I'-'U 
previous eliiamples to give the two '!o{)!Utl<lns 

X :::. (S. 78404.- O.S455<n f 

X ,.,[S.Sl535,-0.55J802f 
using sums of !iCJuares sim1lar tll { 1 and 
respectively. 

It has been demonstrated that an alJOnthm few ;..th m~ 
the AMR.E can be Wil!d 10 '!o{)fVC ov~-ddermmed 
systems of hnear cquatttm. A numlxf nt t\UrnerK'-111 
namples m~tratt the t«hl'llq~ ln\·ulved 

F. fi. L M, P. Q. V, W ("or:ffKJt'flt\ mal~'~' 
matnx R1tatt• cqual!<m I t\MRE • 

X Solutmn of AMRE. 
X, 1tera~ 10 Ylluuon to AMRf: 

(A,B) tonc.atenated matnli f·>rma.l1rnm m&Ut'e~ 4 
.andB 

A.B.C.R.S.cc./LU .T. matn..c:1o dchnc.ltn ·\~nJ,, 

... ' 
A• 

AI. A.? 
S\'0 

trai\.'J)I'tK of matn• A 

~udn-mvc:tK ,,f m.auu .. -\ 
algunthrll\ frnrn j()) 
\lflgul.u v.-luc d<-<:•'fllP'"'tll<>n' 
.:Nttpr•nc:nl~ ut \('d<'f '"' m.Jlfl\ \ 

k.al-f ~u.uc:~ 
lul,tllc.-. .. t -.qu.ltC:t.o 
nmmny mJe1. 


70 

e 

sums of squares defined in Eqns. (7) and (8) 

Euclidian norm of A 

Euclidian norm of residual error in solution of 
AMRE 

m, n, p integers used to give order of matrices 

REFERENCES 

1. LAWSON L. and HANSON J.: Solving Least Squares 
Problems, Prentice-Hall, Inc., Englewood Cliffs, 
New Jersey, 1974. 

2. VAN HUFFEL S. and VANDE WALLE J.: The Total 
Least Squares Problem: Computational Aspects and 
Analysis, SIAM Philadelphia, 1991. 

3. DE MooR B. and DAVID J.: Systems and Control 
Letters,l992, 18, 329-337. 

4. LANCASTER P. and RODMAN L.: Algebraic Riccati 
Equations, Clarendon Press Oxford, 1995. 

.5. BARNETT S. and STOREY C.: Matrix Methods in 
Stability Theory, Thomas Nelson and Sons 
Limited, 1970. 

6. STOREY C.: Modified Newton Methods for the 
Algebraic Matrix Riccati Equation, 

Part 1: Hung.J. Ind. Chern., I997, 25,305-308, 
Part2: Ibid., I998, 26,309-314, 
Part 3: Ibid.,J999, 27, 301-305, 
Part 4: Ibid., 2000, 28, 277-280. 

7. WOLDS., RUHE A., WOLD S. and DUNN III W.J.: 
SiamJ. Sci. Stat. Comput., 1984,5,735-743. 

8. Luus R. and HERNAEZ H.: Hung.J. Ind. Chern., 
2000,28,201-207. 

APPENDIX 

Algorithm AI 

To solve the general AMRE 
XPX + QX + XF + G = 0 

Let X 0 be an approximate solution. Let i = 0 
Compute: 

A1 =X1P+Q 

B1 =PX1 +F 

C, =X1PX1 -G 

R; =(In -A;)-1 
S; =(In- B;)-1 
a1 =2R1 -In 
{31 =2S1 -In 

M 1 =-2R1C1S1 
T

1
<1l =M

1 

Tz (i) = ~(;} + a;~<il /1; 
~ (!) = T

2 
(i) + a/T

2 
(i) f1f 

T Cil = T (i) z"-''y<nf3z''-'' k+l k +a; k ; 

(i} 
When converged set X i+l = Tk+l , i = i + 1 
REPEAT 


	Page 65 
	Page 66 
	Page 67 
	Page 68