HUNGARIAN JOURNAL OF lNDUSTRIAL CHEMISTRY VESZPREM Vol. 29. pp. 67-70 (2001) LEAST SQUARES AND THE MATRIX RICCATI EQUATION C. STOREY (The Institute of Mathematics and Simulation Sciences, Faculty of Computing Sciences and Engineering, De Montfort University, The Gateway, Leicester, LEI 9BH, UK) Received: July 26, 2001 There are a number of methods for least squares and total least squares solution of linear equations. One such method involves solving an appropriate algebraic matrix Riccati equation. This approach is investigated here using an algorithm based on Newton's method with an infinite series solution of the resulting linear matrix equations. Some numerical examples, from the recent literature, are used to illustrate the techniques. Keywords: Matrix Riccati equation; Newton's method; infinite series solution; overdetermined linear systems; least squares and total least squares. Introduction When a system of linear equations (1) with Land M, m x nand m x p matrices respectively, does not have a unique solution X it is usual to seek a solution that is "best" in some sense by imposing appropriate constraints. This paper is concerned with computing the least squares (LS) solution, which imposes the constraint that X should minimize the norm !ILX - Mjj~, and the total least squares (TLS) solution which requires X to minimize IlL- Vjj~ + jjM - Wjj~ subject to the constraint VX=W. (2) The main difference between LS and TLS is that the former assumes the matrix L is exact whereas the latter allows for errors in both L and M. Perhaps the most important approach in solving these problems is singular value decomposition (SVD) the use of which is fully discussed in [1] and [2} for LS and TLS respectively. Another possible approach is the following. By using Lagrange multipliers, in the usual way, it can be shown [3J, [4], that X satisfies the algebraic matrix Riccati equation (AMRE) XM T LX+ LT LX- XM T M - LT M = 0. (3) When Newton's method is applied to the general AMRE XPX +QX +XF+G::O (4) a linear matrix equation, solutions Xi of which converge to X the solution of Eqn. ( 4) under appropriate conditions, is obtained. There are many possible ways of solving this linear matrix equation (see, e.g., (5]) and one of these, which involves infmite series, is studied in {6]. Methods for offsetting the difficulties involved if a poor starting approximation is made to the solution and for accelerating convergence are also investigated in [6}. The solution techniques are implemented by two algorithms, AI and A2, depending on whether X is unsymmetric or symmetric. A simplified form of Al is given in the Appendix for convenience. The aim of the present paper is to illustrate the use of AI in solving LS and TLS problems by means of a number of numerical examples from the literature. Numerical Results Example 1 The first example is taken from [7] and is concerned with modelling the biological activity of fifteen different compounds with respect to a set of eight variables which describe the various morphological and physiochemical properties of the compounds. The data used is given in Table 1 of {7]. Setting the initial approximation X0 =0 in Al gave the same least squares solution as in 17] (when rounded). 68 XLS = [0.636, 0.080, 0.095, -0.308, 0.169, 0.241, 0.278, 0.238f and, of course, the same value, 0.61, for the residual sum of squares SS. X Ls also agrees with L+ M where L+ is the pseudo-inverse of L. The smallest singular value of Lis 0.337365 and of [L;M] is 0.335713 so that conditions for a unique TLS solution, [2], are Gust) satisfied. If A1 is now continued from the solution X LS given above it converges to X= [0.716, 0.259, 0.124, -0.336, 0.186, 0.531, -0.435, 0.337f with e = 0.16 before accuracy begins to deteriorate. (Using a line search shows a slight improvement in accuracy before deterioration.) When this model is applied to the whole of the data the resulting sum of squares is 41.893 . , The smallest singular value of [L;M], for this example, is as given above and the SVD formula X= [l:.L-(0.335713) 2 13J1 LTM gives X = [ -4.28,. 1.20,- 2.21, -1.29, - 2.36, 11.oo, -14.60, 2.1ot which has a residual error e = 6( -4) and which is much better than the previous result from Al. If now, however, the above X is used as the starting approx.imation in A1 the result is refined to give a better X with a residual error less than 10(-6). Algorithm Al was also used to compute the sums of squares, denoted by ''PRESS" in [7], for both LS and TLS; these measure the predictive properties of the models used. The same groupings of the date were used as in [7] and Table 1 shows the results obtained. Table 1 Predictive Sums of Squares for Example 1 LS TLS Group 1 1.172 1.319 Group 2 Group 3 Total 1.183 0.804 3.159 1.540 34.832 37.691 The total for LS agrees with that given in [7]. The total for TLS is inflated by the very large value from Group 3 and presumably accounts for the comment, .. when full rank is approached, it behaved differently," made regarding TLS in [7] and for the use of rank 3 TLS rather than full rank. (Before making the calculations all the nine sets of data were scaled to have zero mean and unit variance.) Example2 This example is taken from { l] and involves finding a linear fit to data given in Table 26.1 in that reference. In ( 11 five different ucandidaten solutions are obtained and one of them X =f-2.486~-Q.529.-0.l94.1.616,3.455t (5) is proposed as the best LS solution according to a number of statistical criteria. Now if A1 is used to find a LS and then a TSL solution the result is X = [ -73.3, 98.4, -78.0, 90.7,- 78.2f ( 6) which is very different from the solution given in Eqn. (5) despite the fact that their residual norms are quite close. It is also interesting that the norm of (5) is only about 4.6 compared with 188 for that of (6) and that the smallness of norm is one of the criteria mentioned above for preferring one solution to another. The TSL solution would not have found much favour from this particular viewpoint. Example 3 This example, with L=(~~) and M =G) is of a type often used to show that the TSL problem may not have a solution unless extra constraints are imposed [2]. Applying A1, using X~ = 0, gives the LS solution [1,.0f, continuing until convergence now gives the TSL solution [L61803,of. Another much used method for solving problems of the kind considered in this paper involves some form of direct minimization (see, e.g., [8]) as in the following treatment. Let (xl X2), (xs) and (x7) x3 x4 x 6 x8 be estimates of L, X and M respectively; form + (1 \2 2 2 2 J 1 = -xi J + X2 + X3 + X4 + (1-x1x5 -.xzx6 } + (1- X:3X5 - x4 x6 }. or alternatively, with more variables involved, fz =(1-xl} +Xz2 +X32 +X42 +(l-x7} +V-xs} +(x7 -x1xs -x2x6Y + (xs - x3xs - x4x6} (8) and minimize ft or f 2 with respect to all of the variables involved in each case. So, for example, using "Find Minimum" from "MATHEMA TICA" on ft the solution (9) is obtained and using / 2 gives the solution Notice that the solution given in Eqn.(9) is the same as that found by Al. The similar, but simpler example with L= [a.bf and M = [uf can be analysed algebraicaUy to quantize the difference between using f 1 and !2 • Thus, for example, taking L = [l,-G.9Sf gives solutions X = 2.37165 and X ""0.0553169 using A andf~ respectively. (Again using Jj gives the same solution as Al and as the SVD formula given in Example 1.) Example4 This is similar to Example 3 but rather more sophisticated; it is taken from {21 where it is analysed usingSVD, r ../6 .(6 l L= J214 -/'il4 .• .fin .fin M =[~ ]·· -..fi Using X 0 = 0 as the starting approximation in Al gives the LS solution xu =[o.zs2633.o.:zsz633r and then proceeding to convergence gives the TLS solution X ns = {0.408248, 0.408248f Notice that this is the same as the solution p!../6.11 Ji;f given in {2j. Jn the same manner, as for the previous example, setting up the appropriate Jj and using FindMinimum gives X = (o.408245, 0.408425f and with the appropriate / 2 the solution X =[o.34186,0.34186T. is obtained. Example$ The final example ts concerned wltb parameter fitting and 1s raken from l8J. Here L fl l I t I =I o o.9 t.s 2.6 3.3 ... l i I I 1 Y 4.4 5 .. 2 6.1 65 7.4 J M=b~S~4~4~3~J~2&2LIAt5f In dus example the first rolumn of L Is. fixed and so one merhod of proc¢alins is. as follows.. f2}; ILMI •> muhtphed by a su1tabfe HO\Ilioelmlder matnx to &•'•e 1he newmatn:t (I() :..16:!28 t::t.oi•)t) II '1004 0 ·4.68666 ::!.7174;'\ 0 -~.78666 L7174S 0 ·::!..9SMt\ 1.917451 0 .;!,;86N, 0.8J14J6 0 -U866b J.Ol745 0 -0.~86656 0 lf74J6 0 0.513344 0.117446 0 0.913~44 .0.2825$4 0 1.81334 ·1.18:!5~ and algorithm A 1 is now apphcd to the ~ubmatm. obtained by deleting the first row and column of the above matrix. to give the :.«ond componenr .t: of the solution vector X. The first component .t1 of X '" then obtained from .3.16228 XI ::: J 1.7004 -12..07Q9t:· ; in this way x :::.{5.7840~.-0.SJ55o2f is obtained. Direct minimizattt~~l ~an oth.o be u.rma.l1rnm m&Ut'e~ 4 .andB A.B.C.R.S.cc./LU .T. matn..c:1o dchnc.ltn ·\~nJ,, ... ' A• AI. A.? S\'0 trai\.'J)I'tK of matn• A ~udn-mvc:tK ,,f m.auu .. -\ algunthrll\ frnrn j()) \lflgul.u v.-luc d<-<:•'fllP'"'tll<>n' .:Nttpr•nc:nl~ ut \('d<'f '"' m.Jlfl\ \ k.al-f ~u.uc:~ lul,tllc.-. .. t -.qu.ltC:t.o nmmny mJe1. 70 e sums of squares defined in Eqns. (7) and (8) Euclidian norm of A Euclidian norm of residual error in solution of AMRE m, n, p integers used to give order of matrices REFERENCES 1. LAWSON L. and HANSON J.: Solving Least Squares Problems, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1974. 2. VAN HUFFEL S. and VANDE WALLE J.: The Total Least Squares Problem: Computational Aspects and Analysis, SIAM Philadelphia, 1991. 3. DE MooR B. and DAVID J.: Systems and Control Letters,l992, 18, 329-337. 4. LANCASTER P. and RODMAN L.: Algebraic Riccati Equations, Clarendon Press Oxford, 1995. .5. BARNETT S. and STOREY C.: Matrix Methods in Stability Theory, Thomas Nelson and Sons Limited, 1970. 6. STOREY C.: Modified Newton Methods for the Algebraic Matrix Riccati Equation, Part 1: Hung.J. Ind. Chern., I997, 25,305-308, Part2: Ibid., I998, 26,309-314, Part 3: Ibid.,J999, 27, 301-305, Part 4: Ibid., 2000, 28, 277-280. 7. WOLDS., RUHE A., WOLD S. and DUNN III W.J.: SiamJ. Sci. Stat. Comput., 1984,5,735-743. 8. Luus R. and HERNAEZ H.: Hung.J. Ind. Chern., 2000,28,201-207. APPENDIX Algorithm AI To solve the general AMRE XPX + QX + XF + G = 0 Let X 0 be an approximate solution. Let i = 0 Compute: A1 =X1P+Q B1 =PX1 +F C, =X1PX1 -G R; =(In -A;)-1 S; =(In- B;)-1 a1 =2R1 -In {31 =2S1 -In M 1 =-2R1C1S1 T 1 <1l =M 1 Tz (i) = ~(;} + a;~