Preconditioning Large Indefinite Linear


SQU Journal for Science, 17 (1) (2012) 63-79 

© 2012 Sultan Qaboos University 

63 

Preconditioning Large Indefinite Linear  
Systems 

Giovanni Fasano
*
 and Massimo Roma

**
 

*Department of Management, University Ca’Foscari Venezia, Venice, Italy, E-mail: 
fasano@unive.it. 
*The Italian Ship Model Basin - INSEAN, Rome, Italy. 
**Dipartimento di Ingegneria Informatica Automatica e Gestionale A. Ruberti, SAPIENZA 
Università di Roma, Roma, Italy, E-mail: roma@dis.uniroma1.it. 
 

ABSTRACT: After briefly recalling some relevant approaches for preconditioning large symmetric 

linear systems, we describe a novel class of preconditioners. Our proposal is tailored for large 

indefinite linear systems, which arise very frequently in many different contexts of numerical analysis 

and nonlinear optimization. Our preconditioners are built as a byproduct of the Krylov subspace 

method used to solve the system. We describe theoretical properties of the proposed class of 

preconditioners, namely their capability of both shifting some eigenvalues of the system’s matrix to 

controlled values, and reducing the modulus of the other ones. The results of a numerical 

experimentation give evidence of the good performance of our proposal.  

 
KEYWORDS: Krylov subspace methods, Large indefinite linear systems, Large scale optimization, 

Preconditioners, Quasi-Newton updates. 

 لجملة كبيرة من المعادالت الخطية غير المحددة يةشرطتهيئة   

 جوڤاني فاسانو و ماسيمو روما 

لجملة كبٌرة من المعادالت الخطٌة المتناظرة، نصف نوع  ٌةشرطالتهٌئة الالمتعلقة ب طرقبعد سرد سرٌع لبعض ال :ملخص
هر فً التحلٌل العددي دة التً كثٌرا ما تظقٌمٌالئم الجمل الخطٌة الكبٌرة غٌر الهذا المقترح . رقالطهذه جدٌد من 

ٌلوف الجزئً المستعملة  فً حل تلك اكرفضاء نتٌجة جانبٌة لطرٌقة كالمقترحة  رٌقةطنبنً ال مثلٌات غٌر الخطٌة.واأل
المسبقة وخاصة قدرتها على تغٌٌر تهٌئة بوصف الخواص النظرٌة لهذا الصنف من أدوات ال تقوم هذه الدراسةالجمل. 

باقً القٌم الذاتٌة. تظهر نتائج التجارب لإلى قٌم متحكم فٌها وتقلٌص القٌم المطلقة  الخطٌة جملللالقٌم الذاتٌة لمصفوفة  بعض
 ٌة مدى فاعلٌة الطرٌقة المقترحة.العدد

1.   Introduction 

he efficient solution of large linear systems (or a sequence of slowly varying linear systems) is of 

fundamental importance in many contexts of numerical analysis and nonlinear optimization. In this paper 

we first recall a few relevant approaches for preconditioning large indefinite linear systems. Observe that in 

many contexts of numerical analysis (see e.g. http://math.nist.gov/MatrixMarket) and nonlinear optimization, the 

T 

mailto:fasano@unive.it


GIOVANNI FASANO and MASSIMO ROMA 

64 

iterative efficient solution of linear systems and sequences of linear systems is sought. Truncated Newton 

methods in unconstrained optimization, KKT systems arising in constrained optimization, interior point methods, 

and PDE constrained optimization are just some examples from optimization.  

We first show that using information from quasi-Newton updates may often provide effective 

preconditioners. The latter are sometimes endowed with theoretical properties related to the spectrum and the 

condition number of the preconditioned matrix. Then, we describe a novel class of preconditioners for the 

solution of large indefinite linear systems, without assuming any sparsity pattern of the system matrix.  

In particular, the class of preconditioners we propose uses information collected from Krylov subspace 

methods, in order to assess the structural properties of the system matrix. We iteratively construct our 

preconditioners either indirectly using (but not performing) a factorization of the system matrix (see, e.g. 

(Fasano and Roma, 2007; Golub and Van Loan, 1996; Stoer, 1983)), obtained as by product of Krylov subspace 

methods, or performing a Jordan canonical form on a very small size matrix. We address our preconditioners 

using a general Krylov subspace method; then, we prove theoretical properties for such preconditioners, and 

describe results which indicate how to possibly select the parameters which characterize the definition of the 

preconditioners.  

The basic idea of our approach is to apply a Krylov subspace method to generate a positive definite 

approximation of the inverse of the system matrix. The latter is then used to build our preconditioners, needing 

to store just a few vectors, without requiring any product of matrices. We assume that the entries of the system 

matrix are not known and the information necessary to build the preconditioner is gained by using a routine, 

which computes the product of the system matrix times a vector.  

In the paper (Fasano and Roma, 2011b) we experiment our preconditioners both in numerical analysis and 

nonconvex optimization frameworks. In particular, we test our proposal on significant linear systems from the 

literature. Then, we focus on the so called Newton-Krylov methods, also known as Truncated Newton methods 

(see (Nash, 2000) for a survey). In these contexts, both positive definite and indefinite linear systems are 

considered.  

We recall that in case the optimization problem in hand is nonconvex, i.e. the Hessian matrix of the 

objective function is possibly indefinite and at least one eigenvalue is negative, the solution of Newton's 

equations within Truncated Newton schemes may require some care. Indeed, the Krylov subspace method used 

to solve Newton's equation, should be suitably applied considering that optimization frameworks require the 

computation of descent directions, which have to satisfy additional properties (Dennis and Schnabel,1983; 

Nocedal and Wright, 2000). In this regard our proposal provides a tool, in order to preserve the latter properties. 

The paper is organized as follows. In Section 2 we briefly recall relevant approaches from the literature. 

Then, in Section 3 we describe our class of preconditioners for large indefinite linear systems, by using a general 

Krylov subspace method. We detail some theoretical properties of our proposal, along with some hints on its 

calculation. Finally, a section of conclusions and future work completes the paper.  

Regarding the notations, for a n n  real matrix M we denote with [ ]M  the spectrum of M; kI  is the 

identity matrix of order k and we use the capital letter T to indicate a  tridiagonal matrix. Finally, with 0C  we 

indicate that the matrix C is positive definite, and   denotes the Euclidean norm. 

2. Some approaches for preconditioning large symmetric systems 

Let us consider the following linear system  

 = , , ,
n n n

Ax b A R b R


   (1) 

where A is symmetric and nonsingular, and n is large. We assume that the structure of the matrix A is unknown 

as is its sparsity pattern. We recall that in case of special structures of matrix A, suitable preconditioners may be 

built for solving (1) (see (Higham, 2002)).  

As is well-known, the main rationale behind the idea of using a preconditioner to solve the linear system 


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

65 

(1), consists in introducing the nonsingular matrix M, such that solving  

 =MAx Mb  (2) 

is possibly simpler in some sense than solving (1). Of course, to the latter purpose the extreme choices for M are 

=M I  and 
1

=M A


 (the latter being the ideal choice). In most cases (always when n is large) there is no 

chance to compute 
1

A


 (or computing 
1

A


 is no cheaper than solving (1)). Notwithstanding this difficulty, the 

preconditioner M can be chosen according to the following alternative guidelines:   

 the linear system (2) should be of easy solution thanks to the structure of matrix M (e.g. preconditioners 
for linear systems from PDEs discretization often have a suitable block-structure which is suggested by 

the problem in hand);  

 the condition number ( )MA  should be relatively small. The latter fact may be helpful when 

attempting to solve the preconditioned system (2), with a technique sensitive to ( )MA  (e.g. the 

Krylov subspace methods);  

 the eigenvalues in the spectrum [ ]MA  should be as clustered as possible (see e.g. (Nocedal and 

Wright, 2000)). The latter fact may again be helpful whenever Krylov subspace methods are adopted to 

solve (2).  

Since we want to deal with the large scale case, without any assumption on the sparsity pattern of A, the 

main approaches in the literature for building and applying a preconditioner to (1) often gain information on the 

system matrix by computing the matrix-vector product ,A v  with ,
n

v R  or rely on numerical differentiation. 

In particular, among the approaches proposed we have the following:   

 Approximate Inverse Preconditioners based on using the BFGS update method (see also (Nash, 1985; 
Benzi et al., 2000), and references therein). Here, the main adopted idea is that a BFGS update may be 

suitably used to compute the approximate inverse 
#

A  of matrix A. Then, the matrix 
#

A  is applied as a 

preconditioner. 

 Preconditioners based on the L-BFGS method (see also (Morales and Nocedal, 2000; 2001), and the 
class of preconditioners Limited Memory Preconditioners (LMPs) by Giraud and Gratton (2006)), 

which pursue an idea similar to Approximate Inverse Preconditioners in the previous item. 

 Approximate Inverse Preconditioners based on the use of Krylov subspace methods (see also 
(Simoncini and Szyld, 2007; Fasano and Roma, 2009)), where a Krylov subspace method is used to 

determine the solution of (1) and to provide information in order to build a preconditioner for (1). 

 Band Preconditioners based on matrix scaling/balancing (see also (Roma, 2005)). 

 Preconditioners based on numerical differentiation (see e.g. (Higham, 2002)). 

 Band Preconditioners based on the BFGS method (see e.g. (Luksan et al., 2010)), where a BFGS update 
is partially modified, so that suitable band preconditioners are defined for linear systems. This approach 

was mainly tested within truncated-Newton frameworks.  

For each of the preconditioning strategies mentioned above we have both a theoretical analysis and a 

numerical experience, for validation. In this paper we want to follow and generalize the approach proposed in 

Fasano and Roma (2009), where at any step, an iterative Krylov subspace method is used to compute the 

approximate inverse 
#
,A  over an increasing dimensional subspace.  

Note that as detailed in Section 3, our approach also encompasses diagonal and block-diagonal 

preconditioners. 

3. Our class of preconditioners 

In this section we first introduce some preliminaries, then we propose our class of preconditioners. 

Consider the indefinite linear system  


GIOVANNI FASANO and MASSIMO ROMA 

66 

 = ,Ax b  (3) 

where 
n n

A R


  is symmetric, n is large and .
n

b R  Suppose any Krylov subspace method is used for the 

solution of (3), e.g. the Lanczos process or the CG method (Golub and Van Loan, 1996) (but MINRES (Saad, 

2003) or Planar-CG methods (Hestenes, 1980; Fasano, 2005) may be also alternative choices). They are 

equivalent as long as 0,A  whereas the CG, though cheaper, in principle may not cope with the indefinite case. 

In the next Assumption 3.1 we suppose that a finite number of steps, say ,h n  of the adopted Krylov 

subspace method are performed. 

 
Assumption 3.1  Let us consider any Krylov subspace method to solve the symmetric linear system (3). Suppose 

at step h of the Krylov subspace method, with 1,h n   the matrices ,
n h

hR R


  
h h

hT R


  and the vector 

1
n

hu R   are generated, such that  

 1 1 1= , , and
T

h h h h h h hAR R T u e R      (4) 

 
, if is indefinite

=

, if is positive definite

T
h h h h

h

T
h h h h

V B V T

T

L D L T







 (5) 

where   

    • 1= ( ),h hR u u   = 0,
T
i ju u   = 1,iu   1 ,i j h     

    • 1 = 0,
T
h iu u   1 = 1,hu    1 ,i h    

    • hT  is irreducible and nonsingular, with eigenvalues 1, , h   not all coincident,  

    • 1= diag { }h i h iB   , 1= ( )
h h

h hV v v R


  orthogonal, ( , )i iv  is an eigenpair of ,hT   

    • 0hD  is diagonal, hL  is unit lower bidiagonal.  

  
Remark 3.1 Note that most of the common Krylov subspace methods for the solution of symmetric linear 

systems at iteration h may easily satisfy Assumption 3.1 (Saad, 2003; Stoer, 1983). In particular, also observe 

that from (4) we have = ,
T

h h hT R AR  so that whenever 0A , then 0.hT  Since the Jordan form of hT  in (5) 

is required only when hT  is indefinite, it is important to check whenever 0.hT  without computing the 

eigenpairs of hT  if unnecessary. For this purpose, note that the Krylov subspace method adopted always 

provides relation = ,
T

h h h hT L D L  with hL  nonsingular and hD  block diagonal (blocks can be 1 1  or 2 2  at 

most), even when hT  is indefinite  (Saad, 2003; Stoer, 1983; Fasano and Roma, 2007). Thus, checking the 

entries of hD  will suggest if the Jordan form =
T

h h h hT V B V  is really needed for ,hT  i.e. if hT  is indefinite. 

 
Furthermore, the matrix hT  captures much information on the eigenvalues of A, corresponding to 

eigenvectors in 1span{ , , }.hu u  Indeed, let ,
n

v R  then hR v  belongs to the Krylov subspace 

1span{ , , }.hu u  Now, considering that =
T
h h hR AR T  from (4), and recalling that = ,

T
h h hR R I  relation  

 
2 2T

hm v v T v M v   

implies  

 
2 2 2 2

= ( ) ( ) = .
T

h h h hm R v m v R v A R v M v M R v   


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

67 

Thus, m  and M  are respectively a lower bound and an upper bound of eigenvalues of A, corresponding to 

eigenvectors in 1span{ , , }.hu u   

Observe also that from Assumption 3.1 the parameter 1h   may be possibly nonzero, i.e. the subspace 

1span{ , , }hu u  is possibly not an invariant subspace under the transformation performed by matrix A (thus, in 

this paper we consider a more general case with respect to (Baglama et al., 1998)).   

  
Remark 3.2 The adopted Krylov subspace method may in general perform m h  iterations, generating the 

orthonormal vectors 1, , .mu u  Then, we can set 1
= ( , , ),h h

R u u  with 1{ , , } {1, , },h m  and 

change relations (4-5) accordingly; i.e. Assumption 3.1 may hold selecting any h out of the m vectors (among 

1, , mu u ) computed by the Krylov subspace method.  

  
Remark 3.3 For relatively small values of the parameter h in Assumption 3.1 (say 20,h   as it often suffices in 

the applications), the computation of the eigenpairs ( , ),i iv  = 1, , ,i h  of hT  when hT  is indefinite may be 

extremely fast, with standard codes. For example, if the CG is the Krylov subspace method used in Assumption 

3.1 to solve (3), then the Matlab (2011) (general) function eigs() requires as low as 
4

10


  seconds to fully 

compute all the eigenpairs of ,hT  for = 20,h  on a commercial laptop. In the case where the CG is the Krylov-

subspace method of choice, the matrix hT  is tridiagonal. Nonetheless, in a separate paper (Fasano and Roma, 

2009) we consider a special case where the request (5) on hT  may be considerably weakened under mild 

assumptions. Moreover, in the paper (Fasano and Roma, 2011b) we also prove that for a special choice of the 

parameter 'a' used in our class of preconditioners (see below), strong theoretical properties may be stated.   

 
On the basis of the latter assumption, we can now define our preconditioners and show their properties. To 

this aim, considering for the matrix hT  the expression (5), we define (see also (Gill et al., 1992))  

 
1d
| | , | |= diag {| |}, if is indefinite ,

| | =

, if is positive definite .

T
h h h h i h i hef

h

h h

V B V B T

T

T T

 






 
As a consequence, when hT  is indefinite we have  
1 1 ˆ| | =| | = ,

T
h h h h h h hT T T T V I V

 
 where the h nonzero 

diagonal entries of the matrix ˆhI  are in the set { 1, 1}.   Furthermore, it can be easily seen that | |hT  is positive 

definite, for any h, and the matrix 
1 2 1

| | | | =h h h hT T T I
 

 is the identity matrix. 

Now let us introduce the following n n  matrix, which depends on the real parameter 'a':  

  
d

1 1= ( ) | | , 1 ,
ef

T T T T
h h h h h h h h h hM I R R R T R a u u u u h n        

         1 , 1 1

( 1) , 1

| |
0

= | | 1

0

T
hh h

TT
h h n h hh

T
n h n h

RT ae

R u R uae

I R

  

  

   
   

        
  
     

 (6) 

 
d

= ( ) | | = | | ,
ef

T T T
n n n n n n n n nM I R R R T R R T R   (7) 

where hR  and hT  satisfy relations (4-5), ,a R  the matrix 
[ ( 1)]

, 1
n n h

n hR R
  

   is such that 


GIOVANNI FASANO and MASSIMO ROMA 

68 

, 1 , 1 ( 1)=
T
n h n h n hR R I     and 1 , 1| |h h n hR u R     is orthogonal. Observe that of course the matrix , 1n hR   in 

(6) always exists, with  

 , 1 , 1 1 1= ( | )( | ) .
T T

n h n h n h h h hR R I R u R u     

Using the parameter dependent matrix hM  in (6-7) we are now ready to introduce our class of 

preconditioners  

   # 1 1( , , ) = | |
T T

h n h h h hM a D D I R u R u D  
 
  

 
                            

1
2

1 1

| |
| |

1

Th h
h h h h

T
h

T ae
R Du R Du

ae




 

 
 
 
 

 (8) 

 
# 1
( , , ) = | | .

T
n n n nM a D R T R


 (9) 

  
Theorem 3.1  Consider any Krylov subspace method to solve the symmetric linear system (3). Suppose that 

Assumption 3.1 holds and the Krylov method performs h n  iterations. Let ,a R  = 0,  and let the matrix 

n n
D R


  be such that 1 , 1| |h h n hR u DR     is nonsingular, where 

  , 1 , 1 1 1= | |
TT

n h n h n h h h hR R I R u R u    . 

Then, we have the following properties:   

a)  the matrix 
#

( , , )hM a D  is symmetric. Furthermore   

        - when 1,h n   for any 
1 1/2

{ ( | | ) },
T
h h ha R e T e

 
    

#
( , , )hM a D  is nonsingular;  

        - when =h n  the matrix 
#

( , , )hM a D  is nonsingular;  

b)  the matrix 
#

( , , )hM a D  coincides with 
1

hM


 as long as either = nD I  and = 1,  or = ;h n  

c)  for 
1 1/2

| |<| | ( | | )
T
h h ha e T e

 
 the matrix 

#
( , , )hM a D  is positive definite. Moreover, if = nD I  the 

spectrum 
#

[ ( , , )]h nM a I  is given by  

1
2

#
( 1)

| |
[ ( , , )] = ;

1

h h
h n n h

T
h

T ae
M a I I

ae






 

  
      

   
   

 
d)  when 1:h n     

        - if D is nonsingular then 
#

( , , )hM a D A  has at least   3h   singular values equal to 
2

1 / ;  

        - if D is nonsingular and = 0a  then the matrix 
#

( , , )hM a D A  has at least  2h   singular values  equal 

to 
2

1 / ;   

e)  when = ,h n  then 
# 1
( , , ) = ,n nM a D M


 [ ] = [| |]n nM T   and 

1 1
[ ] = [ ] { 1, 1},n nM A AM

 
      i.e. the 

n eigenvalues of the preconditioned matrix 
#

( , , )hM a D A  are either +1 or -1.  

 
Proof.  See (Fasano and Roma, 2011b) for the proof.                                                                                              □ 

 
The Corollary which follows considers the important particular case obtained by setting = 0,a  = 1  and 


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

69 

= ,nD I  in the definition of the preconditioner 
#
( , , ).hM a D     

 
Corollary 3.2 Consider any Krylov subspace method to solve the symmetric linear system (3). Suppose that 

Assumption 3.1 holds and the Krylov subspace method performs h n  iterations. Then, the preconditioner  

   # 1 1(0,1, ) = | |
T

h n n h h h hM I I R u R u 
 
  

 
                              
1

1 1

| | 0
| |

0 1

Th
h h h h

T
R u R u



 

 
  

 
 (10) 

 
# 1
(0,1, ) = | | ,

T
n n n n nM I R T R


 (11) 

is such that   

a) the matrix 
#

(0,1, )h nM I  is symmetric and nonsingular for any ;h n  

b) the matrix 
#

(0,1, )h nM I  coincides with 
1
,hM


 for any ;h n  

c) the matrix 
#

(0,1, )h nM I  is positive definite. Moreover, its spectrum 
#

[ (0,1, )]h nM I  is given by  

 # 1[ (0,1, )] = | | ;h n h n hM I T I



   
 

 
d) when 1,h n   then the matrix 
#

(0,1, )h nM I A  has at least  2h   singular values equal to +1; 

e) when = ,h n  then [ ] = [| |]n nM T   and 
# 1 1

[ (0,1, ) ] = [ ] = [ ] { 1, 1},n n n nM I A M A AM
 

       i.e. the 

n eigenvalues of 
#

(0,1, )h nM I A  are either  +1 or -1.  

 
Proof.  The result is directly obtained from (6-7) and Theorem 3.1, with = 0,a  = 1  and = .nD I                   □ 

 
Remark 3.4 As stated in the comments to relation (6), the matrix , 1n hR   in the statement of Theorem 3.1 

always exists, such that 1 , 1| |h h n hR u R     is orthogonal. However, , 1n hR   is neither built nor used in (8-9), 

and it is introduced only for theoretical purposes. Furthermore, it is easy to see that since 1 , 1| |h h n hR u R     is 

orthogonal, any nonsingular diagonal matrix D may be used in order to satisfy the hypotheses of Theorem 3.1.   

 
Remark 3.5  Observe that the case h n  in Theorem 3.1 is of scarce interest for large scale problems. Indeed, 

in the literature of preconditioners the values of 'h' typically do not exceed 10-20 (Morales and Nocedal, 2001; 

Gratton et al., 2011). Moreover, for small values of h in (8), the computation of the inverse matrix  

 
1
2

| |

1

h h

T
h

T ae

ae




 
 
 
 

 (12) 

in order to provide 
#
( , , )h nM a I  or 

#
( , , ),hM a D  may be cheaply performed when hT  is either indefinite or 

positive definite. Indeed, after a brief computation we have  

 
1 1 1 1
1

2 2 4 2

1

2

1
| | | | | | | |

| |
= ,

1 | |

T
h h h h h h h

h h

T
T

h
h h

a
T T e e T T e

T ae

ae e T
a




   

 



   




 
  

  
     
 

 (13) 

 with  


GIOVANNI FASANO and MASSIMO ROMA 

70 

 
2

1

2

= .

1 | |
T
h h h

a

a
e T e











 (14) 

Thus, when hT  is indefinite, Remark 3.3 and relation (13) will provide the result. On the other hand, in case 

0,hT  it suffices to use (13). Finally, the proper setting of the parameter 'a' allows to easily control the 

condition number of matrix (12).  

 
Figure 1. The condition number of matrix A (i.e. ( )Cond A ) along with the condition number of matrix 

#
(0,1, )h nM I A  (i.e. 

1
( )Cond M A


), when {10, 20, 30, 40, 50, 60, 70, 80, 90},h   and A is randomly 

chosen with entries in the uniform distribution [ 10,10].U   

4. Preliminary numerical results 

In order to preliminarily test our proposal on a general framework, without any assumption on the sparsity 

pattern of the matrix A, we used our parameter dependent class of preconditioners 
#
( , , ),hM a D  setting = 1  

and = .nD I  

We anticipate that in our numerical experience we obtain very interesting results concerning the 

correspondence between theoretical and numerical results. Indeed, all the results stated in Theorem 3.1 for the  

singular values of the (possibly) unsymmetric matrix 
#
( , , ) ,hM a D A  seem to hold in practice also for the  

eigenvalues of 
#

( , , )hM a D A  (it is worth recalling that since 
#
( , , ) 0hM a D , then 

# # 1/2 # 1/2
[ ( , , ) ] [ ( , , ) ( , , )h h hM a D A M a D AM a D     ]), so that 

#
( , , )hM a D A  has only real eigenvalues. 

Regarding the numerical investigation, we used 3 different sets of test problems.  

First, we considered a set of symmetric linear systems as in (3), where the number of unknowns n  is set as 

= 1000,n  and the matrix A has also a moderate condition number. We simply wanted to experience how our 

class of preconditioners affects the condition number of A. In particular (see also (Geman, 1980)), a possible 

choice for the latter class of matrices is given by  

 ,= { }, [ 10,10], , = 1, , ,i j ijA a a U i j n   (15) 

where , ,=i j j ia a  are random entries in the uniform distribution [ 10,10],U   between -10 and +10. Then, also 

the vector b in (3) is computed randomly with entries in the set [ 10,10].U   We computed the preconditioners 


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

71 

(8-9) by using the Conjugate Gradient (CG) method (Higham, 2002), which is one of the most popular Krylov 

subspace iterative methods to solve (3) (Golub and Van Loan, 1996). We remark that the CG is also often used  

in case the matrix A is indefinite, though it can prematurely stop. As an alternative choice, in order to satisfy 

Assumption 3.1 with A indefinite, we can use the Lanczos (1950) process, MINRES methods (Paige and 

Saunders, 1975) or Planar-CG methods (Fasano, 2005). In (8) we set the parameter h in the range  

 { 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 },h   

and we preliminarily chose = 0a  (though other choices of the parameter 'a' yield similar results), which satisfied 

items a) and c) of Theorem 3.1. We have generated several matrices like (15), obtaining very similar results. In 

particular, given one instance of A as in (15), we plotted in Figure 1 the condition number ( )A  of 

A ( ( )Cond A ), along with the condition number 
#

( (0,1, ) )h nM I A  of 
#

(0,1, )h nM I A (
1

( )Cond M A


): in both 

cases the condition number   was calculated by preliminarily computing the eigenvalues 1, , n   (using 

Matlab (2011) routine eigs()) of A and 
#

(0,1, )h nM I A  respectively, then obtaining the ratio  

 
| |max

= .
| |min

i
i

i
i






 
Evidently, numerical results confirm that the order of the condition number of A is pretty similar to that of the 

condition number of 
#
(0,1, ) .h nM I A  This indicates that if the preconditioners (8) are used as a tool to solve (3), 

then most preconditioned iterative methods which are sensitive to the condition number (e.g. the Krylov 

subspace methods) are, on average, not expected to perform worse with respect to the unpreconditioned case. 

However, it is important to remark that the spectrum 
#

[ (0,1, ) ]h nM I A  tends to be shifted with respect to 

[ ],A  inasmuch as the eigenvalues in [ ]A  whose absolute value is larger than +1 tend to be scaled in 

#
[ (0,1, ) ]h nM I A  (see Figure 2).  The latter property is an appealing result as described in Section 1, since the 

eigenvalues of 
#

(0,1, )h nM I A  will be 'more clustered'. The latter phenomenon was better investigated by 

introducing other sets of test problems, described hereafter. 

 
Table 1. The adopted linesearch-based truncated Newton method 

 
GIOVANNI FASANO and MASSIMO ROMA 

72 

 
Figure 2. Comparison between the full/detailed spectra (left/right figures) [ ]A  (Unprecond) and 

#
[ (0,1, ) ]h nM I A  (Precond), with A randomly chosen (eigenvalues are sorted for simplicity); without 

loss of generality we show the results for the values 5= = 20h h  and 6= = 30.h h  The intermediate 

eigenvalues in the spectrum 
#

[ (0,1, ) ],h nM I A  whose absolute value is larger than 1, are in general 

smaller than the corresponding eigenvalues in [ ].A  The eigenvalues in 
#

[ (0,1, ) ]h nM I A  are more 

clustered near +1 or -1 than those in [ ].A  

 
In a second experiment we generated the set of matrices A such that  

 = ,A HDH  (16) 

where ,
n n

H R


  = 500n , is a Householder transformation given by = 2 ,
T

H I vv  with 
n

v R  a randomly 

chosen unit vector. The matrix 
n n

D R


  is diagonal (so that its non-zero entries are also eigenvalues of A, 

while each column of H is also an eigenvector of A). The matrix D is such that its perc n  eigenvalues are larger 

(about one order of magnitude) than the remaining (1 )perc n   eigenvalues (we set = 0.3perc ). Finally, again 

we computed the preconditioners (8-9) by using the CG, setting the starting point 0x  so that the initial residual 

0b Ax  was a linear combination (with randomly chosen coefficients -1 and +1) of all the n eigenvectors of A. 

We strongly highlight that the latter choice of 0x  is expected to be not favorable when applying the CG, in order 

to build our preconditioners. In the latter case the CG method is indeed expected to perform exactly n iterations 

before stopping (see also (Nocedal and Wright, 2000; Saad, 2003)), so that the matrices (4.2) may be significant 

to test the effectiveness of our preconditioners, in case of  small values of h (broadly speaking, h small implies 

that the preconditioner contains correspondingly little information on the inverse matrix 
1

A


). We compared the 

spectra [ ]A  and 
#

[ (0,1, ) ],h nM I A  in order to verify again how the preconditioners (8) are able to cluster the 

eigenvalues of A. Following the guidelines in (Morales and Nocedal, 2001), in order to test our proposal also on 

a different range of values for the parameter h, we set  


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

73 

 { 4 , 8 , 12 , 16 , 20 }.h   

The results are given in Figure 3 (full comparisons) which includes all the 500 eigenvalues, and Figure 4 

(details) which includes only the eigenvalues from the 410-th eigenvalue to the 450-th eigenvalue. Observe that 

our preconditioners are able to shift the largest absolute eigenvalues of A towards -1 or +1, so that the clustering 

of the eigenvalues is enhanced when the parameter h increases. For each value of h the matrix A is (randomly) 

recomputed from scratch, according to relation (16). This explains why in the five plots of Figures 3-4 the 

spectrum of A changes. Again, a behavior very similar to Figures 3-4 is obtained also using different values for 

the parameter 'a'.  

 
Figure 3. Comparison between the  full spectra [ ]A  (Unprecond) and 
#

[ (0,1, ) ]h nM I A  (Precond), 

with A nonsingular and given by (16) (eigenvalues are sorted for simplicity); we used different values of h 

( 1 = 4,h  2 = 8,h  3 = 12,h  4 = 16,h  5 = 20h ), setting = 500.n  The large eigenvalues in the spectrum 

#
[ (0,1, ) ]h nM I A  are in general smaller (in modulus) than the corresponding large eigenvalues in [ ].A  

A 'flatter' piecewise-line of the eigenvalues in 
#

[ (0,1, ) ]h nM I A  indicates that the eigenvalues tend to 

cluster around -1 and +1, according with the theory. 

 
GIOVANNI FASANO and MASSIMO ROMA 

74 

 
Figure 4. Comparison between a detail of the spectra [ ]A  (Unprecond) and 
#

[ (0,1, ) ]h nM I A  

(Precond), with A nonsingular and given by (16) (eigenvalues are sorted for simplicity); we used different 

values of h ( 1 = 4,h  2 = 8,h  3 = 12,h  4 = 16,h  5 = 20h ), setting = 500.n  The large eigenvalues in the 

spectrum 
#

[ (0,1, ) ]h nM I A  are in general smaller (in modulus) than the corresponding large eigenvalues 

in [ ].A  A 'flatter' piecewise-line of the eigenvalues in 
#

[ (0,1, ) ]h nM I A  indicates that the eigenvalues 

tend to cluster around -1 and +1, according with the theory. 

 
To complete our preliminary experience we tested our class of preconditioners in optimization frameworks. 

In particular, we considered a standard linesearch-based truncated Newton method in Table 1, where for any 

0k   the solution of the symmetric linear system (Newton's equation) 
2

( ) = ( )k kf x d f x   is required. We 

considered several unconstrained optimization problems from CUTEr collection (Gould et al., 2003), and for 
each problem we applied the truncated Newton method in Table 1. At the outer iteration k we computed the 

preconditioner 
#
(0,1, ),h nM I  with {4, 8, 12, 16, 20},h   by using the CG to solve the equation 

2
( ) = ( ).k kf x d f x   Then, we adopted 

#
(0,1, )h nM I  as a preconditioner for the solution of Newton's 

equation at the subsequent iteration  


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

75 

 
2

1 1( ) = ( ) .k kf x d f x    

The iteration index 'k' was the first index such that both relations  

 
31 10 and 0.95k k k k

k

x x
x 



     (17) 

 hold (the first relation implies that 1 ,k kx x   while the second holds when the search direction 

1( ) /k k kx x    approaches Newton's step). Thus, the index k was chosen in order to have 1k kx x    

small, i.e. the entries of the Hessian matrices 
2

( )kf x  and 
2

1( )kf x   are not expected to differ significantly. 

For simplicity we just report the results on two test problems, using = 1000,n  in the set of all the optimization 

problems experienced. Very similar results were obtained for almost all the test problems. 

 
Figure 5. The condition number of matrix 
2

1( )kf x  ( ( )Cond A ) along with the condition number of 

matrix 
# 2

1(0,1, ) ( )h n kM I f x  (
1

( )Cond M A


), for the optimization problem NONCVXUN, when 

1 17.h   The condition number of 
2

1( )kf x   is close to the condition number of 

# 2
1(0,1, ) ( ),h n kM I f x   for any value of the parameter h. The value = 175k  was chosen as in (17) and 

it was 176 175 0.083.x x   

 
In Figures 5-6 we consider the problem NONCVXUN. For the sake of brevity we only show the numerical 

results using = 16h  in (8). Observe that since 1kx   is close to kx  (i.e. we are eventually converging to a local 

minimum) the Hessian matrix 
2

1( )kf x   is positive semidefinite. Furthermore, again the eigenvalues larger 

than +1 in 
2

1[ ( )]kf x    are scaled in 
# 2

1[ (0,1, ) ( )].h n kM I f x    Similarly we show in Figures 7-8 the 

results for the test function NONDQUAR in CUTEr collection. The test problems in this optimization framework, 

where the preconditioner 
#

(0,1, )h nM I  is computed at the outer iteration k and used at the outer iteration 1,k   

confirm that the properties of Theorem 3.1 may hold also when 
#

(0,1, )h nM I  is used on a sequence of linear 

systems = ,k kA x b  when kA  changes slightly with k.  

 
GIOVANNI FASANO and MASSIMO ROMA 

76 

 
Figure 6. Comparison between the full spectra/detailed spectra (left figure/right figure) of 
2

1( )kf x   

(Unprecond) and 
# 2

1(0,1, ) ( )h n kM I f x   (Precond), for the optimization problem NONCVXUN, with 

4= = 16.h h  The eigenvalues in 
# 2

1[ (0,1, ) ( )]h n kM I f x    larger than +1 are evidently scaled, so that 

# 2
1[ (0,1, ) ( )]h n kM I f x    is more clustered. 

 
Figure 7. The condition number of matrix 
2

1( )kf x  ( ( )Cond A ) along with the condition number of 

matrix 
# 2

1(0,1, ) ( )h n kM I f x  (
1

( )Cond M A


), for the optimization problem NONDQUAR, when 

1 17.h   The condition number of 
2

1( )kf x   is now slightly larger than the condition number of 

# 2
1(0,1, ) ( )h n kM I f x   (though they are both 

10
10 ). The value = 40k  was chosen as in (17) and it 

was 41 40 0.203.x x   

5.    Conclusions 

We have given theoretical and numerical results for a class of preconditioners, which are parameter 

dependent. The preconditioners in our proposal can be built by using any Krylov method for the symmetric 


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

77 

linear system (3), provided that it is able to satisfy the general conditions (4-5) in Assumption 3.1. The latter 

property may be appealing in several real problems, where a few iterations of the adopted Krylov subspace 

method may suffice to compute an effective preconditioner.  

       
Figure 8. Comparison between the full spectra/detailed spectra (upper figure/lower figures) 

2
1[ ( )]kf x    (Unprecond) and 

# 2
1[ (0,1, ) ( )]h n kM I f x    (Precond), for the optimization problem 

NONDQUAR, with 4= = 16.h h  Some nearly-zero eigenvalues in the spectrum 
2

1[ ( )]kf x    are 

shifted to non-zero values in 
# 2

1[ (0,1, ) ( )].h n kM I f x    Since many eigenvalues in 
2

1[ ( )]kf x    are 

zero or nearly-zero, the preconditioner 
#

(0,1, )h nM I  may be of slight effect, unless large values of the 

parameter h are considered. 


GIOVANNI FASANO and MASSIMO ROMA 

78 

 
Our proposal seems tailored also for those cases where a sequence of linear systems of the form  

 = , = 1, 2,k kA x b k  

requires a solution (e.g., see (Morales and Nocedal, 2001) for details), where kA  slightly changes with the index 

k. In the latter case, the preconditioner 
#

( , , )hM a D  in (8-9) can be computed applying the Krylov subspace 

method to the first linear system 1 1= .A x b  Then, 
#

( , , )hM a D  can be used to efficiently solve = ,k kA x b  with 

= 2, 3,k . On this guideline, in a future work we are going to experience our proposal with other 

preconditioners described in Section 2. In particular, we think that a comparison with the proposals in (Gratton et 

al., 2011; Morales and Nocedal, 2000) could be noteworthy. 

Finally, the class of preconditioners in this paper seems to be an interesting tool also for the solution of 

linear systems in financial frameworks. In particular, in future works we want to focus on symmetric linear 

systems arising when we impose KKT conditions in portfolio selection problems, with a large number of titles in 

the portfolio, along with linear equality constraints (see also (Al-Jeiroudi et al., 2008)). 

6. Acknowledgment 

The first author wishes to thank the Italian Ship Model Basin - INSEAN, CNR institute, for their support. 

7. References 

AL-JEIROUDI, G., GONDZIO, J. and HALL, J. 2008. Preconditioning indefinite systems in interior point 

methods for large scale linear optimization. Optimization Methods and Software, 23: 345–363. 

BAGLAMA, J., CALVETTI, D., GOLUB, G. and REICHEL, L. 1998. Adaptively preconditioned GMRES 

algorithms. SIAM Journal on Scientific Computing, 20: 243–269. 

BENZI, M., CULLUM, J. and TUMA, M. 2000. Robust approximate inverse preconditioner for the conjugate 

gradient method. SIAM Journal on Scientific Computing, 22: 1318–1332. 

DENNIS, J. and SCHNABEL, R. 1983. Numerical Methods for Unconstrained Optimization and Nonlinear 

Equations. Prentice-Hall, Englewood Cliffs. 

FASANO, G. 2005. Planar–conjugate gradient algorithm for large–scale unconstrained optimization, Part 1: 

Theory. Journal of Optimization Theory and Applications, 125: 523–541. 

FASANO, G. and ROMA, M. 2007. Iterative computation of negative curvature directions in large scale 

optimization. Computational Optimization and Applications, 38: 81–104. 

FASANO, G. and ROMA, M. 2009. Preconditioning Newton-Krylov methods in nonconvex large scale 

optimization. Submitted to Computational Optimization and Applications. 

FASANO, G. and ROMA, M. 2011a. A class of preconditioners for large indefinite linear systems, as by-

product of Krylov-based methods: Part 1. Technical Report n. 4, Department of Management, University 

Ca’Foscari, Venice, Italy. 

FASANO, G. and ROMA, M. 2011b. A class of preconditioners for large indefinite linear systems, as by-

product of Krylov-based methods: Part 2. Technical Report n. 5, Department of Management, University 

Ca’Foscari, Venice, Italy. 

GEMAN, S. 1980. A limit theorem for the norm of random matrices. The Annals of Probability, 8: 252–261. 

GILL, P.E., MURRAY, W., PONCELEÓN, D.B. and SAUNDERS, M.A. 1992. Preconditioners for indefinite 

systems arising in optimization. SIAM Journal on Matrix Analysis and Applications, 13: 292–311. 

GIRAUD, L. and GRATTON, S. 2006. On the sensitivity of some spectral preconditioners. SIAM Journal on 

Matrix Analysis and Applications, 27: 1089–1105. 


PRECONDITIONING LARGE INDEFINITE LINEAR SYSTEMS 

79 

GOLUB, G. and VAN LOAN, C. 1996. Matrix Computations. The John Hopkins Press, Baltimore, Third 

edition. 

GOULD, N.I.M., ORBAN, D. and TOINT, P.L. 2003. CUTEr (and sifdec), a constrained and unconstrained 

testing environment (revised). ACM Transaction on Mathematical Software, 29: 373–394. 

GRATTON, S., SARTENAER, A. and TSHIMANGA, J. 2011. On a class of limited memory preconditioners 

for large scale linear systems with multiple right-hand sides. SIAM Journal on Optimization, 21: 912-935. 

HESTENES, M. 1980. Conjugate Direction Methods in Optimization. Springer Verlag, New York. 

HIGHAM, N. 2002. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA, Second edition. 

LANCZOS, C. 1950. An iteration method for the solution of the eigenvalue problem of linear differential and 

integral. Journal of Research of the National Bureau of Standards, 45: 255–282. 

LUKSAN, L., MATONOHA, C. and VLCEK, J. 2010. Band preconditioners for the matrix free truncated 

Newton method. Technical Report n. 1079, Institute of Computer Science, Academy of Sciences of the 

Czech Republic. 

MATLAB. 2001. Release 2011a, The MathWorks Inc. 

MORALES, J. and NOCEDAL, J. 2000. Automatic preconditioning by limited memory quasi–Newton updating. 

SIAM Journal on Optimization, 10: 1079–1096. 

MORALES, J. and NOCEDAL, J. 2001. Algorithm PREQN: Fortran 77 subroutine for preconditioning the 

conjugate gradient method. ACM Transaction on Mathematical Software, 27: 83–91. 

NASH, S. 1985. Preconditioning of truncated-Newton methods. SIAM J. Scientific and Statistical Computing, 6: 

599–616. 

NASH, S. 2000. A survey of truncated-Newton methods. Journal of Computational and Applied Mathematics, 

124: 45–59. 

NOCEDAL, J. and WRIGHT, S. 2000. Numerical Optimization (Springer Series in Operations Research and 

Financial Engineering) - Second edition, Springer, New York. 

PAIGE, C. and SAUNDERS, M. 1975. Solution of sparse indefinite systems of linear equations. SIAM Journal 

on Numerical Analysis, 12: 617–629. 

ROMA, M. 2005. A dynamic scaling based preconditioning for truncated Newton methods in large scale 

unconstrained optimization. Optimization Methods and Software, 20: 693–713. 

SAAD, Y. 2003. Iterative Methods for Sparse Linear Systems. Second Edition, SIAM, Philadelphia. 

SIMONCINI, V. and SZYLD, D.B. 2007. Recent developments in Krylov subspace methods for linear systems. 

Numer. Linear Algebra with Appl., 14: 1–59. 

STOER, J. 1983. Solution of large linear systems of equations by conjugate gradient type methods. In 

Mathematical Programming. The State of the Art, A. BACHEM, M. GRÖTSCHEL, and B. KORTE, eds., 

Berlin Heidelberg, Springer-Verlag, pp. 540–565.  

 
Received   15 July 2011                

Accepted   13 September 2011