() CUBO A Mathematical Journal Vol.16, No¯ 03, (87–96). October 2014 Computing the resolvent of composite operators Abdellatif Moudafi L.S.I.S, Aix Marseille Université, U.F.R Sciences, Domaine Universitaire de Saint-Jérôme. Avenue Escadrille Normandie-Niemen, 13397 MARSEILLE CEDEX 20, abdellatif.moudafi@lsis.org ABSTRACT Based in a very recent paper by Micchelli et al. [8], we present an algorithmic approach for computing the resolvent of composite operators: the composition of a monotone operator and a continuous linear mapping. The proposed algorithm can be used, for example, for solving problems arising in image processing and traffic equilibrium. Fur- thermore, our algorithm gives an alternative to Dykstra-like method for evaluating the resolvent of the sum of two maximal monotone operators. RESUMEN Basados en un art́ıculo reciente de Micchelli et al. [8], presentamos una manera al- goŕıtmica para calcular la resolvente de operadores compuestos: la composición de un operador monótono y una aplicación lineal continua. El algoritmo propuesto puede us- arse, por ejemplo, para resolver problemas que aparecen en procesamiento de imágenes y equilibrio de tránsito. Además, nuestro algoritmo entrega una alternativa a métodos tipo Dykstra para evaluar al resolvente de la suma de dos operadores monótonos max- imales. Keywords and Phrases: 49J53, 65K10, 49M37, 90C25. 2010 AMS Mathematics Subject Classification: maximal monotone operators, KM-algorithm, Yosida regularization, Douglas-Rachford algorithm, Dykstra-like method. 88 Abdellatif Moudafi CUBO 16, 3 (2014) 1 Introduction and preliminaries To begin with, let us recall the following concepts which are of common use in the context of convex and nonlinear analysis, see for example Takahashi [13] or [1]. Throughout, H is a real Hilbert space, 〈·, ·〉 denotes the associated scalar product and ‖ · ‖ stands for the corresponding norm. An operator with domain Dom(T) and range R(T) is said to be monotone if 〈u − v,x − y〉 ≥ 0 whenever u ∈ T(x),v ∈ T(y). It is said to be maximal monotone if, in addition, its graph, gphT := {(x,y) ∈ H × H : y ∈ T(x)}, is not properly contained in the graph of any other monotone operator. It is well-known that for each x ∈ H and λ > 0 there is a unique z ∈ H such that x ∈ (I + λT)z. (1) The single-valued operator JTλ := (I + λT) −1 is called the resolvent of T of parameter λ. It is a nonexpansive mapping which is everywhere defined and is related to its Yosida approximate, namely Tλ(x) := x−J T λ (x) λ , by the relation Tλ(x) ∈ T(J T λ(x)). (2) The latter is Lipschitz continuous with constant 1 λ . Recall also that the inverse T−1 of T is the operator defined by x ∈ T−1(y) if and only if T(x). Now, let A : H1 → H2 be a continuous linear operator with adjoint A∗, T : H2 → H2 a maximal monotone operator, H1 and H2 being two Hilbert spaces. It is easily checked that the composite operator A∗TA is monotone. This kind of operator appears, for example, in partial differential equations in divergence form and in signal and image processing. Without further conditions, however, it may fail to be maximal monotone, see for sufficient conditions [12] or [13]. Now, let us state the two following key facts: Fact 1: A∗TA is maximal monotone, see for instance [12]-Corollary 4.4, if 0 ∈ ri(R(A) − DomT), where ri stands for the relative interior of a set. The Krasnoselski-Mann algorithm is a widely used method for solving fixed-point problems. This algorithm generates from an arbitrary initial guess v0 ∈ C (a closed convex set), a sequence (yk) by the recursive formula yk+1 = (1 − αk)yk + αkQ(yk), k ∈ IN, where (αk) is a sequence in [0,1]. Fact 2: It is known that for a nonexpansive mapping Q, the Krasnoselski-Mann’s algorithm converges weakly to a fixed point of Q provided that the sequence (αk) is such that ∑ ∞ k=0 αk(1 − αk) = +∞ and the underling space is a Hilbert space, see for example [2]. Given x ∈ H1, we will focus our attention in this paper on the following problem Compute y := JA ∗ TA 1 (x). (3) CUBO 16, 3 (2014) Computing the resolvent of composite operators 89 The problem (1.3) was studied in the case where T = ∂φ because it arises in many applications in signal processing and image denoising, see for example, [8], [9] and references therein. More precisely, If φ : H2 → IR ∪ {+∞} is a convex and lower semicontinuous, and A : H1 → H2 is a continuous and linear mapping , then the composition φ◦A is also convex and lower semicontinuous. Furthermore, by the chain rule of convex analysis, A∗∂φA ⊂ ∂(φ ◦ A), where equality holds whenever the constraint qualification 0 ∈ ri(R(A) − Domφ) is satisfied, see for instance [12], (here Domφ = {x ∈ H2;φ(x) < +∞} and ∂φ(x) means the subdifferential of φ at x and is equal to the set {u ∈ H2 : φ(y) ≥ φ(x) + 〈u,y − x〉 for all y ∈ H2} as usual). In this context (1.3) reduces to the problem of evaluating the proximity operator, proxφ◦Ax := argminu{φ ◦ A(u) + 1 2 ‖u − x‖2}, (4) of φ ◦ A at x. This arises, for instance, in studying the L1/TV image denoising model that involves the total-variation regularization term which is non-differentiable. This causes algorithmic difficulties for its numerical treatment. To overcome the difficulties, the authors formulate in [8] the total-variation as a composition of a convex function (the l1-norm or the l2- norm) and the first order difference operator, and then express the solution of the model in terms of the proximity operator of the composition, by identifying the solution as fixed point of a nonlinear mapping expressed in terms of the proximity operator of the l1-norm or the l2-norm, each of which is explicitly given. Our aim here is to extend their analysis to evaluate the resolvent of the main operator A∗TA. To begin with, let us recall that M. Fukushima [6] proved that if A ◦ A∗ is an isomorphism, then the operator A∗TA is maximal monotone. Whereas, in finite dimensional setting, S. Robinson [13] observed that we may avoid this condition provided that R(A) ∩ ri(DomT) 6= ∅. Moreover, JA ∗ TA λ (x) = x − λu with u = (T −1 + λAA∗)−1(Ax). However, the above formula is difficult to evaluate in practice since it involves T−1. To overcome this difficulty, M. Fukushima [6] proposed an alternative computation of y = JA ∗ TA λ (x): Given x ∈ H1 (i) find the unique solution z of 0 ∈ 1 λ (AA∗)−1(z − Ax) + Tz, (ii) Compute u, by u = 1 λ (AA∗)−1(Ax − z) (iii) Compute y = x − λA∗u. Let us remark the potential difficulties in implementing the above algorithm lie in the fact that it involves the inverse operator AA∗. The evaluation of such a mapping is in general expensive. In the sequel we will follow another approach developed in [8] in a convex optimization context. 90 Abdellatif Moudafi CUBO 16, 3 (2014) 2 Fixed-point approach Under the assumption that we can explicitly compute the resolvent operator of T and by assuming that A∗TA is a maximal monotone operator, our aim is to develop an algorithm for evaluating the resolvent of A∗TA. Remember that from (1.1), for each x ∈ H1 there is a unique z := J A ∗ TA 1 x ∈ H1 such that x ∈ (I + A∗TA)z. Our main is to provide a constructive method to compute it. From (1.2), we clearly have JA ∗ TA 1 (x) ∈ x − A ∗T(A(JA ∗ TA 1 x). (5) This combined with the fact that y ∈ A∗TA(x) ⇔ x = JA ∗ TA 1 (x + y) enable us to establish a relationship between the resolvent of A∗TA and that of T. To that end, we define the following affine transformation for a fixed x ∈ H1 at y ∈ H2 by Fy := Ax + (I − λAA∗)y and the operator Q := (I − JT1/λ) ◦ F. (6) Theorem 2.1. If T is a maximal monotone operator, A a continuous linear operator and λ > 0 then JA ∗ TA 1 x = x − λA ∗y if and only if y is a fixed point of Q. Proof. According to (2.1), we can write JA ∗ TA 1 x = x − λA ∗y, where y ∈ 1 λ T(AJA ∗ TA 1 x). Thus y ∈ 1 λ T(A(x − λA∗y)). Using, for instance, the fact that y ∈ T(x) ⇔ x = JT1(x + y), we deduce Ax − λAA∗y = JT1/λ(Ax + (I − λAA ∗)y), that is y is a fixed-point of Q. Conversely, if y is a fixed-point of Q then Ax − λAA∗y = JT1/λ(Ax + (I − λAA ∗)y), Definition of the resolvent yields λy ∈ TA(x − λA∗y), thus λA∗y ∈ A∗TA(x − λA∗y), and by (2.1) we obtain JA ∗ TA 1 x = x − λA ∗y. This completes the proof. CUBO 16, 3 (2014) Computing the resolvent of composite operators 91 Hence, it suffices to find a fixed-point y of Q and the resolvent of T is then equal to x−λA∗y. Now, by assuming ‖I−λAA∗‖ ≤ 1 and having in mind that I−JT 1/λ is nonexpansive, we easily derive: Lemma 2.2. If T is a maximal monotone operator and A a continuous linear mapping and λ > 0 such that ‖I − λAA∗‖ ≤ 1, then the operator Q is nonexpansive. To find a fixed point y∞ of the operator Q, we can use, for example, the KM-algorithm, namely yk+1 = αkyk + (1 − αk)Q(yk), (7) and then set JA ∗ TA 1 x = x − λA ∗y∞. Using fact 2, we derive the following result. Corollary 2.3. Under assumptions of Lemma 1.2, for any αk ∈ [0,1] satisfying ∑ k αk(1−αk) = +∞, the sequence (xk) generated by (2.3) weakly converges to a fixed-point of Q. 3 Applications 3.1 Image denoising L1/TV models can be cast into the following general form 0 ∈ λS(x) + A∗TAx, (8) where S and T are two maximal monotone operator and A a linear continuous mapping. For example, given a noisy image x which was contaminated by impulsive noise, we consider a denoised image of x as a minimizer of the following L1/TV model min{λ‖u − x‖1 + ‖u‖TV}, (9) where ‖ · ‖1 represents the l1−norm, ‖ · ‖TV denotes the total-variation, and λ is the regularization parameter balancing the fidelity term‖u · −x‖1 and the regularization term ‖ · ‖TV. By rewriting ‖u‖TV = φ ◦ A, with φ a convex lower-semicontinuous function and A a real matrix and by using the chain rule, we obtain the following optimality condition of (3.2) 0 ∈ λ∂‖u − x‖1 + A t∂φA(u), which is nothing else than (3.1) with S = ∂‖ · −x‖1 and T = ∂φ. It is well accepted that the l1-norm fidelity term can effectively suppress the effect of outliers that may contaminate a given image, and is therefore particularly suitable for handling non-Gaussian additive noise. The L1/TV model (3.1) has many distinctive and desirable features. For the 92 Abdellatif Moudafi CUBO 16, 3 (2014) anisotropic total variation, φ is expressed with ‖ · ‖1 while for the isotropic total variation, φ is expressed with ‖ · ‖2. Thus, we can obtain their proximity mappings in closed-forms. Indeed, for x ∈ IRm and λ > 0 prox1 λ ‖·‖1 (x) = (prox1 λ |·|(x), · · ·,prox1 λ |·|(x)) with prox1 λ |·|(xi) = max(|xi| − 1 λ ,0)sign(xi), while prox1 λ ‖·‖2 (x) = max(‖x‖2 − 1 λ ,0) x ‖x‖2 . Since in section 2, we developed a method for computing the resolvent of the operator A∗TA, to solve (3.1), we can make use of any splitting algorithm for finding the zero of the sum of two maximal monotone operators, for instance that of Passty which consists in the Picard iteration for the composition of the resolvents of λS and A∗TA. But, it is well known that we only have ergodic convergence. The algorithm which is of common use in this type of context is the so-called Douglas-Rachford (DR) proposed in its initial form by Lions and Mercier [7] and that takes here the following form { xk = J A ∗ TA γ (yk) (by a loop which uses the method introduced in section 2); yk+1 = yk + αk(J S λγ(2xk − yk) − yk); (10) Thanks to the well-known convergence result for DR-algorithm, we derive that the sequence (yk) converges weakly to a fixed-point y∞ of (2JA ∗ TA γ − I) ◦ (2J S λγ −I) and J S λγy ∞ solves (1.3) provided that ∑ k αk(2 − αk) = +∞ and αk < 2. Moreover, if dimH < +∞, then the sequence (xk) converges to a solution of (3.1). 3.2 Resolvent of a sum of two operators Let x ∈ H, let T1 and T2 be two maximal monotone operators from H to H. To compute the resolvent of the sum of T1 and T2 at x, Bauschke and Combettes [4] proposed the following Dykstra- like method: and set    x0 = x, p0 = 0 q0 = 0 and { yk = J T2 1 (xk + pk); pk+1 = xk + pk − yk, and { xk+1 = J T1 1 (yk + qk); qk+1 = yk + qk − xk+1. (11) They proved that if x ∈ R(Id+T1 +T2), then both the sequences (xk) and (yk) strongly converges to JT1+T2 1 x. Our aim is to propose an alternative approach for computing the resolvent of a monotone operator which can be decomposed as a sum of two maximal monotone operators, such that their individual resolvents can be implemented easily. Both methods we present will proceed by splitting in the sense that, at each iteration, they employ these resolvents separately. CUBO 16, 3 (2014) Computing the resolvent of composite operators 93 First, note that if T1 and T2 are set-valued mappings from H to H, their pointwise sum can be expressed in the composite form A∗TA; by defining H = H × H;Ax = (x,x) and T(x1,x2) = T1(x1) × T2(x2). Indeed, then A∗(y1,y2) = y1 + y2 and so A ∗TA(x) = T1(x) + T2(x). This fact will allow us to give alternative algorithms to compute the resolvent operator of the sum of two maximal monotone operators relying on the fixed-point approach developed in section 2. It is easily seen that finding a fixed point y = (y1,y2) of the operator Q, defined by (2.2), amounts to solving the following system { y1 = 1 λ (I − J T1 1 )(x − λy2); y2 = 1 λ (I − J T2 1 )(x − λy1), (12) Note that the operator Q̃(y1,y2) := { 1 λ (I − J T1 1 )(x − λy2); 1 λ (I − J T2 1 )(x − λy1), is nonexpansive for all λ > 0. Thus, we can use the algorithm (yk+11 ,y k+1 2 ) = αk(y k 1,y k 2) + (1 − αk)Q̃(y k 1,y k 2), (13) to find a fixed-point (y∞1 ,y ∞ 2 ) and then we deduce the resolvent of the sum of T1 and T2. Indeed, we have the following result: Proposition 3.1. Let T1,T2 be two maximal monotone operators, then for any αk ∈ [0,1] satisfying∑ k αk(1 − αk) = +∞, the sequence (y k 1,y k 2) generated by (3.6) weakly converges to a fixed-point (y∞1 ,y ∞ 2 ). Furthermore, we have J T1+T2 1 (x) = x − λA∗(y∞1 ,y ∞ 2 ) = x − λ(y ∞ 1 + y ∞ 2 ). We would like to emphasize that we can also use a Von Neumann-like alternating algorithm for solving system (3.5). Indeed, the latter is equivalent to { λy1 = (I − J T1 1 )(x − (I − J T2 1 )(x − λy1)); λy2 = (I − J T2 1 )(x − λy1). (14) By defining Ã(y) = −A(−y) for a given operator A, a simple computation gives JÃ1 (y) = −J A 1 (−y) and J A(·−x) 1 (y) = x + JA1 (y − x). Hence, by setting A := T−1 1 , B := T−1 2 , v1 := λy1 and v2 := λy2, and according to the fact that I − JA1 = J A −1 1 , we finally obtain { v1 = J A 1 ◦ J B̃(·−x) 1 (v1); v2 = J B 1 ◦ J Ã(·−x) 1 (v2). (15) 94 Abdellatif Moudafi CUBO 16, 3 (2014) This suggests to consider the following alternating resolvent method: { v01 ∈ H and ṽ k 1 = J B̃(·−x) 1 (vk1),v k+1 1 = JA1 (ṽ k 1); v02 ∈ H and ṽ k 2 = J Ã(·−x) 1 (vk2),v k+1 2 = JA1 (ṽ k 2). (16) From [3]-Theorem 3.3, we deduce: Proposition 3.2. Let T1,T2 be two maximal monotone operators, then the sequence (v k 1,v k 2) gener- ated by (3.9) weakly converges to a solution (v∞1 ,v ∞ 2 ) of (3.8) provided that the latter exists. More- over, t he resolvent of the sum of T1 and T2 at x is then given by J T1+T2 1 (x) = x − λA∗(y∞1 ,y ∞ 2 ) = x − λ(y∞1 + y ∞ 2 ). To conclude, it is worth mentioning that in the case of convex optimization, the problem under consideration in this section amounts to evaluating the proximity operator of the sum of two proper, lower semicontinuous convex functions φ,ψ : H → IR ∪ {+∞}, namely: given x, compute y := proxφ+ψ(x). (17) In this context, the resolvent operator is nothing else than the proximity operator and according to the fact that (I − proxφ) = proxφ∗, system (3.5) reduces to { y1 = 1 λ (I − proxφ)(x − λy2) = 1 λ proxφ∗(x − λy2); y2 = 1 λ (I − proxψ)(x − λy1) = 1 λ proxψ∗(x − λy1), (18) where φ∗,ψ∗ stand for the Fenchel conjugate of the functions φ and ψ. Remember that the Fenchel conjugate of a given function f is defined as f∗(x) = supu{〈x,u〉 − f(u)}. Finally, we would like to emphasize that we can also consider as an application the traffic equilibrium problem consider in [6]. The advantage of our approach is that it does not require the inversion of the operator AA∗. Roughly speaking, see [6], the traffic equilibrium problem can be written as (3.1) with λ = 1, S = ∂δC, namely 0 ∈ A∗TAx + ∂δC(x), (19) S = ∂δC the partial differential of the indicator function of a polyhedral convex set, T a cost function (not assumed to be single-valued as in [6], but we suppose that its resolvent operator can be computed efficiently) and A satisfying AA∗ = νI for ν ∈ IN. (20) The latter is also satisfied in image restoration problems by the so-called tight frames operators. It is well known that in this case the resolvent operator of ∂δC is exactly the projection onto C (algorithms for computing projection which do not require any particular hypothesis on the input CUBO 16, 3 (2014) Computing the resolvent of composite operators 95 polyhedral sets are available) and a simple calculation shows that the method developed in section 2 gives JA ∗ TA 1 (x) = x − A ∗Tν(Ax). The latter formula is in fact valid for any positive real number ν and generalizes a formula by Combettes and Pesquet [5] obtained in the case where T is the subdifferential of a proper lower- semicontinuous function and was used in many applications in image restoration. More precisely, relying on a qualification condition and by taking into account (1.4), we clearly have proxφ◦Ax = x − A ∗ (∂φ)ν(Ax) = x − ν −1A∗(I − proxνφ)Ax. Now, an application of Passty’s algorithm gives xk+1 = PC(xk − 1 ν A∗(I − JTν)Axk), (21) which is nothing else than an extension of the celebrated CQ-algorithm of Byrne, see [10]. [10]- Theorem 3.1 assures: Proposition 3.3. Let T be a given maximal monotone operator and C a nonempty closed convex set, the sequence f (xk) generated by (3.14) converges to a solution of the traffic equilibrium problem (3.12) if ν > L, with L being the spectral radius of the operator A∗A. While applying the Douglas-Rachford’s algorithm yields to { xk = yk − 1 ν A∗(I − JTν)Ayk; yk+1 = (1 − αk)yk + αkPC(yk − 2 ν A∗(I − JTν)Ayk). (22) Which looks like a relaxed version of algorithm (3.13). Since, we are in a finite dimensional setting, we obtain Proposition 3.4. Let T be a given maximal monotone operator and C a nonempty closed convex set, , then for any αk ∈]0,2[ satisfying ∑ k αk(2 − αk) = +∞, the sequence (xk) generated by (3.15) converges to a solution of the equilibrium problem (3.12). Received: April 2014. Accepted: August 2014. References [1] H. Attouch, A. Moudafi, H. Riahi, Quantitative stability analysis for maximal monotone operators and semi-groups of contractions, Journal Nonlinear Analysis, Theory, Methods & Applications, vol. 21, (1993), 697-723. [2] H. H. Bauschke and P. L. Combettes, A weak-to-strong convergence principle for Fejér- monotone methods in Hilbert spaces, Mathematics of Operations Research, 26, no. 2 (2001), 248-264. 96 Abdellatif Moudafi CUBO 16, 3 (2014) [3] H.H. Bauschke, P.L. Combettes, and S. Reich: The asymptotic behavior of the composition of two resolvents, Nonlinear Analysis: Theory, Methods, and Applications 60, (2005) 283-301. [4] H.H. Bauschke and P.L. Combettes: A Dykstra-like algorithm for two monotone operators, Pacific Journal of Optimization 4, (2008) 383-391. [5] P. L. Combettes and J.-C. Pesquet, A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery, IEEE J. Selected Topics Signal Process. 1, (2007) 564574. [6] M. Fukushima,The Primal Douglas-Rachford Splitting Algorithm for a Class of Monotone Operators with Application to theTraffic Equilibrium Problem, Mathematical Programming, Vol.72 (1996) 1-15. [7] P.-L. Lions, B. Mercier, Splitting algorithms for the sum of two nonlinear operators, SIAM J. Numer. Anal. 16 (1979) 964-979. [8] Ch. A. Micchelli, L. Chen and Y. Xu, Proximity algorithms for image models:Denoising Inverse Problems (2011) doi:10.1088/0266-5611/27/4/045009. [9] Ch. A. Micchelli, L. Shen, Y.Xu, X. Zeng, Proximity algorithms for the L1/TV image denoising model, Adv Comput Math 38 (2013) 401-426. [10] A. Moudafi, M. Théra, Finding the zero for the sum of two maximal monotone operators, Journal Optimization Theory & Applications, vol. 94, N2, (1997), 425-448. [11] A. Moudafi, Split monotone variational inclusions, Journal of Optimization, Theory and Ap- plications 150 (2011), p. 275-283. [12] T. Pennanen, Dualization of generalized equations of maximal monotone type, SIAM J. Op- tim. 10 (2000) 809-835. [13] S.M. Robinson, Composition duality and maximal monotonicity, Math. Programing Ser. A 85 (1999) 1-13. [14] W. Takahashi, Nonlinear Functional Analysis, Yokohama Publishers, Yokohama, 2000. Introduction and preliminaries Fixed-point approach Applications Image denoising Resolvent of a sum of two operators