J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 Journal of the Nigerian Society of Physical Sciences Modified Gradient Flow Method for Solving One-Dimensional Optimal Control Problem Governed by Linear Equality Constraint Olusegun Olotua, Charles Aladesayeb, Kazeem Adebowale Dawodua,∗ aDepartment of Mathematical Sciences,The Federal University of Technology Akure, Nigeria bDept. of Mathematics, School of Science, College of Education, Ikere-Ekiti, Ekiti State, Nigeria Abstract This study presents a computational technique developed for solving linearly constraint optimal control problems using the Gradient Flow Method. This proposed method, called the Modified Gradient Flow Method (MGFM), is based on the continuous gradient flow reformulation of constrained optimization problem with three-level implicit time discretization scheme. The three-level splitting parameters for the discretization of the gradient flow equations are such that the sum of the parameters equal to one (θ1 + θ2 + θ3 = 1). The Linear and quadratic convergence of the scheme were analyzed and were shown to have first order scheme when each parameter exist in the domain [0, 1] and second order when the third parameter equal to one. Numerical experiments were carried out and the results showed that the approach is very effective for handling this class of constrained optimal control problems. It also compared favorably with the analytical solutions and performed better than the existing schemes in terms of convergence and accuracy. DOI:10.46481/jnsps.2022.589 Keywords: Optimal Control, Gradient Flow, three-level splitting parameters, discretization scheme, linear and quadratic convergence Article History : Received: 15 January 2022 Received in revised form: 23 February 2022 Accepted for publication: 23 February 2022 Published: 28 February 2022 c©2022 Journal of the Nigerian Society of Physical Sciences. All rights reserved. Communicated by: J. Ndam 1. Introduction Many authors, in literature, have attempted the class of lin- early constrained optimal control problems using the classical analytical techniques of the Hamilton-Jacobi-Bellman (HJB) or "Pontryagin Maximum Principle" by Pontryagin et al. in [14]; though this approach may proof difficult in some cases when inclusive of non-differentiable delay terms. The numerical ap- proach has also been fully handled by several authors like Betts ∗Corresponding author tel. no: +2348032541380 Email address: dawodukazeem@futa.edu.ng (Kazeem Adebowale Dawodu) in [4], Olotu and Dawodu in [11, 12], Adekunle [1] and Ak- eremale [2], using the direct numerical methods. This method deploys the "first discretized - then - optimize" technique which is known for its amenability to well-structured discretization schemes and optimization solvers after the unconstrained for- mulation of the control problem using any of the penalty func- tion methods or method of multipliers. The direct method of nonlinear programming formulation tends to eradicate the prob- lem of ill-conditioning associated with control problems with delay dynamical systems. However, this research paper seeks for a better approach in the 146 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 147 formulation of the control operators in the optimization of the discretized optimal control problem. The gradient flow method was first applied to nonlinear programming problems by Ev- tushenko [5] and later improved by Evtushenko and Zhadan [6]; while the application to unconstrained problems was by Behrman [3] and the unified approach to nonlinear optimiza- tion problem was by Wang et al. [15]. The unified gradi- ent flow algorithm exhibits linear convergence for θ ∈ [0, 1) and quadratic convergence at θ = 1. As an improvement over [15], a modified gradient flow method MGFM was proposed using the gradient formula for Lagrange multiplier formulation. The development of this method requires a continuous gradient flow approach based on a three level implicit time discretiza- tion scheme with splitting parameter θ1, θ2 and θ3. The conver- gence of the solution of the modified discretized gradient flow equation to a local minimum of the original problem, either lin- early or quadratically depending on the choice of θ, will as well be demonstrated. However, the technique can also be adapted to handle inequality constraint problems by introducing slack variables. Three test problems were solved and their numerical results presented in comparison with existing methods. This is to demonstrate the effectiveness of the method in solving con- strained nonlinear programming problems with either continu- ous variables or a mixture of continuous and discrete variables. The structure of this paper is as follows: Section 3 is ded- icated to the idea of gradient flow approach for unconstrained optimization while Section 4 presents the modified gradient flow algorithm for constrained optimization in Section 2 using the 3- level splitting parameters and as well as the convergence analy- sis. The Algorithm was experimented on two numerical exam- ples and comparison made with the Conjugate Gradient Method (CGM) in Section 4. 2. Statement of problem The general form of the One-Dimensional Optimal Control Problem is given as Minimize J(x, u) = ∫ T t0 F(t, x(t), u(t))dt, (1) Subject to ẋ(t) = f (t, x(t), u(t)), t ∈ [t0, T ], (2) x(t0) = x0, (3) where x(t) ∈ C1[t0, T ] ∈ R, u(t) ∈ L2[t0, T ] ∈ R, F : [t0, T ] × R × R → R and f : [t0, T ] × R × R → R. Thus, x(t) is the state variable which describes the state of the system at any point in time and u(t) is the control variable in the optimization problem. However, for linear-quadratic problem, the objective functional is written as F(t, x(t), u(t)) = ( px2(t) + qu2(t)) while the linear differential constraint is written as f (t, x(t), u(t)) = ax(t) + bu(t), where p, q , a , b ∈ R and p, q > 0. 3. Research Methodology This paper considered both the theoretical and numerical analyses of the problem in the development of the algorithm for solving eqns. (1) to (3). The discrete form of the quadratic ob- jective functional and the linear differential constraint are con- structed using the Trapezoidal and Euler method respectively. The recurrence relation for both the objective functional and constraint were developed to help construct their respective ma- trix operators. Furthermore, the symmetry and positive definite properties of the matrix operators were analyzed to guarantee their consistency and invertibilty in the developed MGFM. 3.1. Preamble Let f : X → R be a functional such that the solution x(t) is a minimizer of f . The gradient flow method, with an initial point x(0) ∈ X, seeks to find a minimizer of f by the curve x(t) defined by the differential equation d x(t) dt = −∇ f (x(t)), x(0) = x0, (4) where ∇ f is called the gradient of f and the solution is the called the integral curve such that at each instance proceeds in the direction of the steepest descent of f . In an unconstrained optimization problem, the equilibrium points of the gradient flow are the zeros of ∇ f , which are exactly the minimizers of f . The gradient flow solves the problem min f (x), x(0) = x0 such that in every trajectory x(t) of the gradient flow, then f (x(t)) → p∗ for p∗ is the minimizer of f . For clarity, the derivative x′(t) is descretized using the forward Euler step which yields xk+1 := xk − h∇ f (xk) (5) or the backward discretization scheme which yields xk+1 := xk − h∇ f (xk+1). (6) The backward Euler method can be applied for the numerical integration of gradient flow differential equation, also known as the Proximal minimization method expressed thus: xk+1 := Proxh, f (xk). However, the proximal gradient flow algorithm can be inter- preted as gradient flows, d x(t) dt = −∇ f (x(t)) −∇g(x(t)). (7) where g is differentiable. Using the forward-backward integra- tion of the gradient flow, eqn. (7) can be expressed as xk+1 = xk − h∇ f (xk) − h∇g(xk+1), xk+1 := (I + h∇g)−1(I − h∇ f )xk, 147 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 148 where the forward-backward splitting method above numeri- cally integrates the gradient flow differential equations. The time-stepping discretization of eqns. (5) and (6) is ex- pressed as follows: xk+1 := xk − h [ (1 − θ)∇ f (xk) + θ∇ f (xk+1) ] , (8) where θ ∈ [0, 1] is a parameter. When θ = 0, the above dis- cretization is the explicit Euler’s scheme and on the other hand, the implicit backward Euler when θ = 1. 3.2. Discretization of the Objective function The discretization of the objective function in eqn. (1) above, the Trapezoidal rule will be used given as;∫ T t0 f (x)d x = h 2 N−1∑ i=0 [ f (xi) + f (xi+1)], (9) Expanding eqn. (9) above, yields; J(x, u) ≈ h 2 p N−1∑ i=0 (x2i + x 2 i+1) + h 2 q N−1∑ i=0 (u2i + u 2 i+1), (10) where for ti = t0 + ih then x(ti) ≈ x(t0 + ih) = xi, u(ti) ≈ u(t0 + ih) = ui for i = 1, 2, ..., N and N = (T − t0)/h. Re- arranging eqn. (10) into matrix form yields; Minimize J(Z) = 1 2 ZT MZ + C, (11) where Z = (x1, x2, ...xN, u0, u1, ..., uN )T ∈ R2N+1, C = 12 x 2 0hp ∈ R and M ∈ R(2N+1)×(2N+1) is a (2N +1)×(2N +1) matrix operator with diagonal entries mi j as expressed below: [mi j] =  2hp, 1 ≤ i ≤ N − 1, j = i, hp, i = N, j = i, hq, i = N + 1, 2N + 1, j = i, 2hq, N + 2 ≤ i ≤ 2N, j = i, 0, elsewhere . (12) 3.3. Discretization of the Constraint function The differential constraint ẋ(t) = f (t, x(t), u(t)) = ax(t) + bu(t) was discretized by slotting it into the Euler’s scheme xi+1 = xi + ẋh with ti = t0 + ih, to arrive at xi+1 = cxi + dui, (13) where c = 1 + ha, d = hb and i = 0, 1, 2, ..., N − 1. And upon expansion and re-arrangement of eqn. (13) into matrix form yields GZ = k express as g(Z) = GZ − k = 0, (14) where Z is of dimension (2N + 1) × 1 and G is of dimension N × (2N + 1) as shown below; G =  1 0 · · · 0 −d 0 · · · · · · 0 −c 1 ... ... 0 −d 0 ... ... 0 ... ... 0 ... ... ... ... 0 ... ... −c 1 0 · · · 0 −d 0  and k =  cx0 0 ... 0  , with the entries defined as follows: gi j = 1 for 1 ≤ i ≤ N and j = i; gi j = −d for 1 ≤ i ≤ N and j = N + i; gi j = −c for 2 ≤ i ≤ N and j = i − 1 while gi j = 0 otherwise. Hence, eqns. (11) and (14) give the quadratic representation for the constrained minimization problem as Minimize J(Z) = 1 2 ZT MZ + C, subject to GZ − k = 0. (15) 3.4. Lagrangian formulation of the Constraint problem The unconstrained form of the optimal control problem can be obtained using the Lagrangian function on eqn. (15) as stated below: L(Z,λ) = 1 2 ZT MZ + C + λT (GZ − k), (16) L(Z,λ) = J(Z) + λT g(Z), (17) where λ = (λ1,λ2, ...,λN )T ∈ RN is the Lagrangian multiplier. The pair of points (Z∗,λ∗) is said to be a stationary pair points of eqn. (17) if the following first order necessary optimality conditions hold: LZ (Z∗,λ∗) = JZ (Z ∗) + gTZ (Z ∗)λ∗ = 0, (18) Lλ(Z∗,λ∗) = g(Z ∗) = 0, (19) where LZ (Z,λ) = ( ∂L(Z,λ) ∂x1 , · · · , ∂L(Z,λ) ∂uN )T , Lλ(Z,λ) = ( ∂L(Z,λ) ∂λ1 , · · · , ∂L(Z,λ) ∂λN )T , JZ (Z) = ( ∂J(Z) ∂x1 , · · · , ∂J(Z) ∂uN )T , denote the gradient vectors of L and J with the dimensions (2N + 1)×1, N ×1 and (2N + 1)×1 respectively. Similarly, we 148 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 149 have the Jacobian of g with dimension N ×(2N + 1) represented below: gZ (Z) =  ∂g1 (Z) ∂x1 ∂g1 (Z) ∂x2 . . . ∂g1 (Z) ∂uN ∂g2 (Z) ∂x1 ∂g2 (Z) ∂x2 . . . ∂g2 (Z) ∂uN ... ... ... ... ∂gN (Z) ∂x1 ∂gN (Z) ∂x2 . . . ∂gN (Z) ∂uN  . (20) In order to fulfill the above optimality conditions, the continu- ous gradient flow reformulation with the given initial condition Z0 is derived as follows; dZ dt = −LZ (Z(t),λ(Z(t))), Z(0) = Z0, (21) dZ dt = ϕ(Z). (22) Substituting eqn. (18) into (21) yields, dZ dt = −[JZ (Z(t)) + g T Z (Z(t))λ 0(Z(t))], Z(0) = Z0 (23) such that the choice of λ0(Z) is to satisfy LZ (Z,λ 0(Z)) = τgTZ (Z)g(Z), (24) in the least square sense, where τ > 0 is a parameter. Substituting eqn. (18) into (24) yields, JZ (Z) + λ 0(Z)gTZ (Z) = τg T Z (Z)g(Z). (25) Obtaining the least squares solution of λ0(Z) in en. (25) yields, λ0(Z)gTZ (Z) = τg T Z (Z)g(Z) − JZ (Z), (26) λ0(Z) = [gTZ (Z)] † {τgTZ (Z)g(Z) − JZ (Z)}, (27) where the superscript ‘†’ refers to the pseudo inverse of gTZ (Z) in Layton [9]. Theorem 3.1. The constraint equations for the discretized op- timal control problem g(Z) satisfies the linearly independent constraint qualification (LICQ) for any feasible point Z∗. The LICQ states that the gradient of the active inequality constraints and the gradient of the equality constraints are linearly inde- pendent at Z∗. eyiu LICQ holds if g1Z (Z), g2Z (Z), ..., gNZ (Z) are linearly indepen- dent for any feasible point Z∗. Thus α1g1Z (Z) + α2g2Z (Z) + ... + αN gNZ (Z) = 0 (28) implies that αi = 0. Hence, the dimension of the row space of G must be N. Applying the row-reduction process to G in eqn. (3.3), we have Ḡ∗ = [I|V ] written as Ḡ∗ =  1 0 · · · 0 −d 0 · · · · · · 0 0 1 ... ... −cd −d ... ... ... ... ... ... 0 ... ... ... ... ... 0 · · · 0 1 −cN−1d · · · −cd −d 0  ,(29) where I ∈ RN×N and V ∈ RN×(N+1). The entries g∗i j of Ḡ∗ can be defined as follows: [g∗i j] =  1, 1 ≤ i ≤ N, j = i; −ckd, k + 1 ≤ i ≤ N, j = N + i − 2; for k = 1, 2, ..., N − 1, 0, elsewhere. (30) Thus, rank of Ḡ∗ equals rank of G and αi = 0 are the param- eters that make g1Z (Z), g2Z (Z), ..., gNZ (Z) linearly independent for any feasible point Z∗ and G(Z) satisfies LICQ. From theo- rem (3.1), we have for any feasible point Z∗ that Ḡ∗ ∈ RN×(2N+1) is re-structured into a N×N non-singular (invertible) Gram ma- trix G∗ given as G∗ = gZ (Z)g T Z (Z). (31) Multiplying both sides of eqn. (25) by gZ (Z) arrives at gZ (Z)(JZ (Z) + λ 0(Z)gTZ (Z)) = τgZ (Z)g T Z (Z)g(Z). (32) Therefore, with G∗(Z) = gZ (Z)gTZ (Z), eqn. (32) becomes gZ (Z)JZ (Z) + gZ (Z)λ 0(Z)gTZ (Z) = τG∗g(Z) and simplified to get G∗λ 0(Z) = τG∗g(Z) − gZ (Z)JZ (Z). (33) Pre-multiplying eqn. (33) by G−1∗ yields G−1∗ G∗λ 0(Z) = G−1∗ τG∗g(Z) − G −1 ∗ gZ (Z)JZ (Z), (34) and solving for λ0(Z) to obtain λ0(Z) = τg(Z) − G−1∗ gZ (Z)JZ (Z), (35) where eqns. (27) and (35) are equivalent. Consequently, eqn. (27) can be express as λ0(Z) = (gTZ (Z)) †[τgTZ (Z)gZ (Z) − JZ (Z)], = τgZ (Z) − G −1 ∗ JZ (Z)gZ (Z), where G−1∗ = [g T Z (Z)gZ (Z)] −1, hence the pseudo-inverse of [gTZ (Z)] is [gTZ (Z)] † = G−1∗ gZ (Z). However, eqn. (35) is true when Z is sufficiently close to an equilibrium point Z∗. The choice of λ0(Z) has the advantage of reducing the complexity in the evaluation of the gradient of λ0 with respect to Z. Thus substituting λ0(Z) in eqn. (35) above into eqn. (23) yields; Ż = ϕ(Z) = −(JZ (Z) + g T Z (Z)[τg(Z) − G−1(Z)gZ (Z)JZ (Z)]). (36) In testing the convergence of the gradient flow formulation ex- pressed in eqns. (23) and (24), we then considered the LICQ theorem below. 149 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 150 Theorem 3.2. Suppose that (Z∗,λ∗) is a stationary point, then 1. LICQ holds at Z∗. 2. For any Z̄ satisfying gZ (Z∗)Z̄ = 0, then, Z̄T LZZ (Z∗,λ∗)Z̄ > 0 where Z̄ = Z − Z∗. It indicates that the pair (Z(t),λ0(Z(t))) tends to (Z∗,λ∗) as t → ∞ provided the starting point Z0 is sufficiently close to point Z∗. Proof : We start by deploying the Poincare-Lyapunov stability theo- rem stated in Haddad et al. [8] and Morris [10] thus: Theorem 3.3 (Liapunov Stability). Let X∗ be an equilibrium point for X ′ = J(X). Let L : Ω → R be a differentiable function defined on an open set Ω containing X∗. Suppose further that 1. L(X∗) = Ω and L(X) > 0 if X , X∗; 2. L̇ ≤ 0 in Ω−X∗; then X∗ is stable. Furthermore, if L also satisfies 3. L̇ < 0 in Ω − X∗, then X∗ is asymptotically stable, It then requires we show that Z∗ is an asymptotically stable point for Ż = ϕ(Z) if ϕ(Z) is continuously differentiable and the original of the linearized system ẏ = ϕZ (Z ∗)y, (37) where y := Z − Z∗, is exponentially stable (i.e. all eigenvalues of ϕZ (Z∗) are strictly negative). It was however noted that at Z∗, (Z∗,λ0(Z∗)) satisfies eqns. (18) and (19) because the right hand side of eqn. (24) is identically zero; implying that λ∗ = λ0(Z∗). Applying Taylor’s expansion to the RHS of eqn. (36), about the point Z∗, gives the following linearized equation dZ dt = ϕ(Z) + ϕZ (Z)y. (38) Comparing eqn. (38) with the RHS of eqn. (23) yields dZ dt = ϕ(Z∗) + ϕZ (Z ∗)(Z − Z∗). (39) Recall that L(Z,λ) = J(Z) + λT g(Z), (40) LZ (Z,λ) = JZ (Z) + λ T gZ (Z), (41) LZZ (Z,λ) = JZZ (Z) + g T ZZ (Z)λ, (42) then from eqn. (38), dZ dt = ϕ(Z∗) + ϕZ (Z ∗)y, [where y = Z − Z∗]. (43) Suppose ϕ̇(Z∗) = LZ (Z,λ) which is equal to eqn. (41); then differentiating eqn. (41) arrives at ϕZ (Z ∗) = [JZZ (Z ∗) + λT gZZ (Z ∗) + gZ (Z ∗)λZ ]y, = [LZZ (Z ∗,λ∗)) + gZ (Z ∗)λZ ]y. (44) Therefore, eqn. (39) becomes dZ dt = −[LZ (Z ∗,λ∗) + [LZZ (Z ∗,λ∗) + gTZ (Z ∗)λ0Z (Z ∗)](Z − Z∗)], (45) where ϕ(Z∗) = LZ (Z∗,λ∗) and ϕZ (Z∗) = LZZ (Z∗,λ∗) + gTZ (Z ∗)λ0Z (Z ∗). Considering the unknown coefficient λ0Z (Z ∗) and setting LZ (Z∗,λ∗) = 0 yields, dZ dt = −[LZZ (Z ∗,λ∗) + gTZ (Z ∗)λ0Z (Z ∗)](Z − Z∗)]. (46) We seek to differentiate both sides of eqn. (32) with respect to Z and evaluating at Z∗. Firstly, we differentiate the LHS of eqn. (32) as follows: gZ (Z ∗) d dZ [JZ (Z) + λ 0(Z)gTZ (Z)] + [JZ (Z) + λ 0(Z)gTZ (Z)] d dZ gZ (Z) = gZ (Z ∗)[JZZ (Z ∗) + λ0(Z ∗) d dZ gTZ (Z ∗) + gTZ (Z) d dZ λ0(Z)] + [JZ (Z ∗)gZZ (Z ∗)λ0(Z)gTZ (Z ∗)gZZ (Z ∗) (47) and arrives at [JZZ (Z ∗)gZ (Z ∗) + λ0gTZZ (Z ∗)]gZ (Z ∗) +[JZ (Z ∗) + λ0Z (Z ∗)gTZ (Z ∗)]gZZ (Z ∗) + λ0Z (Z ∗)gTZ (Z ∗)gZ (Z ∗). (48) Secondly, we differentiate the RHS of eqn. (32) as follows; τ d dZ [ gTZ (Z)gZ (Z)g(Z) ] = τ [ gTZ (Z) d dZ gZ (Z)g(Z) +gZ (Z)g(Z) d dZ gTZ (Z) ] , = τ[gTZ (Z ∗)gZ (Z ∗)]gZ (Z ∗) +τ[gTZ (Z ∗)gZZ (Z ∗)]g(Z∗) + τ[gTZZ (Z ∗)]gZ (Z ∗)g(Z∗). (49) Equating eqns. (48) and (49), we have [JZZ (Z ∗)gZ (Z ∗) + λ0gTZZ (Z ∗)]gZ (Z ∗) + [JZ (Z ∗) + λ0Z (Z ∗)gTZ (Z ∗)]gZZ (Z ∗) + λ0Z (Z ∗)gTZ (Z ∗)gZ (Z ∗) (50) = τ[gTZ (Z ∗)gZ (Z ∗)]gZ (Z ∗) + τ[gTZ (Z ∗)gZZ (Z ∗)]g(Z∗) + τ[gTZZ (Z ∗)]gZ (Z ∗)g(Z∗). (51) 150 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 151 Applying eqns. (18) and (19) on eqn. (51), we have gZ (Z ∗)[LZZ (Z ∗,λ∗) + gTZ (Z ∗)λ0Z (Z ∗)] = τgZ (Z ∗)gTZ (Z ∗)gZ (Z ∗), (52) where LZZ (Z∗,λ∗) = JZZ (Z∗)gZ (Z∗) +λ0gTZZ (Z ∗) and LZ (Z∗,λ) = JZ (Z∗) + λ0Z (Z ∗)gTZ (Z ∗)). Eqn. (52) becomes gZ (Z ∗)LZZ (Z ∗,λ∗) + G(Z∗)λ0Z (Z ∗) = τG(Z∗)gZ (Z ∗), (53) where G(Z∗) = gTZ (Z ∗)gZ (Z∗). Making λ0Z (Z ∗) the subject of eqn. (53) yields λ0Z (Z ∗) = −G−1(Z∗)gZ (Z ∗)LZZ (Z ∗,λ∗) + τgZ (Z ∗). (54) Recall that y = Z − Z∗, (55) ⇒ dy dt = dZ dt − dZ∗ dt , (56) and dy dt = dZ dt . (57) Substituting eqn. (54) into eqn. (46) and considering eqn. (57) yields dy dt = −[LZZ (Z ∗,λ∗) + gTZ (Z ∗)(−G−1(Z∗)gZ (Z ∗)LZZ (Z ∗,λ∗) +τgZ (Z ∗))]y, (58) dy dt = −[LZZ (Z ∗,λ∗) + +gTZ (Z ∗)gz(Z ∗)LZZ (Z ∗,λ∗)G−1(Z∗) −τgZ (Z ∗)gTZ (Z ∗)]y, (59) dy dt = [−LZZ (Z ∗,λ∗)[I − gTZ (Z)G −1(Z)gZ (Z)] −τgTZ (Z)gZ (Z)]y. (60) Since Z∗ is zero at equilibrium point, then dy dt = −[(I − gTZ (Z)G −1(Z)gZ (Z))LZZ (Z ∗,λ∗) + τgTZ (Z)gZ (Z)]y. (61) Let S (Z) = I − gTZ (Z)G −1(Z)gZ (Z), (62) V (Z) = gZ (Z)g T Z (Z), (63) where I is the N × N identity matrix. Therefore, eqn. (61) becomes dy dt = −[S (Z∗)LZZ (Z ∗,λ∗) + τV (Z∗)]y. (64) From above, eqns. (62) and (63) are orthogonal where eqn. (62) is an operator which projects a vector to the null space of gZ (Z), because gZ (Z)S (Z) = 0. Multiplying eqn. (64) from the left by gZ (Z∗) yields gZ (Z ∗) dy dt = −[τgZ (Z ∗)V (Z∗)]y. (65) Eqn. (65) is equivalent to dy dt = −[τV (Z∗)]y. (66) Hence, since τ > 0, it remains to show that V (Z∗) is positive definite. Recall that V (Z∗) = gTZ (Z ∗)gZ (Z∗) and gZ (Z∗) = G; where G whose dimension is N × (2N + 1) is given by eqn. (3.3). Therefore, Q = GGT with dimension N × N is given by Q =  1 + d2 −c 0 · · · 0 −c α −c 0 ... 0 ... ... ... 0 ... ... ... ... −c 0 · · · 0 −c α  , (67) where α = 1 + c2 + d2. Theorem 3.4 (Sylvester’s Criterion). A real, symmetric matrix is positive definite if and only if all the prin- cipal minors are positive definite. Gilbert [7]. Considering the implementation of Theorem 3.4 on the ma- trix Q, we seek to establish that the matrix is positive definite. Firstly, Q is real since for every qi, j ∈ Q, qi, j ∈ R (i.e. all the diagonal entries in eqn. (67) are real). Secondly, Q = gTZ (Z ∗)gZ (Z∗) is symmetric since (Q)T = [gTZ (Z ∗)gZ (Z∗)]T = gTZ (Z ∗)gZ (Z∗) = Q. However, we also seek to establish that the principal minors Qi, ∀i = 1, 2, 3, ...N, of Q are all positive definite. When i = 1, 2, 3, and 4, Q1, Q2, Q3 and Q4 are 1, 1 + c2, 1 + c2 + c4 and 1 re- spectively; and are all positive ∀c, d > 0. We then assumed, by mathematical induction, that it is true for Qk when i = k and Qk+1 when i = k + 1. And by Cholesky, it was shown that Qk+1 is positive definite and that all the principal minors of Q are positive definite [See Appendix]. We, therefore, concluded that the matrix Q = GGT is positive definite. 3.5. Construction of the Modified Gradient Flow Method (MGFM) We now considered the discretization of the formulation of eqns. (23) and (24) using three level implicit time discretization scheme with splitting parameters θ1 and θ2 where θ3 = (1−θ1 − θ2). Let 0 = t0 < t1 < t2 < t3 < ... < tk = t be a sequence time points for the time t ≥ t0, hk = tk+1 − tk and δZk+1 = Zk+2 − Zk being the sequence of solution distance between two successive 151 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 152 equilibrium points. We discretize eqn. (23) by the following implicit time stepping algorithm; Zk+2 − Zk+1 hk+1 = −[θ3 LZ (Zk+2,λ 0 k+2)+ θ2 LZ (Zk+1,λ 0 k+1) + θ1 LZ (Zk,λ 0 k )]. (68) Simplified to Zk+2 − Zk+1 = −hk+1[θ3 LZ (Zk+2,λ 0 k+2)+ θ2 LZ (Zk+1,λ 0 k+1) + θ1 LZ (Zk,λ 0 k )]. (69) Employing Taylor’s expansion about (Zk+2,λ0k+2) arrives at Zk+1 = Zk + hkZk + O||hk|| 2, (70) LZ (Zk+2,λ 0 k+2) = LZ (Zk,λ 0 k ) + H(Zk,λ 0 k )δZk+1 + O||δZk+1 || 3, (71) LZ (Zk+1,λ 0 k+1) = LZ (Zk,λ 0 k ) + H(Zk,λ 0 k )δZk + O||δZk || 3, (72) where, δZk = Zk+1 − Zk and H(Zk,λk) = LZZ (Zk,λ0k ) + LZλ0k ((Zk,λk))λZ (Zk) is the Hessian of L(Z,λ). Substituting eqns. (70), (71) and (72) into eqn. (69) yields Zk+2 = Zk + hkZ ′ k + O||hk|| 2 −hk+1[θ3(LZ (Zk,λ 0 k ) +H(Zk,λ 0 k )δZk+1 + O||δZk+1 || 3) + θ2(LZ (Zk,λ 0 k ) + H(Zk,λ 0 k )δZk +O||δZk || 2) + θ1 LZ (Zk,λ 0 k )]. (73) Neglecting the higher order terms in eqn. (73) and solving the resultant equation for Zk+2 yields Zk+2 = Zk + hkZk − hk+1θ3 LZk −hk+1θ3 HZkZk+2 + hk+1θ3 HZkZk −hk+1θ2 LZk + +hk+1hkθ2 HZk LZk −hk+1θ1 LZk. (74) For equal step size, hk+1 = hk with θ1 = 0 we obtain Zk+2 ex- pressed below Zk+2 = Zk + hkZk − hkθ3 LZk − hkθ3 HZkZk+2 +hkθ3 HZkZk − hkθ2 LZk +h2kθ2 HZk LZk. (75) In collecting like terms and factorizing eqn. (75), it becomes (I + hkθ3 HZk)Zk+2 = (I + hkθ3 HZk )Zk − (θ2 + θ3)hk LZk + h2kθ2 HZk LZk, = (I + hkθ3 HZk )Zk + (−hk I + h2kθ2 HZk)LZk. (76) Making Zk+2 the subject of formula in eqn. (76) yields Zk+2 = (I + hkθ3 HZk) −1[(I + hkθ3 HZk)Zk +(−hk I + h 2 kθ2 HZk)LZk], = Zk − hk(I + hkθ3 HZk) −1 ×[I − hkθ2 HZk]LZk, for any θ2,θ3 ∈ [0, 1] and θ2 + θ3 = 1 by the time-stepping algorithm for convex functions. Also, from eqn. (76), we obtained (I + hkθ3 HZk )(Zk+2 − Zk) = −hk(I + hkθ2 HZk)LZk. (77) Given that δk+1 = Zk+2 − Zk and θ2 = 0, eqn. (77) becomes (I + hkθ3 HZk )δZk+1 = −hk LZk. (78) Thus, for any initial guess Z0, eqn. (77) defines a series con- verging to the solution of the optimal control problem stated in eqns. (1) to (3). Hence, the Modified Gradient Flow (MGF) Algorithm for the Discretized Optimal Control Problem (OCP) is stated as follows: MGF Algorithm for Discretized OCP Step 1: Input the operator G ∈ RN×(2N+1), Z0 ∈ R2N+1, parameters θ2,θ3 ∈ [0, 1], sequence of time step-size {hk} and tolerance (σ > 0) sufficiently small. Set k = 0. Step 2: Compute λ0Zk using eqn. (54). Step 3: Solve for δZk+1 in the system using the CGM (I + hkθ3 HZk )δZk+1 = −hk LZk (where δZk+1 = Zk+2 − Zk) Step 4: Update: Zk+2 = Zk + δZk+1 . Step 5: Stop the process if ||HZk || ≤ σ, otherwise, set k = k + 1 and repeat step 3. 3.6. Error and Convergence Analysis The convergence analyses of the MGFM requires us to shown the rate and type of convergence of the algorithm for varying values of parameters θ1,θ2 and θ3. We shall show that the algorithm is super-linearly convergent for some θ2 and θ3 and quadratically convergent if θ1 = θ2 = 0 and θ3 = 1. Theorem 3.5. Let {Zk} be the sequence defined in eqn. (77) and (Zå,λå) be the solution of eqns. (1) and (2). If the initial guess Z0 is sufficiently close to Z, then we have 1. Zk converges linearly to Zå if θ1,θ2,θ3 ∈ [0, 1] and hk > 0 is sufficiently small; 2. Zk converges to Zå quadratically if θ2 = 0, θ3 = 1 and hk →∞. We shall prove the two cases separately. 152 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 153 • Case 1. θ1,θ2,θ3 ∈ [0, 1]. From eqn. (77) we have: (I + hkθ3 HZk )(Zk+2 − Zk) = −hk(I − hkθ2 HZk )LZk . Hence, Zk+2 = Zk − hk[(I − hkθ2 HZk )LZk + θ3 HZk (Zk+2 − Zk)]. Let (Z∗,λ∗) be the minimum points of eqns. (16) and (17) and ek+1 = Zk+1 − Z∗. Subtracting Z∗ from both sides of the above equation and noting that we have ek+1 = ek − hk[(I − hkθ2 HZk )(LZk − LZ∗k ) +θ3 HZk (ek+1 − ek)]. (79) Thus, by the Mean Value Theorem, ek+1 = ek − hk[(I − hkθ2 HZk )H ζk Zk ek +θ3 HZk (ek+1 − ek)], (80) where ζk denotes a point in the line segment connecting Zk and Z∗. Taking the norm of both sides, we have ||ek+1|| ≤ γ(Zk,λ 0 k,ζk,θ1,θ2,θ3, hk)||ek||, (81) where γ(Zk,λ 0 k,ζk,θ1,θ2,θ3, hk) = ||{I − hk(I + θ3 HZk ) −1(I − θ2 HZk )H ζk Zk }||. (82) From eqn. (81), we see that if γ(Zk,λ0k,ζk,θ1,θ2,θ3, hk) < 1, then ek converges to zero linearly. To show this, we need to prove that HZk is pos- itive definite. But by theorems and , HZk is positive defi- nite. With these, when Zk is sufficiently close to Z∗, from eqn. (82), it is seen that if θ1,θ2,θ3 ∈ [0, 1] and hk suffi- ciently small, we have γ(Zk,λ 0 k,ζk,θ1,θ2,θ3, hk) ≤ 1 − hk(1 − hkθ2λkmin)λkmin 1 + hkθ3λkmax  < 1, (83) where λkmin and λ k max denote the minimum and maximum eigenvalues of HZk , respectively. Therefore, eqn. (81) implies that limk→∞ ek = 0 is linear. This proves that Zk converges linearly to Zå if θ1,θ2,θ3 ∈ [0, 1] and hk > 0 is sufficiently small. • Case 2. θ3 = 1. From eqn. (77), when θ3 = 1 and hk = h, we have Zk+2 − Zk h = −(LZk + HZkδZk+1 ). When h →∞, the above equation becomes HZkδZk+1 + LZk = 0. (84) Eqn. (84) coincides with the equation resulting from the application of the Newton’s method to LZ = 0. Hence, Zk converges quadratically to Z∗ if Z0 is sufficiently close to Z∗. Figure 1. State Behaviors in various Algorithms Figure 2. Control Behaviors in various Algorithms 4. Examples and Numerical Results The numerical examples and respective results are presented below based on the construction of modified gradient flow method (MGFM) for discretized optimal control problems. The results show the robustness and efficiency of the scheme (MGFM) as compared with existing schemes. Example 1 Considering the optimal control problem presented by [1] Minimize I(x, u) = ∫ 1 0 (x2(t) + u2(t))dt, (85) subject to ẋ = 2x(t) + 5u(t), x(0) = 1. (86) The analytical solution is given as( x λ ) = ( 1.0000 −0.5908 ) e−5.3852t, (87) and the control variable is given as u(t) = − 1.4770e−5.3852t. The solution of eqns. (85) and (86) were obtained, using h = 0.1 and hk = 20, by the modified gradi- ent algorithm for discretized optimal control problems and the results were compared with the existing algorithms as shown in table 1 below. However, the step-size h = 0.1 was chosen for the purpose of illustration. Meanwhile smaller choices of step-size could be used for better refinement. 153 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 154 Algorithm objective functional value Analytical Method 0.2953894 Adekunle [1] 0.3033885 Akeremale [2] 0.2955932 MGFM 0.2955879 Table 1: Comparison of Objective Functional Values of Exist- ing Methods for Example 1 Results: From table 1 above, it was observed that the objective functional value (0.2955879) for the developed MGFM agreed favourably with the value (0.2953894) obtained from the ana- lytical method and better than the values obtained by Adekunle [1] and Akeremale [2] from Conjugate Gradient Method (CGM) and Extended Conjugate Gradient Method (ECGM) respectively. Table 2a: Errors in the State variables for Example 1 |x(t) − x̂(t)| (×10−3) t Adekunle [1] Akeremale [2] MGFM 0.0 0 0 0 0.1 3.2204 0.5528 0.5525 0.2 3.8300 0.3680 0.3677 0.3 3.4841 0.3414 0.3410 0.4 2.9424 0.4447 0.4442 0.5 2.5506 0.6968 0.6962 0.6 2.4892 1.1667 1.1650 0.7 2.9118 1.9867 1.9858 0.8 4.0417 3.4014 3.4005 0.9 6.2660 5.8292 5.8279 1.0 10.266 9.9981 9.9789 |u(t) − û(t)| (×10−4) t Adekunle [1] Akeremale [2] MGFM 0.0 38.796 38.419 38.4202 0.1 18.131 0.6668 0.6555 0.2 7.9765 0.2654 0.2562 0.3 3.1973 0.0149 0.0755 0.4 1.1303 0.1891 0.1949 0.5 4.3839 0.4248 0.4294 0.6 4.8076 0.7726 0.7761 0.7 1.0193 1.3417 1.3442 0.8 2.0619 2.3045 2.3062 0.9 3.8245 3.9497 3.9505 1.0 6.7705 6.4026 6.4027 Table 2 above shows the errors in the state and control variables for example 1 while the errors generated with the MGFM were minimal compared with the errors generated from Adekunle [1] and Akeremale [2]. Figures 1 and 2 below clearly show the Algorithm Objective Functional Value Analytical Method 0.5647 Adekunle [1] 0.5746 Akeremale [2] 0.5649 MGFM 0.5648 comparisons of the behaviors of the state and control trajecto- ries to the various algorithms respectively. It was observed that the trajectories for both the state and control variables overlap due to closeness in results. Example 2 Considering the optimal control problem presented by [1] MinimizeJ(x, u) = ∫ 1 0 (x2(t) + u2(t))dt, (88) subject to ẋ = 1.705x(t) + 3.021u(t), (89) x(0) = 1. (90) Table 3: Comparison of Objective Functional Values of Exist- ing Methods for Example 2 Table 4a: Errors in the State variables for Example 1 |x(t) − x̂(t)| (×10−4) t Adekunle [1] Akeremale [2] MGFM 0.0 10.000 10.0000 1.0000E 0.1 3.2204 5.528 5.5253 0.2 2.0509 4.347 3.9394 0.3 3.3242 2.604 2.3946 0.4 3.2376 2.435 2.2688 0.5 2.9852 2.646 2.4997 0.6 2.6832 3.2433 3.0992 0.7 2.3921 4.2860 4.1303 0.8 2.1365 5.8933 5.7138 0.9 1.9144 8.2567 8.0424 1.0 1.7012 10.716 11.403 Table 4b: Errors in the Control variables for Example 1 Results: From table 3 above, it was observed that the objective functional value, 0.5648, for MGFM agreed favourably with the value (0.5647) obtained from the analytical method; and better than the values obtained by Adekunle [1] and Akeremale [2] from Conjugate Gradient Method (CGM) and Extended Con- jugate Gradient Method (ECGM) respectively as expressed in table 3 above. Results: Table 4 above shows the errors in the state and control variables for example 2 where the errors generated from the use of MGFM were minimal compared to the errors generated from same examples in Adekunle [1] and Akeremale [2]. Figures 3 and 4 below clearly show the comparisons of the behaviors of the state and control trajectories to the various algorithms 154 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 155 |u(t) − û(t)|(×10−4) t Adekunle [1] Akeremale [2] MGFM 0.0 0.2928 0.3404 0.2893 0.1 1.8337 0.5220 0.3179 0.2 1.0622 1.8094 2.4019 0.3 5.9872 1.1203 2.3334 0.4 3.3442 0.4000 0.4941 0.5 1.9971 0.7234 0.7965 0.6 1.5137 1.1255 1.1823 0.7 1.6411 1.6579 1.7019 0.8 2.2516 2.3872 2.4210 0.9 3.3101 3.4037 3.4311 1.0 2.9285 3.4045 2.8932 Figure 3. State Behaviors in various Algorithms Figure 4. Control Behaviors in various Algorithms respectively. It was observed that the trajectories for both the state and control variables overlap due to closeness in results. 5. Conclusion This paper developed a new scheme for the solution of a quadratic optimal control problem with linear equality constraint. The method is a gradient flow formulation that requires dis- cretization techniques and a splitting parameter approach for the gradient of the equation whose solution converges to a local minimum of the original problem; either linearly or quadrati- cally. Two linear equality constrained optimal control problems were considered and their results compared with the analytical and two known existing schemes. Thus, it was observed that the new scheme compared favorably with the analytic solutions and performed better than the Conjugate Gradient Method used in [2] in terms of convergence and accuracy. It is therefore ob- vious that this developed algorithm, due to its robustness and efficiency, would provide a new platform for solving optimal control problems with or without box constraints, equality or in- equality constraints or with delay or non-delay dynamical sys- tems. Acknowledgements The authors are grateful to the editors and the anonymous reviewers for their constructive comments and suggestions. References [1] A. I. Adekunle, “Algorithm for a Class of Discretized Optimal Control Problems", M.Tech. Thesis, Federal University of Technology, Akure, Nigeria (2011) (Unpublished). [2] O. C. Akeremale, “Optimization of Quadratic Constrained Optimal Con- trol Problems using Augumented Lagrangian Method", M.Tech. Thesis, Federal University of Technology, Akure, Nigeria (2012) (Unpublished) [3] W. Behrman, “An Efficient Gradient Flow Method for Unconstrained Op- timization", PhD Thesis, Stanford University (1998) (Unpublished). [4] J. T. Betts, “Practical Methods for Optimal Control Problem Using Non linear programming", SIAM, Philadelphia, (2001). [5] Y. Evtushenko, “Generalized Lagrange Multipliers Technique for Nonlin- ear Programming", JOTA, 21 (1977) 121. [6] Y. G. Evtushenko & V. G. Zhadan, “Stable Barrier Projection and Barrier Newton Methods", Nonlinear Programming Optimization Methods and Software, 3 (1994) 237. [7] G. T. Gilbert, “Positive Definite Matrices and Sylvester’s Criterion", The American Mathematical Monthly, Taylor & Francis, 98 (1991) 44. [8] W. M. Haddad, S. G. Nersesov & V.S. Chellaboina, “Lyapunov Function Proof of Poincare’s Theorem", International Journal of Systems Science, 35 (2004) 287. [9] J. B. Layton, “Efficient direct computatioion of the Pseudo-inverse and its gradient", International Journal for Numerical Methods in Engineering, 40 (1997) 4211. [10] W. H. Morris, S. Stephen & L. D. Robert, “Differential Equations, Dy- namical Systems, and an Introduction to Chaos", Elsevier Academic Press, USA (2004) 194. [11] O. Olotu & K. A. Dawodu, “Quasi-Newton Embedded Augmented La- grangian Algorithm for Discretized Optimal Proportional Control Prob- lems", Journal of Mathematical Theory and Modeling, 3 (2013a) 67. [12] O. Olotu & K. A. Dawodu, “On the Discretized Algorithm for Optimal Proportional Control Problems Constrained by Delay Differential Equa- tions", Journal of Mathematical Theory and Modeling, 3 (2013b) 157. [13] O. Olotu & S. A. Olorunsola, “An Algorithm for a Discretized Con- strained, Continuous Quadratic Control Problem", Journal of Applied Sciences, 8 (2006) 6249. [14] L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze & E. F. Mishchenko, “The Mathematical Theory of Optimal Processes", Inter- science Publishers, London (1962). [15] S. Wang, X. Q. Yang K. L. Teo “A Unified Gradient Flow Approach to Constrained Nonlinear Optimization Problems", Computational Op- timization and Applications, 25 (2003) 251. 155 Olotu et al. / J. Nig. Soc. Phys. Sci. 4 (2022) 146–156 156 APPENDIX Let Qi represent the leading principal minor of Q, ∀i = 1, 2, 3, ..., N of the matrix Q = GGT ; we then considered the determinants of the principal minors Q1, Q2, Q3, ... as follows: For i = 1, Q1 = 1 + d2 > 0, ∀d. For i = 2, Q2 = [ 1 + d2 −c −c 1 + c2 + d2 ] , (91) such that det (Q2) = 1 + 2d2 + c2d2 + d4 > 0 for all c, d. For i = 3, Q3 =  1 + d2 −c 0 −c 1 + c2 + d2 −c 0 −c 1 + c2 + d2  , (92) such that det (Q3) = 1 > 0 ∀c. Therefore, |Q3| is positive. For i = 4, Q4 =  1 + c2 −c 0 −d −c 1 + c2 −c 0 0 −c 1 0 −d 0 0 2d2  , (93) such that det. (Q4) = d2 > 0. ∀c, d. Therefore, det (Q4) = d2 > 0, ∀d, i.e. |Q4| is positive. Hence, by mathematical induction, since it is true for values of Qi, i = 1, 2, 3, 4, then we assume that it is true for i = k, i.e. Qk, while we shall prove that it is true for i = k + 1, i.e. Qk+1. Set σ2 = 1 + c2 > 0, c2 > 0, 1 > 0, 2d2 > 0 and d2 > 0 where 1 + c2, c2, 1, 2d2 and d2 are principal diagonals of Q. Thus, Qk+1 = [ Qk βk+1 βTk+1 σ 2 ] , (94) where βk+1 = [ Q1,k+1 Q2,k+1 . . . Qk,k+1 ]T . (95) By Cholesky, Qk+1 is said to be positive definite if there exists a lower triangular matrix Li, j such that Qk+1 = LLT . That is,[ Qk βk+1 βTk+1 c 2 ] = [ Hk 0 LTk+1,1 Lk+1,k+1 ] [ HTk L T k+1,1 0 Lk+1,k+1 ] , where Hk is a lower triangular matrix with positive diagonal entries. However, eqn. (5) becomes [ Qk βk+1 βTk+1 c 2 ] = [ Hk HTk Hk Lk+1,1 LTk+1,1 H T k L T k+1,1 Lk+1,1 + L 2 k+1,k+1 ] , (96) such that Qk = Hk HTk , β T k+1 = L T k+1,1 H T k , βk+1 = Hk Lk+1,1 and σ2 = LTk+1,1 Lk+1,1 + L 2 k+1 Lk+1. Thus, since the diagonal entries of Hk are greater than zero, it is non-singular and therefore the linear system of the equation has a unique solution given by Lk+1,1 = H−1k βk+1 and a positive value for Lk+1,k+1 can be obtained if σ2 − LTk+1 Lk+1 > 0. Hence, 0 < |Qk| = det [ Qk βk+1 βTk+1 σ 2 ] , = Qk[σ 2 I −βTk+1(Qk) −1βk+1]. Substituting the values of βk+1 and Qk, we have Q = Qk[σ 2 I − (Hk Lk+1,1) T (Hk H T k ) −1 Hk Lk+1,1], = Qkσ 2 − LTk+1,1 Lk+1,1, where (Hk H T k ) −1 = Q−1k , Qk = [σ 2 − LTk+1,1 Lk+1,1]. Since detQk > 0, it follows that σ2 − LTk+1,1 Lk+1,1 > 0. Hence, Lk+1,k+1 = √ σ2 − LTk+1,1 Lk+1,1. (97) Thus Li > 0 arises from the fact that G is positive definite (LICQ). Hence, the pair (Z(t),λ0(Z(t))) tends to (Z∗,λ∗) as t → ∞ provided the starting point Z0 is sufficiently close to point Z∗. 156