INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL ISSN 1841-9836, 12(3), 323-329, June 2017. Learning Speed Enhancement of Iterative Learning Control with Advanced Output Data based on Parameter Estimation G.-M. Jeong, S.-H. Ji Gu-Min Jeong School of Electrical Engineering Kookmin University, Korea gm1004@kookmin.ac.kr Sang-Hoon Ji* Robot R&BD Group KITECH, Korea *Corresponding author: robot91@kitech.re.kr Abstract: Learning speed enhancement is one of the most important issues in learning control. If we can improve both learning speed and tracking performance, it will be helpful to the applicability of learning control. Considering these facts, in this paper, we propose a learning speed enhancement scheme for iterative learning control with advanced output data (ADILC) based on parameter estimation. We consider linear discrete-time non-minimum phase (NMP) systems, whose model is unknown, except for the relative degree and the number of NMP zeros. In each iteration, estimates of the impulse response are obtained from input-output relationship. Then, learning gain matrix is calculated from the estimates, and by using new learning gain matrix, learning speed can be enhanced. Simulation results show that the learning speed has been enhanced by applying the proposed method. Keywords: iterative learning control, speed enhancement, parameter estimation, learning gain estimation. 1 Introduction By using Iterative learning control (ILC), the tracking performance can be enhanced when the the same task is performed iteratively [7]– [9]. Among various ILC schemes, Iterative learning control with advanced output data (ADILC) [4] [5] has been proposed for the learning control in discrete time non-minimum phase (NMP) systems. ADILC stabilizes inverse mapping by using output-to-input mapping directly with time-advanced output data. Its learning structure is simple since it consists of an input update law that depends on the relative degree and number of NMP zeros. On the other hand, due to the complexity of computation, learning speed enhancement is one of the most important issues in iterative learning control (ILC). Considering this, various approaches for direct learning control (DLC) [9] have been proposed. In [5], an ADILC scheme based on the estimation of the impulse response is proposed for linear discrete-time NMP systems, whose model is unknown, except for the relative degree and the number of NMP zeros. Instead of using an approximate model of the system, the first part of impulse response is estimated and used for the ADILC. However, considering the computational cost of this method, we need a novel scheme to enhance learning speed. In this paper, we propose a new speed enhancement scheme for discrete time NMP systems, extending the results in [5]. By using the estimates of the learning matrix, an estimate of the desired input is derived and the learning speed can be significantly enhanced. Further, an illustrative example is provided to demonstrate the applicability of the proposed method. Copyright © 2006-2017 by CCC Publications 324 G.-M. Jeong, S.-H. Ji 2 ADILC for discrete-time NMP systems In this section, some preliminary results of the ADILC in [4] are briefly summarized. Let us consider a linear time invariant(LTI) system described by x(i + 1) = Ax(i) + Bu(i) y(i) = Cx(i) (1) where, u ∈ R1, x = [x1, · · · ,xn]T ∈ Rn, and y ∈ R1 are the input, the state, and the output of the system, respectively. A, B and C are matrices of appropriate dimensions. Let xd(i), yd(i) and ud(i) represent the state, the output and the input corresponding to the desired trajectory respectively. Further, let the desired output yd(i), i ∈ [σ,N + σ − 1] be given and u[i,j] := [u(i), · · · ,u(j)]T , y[i,j] := [y(i), · · · ,y(j)]T . The transfer function of the system is represented by G(z) = β1z n−1+···+βn zn+α1zn−1+···+αn . Here, it is assumed that the number of NMP zeros, d0, and the relative degree, σ, are known a priori (i.e., β1 = · · · = βσ−1 = 0). In the ADILC, the following input-output mapping is used to stabilize the inverse mapping. y[σ+d0,N+σ+d0−1] = Hx(0) + Ju[0,N−1], (2) H = [ (Hd0+1) T , · · · , (HN+d0 ) T ]T , J =   Jd0+1 Jd0 · · · 0 Jd0+2 Jd0+1 · · · 0 ... ... ... ... JN+d0 JN+d0−1 · · · Jd0+1   , where Hl = CAσ+l−1, Jl = CAσ+l−2B. The time interval for the output of interest is [σ + d0,N + σ + d0 − 1] in (2), whereas it is [σ,N + σ− 1] for minimum phase systems (i.e., d0 = 0). For an ADILC, we set the input horizon to [0,N + d0 − 1] with u[N,N+d0−1] = 0 and the output horizon to [0,N +σ +d0−1]. The desired trajectory, yd, is given in [σ,N +σ−1]. We set yd [N+σ,N+σ+d0−1] to some appropriate constants. Further, at every iteration, we set x k(0) = xd(0) and uk(i) = ud(i) = 0,N ≤ i ≤ N − 1 + d0. To analyze the stability of the inverse mapping, we need the following assumptions: • (A1) The system is stable, controllable and observable. • (A2) The matrix A is invertible. • (A3) βn 6= 0 in G(z). • (A4) The matrix J is nonsingular. With these assumptions, Lemma 1 shows that the inverse mapping (2) is stable using the time advancing of the output data, even though it is an NMP system. Lemma 2.1. (Stable inversion using time advancing) The inverse mapping from yd [σ+d0,N+σ+d0−1] to u d [0,N−1] is stable. The input update law is derived from Lemma 1 as follows: uk+1 [0,N−1] = u k [0,N−1] + S kek[σ+d0,N+σ+d0−1], (3) Learning Speed Enhancement of Iterative Learning Control with Advanced Output Data based on Parameter Estimation 325 where ek [l,m] = yd [l,m] − yk [l,m] and Sk ∈ RN×N is the learning gain matrix. The next lemma shows that the input uk [0,N−1] converges to u d [0,N−1] as k → ∞ using the input update law (3). It should be noted that this inverse mapping is stable. Lemma 2.2. The uncertain system (1) satisfies (A1)–(A4). If the condition ‖I − SkJ‖≤ ρ < 1 (4) holds, the input uk [0,N−1] converges to u d [0,N−1] as k →∞. 3 ADILC with the estimation of the impulse response In this section, the impulse response estimation scheme in [5] is slightly modified for the learning speed enhancement detailed in the next section. After estimating the first p impulse responses, we select the first l ≤ p responses to obtain J̄, the estimate of J in (2). With J̄, which consists of the estimations of the first l impulse responses (J1,· · · ,Jl), the ADILC scheme can be applied to unknown NMP systems. Since we set x(0) = 0 for the learning scheme, from (2), we can obtain y[σ+d0,N+σ+d0−1] = Ju[0,N−1]. (5) By exchanging the location of J and u[0,N−1], (5) can be changed into y[σ+d0,N+σ+d0−1] = UmaxJ[1,N+d0+1]. (6) Here, Umax =   u(d0) · · · u(0) · · · 0... ... ... ... ... u(d0 + N − 1) · · · u(N − 1) · · · u(0)   , J[1,N+d0+1] = [J1, · · · ,JN+d0+1] T . (7) To estimate the first p impulse responses, we make an approximation for J[1,p]. As i becomes larger, the impulse response Ji approaches 0. By selecting a sufficient large p and discarding the impulse responses from p + 1, the approximation is made as y[σ+d0,N+σ+d0−1] ≈ UpJ[1,p]. (8) Here, Up =  u(d0) u(d0 − 1) · · · 0 u(d0 + 1) u(d0) · · · 0 ... ... ... ... u(d0 + N − 1) u(d0 + N − 2) · · · u(d0 + N − l− 1)   , J[1,p] = [J1, · · · ,Jp] T . (9) Using the least square method, we can obtain J̄[1,p] consisting of the estimates of J[1,p], as J̄[1,p] = (U T p · Up) −1UTp y[σ+d0,N+σ+d0−1]. (10) 326 G.-M. Jeong, S.-H. Ji After estimating the impulse responses, we select the first l ≤ p impulse responses and obtain J̄ which is the estimates of J in (2). In [5], a learning control scheme was presented based on impulse response estimation. At step 0, u1 [0,N−1] can be determined by setting S 0 as an appropriate matrix, e.g., αI and u0 [0,N−1] = 0. Then, we can estimate J̄k [1,N] similarly to (10) and learning control can be performed using (3) with Sk = α(J̄k)−1 for some α, 0 < α < 1. 4 Learning speed enhancement using the estimation of the im- pulse response In this section, a new learning speed enhancement algorithm is presented using (10) and the learning scheme for unknown NMP systems. Since the estimates of impulse responses can be used to estimate the desired input, the learning speed can be enhanced. At k = 0, since u0 [0,N−1] = 0, y 0 [σ+d0,N+σ+d0−1] will be zero. Thus, u 1 [0,N−1] = S 0yd [σ+d0,N+σ+d0−1]. Here, we set S0 to be an appropriate matrix, e.g., αI. For k ≥ 1, we estimate the impulse re- sponse J̄k [1,p] using (10) from uk [0,N−1] and y k [σ+d0,N+σ+d0−1], derive J̄ k from the estimates of first l impulse responses. Likewise, for a sufficient k, e.g., k = 1, we can obtain the estimate of the input ūd [0,N−1] as follows: ūd[0,N−1] = (J̄ k)−1y[σ+d0,N+σ+d0−1]. (11) If the estimation is successfully made and l is sufficiently enough, ūd [0,N−1] will be considerably close to ud [0,N−1]. For k + 1, e.g., k = 2, we can set uk [0,N−1] = ū d [0,N−1] and S = α(J̄ k)−1. If S satisfies the convergence condition, we can obtain the desired input with the proposed method. Throughout this approach, we can enhance the learning speed. We can summarize the learning rule as follows: The proposed learning algorithm • Step 0: When k = 0. - Set S0 to be an appropriate matrix and obtain u1 [0,N−1]. • Step 1: For the first iteration, - Obtain y1 [σ+d0,N+σ+d0−1]. - If ‖ek [σ+d0,N+σ+d0−1]‖≤ �, then stop. - Else, derive J̄k using (10). - Calculate the estimated value of the desired input ūd [0,N−1] from (11) and set u 2 [0,N−1] = ūd [0,N−1] . • Step k: For the k-th iteration, - If ‖ek [σ+d0,N+σ+d0−1]‖≤ �, then stop. - Set S = α(J̄k)−1. - Update the input using (3), increment k, and repeat Step k until termination. Learning Speed Enhancement of Iterative Learning Control with Advanced Output Data based on Parameter Estimation 327 Theorem 1. The NMP system (1) satisfies (A1)-(A4), the relative degree and the number of NMP zeros are known, and the system dynamics may not be known completely. Let us assume that we update the input based on the proposed learning algorithm. If the condition (4) holds for all k ≥ 1, the input uk [0,N−1] converges to u d [0,N−1] as k →∞. Proof: This can be easily shown using Lemma 2.1, and Theorem 1 from [5]. 2 (a) Outputs using the proposed method and y5 in [5] (b) Inputs using the proposed method and u5 in [5] Figure 1: Outputs and inputs for different values of k 5 Simulation results Let us consider an example of NMP system for a positioning table in [10] as follows: G(z) = 0.0082z4 + 0.031z (z + 0.29)(z − 0.2)(z − 0.46)(z2 − 1.7z + 0.73) 328 G.-M. Jeong, S.-H. Ji This system has one NMP zero (z = −3.7805) and satisfies (A1)–(A4). The desired trajectory is given as yd(i) = { 0, i = 0, 1, 83, 84, 85, 86 −0.2 cos(0.05π(i− 2)), 2 ≤ i ≤ 82. (12) Here, we set N = 85, S0 = 0.1I and u(85) = u(86) = 0. The input update law is given as uk+1 [0,84] = uk [0,84] + Skek [2,86] . From (10), we set p = 85. The impulse response is estimated using J̄k[1,85] = ((U k)T · Uk)−1(Uk)T yk[2,86]. (13) In addition, J̄1 is obtained using l = 30. Then, we set u2 = ūd and enhance the learning speed. In this case, the convergence condition is satisfied as ‖I − SkJk‖ < 0.568 when Sk = 0.5(J̄k)−1. Fig. 1(a) and 1(b) show the outputs and inputs for different values of k, respectively. The root mean square (RMS) error for the output error is 0.0012 for k = 2, and is smaller than the RMS error of 0.0036 for k = 10 reported in [5]. From this example, we can see that the learning speed is significantly enhanced. 6 Conclusion In this paper, we have proposed a new learning speed enhancement algorithm of ADILC for discrete-time NMP systems. First, we have presented an estimation algorithm of impulse responses based on the input-output mapping of ADILC. Next, learning speed enhancement algorithm has been derived from new learning gain, which is calculated with the estimates of impulse response. Simulation results for the NMP system have demonstrated the learning speed enhancement of the proposed method. Robust algorithms over disturbances for learning control can be considered with the proposed method. It remains as a future work. Acknowledgment This work was supported by the National Research Foundation of Korea(NRF) Grant funded by the Korean Government(MSIP)(NRF-2016R1A5A1012966), and also supported by Basic Sci- ence Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2015R1D1A1A01060917) Bibliography [1] Arimoto S., Kawamura S., Miyazaki F. (1984); Bettering operation of robots by learning, Journal of Robotic Systems, 1(2), 123–140, 1984. [2] Bien Z., Xu J.-X. (1998); Iterative learning control analysis, design, integration and appli- cations, Kluwer Academic Publishers, 1998. [3] Jang T.-J., Ahn H.-S., Choi C.-H. (1994); Iterative learning control for discrete-time non- linear systems, International Journal of Systems Science, 25(7): 1179-1189.. [4] Jeong G.-M., Choi C.-H. (2002); Iterative learning control for linear discrete time nonmini- mum phase systems, Automatica, 38(2), 287–291, 2002. Learning Speed Enhancement of Iterative Learning Control with Advanced Output Data based on Parameter Estimation 329 [5] Jeong G.-M., Ji S.-H. (2013); Iterative learning control with advanced output data using an estimation of the impulse response, IEICE Transactions on Fundamentals, E96-A (6), 1488-1491, 2013.. [6] Ngo T., Wang Y., Mai T.L., Ge J., Nguyen M.H., Wei S. N. (2012); An adaptive iterative learning control for robot manipulator in task space, International Journal of Computers Communications & Control, 7(3), 518–529, 2012. [7] Uchiyama M. (1978); Formulation of high-speed motion pattern of mechanical arm by trial, Transactions of the Society of Instituteument and Control Engineers (in Japanese), 14(6), 706–712, 1978. [8] Xia C., Deong W., Shi T., Yan Y. (2016); Torque ripple minimization of PMSM using parameter optimization based iterative learning control, Journal of Electrical Engineering and Technology, 11(2), 709–718, 2016. [9] Xu J.-X. (1997); Direct learning of control efforts for trajectories with different magnitude scales, Automatica, 33(12), 2191–2195, 1997. [10] Yamada M., Riadh Z., Funahashi Y. (1999); Design of discrete-time repetitive control system for pole placement and application,IEEE/ASME Transactions on Mechatronics, 4(2), 110- 118, 1999.