www.biomathforum.org/biomath/index.php/biomath ORIGINAL ARTICLE Identification of HIV Dynamic System in the Case of Incomplete Experimental Data P. Mathye∗, I. Fedotov∗ and M. Shatalov∗† ∗Department of Mathematics and Statistics Tshwane University of Technology, Pretoria, South Africa Email: mathyeph@gmail.com; FedotovI@tut.ac.za; ShatalovM@tut.ac.za †Manufacturing and Materials Council for Scientific and Industrial Research (CSIR), Pretoria, South Africa Received: 6 October 2015, accepted: 14 December 2015, published: 17 January 2016 Abstract—In this paper we apply an inverse method that estimates parameters of deterministic mathematical models to an HIV model. We consider the case where experimental data concerning the values of some variables is incomplete or unknown. The objective is to estimate the parameters and to restore the information concerning the behaviour of the incomplete data. The method is based on integrating both sides of equations of a dynamic sys- tem, and applying some minimization methods (for example least square method). Such an approach was first suggested in [7] and [8]. Analysis of the HIV model and a corresponding numerical example is presented. Keywords-Inverse problem, least square methods, parameter estimation, HIV model, incomplete data. I. INTRODUCTION Several HIV vaccine models have been devel- oped over the recent years (e.g., [1] and [4]). These models describe and predict the potential epidemi- ological impact of vaccination. For the models to give insight into the transmission dynamics of HIV, model parameters are of great significance. The parameters for these model are estimated based on HIV seroprevalence data. Most often the data on testing and treatment history is incomplete (missing). This barrier can be as a result of the stigma attached to and the discrimination against people living with HIV and AIDS. In studying the dynamics of this world pandemic, the availability of recorded valuable data is thus a challenge. The proposed method is based on eliminating the unknown (missing) state variables from the original system algebraically. The resultant sys- tem is then used to estimate the unknown model parameters. The discussion on the identification of dynamic systems with incomplete data and restoring the missing data is elementary and can be suitable for students in basic courses on dynamic systems identification. A four dimensional deterministic model for transmission dynamics of HIV in the presence of a preventive vaccine is considered as an example. The model is identified in the cases of incomplete data. It is assumed that the population sizes of individuals infected by the wild type strain and by both the wild and the vaccine strains is unknown. For this study, the data is artificially generated from the given parameters from Gumel’s paper [4]. We then ’forget’ about the these parameters and Citation: P. Mathye, I. Fedotov, M. Shatalov, Identification of HIV Dynamic System in The Case of Incomplete Experimental Data, Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 1 of 7 http://www.biomathforum.org/biomath/index.php/biomath http://dx.doi.org/10.11145/j.biomath.2015.12.141 P. Mathye et al., Identification of HIV Dynamic System in The Case of Incomplete ... part of the generated data. Applying the proposed method we estimate the parameters and then re- store the ’forgotten’ data. The performance of the proposed method with real life data remains to be investigated in the future. The remainder of the paper is structured as follows. Section II briefly outlines the formulation of the problem. Section III discusses the inverse method used in solving the problem. Section IV gives a description of the mathematical model that is used as an example. Sections V, VI and VII discusses the method of solution, parameter esti- mations and a numerical simulation respectively. Finally, concluding remarks are made in Section VIII. II. PROBLEM FORMULATION Consider a system of ordinary differential equa- tions of the form dx dt = f(t,x,y,θ), (1) dy dt = g(t,x,y,θ), (2) where (x,y) is a vector-function [0,T] 3 t → (x,y) ∈ Rm × Rn subject to experimental information concerning the values of x(tj) at points tj ∈ [0,T], (tj = 0, 1, . . . ,N) are known. Now, suppose the information concerning the values y(tj) is either incomplete or unknown. It may be, for example, some statistical data of the form: Table 1: Experimental data. t0 · · · tj · · · tN x0 = x(t0) · · · xj = x(tj) · · · xN = x(tN ) y0 = ? · · · yj = ? · · · yN = ? The parameter θ is l dimensional, that is θ ∈ A ⊂ Rl where A can coincide with Rl (no constrains between the entries of θ) and A can be subset of Rn (there are constraints). The purpose of this study is to identify the model parameters θ and to restore (recover) the information concerning the behaviour of the in- complete or unknown of y(t). III. INVERSE METHOD Solve (1) with respect to y to get y = h ( t,x, dx dt ,θ ) . (3) Substitute y by h ( t,x, dx dt ,θ ) in (2) to obtain d dt h ( t,x, dx dt ,θ ) = g ( t,x,h ( t,x, dx dt ,θ ) ,θ ) . (4) For this study consider the case where (4) is linear with respect to the coefficients ck(θ): d dt h0 ( t,x, dx dt ) = N∑ k=1 ck(θ)hk ( t,x, dx dt , d2x dt2 ) , (5) where ck : A 3 θ 7→ ck(θ) is a scalar function. The parameter identification for the model is based on the direct integration of the dynamic system with posterior application of a quadrature rule (for example, the adaptive trapezoidal rule). Some minimization methods (for example, least squares method (see for example [5])) is then applied to find the estimates. Integrating (5) twice with respect to t from t0 to ti, (i = 1, 2, . . . ,N) yields: Aα−h = 0 (6) where A = ∫ ti t0 (ti − τ)hk ( τ,x(τ), dx(τ) dτ , d2x(τ) dτ2 ) dτ, α = ck(θ), and h = ∫ ti t0 h0 ( τ,x(τ), dx(τ) dτ ) dτ. The values of the unknown parameters α in (6) can be determined using method of least squares. Suppose there are some constraints to be sat- isfied amongst the parameters α. The problem is thus to minimize the Lagrangian L(α,λ) = (Aα−h)> (Aα−h) + 2λ> (Cα−b) , (7) where λ is the Lagrange multiplier and Cα = b (8) Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 2 of 7 http://dx.doi.org/10.11145/j.biomath.2015.12.141 P. Mathye et al., Identification of HIV Dynamic System in The Case of Incomplete ... the constraints that must be satisfied. The necessary conditions for a constrained min- imum at α∗ is the existence of vector λ* such that ∇αL(α∗,λ∗) = (α∗)>A>A−h>A + (λ∗)>C = 0 (9) and ∇λL(α∗,λ∗) = Cα∗ −b = 0. (10) That is, we obtain the system:( A>A C> C 0 )( α∗ λ∗ ) = ( A b ) (11) where the matrix:( A>A C> C 0 ) (12) is a square matrix with a non-zero determinant. The solution to the linear system (11) is given by ( α∗ λ∗ ) = ( A>A C> C 0 )−1 ( A b ) . (13) With the the parameters α = ck(θ) known, the information concerning the behaviour of the incomplete or unknown state variable y(t) can now be restored from (3). IV. MATHEMATICAL MODEL The model monitors four populations namely: HIV susceptible (X), unvaccinated individuals in- fected by wild type (Yw), individuals uninfected by the wild type but infected by the vaccine strain (Yv) and individuals dually-infected with the vaccine and wild strain (Yvw). The total (sexual activity) population size is N = N(t) = X + Yv + Yw + Yvw. The model is a modified version of those studied by Blower and Gumel in [1] and [4] respectively. A. HIV susceptible (X) Individuals are recruited, by birth or immigra- tion, into this population at the rate p1 . This population is reduced by the natural cessation of sexual activity at a rate µ, infection with the vaccine strain at the rate αv, the wild strain and infected by the dual infected individuals at the rate αw. In this case it is assumed that the dual infected individuals can only transmit the wild strain. Thus, dX dt = p1 −µ1X −αv XYv N −αw X(Yw + Yvw) N (14) B. Individuals uninfected by the wild type but infected by the vaccine strain (Yv) This population increases through the new sus- ceptible being vaccinated at the rate p2 and by infection by the vaccine strain at the rate αv. It is decreased by infection with the wild strain and in- fected by the dual infected individuals at the rate γ. The parameter γ is of the form (1−ψ)cβw where ψ is the degree of protection that the vaccine pro- vides against infection with the wild-type strain, c the number of sexual partners and βw the rate of infection by the wild-type strain. The population is further decreased by natural cessation from sexual activity and by death induced by infection with the wild strain at a rate µ2. dYv dt = p2 + αv XYv N −γ Yv(Yw + Yvw) N −µ2Yv, (15) C. Unvaccinated individuals infected by wild type (Yw) This population increases through the suscep- tible being infected by the wild strain and by the dual infected individuals at the rate αw. The population is decreased by natural cessation from sexual activity and by death induced by infection with the wild strain at a rate µ3. dYw dt = αw X(Yw + Yvw) N −µ3Yw, (16) D. Individuals dually-infected with the vaccine and wild strain (Yvw) This population increases through infection with the wild strain and infected by the dual infected individuals at the rate γ. The population is de- creased by natural cessation from sexual activity Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 3 of 7 http://dx.doi.org/10.11145/j.biomath.2015.12.141 P. Mathye et al., Identification of HIV Dynamic System in The Case of Incomplete ... and by death induced by infection with the wild strain at a rate µ4. dYvw dt = γ Yv(Yw + Yvw) N −µ4Yvw. (17) The complete model is thus: dX dt = p1 −µ1X −αv XYvN −αw X(Yw+Yvw) N , dYv dt = p2 + αv XYv N −γYv(Yw+Yvw) N −µ2Yv, dYw dt = αw X(Yw + Yvw) N −µ3Yw, dYvw dt = γ Yv(Yw + Yvw) N −µ4Yvw,   (18) Adding all the equations of the system (18) gives dN dt = p1+p2−µ1X−µ2Yv−µ3Yw−µ4Yvw (19) All model parameters are nonnegative. V. METHOD OF SOLUTION Suppose that N = N(t),X = X(t),Yv = Yv(t) are known and that Yw and Yvw are unknown. From N = X + Yv + Yw + Yvw, (20) it is clear that the sum Yw + Yvw is also known. In fact, Yw + Yvw = N −X −Yv, (21) Let Z = Yw + Yvw. (22) The system with equations (14, 15, 16) and (17) then becomes dX dt = p1 −µ1X −αv XYv N −αw XZ N , (23) dYv dt = p2 + αv XYv N −γ YvZ N −µ2Yv, (24) dYw dt = αw XZ N −µ3Yw, (25) dYvw dt = γ YvZ N −µ4Yvw. (26) By using the inverse method the parameters in Equations (23) and (24) can be determined. That is, parameters p1,p2,µ1,µ2,αv,αw and γ can now be considered known. Adding Equations (25) and (26) and rearranging yields µ3Yw + µ4Yvw = αw XZ N + γ YvZ N − dZ dt . (27) Let M = αw XZ N + γ YvZ N − dZ dt , (28) Equation (27) then becomes µ3Yw + µ4Yvw = M. (29) From (22) and (29), form the system Yw + Yvw = Z, (30) µ3Yw + µ4Yvw = M. (31) Solving this system we obtain Yw = µ4Z −M ∆ (32) and Yvw = M −µ3Z ∆ (33) where ∆ = µ4 −µ3. (34) Differentiating Equation (32) yields, dYw dt = µ4 ∆ dZ dt − 1 ∆ dM dt (35) Equating (25) and (35) yields, µ4 ∆ dZ dt − 1 ∆ dM dt = αw XZ N −µ3Yw (36) Simplifying µ4 ∆ dZ dt − 1 ∆ dM dt = αw XZ N − µ3µ4 ∆ Z+ µ3 ∆ M (37) Rearranging gives, µ4 ∆ dZ dt − 1 ∆ dM dt = αw XZ N − µ3µ4 ∆ Z+ µ3 ∆ M (38) For convenience, a1 dZ dt + a2 dM dt + a3M + a4Z + F = 0 (39) Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 4 of 7 http://dx.doi.org/10.11145/j.biomath.2015.12.141 P. Mathye et al., Identification of HIV Dynamic System in The Case of Incomplete ... where a1 = µ4 ∆ ,a2 = − 1 ∆ ,a3 = − µ3 ∆ , a4 = µ3µ4 ∆ and F = −αw XZ N . The relationship between the parameters a1,a2,a3 and a4 is given by the following constrains a1 + a3 = 1, (40) and a1a3 −a2a4 = 0. (41) Suppose that the parameters a1,a2,a3 and a4 are obtained, then µ4 = − a1 a2 and µ3 = a1 − 1 a2 . Finally, with Z, M, µ3 and µ4 known, the unknown variables Yw and Yvw can be restored from (32) and (33). VI. PARAMETER ESTIMATION Recall (23), dX dt = p1 −µ1X −αv XYv N −αw XZ N , (42) Integrating both sides with respect to t from t0 to ti, gives ∆Xi = p1ti −µ1Ji −αvKi −αwPi, (43) where the integrals ∆Xi = ∫ ti t0 dX(τ) dt dτ, Ki = ∫ ti t0 X(τ)Yv(τ) N(τ) dτ, Ji = ∫ ti t0 X(τ)dτ and Pi = ∫ ti t0 X(τ)Z(τ) N(τ) dτ, are evaluated using the trapezoidal rules. The system (43) is of the form Aa = h, where A =   t1 −J1 −K −P1... ... ... ... tN −JN −K −PN   , a =   p1 µ1 αv αw   and h =   ∆X1... ∆XN   . Solving the regression problem minimize‖Aa − h‖2 (44) using least squares method we obtain the estimate ã. Thus the parameters p1,µ1,αv and αw are found. Now recall (24), dYv dt = p2 + αv XYv N −γ YvZ N −µ2Yv, (45) Similarily, integrating both sides of (45) yields ∆Yv = p2ti + αvKi −γQ1 −µ2Ii, (46) where ∆Yv = ∫ ti t0 dYv(τ) dt dτ, Ki = ∫ ti t0 X(τ)Yv(τ) N(τ) dτ, Qi = ∫ ti t0 X(τ)Z(τ) N(τ) dτ and Ii = ∫ ti t0 Y (τ)dτ. The system (46) is of the form Aa = h, where A =   t1 −Q −I1... ... ... tN −Q −IN   , a =   p2γ µ2   and h =   ∆Y1 −αvK1... ∆YN −αvKN   . Solving the regression problem minimize‖Aa − h‖2 (47) using least squares method we obtain the estimate . Thus the parameters γ,µ2 and p2 are found. Lastly recall (39), a1 dZ dt + a2 dM dt + a3M + a4Z + F = 0 (48) subject to the constraints a1 + a3 − 1 = 0 and a1a3 −a2a4 = 0. (49) Integrating both sides of (48) with respect to t from t0 to ti, gives a1∆Zi +a2∆Mi +a3Ui +a4Wi + ∆Fi = 0 (50) Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 5 of 7 http://dx.doi.org/10.11145/j.biomath.2015.12.141 P. Mathye et al., Identification of HIV Dynamic System in The Case of Incomplete ... where ∆Zi = ∫ ti t0 dZ(τ) dt τ, ∆Mi = ∫ ti t0 dM(τ) dt τ, Ui = ∫ ti t0 M(τ)dτ, Wi = ∫ ti t0 Z(τ)dτ and ∆Fi = −αw ∫ ti t0 X(τ)Yv(τ) N(τ) dτ. To find the unknown parameters a1,a2,a3 and a4, the Lagrangian L3 can be written with one constraint as L3(a) = 1 2 n∑ i=0 [ a1(Zi −Ui) + a2Mi + a4Wi+ (∆Fi + Ui) ]2 + λ(a1 −a21 −a2a4) (51) where λ1 is the Lagrange multipliers and a = (a1,a2,a4). To minimize the least-square func- tional L3, set ∂L3 ∂a1 = ∂L3 ∂a2 = ∂L3 ∂a4 = ∂L3 ∂λ = 0. (52) The resultant system of equations is then solved using the method proposed by Fedotov et al [3] for finding roots of transcendental algebraic equations. The parameter value a3 is finally obtained from a3 = 1 −a1. VII. NUMERICAL SIMULATION In order to illustrate the effectiveness of the method a numerical example is presented. The system (18) is solved numerically by Adams method using a mathematical software Mathcad. The following parameter values and initial condi- tions from Gumel [4] are used. p1 = 400, p2 = 1.6×103, αv = 2.5, αw = 2.25, γ = 0.9µ1 = 0.031, µ2 = 0.0331, µ3 = 0.0281, µ4 = 0.231, X 0 = 8 × 104, Y 0v = 2000, Y 0w = 8000, Y 0 vw = 8000 The solution vectors obtained are taken as experi- mental data. We then assume that the experimental data concerning the state variables Yw, Yvw and the model parameter values to be unknown. The method discussed in the sections above is then applied to estimate the model parameter values and restore the vectors Yw and Yvw. The results given in the table below, show a comparison of the actual parameters used, the estimated parameters and the percentage error given by ‖α− α̃‖/‖α‖× 100. Table 2: The parameter estimates and errors. Parameter Actual value Estimated value %Error α α̃ p1 400 400.142 0.035 p2 1.6 ×103 1.6 ×103 0.000 µ1 0.031 0.031 0.000 µ2 0.331 0.326 1.511 µ3 0.281 0.282 0.356 µ4 0.231 0.229 0.866 αv 2.5 2.495 0.200 αw 2.25 2.25 0.000 γ 0.9 0.905 0.556 From Table 2 it can be seen that the estimate parameters are close enough to the actual ones. The percentage relative errors for this estimates are mostly low than 1 %. Let Yw := ( U<3> ) i and ( Yvw := U <4> ) i be the solution vectors obtained from solving the system (18) and Y Ywi and Y Yvwi the estimated vectors. From Figures 1 and 2, it can be seen that the estimated vectors, Y Ywi and Y Yvwi are within acceptable limit of error. VIII. CONCLUSION In this paper a method to identify dynamic mathematical models with incomplete (missing) data was discussed. The method was applied to a four dimensional HIV vaccination model. The model parameters and the unknown (missing) data were restored. The proposed method gives a direct hint of what is necessary to measure in practice and what data can be analytically restored (found). Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 6 of 7 http://dx.doi.org/10.11145/j.biomath.2015.12.141 P. Mathye et al., Identification of HIV Dynamic System in The Case of Incomplete ... Fig. 1. Comparison of the estimated solutions, Y Y wi, with the “experimental data”, U<3>i , and the percentage relative error. REFERENCES [1] S. M. Blower, K. Koelle, D. Krischner & J. Mills, Live attenuated HIV vaccine: predicting the trade-off between efficacy and safety, Proc Nat Acad Sci, 98(6) : 3618 − 3623, 2001. [2] F. Ding, G. Liu & X. P. Liu, Parameter estimation with scarce measurements, Automatica, 47 : 1646 − 1655, 2001. [3] I. Fedotov, M. Shatalov & J. N. Mwambakana, Roots of transcedental algebraic equations: A method of bracket- ing roots and selecting initial estimations, Buffelsfontein TIME 2008 Peer-review Conference Procedings, 22−26 September, 2008. [4] A. B. Gumel, S. M. Moghadas & R. E. Mickens, Ef- fect of a preventive vaccine on the dynamics of HIV transmission, Communication in Nonlinear Science and Numerical Simulations, 9 : 649−659, 2004. Fig. 2. Comparison of the estimated solutions, Y Y vwi, with the “experimental data”, U<4>i , and the percentage relative error. [5] C. L. Lawson & R. J. Hanson, Solving Least Square Problems, New Jersey, America: Prentice–Hall, 1974. [6] M. A. Nowak& A. R. Mclean, A mathematical model of vaccination against HIV to prevent the development of AIDS, RGMIA, Victoria University, 10(1, 2) : 106−116, 2007. [7] M. Shatalov & I. Fedotov, On identification of dynamic systems parameters from experimental data, RGMIA, Victoria University, 10(1, 2) : 106−116, 2007. [8] M. Shatalov,I. Fedotov & S. V. Joubert, A novel method of interpolation and extrapolation of functions by a linear initial value problem, Buffelsfontein TIME 2008 Peer- review Conference Procedings, 22−26 September, 2008. Biomath 4 (2015), 1512141, http://dx.doi.org/10.11145/j.biomath.2015.12.141 Page 7 of 7 http://dx.doi.org/10.11145/j.biomath.2015.12.141 Introduction PROBLEM FORMULATION INVERSE METHOD Mathematical model HIV susceptible (X) Individuals uninfected by the wild type but infected by the vaccine strain (Yv) Unvaccinated individuals infected by wild type (Yw) Individuals dually-infected with the vaccine and wild strain (Yvw) Method of Solution Parameter Estimation Numerical simulation Conclusion References