Engineering, Technology & Applied Science Research Vol. 8, No. 4, 2018, 3162-3167 3162 www.etasr.com Ahmad et al.: Forecasting Parameter Estimates: A Modeling Approach Using Exponential … Forecasting Parameter Estimates: A Modeling Approach Using Exponential and Linear Regression W. M. A. W. Ahmad School of Dental Sciences Universiti Sains Malaysia, Malaysia wmamir@usm.my R. A. A. Rohim Universiti Sains Malaysia Malaysia adawiyah5350@yahoo.com N. H. Ismail Universiti Sains Malaysia Malaysia noorhuda@usm.my Abstract-This paper supplies a calculation method for the parameter estimates of an exponential equation through SAS algorithm. The aim of this paper is to investigate the efficiency of the gained parameter estimates through the forecasting performance. The proposed calculation method can provide a very useful technique to develop an exponential equation with better accuracy performance. This research paper illustrates a sample of the data obtained from the established study, which characterize the proliferative capacity of mesenchymal stem cells. This paper also provides the specific algorithm for the parameter estimates. Keywords-exponential; SAS algorithm; parameter estimates I. INTRODUCTION Regression analysis is a statistical methodology that uses the relationship between two or more quantitative variables in a way that one variable can be predicted from the other, or others. This methodology is widely used in business, social, behavioral and biological sciences, including agriculture and fishery research [1]. For example, fish weight at harvest can be predicted by utilizing the relationship between fish weights and other growth affecting factors like water temperature, dissolved oxygen, and free carbon dioxide. There are other situations in a fishery where relationships among variables can be exploited through regression analysis [1]. Regression analysis serves three major purposes: (1) description, (2) control and (3) prediction. We frequently use equations to summarize or describe data. Regression analysis is helpful in developing such equations. For example, we may collect a considerable amount of fish growth data and a data on a number of biotic and abiotic factors and a regression model would probably be a much more convenient and useful summary of those data than a table or a graph. Besides prediction, regression models may be used for control purposes. A cause and effect relationship may not be necessary if the equation is to be used only for prediction [2]. A functional relationship between two variables is expressed by a mathematical formula. If x denotes the independent variable and y the dependent variables, a functional relationship is of the form  xfy  . Given a particular value of x, the function indicates the corresponding value of y. A statistical relation, unlike a function, is not a perfect one. In general, the observations for a statistical relation do not fall directly on the relationship’s curve. Depending on the nature of the relationship between x and y, regression approach may be classified into two categories, linear regression and nonlinear regression models. The models that are linear in these parameters are known as linear models, whereas in nonlinear models parameters show nonlinearity. Linear models are generally satisfactory approximations for most regression applications. There are occasions, however, when an empirically indicated or a theoretically justified nonlinear model is more appropriate [3]. A. Linear Regression Linear regression is used to study the linear relationship between a dependent variable Y and one or more independent variables X. The dependent variable Y must be continuous, while the independent variables may be either continuous, binary, or categorical. The initial judgment of a possible relationship between two continuous variables should always be made on the basis of a scatter plot (scatter graph). This type of plot will show whether the relationship is linear or nonlinear. Performing a linear regression makes sense only if the relationship is linear. Other methods must be employed to study nonlinear relationships [4]. A model with more than predictor variables is a straightforward one. The model can be stated as follows: iεxββy i10i  (1) where iy is the value of the response variable in the i th trial, β0 and β1 are parameters xi is a known constant, namely the ith value of the predictor variable and εi is a random error term with mean zero and variance σ2 and their covariance is zero [5]. B. History of the Exponential Function The exponential is one of the most significant and widely occurring functions. In biology, it may depict the growth of bacteria or animal populations, the reduction of the number of bacteria in response to a sterilization process, the development of a tumor or the absorption or elimination of a drug. Exponential growth cannot go on forever because of limitations of nutrients, etc. Knowledge of the exponential function makes it more comfortable to understand birth and death rates, even when they are not perpetual. In physics, the exponential function describes the disintegration of radioactive nuclei, the Engineering, Technology & Applied Science Research Vol. 8, No. 4, 2018, 3162-3167 3163 www.etasr.com Ahmad et al.: Forecasting Parameter Estimates: A Modeling Approach Using Exponential … emission of light by atoms, the assimilation of light as it passes through matter, the change of voltage or current in some electrical circuits, the variance of temperature with time as a warm object cools, and the rate of some chemical reactions [1]. Although the exponential distribution provides a simple, elegant and closed form solution to many problems, it does not offer a reasonable parametric fit for some practical applications where the underlying failure rates are nonconstant, presenting monotone shapes. Recently, in the procedure of overcoming such problems, new categories of examples were introduced based on adjustments of the exponential distribution. Authors in [6] offered a generalized exponential distribution, which can hold data with increasing and decreasing failure rate function. Authors in [7] ushered in the exponential geometric distribution with decreasing failure rate, authors in [8] proposed a two- parameter distribution known as exponential-Poisson distribution, which takes in a decreasing failure rate and authors in [3] proposed another modification of the exponential distribution with decreasing failure rate function This model is inferred in a complementary risk scenario [9] where the lifetime associated with particular danger is not evident, rather we observe just the maximum lifetime value among all risks . C. Exponential Growth Exponential growth is often used to model the growth of organism populations in a resource-rich environment. Here “resource-rich” means that there is abudance of food and other resources necessary for the population to grow. For example, the initial growth of a cell bacteria in a mouth is often modeled as exponential. The justification for this model is that the rate at which a population of organisms grows should be proportional to their number, assuming that the organisms reproduce at a constant rate. For example, if you double the size of a population, then this should precisely double the rate at which the population bears an offspring, and should, therefore, double the rate at which the size of the population increases. What this means is that the population A of a given organism in a resource-rich environment should satisfy the differential equation Ax, dx dA  where x is some constant that depends on the rate of reproduction. Thus the population grows exponentially bx 0 eAA  This model predicts that the population A will grow indefinitely, which cannot be true in any real situation. Eventually, any population will run out of resources such as food or space to grow. However, the exponential model often gives fairly accurate results in cases where the short-term growth of a population is not inhibited by limited resources[10]. D. Interpreting R2 R2 is frequently defined as the proportion of variance of the response that is predictable (or explained) from the regressor variables, that is the variability explained by the model. A low value of R2 can suggest that the assumptions of linear regression are not satisfied. Plots and diagnostics will substantiate this suspicion. II. MATERIALS AND METHODS We used the data which characterize the proliferative capacity of mesenchymal stem cells. The data are composed of two variables which are the days of the culture (X) and population doubling level (lnY). First, we bootstrap the data in order to increase the sample size and also to optimize the parameter estimates. Then, we estimate the parameters through the exponential curve fitting and transform the nonlinear model into a linear form. This would bring a linear equation form. From the equation, we estimate the value of the independent variable (x) and fit the data with robust weighted regression by Cauchy, robust Fair weighted regression and robust weighted regression by Huber. Then a covariate-dependent variable is used to examine the differences in performance of the model suitability. A. The Algorithm of Exponential Calculation The algorithm showed below is the way of inserting data in SAS algorithm and the way of calculating the bootstrapping method.  Data in SAS format. The name of the dataset is given as cell_growth. The data consist of two variables x and ln y Data cell_growth; input x y lny; cards; 0.00 38.00 3.64 5.00 39.31 3.67 8.00 39.74 3.68 10.00 40.98 3.71 13.00 43.10 3.76 17.00 45.78 3.82 20.00 49.15 3.89 22.00 49.90 3.91 24.00 53.98 3.99 28.00 57.46 4.05 31.00 61.03 4.11 34.00 63.80 4.16 37.00 65.52 4.18 40.00 68.54 4.23 44.00 72.62 4.29 47.00 75.42 4.32 50.00 79.38 4.37 53.00 83.31 4.42 ; run;  Adding bootstrapping algorithm to the methodology building. cell_growth data were bootstrapped two times with resampling. The following procedure was given in SAS syntax as follows. The new data which are generated by the SAS procedure will be named as booted. The produce data in the study will be print through the print procedure.  We also add the syntax of ‘ods rtf file='abc.rtf' style=journal’ in the SAS language in order to get the output in Microsoft Word format. %MACRO bootstrap(data=_last_, booted=booted, boots=2,   B.   Engineerin www.etasr seed=1234); DATA &boote pickobs = INT ET &data POI REPLICATE= i+1; IF i> n*&boot RUN; %MEND boot ods rtf file='ab %bootstrap(da run; proc print data run; PROC SQL this procedur and views. SA The syntax b sum of squa Through the fitness of th value. Title “CORRE proc sql; select css(y) in cell_growth; quit; Title “EXPON ods graphics/im proc nlin data= parameters A= model y = A * ods output Est run; proc sql; select N.nVal SSE.nValue1 1 - SSE/&COR n * log(SSE/n) summExp as N where N.Labe SSE.Label1=“ quit; The Syntax of of Calculatio Calculation r syntax is give Title “REGRE /* ROBUST R procrobustreg model lny = x output out=rob run; Calculation r The full synt /* ROBUST (W Title “ROBUS procrobustreg model lny = x ng, Technology r.com ed; T(RANUNI(&se INT = pickobs N =int(i/n)+1; ts THEN STOP tstrap; bc.rtf' style=jour ata= cell_growth a=booted; is a procedure re to adjust, r AS will create below shows ares and the e syntax provi he parameter ECTED SUM O nto :CORRECT NENTIAL PARA magename=“Ex = booted plots= =1 b=0; * exp(b*x); tSummary=sum lue1 as n, 1 as SSE, RRECTED_SU ) + 2*2 as AIC N el1=“Observatio “Objective”; of Regression M on regression bas en as follows. ESSION”; REGRESSION * method=mm da / diagnostics le bout r=resid sr= regression bas tax is given as WEIGHTED) H ST (WEIGHTED method=m(wf= / diagnostics le y & Applied Sci eed)*n)+1; NOBS = n; ; rnal; h, boots=2); e developed in retrieve and re e the output in the calculatio exponential p ded, we are a through the R OF SQUARES”; TED_SUM_OF_ AMETER EST xponentialFit”; fit; mmExp; UM_OF_SQUAR from summExp ons Used” and Modeling Bas sed on robust */ ata=booted; everage; =stdres; sed on robust follows. HUBER */ D) HUBER”; =huber(c=1.345 everage; ience Research Ahmad et al.: F n SQL. We ca eport data in the form of ta on of the cor arameter estim able to measu R-Square and ; _SQUARESy fr IMATES”; RESy as RSqua p as SSE, sed on Four Ty regression. Th t (weighted) H )) data=booted; h V Forecasting Pa an use tables ables. rrected mates. ure the d AIC rom are, ypes he full Huber.   in gen pro mo than sign the A. Vol. 8, No. 4, 20 arameter Estima output out=rob run; Calculation re The full synta /* ROBUST (W Title “ROBUST procrobustreg m model lny = x / output out=rob run; Calculation re full syntax is /* ROBUST (W Title “ROBUST procrobustreg m model lny = x / output out=rob run; ods rtf close; The syntax of Microsoft W nerated in the M The results fo ocedures is giv del predicts th n 0.05, and thi nificantly pred data. Figure 1 Result for Exp Source DF Model 2 Error 34 Total 36 018, 3162-3167 ates: A Modelin out r=resid sr=s egression base ax is given as f WEIGHTED) C T (WEIGHTED method=m(wf= / diagnostics lev out r=resid sr=s egression base given as follow WEIGHTED) FA T (WEIGHTED method=m(wf= / diagnostics lev out r=resid sr=s f “ods rtf close Word. This m Microsoft Wor III. or the first mo ven in Tables he dependent v is indicates tha dicts the outco 1, indicates the ponential Fit Fig. 1. The TABLE I. Sum of Squares 100489 43.3029 100533 7 ng Approach U stdres; ed on robust ( follows. AUCHY */ D) CAUCHY”; cauchy(c=2.385 verage; stdres; ed on robust (w ws. AIR */ D) FAIR”; fair(c=1.4)) dat verage; stdres; e” gives an ord means that th rd format. RESULTS odel without i I and II. Tabl variable well. at, overall, the ome variable a e fit plot for ln e fit plot for y vs x ANOVA Mean Square 50244 1.273 3164 Using Exponent (weighted) Cau 5)) data=booted weighted) Fair a=booted; der to close th he output wil involving weig le I shows tha The p-value i e model statisti and is a good f ny. x F Value Appr Pr>F 39450.4 <.000 tial … uchy. d; r. The he file ll be ghted at the s less ically fit for ox F 01 it c Ta var exp exp wh go B. the tra can dep val Ta C. reg wh Engineerin www.etasr T Parameter A b From Table I y 9735.35 Model (2) ca can be written      3.58 ln y ln ln y   Table III giv able III, the R riation in the plained by the plained, which hich is the sm od model is th n 36 Result for Ro Figure 2 sho e original dat ansformed data n write an exp y 572.3ln  The R-Squa pendent variab lue of AIC is g able V. Result for Ro Figure 3 giv gression. From hich is weighte ng, Technology r.com TABLE II. PA Estimate A St 35.9735 0.0160 0 II, we can writ xe 0160.05 an be transform as follows  35.9735 0. 827 0.0160x   ves the inform R-square value dependent var e regression m h is very large mallest value a he one that has TABLE III. SSE R- 43.302936 0. obust Regressi ws the fit plot ta, while in a. Table IV sh ponential mode x0162.09  are value indi ble. In this cas given as 28.24 Fig. 2. obust (Weighte ves the plot m the Table VI ed by Huber as y & Applied Sci ARAMETER ESTIMA Approx td. Error Ap Co 0.2564 35. 0.000219 0.0 te an exponent med into a lin 0160x mation of exp indicates how riable or varia model. In this e. The AIC val among all pro s minimum AI EXPONENTIAL FI -Square 988339 10 ion t for lny vs x. this section hows the param el as: icates the tot se, 78.94 can b 4. Detailed info Fit plot for lny ed) Huber for lny vs x I, we can write s: ience Research Ahmad et al.: F ATES pproximate 95% onfidence Limits .4525 36.494 0155 0.016 tial model as (2) near form by ta ponential fit. w much of the ability of the d case, 97.98 c lue is about 10 oposed metho IC value. IT AIC 0.64925 Previously we we are usin meter estimate (3) tal variation i be explained an formation is giv x for Huber r e a robust regre h V Forecasting Pa % s 45 64 aking, From e total data is can be 0.649, ods. A e used ng the es. We in the nd the ven in robust ession valu Para Inte S D. wei a ro Vol. 8, No. 4, 20 arameter Estima y 5799.3ln  From the robu ue of R-Squar T ameter DF E ercept 1 x 1 cale 0 TA Result fo In Figure 4 ighted by Cau obust regressio y 5798.3ln  018, 3162-3167 ates: A Modelin x0160.09 ust regression re is 0.9577 an Fig. 3. F TABLE IV. PAR Paramete Estimate Std. Erro 3.5729 0.008 0.0162 0.000 0.0282 ABLE V. PAR Goodn Statistic R-Square AICR BICR Deviance for Robust (We the result o uchy is plotted on which is we x0160.08 Fig. 4. F 7 ng Approach U which is weig nd AIC value is Fit plot for lny RAMETER ESTIMA er Estimates . or 95% Confidenc Limits 87 3.554 3.5 03 0.016 0.0 RAMETER ESTIMAT ess-of-Fit Value 0.7894 28.2413 32.9392 0.0205 eighted) Cauch of robust reg d. From Table eighted by Hub Fit plot for lny 3165 Using Exponent (4) ghted by Hube s given as 24.3 ATES ce Chi- Square C 590 168673 017 2383.2 TION hy gression whic VIII, we can ber as follows (5) tial … er, the 3544. Pr > Chi.Sq. <0001 <0001 ch is write 37 see Par Int S Pa In E. we of Engineerin www.etasr The value of 7.98. A robu ems not to be a TABLE VI. rameter DF E tercept 1 x 1 Scale 1 TABLE VII. TABLE VIII. arameter DF E ntercept 1 x 1 Scale 1 TABLE IX. Result f Below is the e can write a f R-Square is 0 y 580.3ln  ng, Technology r.com f R-Square is 0 ust regression a good proced PARAMETER ES Paramet Estimate Std Erro 3.5799 0.00 0.0160 0.00 0.0335 PARAMETER ES Goodn Statistic R-Square AICR BICR Deviance PARAMETER EST Paramet Estimate Std. Error 3.5798 0.008 0 0.0160 0.000 3 0.0336 PARAMETER EST Goodn Statistic R-Square AICR BICR Deviance for Robust (W result of Fair fair robust reg .9616 and AIC x0160.003 Fig. 5. y & Applied Sci 0.4797 and AI which is we dure for the for STIMATES FOR HU ter Estimates d. or 95% Confiden Limits 76 3.5651 3. 03 0.0154 0. STIMATES FOR HU ness-of-Fit Value e 0.9577 24.3544 29.2774 e 0.0249 TIMATES FOR CAU ter Estimates r 95% Confidence Limits 8 3.564 2 3.59 5 0 0.015 4 0.01 6 TIMATES FOR CAU ness-of-Fit Value 0.4797 377.9854 383.0731 0.4252 Weighted) Fair robust regress ression is give C value is give Fit plot for lny ience Research Ahmad et al.: F IC value is giv eighted by C recasting. UBER REGRESSION nce Chi- Square 594 222668 016 2807.09 UBER REGRESSION UCHY REGRESSION Chi- Square C 201478 <0 2540.99 <0 UCHY REGRESSION sion. From Tab en in (6). The en 12.309. (6) h V Forecasting Pa ven as Cauchy N Pr > ChiSq. <.0001 <.0001 N N Pr > ChiSq. 0.0001 0.0001 N ble X, value Para Inte S F. met thei Ex R Ro R par also thro cell RO (Ta Rob Rob ove be e sho wei high fit i oth out to ana pro calc The com Vol. 8, No. 4, 20 arameter Estima TABLE X. ameter DF E ercept 1 3 x 1 0 cale 1 0 TABLE XI. Method Comp Table XII sh thods. We sum ir parameter es TABLE Parameter Estimates xponential Fit Robust Regression Robust Huber obust Cauchy Robust Fair IV. The main ob ameter estima o to find the b ough modelin l doubling d OBUSTREG. I able XII): (i) bust (weighted bust (weighted erview of the a employed. Fro ows a very goo ighted by fai hest R2 and th is the best me er methods fit liers through t delete them f alysis once aga oposed model culation with e exponential mpared to the o 018, 3162-3167 ates: A Modelin PARAMETER ES Paramete stimate Std. Error 3.5803 0.0086 0.0160 0.0003 0.0336 PARAMETER ES Goodn Statistic R-Square AICR BICR Deviance parison hows the com mmarized the stimates. XII. PARAMETE Mode y 9735.35 or in a linea  y 5827.3ln  y 5729.3ln  y 5799.3ln  y 5798.3ln  y 5803.3ln  . DISCUSSIO bjective of th ate between se best calculatio ng techniques. data by usin In this paper, f Exponential d) Huber (iv) d) fair. This pa above mention om the results, od fitting resu ir techniques he lowest AIC ethod, followe t the model po the output. Th from the set ain. To keep th , it is neces some improve l fit reveals other proposed 7 ng Approach U STIMATES FOR FA er Estimates 95% Confidence Limits 6 3.563 3.5 0.015 0.0 STIMATES FOR FA ess-of-Fit Value 0.9616 12.3093 17.8695 0.0121 mparison of output of the ER ESTIMATES CO el R xe 0160.05 ar form x0160.0 0 x0162.0 0 x0160.0 0 x0160.0 0 x0160.0 0 ON AND CONCL his research i everal propose on which can We have giv ng PROC N five different m fit (ii) Robu ) Robust (wei aper provides ned different t , we can see th ult, followed by . Both meth C. This indicat ed by robust f oorly. This m he best way to of data and he efficiency a ssary to have ements of the the findings d methods. 3166 Using Exponent AIR REGRESSION e Chi- Square C 97 171908 < 016 2162.9 < AIR REGRESSION the five diff gained model OMPARISON -Square A .988339 1.06 0.7894 282 0.9577 243 0.4797 3.77 0.9616 123 LUSION is to compare ed calculations represent the ven an examp NLIN and P methods were ust regression ighted) Cauch only a prelim techniques tha hat the expone y robust regre hods produced tes that expone fair regression may be due to o handle outli then rerun to and accuracy o e a good wa proposed stra s more expl tial … Pr > ChiSq. <0.0001 <0.0001 ferent l with AIC 64.925 2.413 3.544 79.854 3.093 e the s and e data ple of PROC used n (iii) hy (v) minary at can ent fit ession d the ential . The some ers is o the of the ay of ategy. licitly Engineering, Technology & Applied Science Research Vol. 8, No. 4, 2018, 3162-3167 3167 www.etasr.com Ahmad et al.: Forecasting Parameter Estimates: A Modeling Approach Using Exponential … ACKNOWLEDGMENT Authors would like to express their gratitude to Universiti Sains Malaysia for providing the research funding (Grant no.1001/PPSG/8012278, School of Dental Sciences, Universiti Sains Malaysia). REFERENCES [1] N. R. Draper, H. Smith, Applied Regression Analysis, Wiley Eastern, 1998 [2] R. J. Tallarida, R. B. Murray. Exponential growth,and decay, Manual of Pharmacologic Calculations, Springer, 1987 [3] R. Tahmasbi, S. Rezaei, “A two-parameter lifetime distribution with decreasing failure rate”, Computational Statistics & Data Analysis,Vol. 52, No. 8, pp. 3889-390, 2008 [4] A. Schneider, G. Hommel, M. Blettner, “Linear Regression Analysis”, Deutsches Arzteblatt International, Vol. 107, No. 44, pp. 776-82, 2010 [5] D. C. Montgomery, E. Peck, G. Vining, Introduction to linear regression analysis, 3rd Edition, John Wiley and Sons, 2003 [6] R. Gupta, D. Kundu, “Generalized Exponential Distributions”, Australian and New Zealand Journal of Statistics, Vol. 41, No. 2, pp. 173-188, 1999 [7] K. Adamidis, S. Loukas, “A lifetime distribution with decreasing failure rate”, Statistics & Probability Letters, Vol. 39, No. 1, pp. 35-42,1998 [8] D. Kus, “A new lifetime distribution distributions”, Computational Statistics and Data Analysis, Vol. 11, No. 9, pp. 4497-4509, 2007 [9] F. Louzada-Neto, “Poly hazard regression models for lifetime data”, Biometric,Vol. 55, No. 4, pp. 1281-1285, 1999 [10] J. U. Kreft, G. Booth, J. W. T. Wimpenny, “BacSim, a simulator for individual-based modeling of bacterial colony growth”, Microbiology, Vol. 144, No. 12, pp. 3275-3287, 1998