TX_1~ABS:AT/ADD:TX_2~ABS:AT 133 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 ReseaRch aRticle The Use of Tobit and Logistic Regression Models to Study Factors that Affect Blood Pressure in Cardiac Patients Bekhal S. Sedeeq, Banaz W. Y. Meran Department of Statistics, College of Administration and Economics, Salahaddin University-Erbil, Kurdistan Region - F.R. Iraq ABSTRACT This research studies the factors that affect blood pressure in cardiac patients using the Tobit and logistic regression models. The data have been collected, from 500 patients with heart disease in hospital – heart center – Erbil. The two levels of blood pressure, low and high blood pressures, were taken from the patients as dependent variables plus other, independent variables (gender, age, urea, cholesterol, creatinine, and weight). The research shows that the median of blood pressure by means of arterial pressure (MAP) equation contains each of high and low blood pressures differently. This is due to the threshold value of 99.33, equal to,12/8 mmHg, which represent a normal level of a human blood pressure. The aim of this research is to explain the main concepts and processes of Tobit regression analysis (censored and truncated) and logistic regression analysis, which are used for predicting the factors of independent variables that have more effect on the response variables, for example, blood pressureand to compare the outcomes of the two models (Tobit regression and Logistic regression) in order to determine which of the models best fits our data in which AIC and BIC are used. The data analysis of this research shows that the logistic regression model best fits our data, as compared with the Tobit regression model. The data analysis has been achieved using statistical packages in R programming, MATLAB and Statistical Package for the Social Sciences (SPSS) V.26. Keywords: Akaike information criterion, Bayesian information criterion, censoring, logistic regression model, Tobit regression model, truncation INTRODUCTION The health sector is one of the most vital sectors that undertake the task of providing health services to all members of society through health institutions to protect and improve society and achieve the well-being of its members. In fact, one of the components of building health in society is to ward off all diseases, and at the lead of these diseases is cardiovascular disease, which is one of the problems that challenge medicine. Cardiovascular disease, the leading cause of death worldwide, is greatly exacerbated by high blood pressure. Around,54% of strokes-and 47% of coronary heart disease occur worldwide as a result of high blood pressure. High blood pressure is common medical issue that becomes more prevalent as people become older.[1] There have been several statistical studies of individual data that include observations in which a dependent variable is equal to 0, for a digit of observations in the dataset. This behavior is known as censored or truncated data.[2] Since James Tobin’s work, the Tobit regression has received a huge and diverse amount of theoretical interest (1958). Its use in practical applications has been developed in fields such as economics, biology, finance, and medicine. Tobit regression can be seen as a linear regression model where only the data, on the dependent variable are incompletely, observed.[3] Logistic regression is a statistical, method for examining the relationship among a dependent variable of a nominal level and one or other independent variables so that those independent variables are from any type of measurement level.[4] Logistic regression is investigation of the best method in the case of the binary dependent variable, according to Dayton[5] logistic regression, which is a categorical data statistical modeling method. It is the usages of the same logic as ordinary least squares regression. This study, aims at: 1. Utilizing both the Tobit and the logistic regression models. Corresponding Author: Banaz W. Y. Meran, Department of Statistics, College of Administration and Economics, Salahaddin University-Erbil, Kurdistan Region - F.R. Iraq. E-mail: banaz.yaqoob@su.edu.krd Received: July 18, 2022 Accepted: August 12, 2022 Published: November 20, 2022 DOI: 10.24086/cuesj.v6n2y2022.pp133-140 Copyright © 2022 Banaz W.Y Meran, Bikhal S. Sedeeq. This is an open-access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0). Cihan University-Erbil Scientific Journal (CUESJ) Meran and Sedeeq: Tobit and logistic regression models 134 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 2. Knowing which factors of independent variables is more effecter on the dependent variable (blood pressure) using both models. 3. Comparing the results of the two models to determine which one best fits our data that are using Akaike information criterion (AIC) and Bayesian information criterion (BIC). LITERATURE REVIEWS Shirafkan et al.[6] estimated the potential of Tobit regression, as a method to study time to onset of cytomegalovirus in renal recipient transplants. The results of this study revealed that the age of patients was influenced by the time of onset of disease. Consequently, the bigger the age, the shorter time required to-start infection. Karim et al.,[7] Tobit model and the traditional regression model were compared on data taken from patients with kidney disease, and through using some statistical measures, it was concluded that Tobit’s model is preferable than the traditional regression model for this type of data. Taleb et al.[8] used a binary logistic regression model, estimating the coefficients of this model the least squares method for people with heart disease. The aim of the investigation was to compare the actual reasons of death with the estimated causes of death. The binary logistic regression model concluded that smoking was the leading cause of death. Rambeli et al.[9] used the binary logistic model for a study the aim of which was to identify the factors that influence a teacher’s decision to remain in their professional life. The result showed that the income was considered as a key factor in teacher remaining committed to the profession. METHODOLOGY This research contains three parts: The first part of this research included the introduction, methodology, aim of the study, and review of the literature. The second part of the research included the basic concepts of the two models and the last part contains data applications, implementation of two models, and interpretation of the results. Tobit Regression Model Regression analysis is one of the most important ways to find out the significant effect between the dependent variable and explanatory, variables but sometimes, the dependent variable is constrained to the threshold point. In this situation, the use of the traditional regression model is biased. With this kind of data, using the Tobit regression model is the best option. Logistic-regression, it is used as in this, study. The analysis of the Tobit regression means the, statistical method used to examine the relationship between the limited dependent variable and explanatory variables of any type. The analysis in this situation is named a Tobit regression. A limited dependent variable is the one whose range of potential values is constrained in some significant way.[10] Limited dependent variable models contain: a – Censoring, in which some data are lost but other data are present for certain persons in a data collection and b – truncation, in which some individuals are purposefully removed from observation.[11] Tobin used a regression model based on household expenditures that particularly took into account the fact that (his regression model’s dependent variable) cannot be negative. Tobin coined the phrase “model of constrained dependent variables” to describe his approach, a term invented by Goldberger (1964) for the reason that they resemble Probit models. These models are frequently denoted to as truncated or censored models. If the observations are lost outside of a carefully defined range and are censored, the model is called a truncated model.[12] The structural Tobit model y Xi * � �� ei� Where, ei ~ N(0, σ 2). y* is a latent variable that is observed y, shown by the following equation y y if y if yi y � � � � � � �� * * ���� * ��� ���� ����� ���� ������ � � � When assume that τ = 0 in the standard Tobit model, data are censored next to a value of 0. For use there is:[13] y y if y if y i � � � � � � �� * * * ��� ���� ����� ���� ���� ������ 0 0 0 As it has been shown, the probability function is censored L y µ i N di di � �� � � � � � � � � � � � � �� � � � � � � � � � � �� � � 1 1 1 � � � � � � In the conventional Tobit model, one groups τ = 0 and parameterize µ as (X i β). That is Tobit model’s likelihood function: L y X X i N i i di i di � �� � � � � � � � � � � � � � � � � � � � � � � � �� � 1 1 1 � � � � � � The Tobit model’s log likelihood function is ln ln L d y X d Xi N i i i i i � � � �� � � � � � � � � � � � � �� � � � � � �� � 1 1 1 ln ln� � � � � � � �� � � � � � � � � � � � The overall log, likelihood is split into two components. The first component corresponds to a traditional regression for an uncensored observation, while the, second component represents the associated likelihood of censoring the test.[14] Standard Tobit model Censored regression method has been very widespread as a standard Tobit model since Tobin (1958) first presented it to evaluate the relationship between household expenditure and household income. y i n'* ����� , , .� � � �x ui i� 1 2 y �y ������if��������y � n� �* * y � �if�������y n� �0��������� * Where, u i is identical independent distribution (iid). Drawings from N(0, σ u ). y i and X i are observed in the sample, but the yi * is unobserved if � *y i2 < 0. The likelihood function for this model is: [12] Meran and Sedeeq: Tobit and logistic regression models 135 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 L x y xi i i� � � � � � � � � � � � � � �� � � � � �� � 0 1 1 1 � � ' '� � � � � Censoring A regression, model is said to be-censored when the recorded, data on the dependent variable (the response) cutoff outside a certain-range with multiple observations at the-endpoints of that range. When the data are censored, variation in the observed dependent variable will underestimate the impact of the regression on the actual dependent variable. As a result, coefficient estimates from standard ordinary least squares regression employing censored data are often biased toward zero.[15] Truncation Truncated data, or missing data, are discovered when an observation is not reported despite of whether it is below or above a specific level that differs. In actuality, these are known as left and right truncation, respectively. The truncation effect can also happen when only a small portion of a larger population is represented in the sample data. In addition, the response variable in the model is truncated if observations are not possible while taking values inside a particular range. Consequently, when the dependent variable is within that range, neither the dependent nor the independent variables are observed.[2] Logistic Regression Model What distinguishes a logistic regression model from the, linear regression model, in logistic regression, could be the result variable which is either binary or, dichotomous. This difference among logistic and linear regression is reflected both in the choice of a parametric model and it assumptions, whereas the methods used in a logistic, regression study follow the same fundamental principles, as linear regression.[16,17] Logistic regression model circuitously models the response variable created on probabilities linked with the digits of the dependent variable y. We will use P(X) to represent the possibility of a response when y = 1. Furthermore, we will define, 1–P(X) which represents the possibility of a response when y = 0. These probabilities are written as follows: P X P Y X X Xk� � � � �� �1 1 2| , , 1 0 1 2� � � � � �� �P X P Y X X Xk| , , The logistic regression equation and the regression equation with a straight line, equation ( Y X X Xk k� � � � ��� � � �1 1 2 2 . ) are related by the formula below. The logistic regression equation form is rewritten as such: Logit P X P X P X X X Xe k k� log .� � � � � � � �� � � � � � � � � � � � � � �� 1 1 1 2 2�� � � � The logistic regression model can be calculated using the formula below.[17,18] P X X X kXk X X kXk ( ) . � � � � ���� � � � �� �� � e e �� �� � � � � � � 1 1 2 2 1 1 2 21 Estimate the Parameters of Logistic Regression Model The approach of maximum likelihood estimation will be used. The log likelihood is given as:[19-21] ( )yi 1 yi 1 ( , ) [1 ( )] n i L P X P X − = α β = −∏ We’ll use the log likelihood method for estimation: logL logP X P X i ( �, ) ( ) ( ( )� � � � � � � �� � � �� � � � � � � � � � � 1 1 1 n i i n iy n y log �� � � � � � � � � �� � � �� � � � � �U P L P X i ( ) ( , ) / ( ) /( ( � ��� P y n y P X i n i n i 1 1 1 ))) � � � � � � � � Cox and Snell R2 statistic In the logistic regression model, the determination coefficient R2 used to determine the fittingness of the proposed regression, Table 1: Five hundred patients sample ID Y: BP X1: Gender X2: Age X3: Urea X4: Cholesterol X5: Creatinine X6: Weight 1 88 (0) 1 70 68 127 1.26 89 2 76 (0) 2 40 41 117 0.97 96 3 99.33 1 44 34 201 0.59 74 4 85 (0) 2 43 33 221 0.67 107 5 117 1 55 54 226 1.12 88 6 92 (0) 1 67 31 173 0.72 76 7 96.67 1 47 43 165 0.78 65 497 107.67 1 54 31 160 0.87 85 498 93.67 2 63 16 109 0.6 69 499 106.67 1 58 25 100 0.73 67 500 118.33 1 69 38 134 1.11 65 BP: Blood pressure Meran and Sedeeq: Tobit and logistic regression models 136 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 models for the study data is changed by the R2 Nagelkerke, Cox, and Snell R2 computation statistics. Can be calculated as: � ( / ) R Cox�and�Snell2 0 1 2 1� � � � � � � � L L n L 0 : Maximum likelihood for constant in the model. L 1 : Maximum likelihood independent variables in the model n: Sample size[22] And, can be calculated as: R R n 2 2 0 21 Nagelkerke cox snell L � � � �&� ( / ) The Hosmer-Lemeshow test The Hosmer and Lemeshow test is a widely-used test for determining a model’s quality of fit. It accepts any number of explanatory-variables, which can be continuous or categorical. It is used to test the hypothesis: H 0 : The model is adequate for data H 1 : The model is not adequate for data The model will be a useful model if (Hosmer and Lemeshow) static is >0.5.[22] HL o E Nk i n i i i i i � � �� � � �� � �� ��2 2 1 2 1 Wald statistic The Wald test is used to determine whether or not the effect of the logistic regression coefficient on the independent variables.[23] The Wald statistic is calculated according to the following formula: 2 2 . W S Eβ β = The classification table The use of classification tables is one of the ways to check the quality of matching the model with the data. This method depends on creating tables that put the number of cases that have the desired trait or the cases that do not have the desired trait and that were categorized correctly or incorrectly. The concept behind using the analysis is it leading to better results if the model is data compatible.[24] The AIC AIC as a model selection criterion for assessing actual data, it has played a key role in solving issues in a wide range of fields, and the model by the lowermost AIC is chosen as the best model.[25] AIC L K�� �� log *2 2 The BIC In statistical model selection, the BIC is unique of the best well-known and commonly used tools. BIC is calculated for each of the models, and the model with the lower BIC value is selected as the best model.[26] 2 log 2 * log *BIC L N K=− + DATA ANALYSIS In this study, the two models of Tobit and logistic regression models have been applied on a sample taken from 500 patients with heart disease and two levels of blood pressure, high and low blood pressure, in hospital – heart center – Erbil. Blood pressure is taken from the patients as response variables and some independent-variables (gender, urea, age, cholesterol, creatinine, and weight). The study found that the average of blood pressure by means of arterial pressure (MAP) equation contains each high and low blood pressure differently because the threshold point was determined to be 99.33. To take the best model for our data in the study, two statistical measures (AIC and BIC) were used. Note: All assumptions and tests related to Tobit and logistic regression models have been applied before we started the data analyses in this study. Data Description for Tobit Regression Analysis In this study, the data are gathered from 500 patients with heart diseases, and the two levels of blood pressure; high and low blood pressure were taken from patients as dependent variables and the variables: Gender, age urea, cholesterol, creatinine, and weight as the independent variables (Table 1). The researcher set that the medium of blood pressure by MAP equation contains each highest and lowest blood pressure differently because the threshold point was determined to be 93.33, In regards, the dependent variable in this study has been defined as: Y=Y* Y*>93.33 Appropriately, the limitation threshold point Y = 93.33 was regarded the limitation threshold point y = 0, in accordance with the stated theoretical presentation, and the model will be referred to as the following model: Y=Y* Y*>93.33 Y=0 Y*<=93.33 The explanatory variables in this study are the follows: X 1 : Gender (male and female) X 2 : Age measured by year X 3 : Urea (mg/dL) X 4 : Cholesterol (mg/dL) X 5 : Creatinine (mg/dL) X 6 : Weight measured by kg The explanation of the variables is presented in Table 2 in which contains independent variables and dependent variable and the table views the maximum, minimum, mean, and standard deviation of data. Application of Tobit regression model analysis (Censored and Truncated) Because of the researcher initially checked all the necessary assumptions that must be present in the data before starting to analyze the data, and also set a unified standard for the data in this study, it becomes clear for us that there is no problem in terms of our data and we can use the data for analyzing. Meran and Sedeeq: Tobit and logistic regression models 137 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 Censored regression model First, let’s start with censored-regression model are: Censored (formula = Y~ X, -left = 0, right = Inf., data= data 1) Total (n = 500 observation, left censored = 132 observation, Uncensored = 368 observation, left censored (Y < 99.3 then Y* = 0: Observation). Table 3 presents the results of the exact regression model. The coefficients of the independent variables sex, age, cholesterol, creatinine, and weight are positive because the variables have a positive relationship with the dependent variable (blood pressure), while the coefficient of the independent variable urea is negative because the variable has a negative relationship with the dependent variable (blood pressure). According to the findings in Table 3, the variables age, cholesterol, and creatinine all significantly affect blood pressure. The summarized fit to the censored regression model is log-likelihood = −2125,549, AIC = 4265.1, BIC = 4294.6. The score for the best model is determined by the lowest value for AIC and BIC. Truncated regression model In this case, the number of observations turns to 368 due to a truncation. Table 4 shows the results from truncated regression model. Those coefficients of the independent variables such as gender, age, cholesterol, urea, and weight are positive for the reason that the variables take a positive relationship with the dependent variable (blood pressure) whereas the coefficient of the independent variable creatinine is negative because has a negative connection with the dependent variable (blood pressure). It is through the conclusion in Table 4 that only the urea variable has the effect on blood pressure. Logistic Regression Analyses In this part, we use binary logistic regression since the dependent variable in this study is blood pressure and the researcher has taken the average of blood pressure by MAP equation which contains each of the high and low blood pressure differently because of the threshold points which was determined to be 93.33. The patient whose blood pressure is >93.33 is considered to be infected and takes the worth of one while the patient whose blood pressure is ≤93.33 is considered to be uninfected and takes the value of 0, and the other variables are gender, age, cholesterol, urea, creatinine, and weight. Y: The dependent variable has a binary response code: Y Infected Uninfected � � � � 1 0 The independent variables are the same as the variables written above. Application of binary logistic regression analysis Let’s start with the outcome of the classification table starting with the zero stage in which the model is free of independent variables (only the constant). Table 5 represents the baseline model, which is a model without our explanatory variables. The overall right percentage was 73.6% which refers the model’s overall explanatory strength . The initial log likelihood function (-2 log likelihood function)= 577.2. Omnibus test of logistic model coefficients Based on the model coefficients in omnibus tests, we find that the Chi-square tests are to illustrate if there is an important variance between the factors of the nil model and the current model. Table 2: Descriptive statistics of variable Variables Minimum Maximum Mean SD BP 70.33 141.00 101.23 10.97 Gender 1 2 1.33 0.47 Age 17 86 57.73 11.24 Urea 11 198 40.11 20.53 Cholesterol 63 326 167.36 45.02 Creatinine 0.11 10.70 1.58 1.195 Weight 48 135 78.83 13.47 BP: Blood pressure, SD: Standard deviation Table 3: Censored regression model Coefficients Estimate SE t Pr (>t) Constant −127.94562 25.77314 −4.964 6.89e-07*** Gender 3.57944 5.59007 0.640 0.522 Age 1.91136 0.24742 7.725 1.12e-14*** Cholesterol 0.25688 0.05688 4.516 6.30e-06*** Urea −0.11037 0.12724 −0.867 0.386 Creatinine 12.21152 2.09200 5.837 5.31e-09*** Weight 0.31290 0.19800 1.580 0.114 *** mean P ≤ 0.001 Table 4: Truncated regression model Coefficients Estimate SE t Pr (>t) Constant 98.7995674 4.7143913 20.9570 <2e-16*** Gender 0.6703624 0.9215182 0.7275 0.46695 Age 0.0645897 0.0448466 1.4402 0.14980 Cholesterol 0.0015114 0.0094141 0.1605 0.87245 Urea 0.0520044 0.0202811 2.5642 0.01034* Creatinine −0.2198352 0.3154896 −0.6968 0.48592 Weight 0.0040456 0.0340886 0.1187 0.90553 SE: Standard error; * mean P ≤ 0.05 and *** mean P ≤ 0.001 Table 5: Classification table shows that the model has a constant bound zero step Observed Predicted Y Percentage correct Uninfected Infected Step 0 Y Uninfected 0 132 0.0 Infected 0 368 100.0 Overall percentage 73.6 Meran and Sedeeq: Tobit and logistic regression models 138 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 Table 6 displays that the current model is meaningfully more suitable than the null model. Model coefficients omnibus test offers significant reduction in the −2 log likelihood value = 426.572= as compared to −2 log likelihood value =577.2 of the null model. This indicates that the Chi-square values for step, block, and model are all the same. P < 0.05 illustrating that the model’s accuracy increases when explanatory variables are included. Furthermore, the Chi-square is significant (χ2 = 150.628, df = 6, P < 0.05). As a result, our new model is bested. Hosmer and Lemeshow test Being a goodness of fit test for logistic regression, it fits the data model. Table 7 explains since P = 0.226, it is larger than the level of significance at 5%. We may well conclude that the data are will be suitable to the model. Cox and Snell R2, Nagelkerke’s R2 The Cox and Snell R2, Nagelkerke’s R2 values are utilized to estimate the model’s fit to the data. As the Nagelkerke’s R2, Cox and Snell R2 values given in Table 8 are examined, the ratios of interpretation of the independent variables over the dependent variable are shown. The value of Nagelkerke R2 is the modified form for the Cox and Snell R2 coefficient. Giving to the outcomes shown in Table 8, it is seen that the dependent variables determine 26% of the variance in the independent variables according to the value of Cox and Snell R2, and 38% according to the value of Nagelkerke R2. The amount of the −2 log likelihood statistic is 426.572 in model summary for the whole model. Table 9 is corresponding to Table 5 but it is based on the model that contains our explanatory variables. The total percentage of correct was 79.2% which replicates the model’s overall explanatory strength. Classification table contains the constant term and the rest of the predictors, that is, 91.6% were correct for the infected of blood pressure and correctly classified. This table shows how many events were correctly predicted (59 cases were observed to be uninfected and were correctly predicted to be uninfected; 337 cases were observed to be infected and were correctly predicted to be infected) and how many were not (73 cases are observed to be uninfected but are predicted to be infected; 31 cases are observed to be infected but are predicted to be uninfected). Variables in the equation logistic-regression model The variables in the equation logistic regression are the most essential of all the outputs. This table must be studied carefully since it contains the answers to our questions about the common relationship between all variables. Table 10 shows the values of Wald test that represents the parameter of the test value of the model and it appears that the variables (age, urea cholesterol, and creatinine) represent the significant variables in the research. It is by associating the P-value with the level of significant (0.05) that the p-value represents the significance of the effect of the variable on the patient condition, that is, it is significant when P < 0.05 was considered for the variable under test. It shows that the variables (gender and weight) are not significant variables in the study, and it is through comparing the P-value with the level of significant (0.05) that the P-value represents the non-significance of the effect of the variables on the patient condition, that is, it is no significant when P > 0.05 was considered. As far as Table 10 is concerned, the logistic computation coefficients that may be utilized to build a predictive equation could be: Y � � � � � � � � �� � � �� � � � ��log p x p x x x xk k1 1 1 2 2 � � � � The effect factors for blood pressure in cardiac patients can be ranked as follows based on the value of the odds ratio. Likewise, we can write the logistic regression computation with just significant variables: Logistic equation (Model): Y = −7.689 + 0.080 Age - 0.016 Urea + 0.013 Cholesterol + 1.451 Creatinine Table 10 shows, Exp(β) = eb represents the ratio changed in the odds of the event of importance for a one unit variation in the predictor. The value of Exp(β) for the variable gender indicates that when the gender changes from the value 0 (female) to the value 1 (male), the probability of disease blood pressure in patients with heart disease increases because the value of Exp(β) is >1. Such incomes indicate that the blood pressure of males is higher than females. Table 6: Omnibus test Step 1 χ2 df Significant Step 150.628 6 0.000 Block 150.628 6 0.000 Model 150.628 6 0.000 Table 7: Hosmer and Lemeshow test Step χ2 df Significant 1 10.595 8 0.226 Table 8: Summary of logistic regression Step −2 Log likelihood Cox and Snell R2 Nagelkerke R2 1 426.572a 0.260 0.380 Table 9: Classification tables Observed Predicted Y Percentage correct Uninfected Infected Step 1 Y Uninfected 59 73 44.7 Infected 31 337 91.6 Overall percentage 79.2 Meran and Sedeeq: Tobit and logistic regression models 139 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 The odds ratio for the variable age is >1, Exp(β) = 1.083. This means that each additional rise of 1 year in age is in touch with the increase in the odds infected of blood pressure in cardiac patients. The odds ratio for the variable urea is <1. This means that each additional increase of one unit in urea is associated with decrease in the odds infection of blood pressure in cardiac patients with 0.984 times. The odds ratio for the variable cholesterol is >1. This means that each additional increase of one unit in cholesterol is related to the increase in the odds of blood pressure in cardiac patients with 1.013 times. The odds ratio for the variable creatinine is >1. This means that each additional increase of one unit in creatinine is related to the increase in the odds infection of blood pressure in cardiac patients with 4.267 times. The odds ratio for the variable weight is >1. This means that each additional increase of one unit in weight is associated-with the increase in the odds infection of blood pressure in cardiac patients with 1.011 times. DISCUSSION There are different techniques for comparing the analysis of two or more models; however, the AIC and BIC criteria are two that may be worth considering. Table 11 shows a comparison between three regression models censored, truncated, and logistic, for choosing the most fit model to our data of blood pressure in cardiac patients, the AIC and BIC values with the least values are chosen. The results display that the logistic regression model is better and more suitable rather than truncated regression and censored regression for our data, because it’s AIC equal to 591.2 and BIC = 620.7 are the lowest values contrast with Tobit models (censored and truncated). CONCLUSION It has been concluded the following: 1. In the censored, regression model, the explanatory, variables (age, cholesterol, and urea) significantly impacted on blood pressure. 2. The results, from truncated regression model show, that only the urea variable has the effect on blood pressure. 3. According, to the results classification table, logistic model is, correctly classifying the consequences for 79.2% of the cases, compared to 73.6% in the null, model. 4. According to the Hosmer-Lemeshow test, our data fit the logistic regression based on a χ2 = 10.595 and P-value greater, than a significant level. 5. Wald’s test showed that the variables of age, creatinine, cholesterol, and urea, respectively, contributed significantly to the prediction, depending on the P-value (0.000 < 0.005). The variables that do not have a significant, effect are weight and gender. 6. It was concluded, that the logistic regression model for the sample under study or for our data is better, than the censored regression model and the truncated regression model after comparing, their AIC and BIC values. REFERENCES 1. C. Y. Wu, H. Y. Hu, Y. J. Chou, N. Huang, Y. Chou and C. Li. High blood pressure and all-cause and cardiovascular disease mortalities in community-dwelling older adults. Medicine(Baltimore), vol. 94, no. 47, p. e2160, 2015. 2. W. H. Greene. Censored Data and Truncated Distributions, SSRN Electron. J, 2005. 3. M. H. Odah, B. K. Mohammed and A. S. M. Bager. Tobit regression model to determine the dividend yield in Iraq. LUMEN Proceedings, vol. 3, pp. 347-354, 2018. 4. J. S. Cramer. The Origins of Logistic Regression: Tinbergen Institute Discussion Papers, 2002. 5. C. M. Dayton. Logistic regression analysis. Stat, vol. 474, p. 574, 1992. 6. H. Shirafkan, J. Yazdani-Charati, S. A. Mozaffarpur, S. Khafri, R. Akbari and A. A. Pasha. Application of tobit model in time until Cytomegalovirus infection in kidney transplant recipients. Acta Medica, vol. 32, p. 1237, 2016. 7. R. M. H. Karim and S. M. Salh. Using tobit model for studying factors affecting blood pressure in patients with renal failure. UHD Journal of Science and Technology, vol. 4, no. 2, pp. 1-9, 2020. 8. H. R. Talib and S. A. Mazloum. The use of binary logistic Table 10: Variables in the equation Step 1 β SE Wald df Significant Exp(β) 95% CI for Exp(β) Lower Upper Gender 0.108 0.264 0.168 1 0.682 1.114 0.664 1.868 Age 0.080 0.012 41.674 1 0.000*** 1.083 1.057 1.110 Urea −0.016 0.007 4.692 1 0.030* 0.984 0.970 0.998 Cholesterol 0.013 0.003 21.265 1 0.000*** 1.013 1.007 1.019 Creatinine 1.451 0.276 27.595 1 0.000*** 4.267 2.483 7.333 Weight 0.011 0.009 1.327 1 0.249 1.011 0.992 1.030 Constant −7.689 1.294 35.304 1 0.000 0.000 CI: Confidence interval, SE: Standard error, * mean P ≤ 0.05 and *** mean P ≤ 0.001 Table 11: Comparison Tobit regression model (censored) Truncated regression model Logistic regression model AIC=4265.1 AIC=2574.6 AIC=591.2 BIC=4294.6 BIC=2602.0 BIC=620.7 AIC: Akaike information criterion, BIC: Bayesian information criterion Meran and Sedeeq: Tobit and logistic regression models 140 http://journals.cihanuniversity.edu.iq/index.php/cuesj CUESJ 2022, 6 (2): 133-140 regression method to analyze the factors affecting heart disease deaths: An applied study on a sample of patients in Dhi Qar Governorate. Journal of Al-Rafidain University College College for Sciences, 2020, no. 46, 2020. 9. N. Rambeli, E. Hashim, F. C. Leh, N. S. Hudin, M. F. Ramli, M. C. Mustafa, et al. Decision to leave or remain in the career as early childhood educator: A binary logistic regression model. Review of International Geographical Education Online, vol. 11, no. 5, pp. 450-456, 2021. 10. G. S. Maddala. Limited-Dependent and Qualitative Variables in Econometrics. New York, NY: Cambridge University, 1983. 11. N. M. Ahmed. Limited dependent variable modelling (truncated and censored regression models) with application. The Scientific Journal of Cihan University Sulaimanyia, vol. 2, no. 2, pp. 82-96, 2018. 12. T. Amemiya. Tobit models: A survey. Journal of Econometrics, vol. 24, no. (1-2), pp. 3-61, 1984. 13. J. S. Long, Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Saga Publication, 1997. 14. A. Flaih, J. Guardiola, H. Elsalloukh and C. Akmyradov. Statistical inference on the ESEP tobit regression model. Journal of Statistics Applications and Probability Letters, vol. 6, no. 1, pp. 1-9, 2019. 15. K. Y. Chay and J. L. Powell. Semiparametric censored regression models. Journal of Economic Perspectives, vol. 15, no. 4, pp. 29-42, 2001. 16. D. Hosmer Jr., S. Lemeshow and R. Sturdivant. Applied Logistic Regression. vol. 398. New York: John Wiley and Sons, 2013. 17. A. M. Khudhur and D. H. Kadir. An application of logistic regression modeling to predict risk factors for bypass graft diagnosis in Erbil. Cihan University-Erbil Scientific Journal, vol. 6, no. 1, pp. 57-63, 2022. 18. N. M. M. Abd Elsalam. Binary logistic regression to identify the risk factors of eye glaucoma. International Journal of Sciences Basic and Applied Research, vol. 23, no. 1, pp. 366-376, 2015. 19. N. S. K. Barznj. Using logistic regression analysis and linear discriminant analysis to identify the risk factors of diabetes. Zanco Journal of Humanity Sciences, vol. 22, no. 6, pp. 248-268, 2018. 20. S. Menard. Applied Logistic Regression Analysis. vol. 106, New York: Sage, 2002. 21. N. H. Mahmood, R. O. Yahya and S. J. Aziz. Apply binary logistic regression model to recognize the risk factors of diabetes through measuring glycated hemoglobin levels. Cihan University-Erbil Scientific Journal, vol. 6, no. 1, pp. 7-11, 2022. 22. V. Bewick, L. Cheek and J. Ball. Statistics review 14: Logistic regression. Critical Care, vol. 9, no. 1, pp.112-118, 2005. 23. D. Hosmer and S. Lemeshow. Applied Logistic Regression. New York: Johnson Wiley and Sons, 2000. 24. I. R. Soderstrom and D. W. Leitner. The effects of base rate, selection ratio, sample size, and reliability of predictors on predictive efficiency indices associated with logistic regression models, 1997. 25. S. Konishi and G. Kitagawa. Information Criteria and Statistical Modeling. Berlin: Springer Science and Business Media, 2008. 26. K. I. Mawlood. Using logistic regression and cox regression models to studying the most prognostic factors for leukemia patients. Qalaai Zanist Scientific Journal, vol. 4, no. 3, pp. 705- 724, 2019.