Operational Research in Engineering Sciences: Theory and Applications First online ISSN: 2620-1607 eISSN: 2620-1747 DOI: https://doi.org/10.31181/oresta171122136u * Corresponding author. anilutku@munzur.edu.tr (A. Utku) DEEP LEARNING BASED CIRRHOSIS DETECTION Anıl Utku * Department of Computer Engineering, Munzur University, Tunceli, Turkey Received: 26 August 2022 Accepted: 11 October 2022 First online: 17 November 2022 Research paper Abstract: Cirrhosis is a liver disease caused by long-term liver damage. Scar tissue caused by cirrhosis prevents the liver from working properly. With the hepatitis C virus, 130-150 million people are infected in the world and 350-500 thousand deaths, and 3- 4 million new cases are reported every year due to liver disease. In 2030, it is predicted that there will be 40 percent increase in compensated cirrhosis due to the hepatitis C virus, 60 percent increase in decompensated cirrhosis, and 70 percent increase in liver- related deaths. Although it is difficult to diagnose cirrhosis in the early stages, it is very important step for its treatment. Blood tests, imaging tests, and biopsy methods are used to detect cirrhosis. Due to the costs of these tests and the inability to get the test results immediately, the treatment of the patients cannot be started immediately. In this study, a MLP-based deep learning model has been developed for the prediction of cirrhosis. The developed model has been compared with DT, kNN, LR, NB, RF, and SVM. Experimental studies using the accuracy, precision, recall, and F1-score showed that the developed model was more successful than the compared models. Experimental results showed that the developed model had 80.48% accuracy, 85.71% precision, 85.71% recall, and 85.71% F1-score. Experimental results showed that the developed model had a prediction accuracy of over 80% and F1-score of over 85% in cirrhosis detection from blood tests. The developed model can be used in real-world applications to alleviate the workload of healthcare professionals and to develop early diagnosis systems. Keywords: cirrhosis detection, deep learning, machine learning, MLP. 1. Introduction Cirrhosis, also called chronic liver disease, causes severe damage to the liver (Arroyo et al., 2016). Different levels of damage to the liver can occur due to various diseases. As a result, various deteriorations occur in the structural functions of the liver and it cannot perform its normal functions (Garcia‐Martinez et al., 2013). This is mailto:anilutku@munzur.edu.tr Utku/Oper. Res. Eng. Sci. Theor. Appl. First online the beginning of the cirrhosis process. As a result of the decrease in liver cells that continue to function, the liver begins to harden and shrink. The flow of blood to the hardened tissues becomes difficult and new vascular pathways are formed due to the inability of the blood to reach the tissue. All these events aggravate the cirrhosis table by affecting the liver more negatively (Mozos, 2015). As a result, the liver begins to fail to function and liver failure occurs. Cirrhosis is a long-lasting and progressive disease (Vranjkovic et al., 2019). In the early stages, the findings may be very mild. As the damage to the liver increases, the symptoms worsen (Younossi and Henry, 2015). The most common symptoms in the early period are; loss of appetite, weight loss, nausea, weakness, and fatigue. These findings get worse in the future. In this process, water accumulation in the body, edema in the legs, swelling in the abdomen, muscle wasting, rapid bruising on the skin, tendency to bleeding, excessive itching, and jaundice are observed (Pinto et al., 2015). The liver is the body's factory. All the foods taken are used in the liver to make useful and necessary products for the body. One of them, albumin, keeps the fluids in the blood vessels. When liver functions are impaired, albumin synthesis is also affected (Van Zutphen et al., 2016). When the albumin level decreases, the fluids cannot be kept in the vascular bed and leak into the tissues (Guerci et al., 2019). As a result, edema occurs in the legs. Likewise, fluid accumulates in the abdominal cavity. In these patients, bruises may occur on the skin, or the tendency to bleed increases with the slightest impact (Amitrano et al., 2002). The reason for this is that the substances necessary for coagulation cannot be produced as much as they should due to the damage in the liver. Again, as a result of liver failure, some substances accumulate in the blood and severe itching and encephalopathy may occur. Long- term use of alcohol, viral hepatitis B/C, diabetes, obesity, obstruction and inflammation of the biliary tract, chronic heart failure, history of liver disease, and unprotected sex cause cirrhosis (Ginès et al., 2021). Blood tests, imaging tests such as MRI or ultrasound, and biopsy methods are used to detect cirrhosis (Acharya et al., 2015). The costs of these tests and the inability to get the test results immediately highlight the use of artificial intelligence technologies in the cirrhosis detection. Artificial intelligence methods are used successfully in many applications in the field of medicine. Systems for the diagnosis of diseases can be developed by using artificial intelligence methods. In the literature, there is no study on the detection of cirrhosis using artificial intelligence methods. In this section, studies in the literature in which artificial intelligence methods are used in the field of medical diagnosis are examined. Goceri (Goceri, 2019) presented a deep learning-based comparative analysis for the detection of skin diseases. U-Net, InceptionV3, InceptionResNetV2, VGGNet and ResNet models were compared. Experimental results showed that ResNet was the most successful model with 80% accuracy, while U-net was the most unsuccessful model with 0.74 accuracy. Che Azemin et al. (Che Azemin et al., 2020) presented a ResNet-101-based model for detecting COVID-19 cases using chest radiography images. The developed model had 0.82 AUROC, 77.3% sensitivity, 71.8% specificity, and 71.9% accuracy. Deep Learning Based Cirrhosis Detection Jain et al. (Jain et al., 2021) developed a deep learning-based model for detecting COVID-19 from medical images. In the study, ResNeXt, Inception V3 and Xception models were compared. Experimental results showed that the Xception model was more successful than the compared models with 97.97% accuracy. Ismael and Şengür (Ismael and Şengür, 2021) proposed a deep learning-based model for detecting COVID-19 from X-ray data. ResNet18, ResNet50, ResNet101, VGG16 and VGG19 were used for feature extraction. Support Vector Machine (SVM) with different kernel functions is used to classify features. A dataset consisting of 380 X-ray images was used. Experimental results showed that ResNet50 and SVM with linear kernel function outperform other models with 94.7% accuracy. Mostafa et al. (Mostafa et al., 2021) presented comparative analysis of ANN, RF and SVM for Hepatitis C detection from blood test values. The dataset used consists of blood test results of 615 patients. Experimental results showed that RF, ANN and SVM had 98.14%, 88.89% and 96.75% classification accuracy, respectively. Allugunti (Allugunti, 2022) aims to classify patients as cancer and non-cancerous using mammography images. In the study, the results of the Convolutional Neural Network (CNN), Random Forest (RF) and SVM classification models were compared. Experimental results showed that CNN has 99.65% accuracy, SVM has 89.84% accuracy, and RF has 90.55% accuracy. Terlapu et al. (Terlapu et al., 2022) presented comparative analysis of the Probabilistic Neural Network (PNN), SVM, k Nearest Neighbours (kNN), RF, Decision Tree (DT), and Naïve Bayes (NB) for hepatitis C detection. Experimental results showed that PNN outperformed machine learning models with 99.6% accuracy. RF was the most successful model among machine learning models with 97.5% accuracy. Almadhoun and Abu-Naser (Almadhoun and Abu-Naser, 2022) presented a deep learning-based comparative analysis for brain tumor detection. Using a dataset of 10,000 images, the proposed model, Inception, VGG16, MobileNet, and ResNet models were compared. Experimental results showed that the VGG16 outperformed the compared models with 99.86% accuracy. Luetkens et al. (Luetkens et al., 2022) presented deep learning-based comparative analysis for cirrhosis detection from liver MR images. The dataset used consists of 465 patient data. Experimental studies using the ResNet50 and DenseNet121 models showed that ResNet50 had 0.823 precision. Pasyar et al. (Pasyar et al., 2022) developed a hybrid deep learning model for detecting liver diseases such as hepatitis and cirrhosis from ultrasound images. Transfer learning has been applied to the ResNet and AlexNet architectures. The voting method was used to weight the results obtained by each model. Experimental results showed that the hybrid ResNet50 model had classification accuracy of over 86%. Cirrhosis does not usually cause symptoms in the early stages. However, as the degree of the disease progresses and the level of damage to the liver increases, the symptoms and the severity of these symptoms increase. Liver diseases cause other diseases in the body and can pose great dangers to the body. For this reason, diagnosis and treatment of liver diseases such as cirrhosis at an early stage are vital Utku/Oper. Res. Eng. Sci. Theor. Appl. First online for the human body. Blood tests are mainly used to diagnose liver diseases. However, cirrhosis diagnosis made with traditional methods. These methods have limitations in terms of cost, time, and accuracy compared to artificial intelligence methods. Especially in hospitals where these tests are carried out intensively, limitations in terms of the decision-making phase and time come to the fore in terms of the human factor. The motivation of this study is to offer a solution to the limitations of traditional methods used in cirrhosis detection. In this study, a deep learning model was developed to detect cirrhosis patients from blood test values. It was aimed to provide a pre-diagnosis system for the detection of cirrhosis by easing the workload of healthcare professionals. The main contributions of this study to the literature can be summarized as follows: - This is the first study in the literature on cirrhosis detection using artificial intelligence methods. - In this study, Multilayer Perceptron (MLP) based deep learning model were developed for cirrhosis detection. - The developed model were compared with DT, kNN, Logistic Regression (LR), NB, RF and SVM using accuracy, precision, recall and F1-score. - Experimental results showed that the developed model was more successful than the compared models with 80.48% classification accuracy. 2. Deep Learning Based Cirrhosis Detection Many diseases cause people to die as a result of late diagnosis. For this reason, experts recommend that it is appropriate to perform screening tests at regular intervals. Despite this recommendation of experts, many people do not care about health screenings and do not go to the doctor without any symptoms. Artificial intelligence is used for the effective use of human resources, rapid diagnosis, and treatment, supporting healthcare professionals in many ways. Artificial intelligence is used in the diagnosis and treatment of diseases, in determining the appropriate tools for treatment and in medical decision support systems. In this study, it was aimed to determine whether people had cirrhosis according to their demographic characteristics and blood test results. For this purpose, MLP-based deep learning model was developed. The developed model was compared with DT, kNN, LR, NB, RF and SVM using accuracy, precision, recall, and F1-score. 2.1. Dataset The dataset used in this study consists of patients' demographic information and blood test values. The dataset is publicly available via https://www.kaggle.com/datasets/fedesoriano/cirrhosis-prediction-dataset. The dataset consists of blood test data from a total of 418 PBC patients. The dataset consists of 19 attributes. ‘ID’ is a unique sequence number of patients. ‘N_Days’ represents the number of days from the patient's registration date to the date of the transplant, the date of the patient's death, or July 1986. ‘Status’ refers to Deep Learning Based Cirrhosis Detection the patient's status as C (censored), CL (censored due to liver transplant), or D (death). 'Drug' refers to the type of drug, either D-penicillamine or placebo. 'Age' refers to the patient's age in days. Sex denotes gender as M for males and F for females. 'Acid' refers to the presence of acid as N for no and Y for yes. 'Hepatomegaly' refers to the presence of hepatomegaly as N for no and Y for yes. 'Spiders' refers to the presence of spiders as N for no and Y for yes. 'Edema' refers to the presence of edema as N (no edema and no diuretic therapy for edema), S (existing edema without diuretics or edema resolving with diuretics), or Y (edema despite diuretic therapy). 'Bilirubin' refers to the amount of serum bilirubin in mg/dl. 'Cholesterol' means the amount of cholesterol in mg/dl. 'Albumin' refers to the amount of albumin in mg/dl. 'Copper' refers to the amount of copper in the urine in µg/day. 'Alk_Phos' refers to the amount of alkaline phosphatase in U/liter. 'SGOT' refers to the SGOT value in U/ml. 'Triglycerides' refers to the triglycerides value in mg/dl. 'Platelets' refers to the platelet's value per cubic ml/1000. 'Prothrombin' refers to the prothrombin time in seconds. 'Stage' refers to the stage of the disease as 1, 2, 3, or 4. In the dataset, missing data (NA) of the features are filled with the mean values of the relevant column. The categorical features in the dataset were replaced with a numeric value. The categorical values of the 'Sex' attribute were changed to 0 for 'M' and 1 for 'F'. The categorical values of the 'ascites' attribute were changed to 0 for 'N' and 1 for 'Y'. The categorical values of the 'Drug' attribute were changed to 0 for 'D- penicillamine' and 1 for 'Placebo'. The categorical values of the 'hepatomegaly' attribute were changed to 0 for 'N' and 1 for 'Y'. The categorical values of the 'Spiders' attribute were changed to 0 for 'N' and 1 for 'Y'. The categorical values of the 'Edema' attribute were changed to 0 for 'N', 1 for 'Y', and -1 for 'S'. The categorical values of the 'status' attribute were replaced with 0 for 'C', 1 for 'CL', and -1 for 'D'. The distribution of the patients according to the disease stages is shown in Fig 1. Figure 1. The distribution of the patients according to the disease stages The heatmap showing the relationships of the features in the dataset is shown in Fig. 2. The heatmap was used for feature selection. Utku/Oper. Res. Eng. Sci. Theor. Appl. First online Figure 2. The heatmap used for feature selection The distributions of the features are shown in Fig. 3. Deep Learning Based Cirrhosis Detection Figure 3. The distributions of the features The attributes with categorical values in the dataset are shown in Figure 4. Utku/Oper. Res. Eng. Sci. Theor. Appl. First online Figure 4. The attributes with categorical values in the dataset Relationships between features other than blood values are shown in Fig. 5. Figure 5. Relationships between features other than blood values The density graphs of the features obtained from the blood tests and the age feature according to their classes are shown in Fig. 6. Deep Learning Based Cirrhosis Detection Figure 6. The density graphs of the features obtained from the blood tests and the age feature 2.2. Developed Model MLP has a structure in which many neurons with non-linear activation functions are hierarchically connected. MLP consists of an input, one or more intermediate, and an output layer. The input layer receives the input strings to be processed. Inputs are forwarded to the network using the weights between the input layer and the hidden layers. In hidden layers, activation functions such as ReLU, sigmoid, and tanh are used. Input sequences processed in hidden layers are transmitted over the network with the help of these activation functions. These processes are repeated as many times as the number of hidden layers in the network structure. The output layer performs tasks such as regression or classification. In the output layer, activation functions are used according to the type of problem. For example, sigmoid is used for binary classification and softmax activation functions are used for multi- class classification problems. The neurons in the MLP are trained using the backpropagation algorithm. The developed MPL-based model takes the blood test data of the patients as input and predicts whether the patient has cirrhosis. The architecture of the developed model is shown in Fig. 7. Utku/Oper. Res. Eng. Sci. Theor. Appl. First online Figure 7. The architecture of the developed MLP-based model The developed model consists of an input layer in which demographic data of the patients and blood test data are presented as input. There are 2 hidden layers for the model to calculate. Hyperparameter analysis studies were carried out using GridSearchCV to determine the number of neurons and epochs in the hidden layers. ReLU activation function is used in the input layer. In hidden layers, ReLu activation function is used to sort the layers and make nonlinear calculations. Since the binary classification is done, the sigmoid activation function is used in the output layer. 2.3. Evaluation Metrics Classification algorithms aim to predict categorical values with two or more classes. Accuracy, precision, recall, and F1-score metrics are used to measure the performance of classification algorithms. These metrics are calculated using the confusion matrix. The confusion matrix is used to interpret the results of classification models and evaluate the relationship between actual values and predicted values. The confusion matrix is shown in Table 1. Table 1. Confusion matrix Actual values Positive (1) Negative (0) P re d ic te d v a lu e s Positive (1) TP FP Negative (0) FN TN TP is the number of patients that are actually cirrhosis and the classifier also predicts cirrhosis. TN is the number of cases that are actually non-cirrhosis and the classifier also predicts non-cirrhosis. FP is the number of patients that the classifier predicts as cirrhosis, but non-cirrhosis. FN is the number of patients that the classifier predicts as non-cirrhosis, but cirrhosis. Deep Learning Based Cirrhosis Detection Accuracy is calculated as the ratio of the number of samples classified correctly by the model to the total number of samples, as seen in Eq. (1). TP+TN Accuracy = TP+FP+FN+TN (1) Precision refers to how many of the positively predicted values are actually positive. Precision is calculated using Eq. (2). TP Precision = TP+FP (2) Recall is a metric that shows how many of the samples that should be predicted positively are correctly predicted. Recall is calculated using Eq. (3). TP Recall = TP+FN (3) The F1-score is calculated using precision and recall values. F1-score is calculated as seen in Eq. (4). 2.Precision.Recall F1 score = Precision+Recall − (4) 3. Experimental Results In this study, MLP-based deep learning model was developed to detect cirrhosis patients. The experimental results of the developed model were extensively compared with DT, kNN, LR, NB, RF and SVM. The accuracy, precision, recall, and F1- score obtained for each model were compared. Parameter analysis studies were conducted using GridSearchCV to determine the parameters of models. Cross validation has been used to eliminate the overfitting problem and to increase the quality of the models. Cross-validation has been made by choosing the k value as 10. All models were run on 10 randomly generated datasets using cross- validation, and the results obtained were averaged. The confusion matrix and experimental results for DT are shown in Table 2 and Table 3. Table 2. Confusion matrix for DT Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 20 6 Non-cirrhosis (0) 8 7 As seen in Table 2, TP is 20, FP is 6, FN is 8, and TN is 7. Experimental results showed that DT correctly detected 20 of 28 cirrhosis patients and 7 of 13 non- Utku/Oper. Res. Eng. Sci. Theor. Appl. First online cirrhosis patients. Of the 41 patients in the dataset, 27 patients were correctly classified and 14 patients were misclassified. Table 3. Accuracy, precision, recall and F1-score values for DT Accuracy Precision Recall F1-score 0.6585 0.7692 0.7142 0.7406 As seen in Table 3, the accuracy of DT is 0.6585, the precision is 0.7692, the recall is 0.7142, and the F1-score is 0.7406. The confusion matrix and experimental results for kNN are shown in Table 4 and Table 5. Table 4. Confusion matrix for kNN Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 18 9 Non-cirrhosis (0) 10 4 As seen in Table 4, TP is 18, the FP is 9, FN is 10, and TN is 4. Experimental results showed that kNN correctly detected 18 of 28 cirrhosis patients and 4 of 13 non- cirrhosis patients. Of the 41 patients in the dataset, 22 patients were correctly classified and 19 patients were misclassified. Table 5. Accuracy, precision, recall and F1-score values for kNN Accuracy Precision Recall F1-score 0.5365 0.6666 0.6428 0.6544 As seen in Table 5, the accuracy of kNN is 0.5365, the precision is 0.6666, the recall is 0.6428, and the F1-score is 0.6544. The confusion matrix and experimental results for LR are shown in Table 6 and Table 7. Table 6. Confusion matrix for LR Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 20 7 Non-cirrhosis (0) 8 6 As seen in Table 6, TP is 20, FP is 7, FN is 8, and TN is 6. Experimental results showed that LR correctly detected 20 of 28 cirrhosis patients and 6 of 13 non- cirrhosis patients. Of the 41 patients in the dataset, 26 patients were correctly classified and 15 patients were misclassified. Deep Learning Based Cirrhosis Detection Table 7. Accuracy, precision, recall and F1-score values for LR Accuracy Precision Recall F1-score 0.6341 0.7407 0.7142 0.7272 As seen in Table 7, the accuracy of LR is 0.6341, the precision is 0.7407, the recall is 0.7142, and the F1-score is 0.7272. The confusion matrix and experimental results for NB are shown in Table 8 and Table 9. Table 8. Confusion matrix for NB Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 15 11 Non-cirrhosis (0) 13 2 As seen in Table 8, TP is 15, FP is 11, FN is 13, and TN is 2. Experimental results showed that NB correctly detected 15 of 28 cirrhosis patients and 2 of 13 non- cirrhosis patients. Of the 41 patients in the dataset, 17 patients were correctly classified and 24 patients were misclassified. Table 9. Accuracy, precision, recall and F1-score values for NB Accuracy Precision Recall F1-score 0.4146 0.5769 0.5357 0.5555 As seen in Table 9, the accuracy of NB is 0.4146, the precision is 0.5769, the recall is 0.5357, and the F1-score is 0.5555. The confusion matrix and experimental results for RF are shown in Table 10 and Table 11. Table 10. Confusion matrix for RF Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 22 5 Non-cirrhosis (0) 6 8 As seen in Table 10, TP is 22, FP is 5, FN is 6, and TN is 8. Experimental results showed that RF correctly detected 22 of 28 cirrhosis patients and 8 of 13 non- cirrhosis patients. Of the 41 patients in the dataset, 30 patients were correctly classified and 11 patients were misclassified. Utku/Oper. Res. Eng. Sci. Theor. Appl. First online Table 11. Accuracy, precision, recall and F1-score values for RF Accuracy Precision Recall F1-score 0.7317 0.8148 0.7857 0.7999 As seen in Table 11, the accuracy of RF is 0.7317, the precision is 0.8148, the recall is 0.7857, and the F1-score is 0.7999. The confusion matrix and experimental results for SVM are shown in Table 12 and Table 13. Table 12. Confusion matrix for SVM Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 22 5 Non-cirrhosis (0) 6 8 As seen in Table 12, TP is 22, FP is 5, FN is 6, and TN is 8. Experimental results showed that SVM correctly detected 22 of 28 cirrhosis patients and 8 of 13 non- cirrhosis patients. Of the 41 patients in the dataset, 30 patients were correctly classified and 11 patients were misclassified. Table 13. Accuracy, precision, recall and F1-score values for RF Accuracy Precision Recall F1-score 0.7317 0.8148 0.7857 0.7999 As seen in Table 13, the accuracy of SVM is 0.7317, the precision is 0.8148, the recall is 0.7857, and the F1-score is 0.7999. The confusion matrix and experimental results for developed model are shown in Table 14 and Table 15. Table 14. Confusion matrix for developed model Actual values Cirrhosis (1) Non-cirrhosis (0) P re d ic te d v a lu e s Cirrhosis (1) 24 4 Non-cirrhosis (0) 4 9 As seen in Table 14, TP is 24, FP is 4, FN is 4, and TN is 9. Experimental results showed that the developed model correctly detected 24 of 28 cirrhosis patients and 9 of 13 non- cirrhosis patients. Of the 41 patients in the dataset, 33 patients were correctly classified and 8 patients were misclassified. Deep Learning Based Cirrhosis Detection Table 15. Accuracy, precision, recall and F1-score values for developed model Accuracy Precision Recall F1-score 0.8048 0.8571 0.8571 0.8571 As seen in Table 15, the accuracy of proposed model is 0.8048, the precision is 0.8571, the recall is 0.8571, and the F1-score is 0.8571. Comparative experimental results according to accuracy, precision, recall and F1- score values for DT, kNN, LR, NB, RF, SVM and developed model are shown in Table 16 and Fig. 8. Table 16. Comparative experimental results Model Accuracy Precision Recall F1-score DT 0.6585 0.7692 0.7142 0.7406 kNN 0.5365 0.6666 0.6428 0.6544 LR 0.6341 0.7407 0.7142 0.7272 NB 0.4146 0.5769 0.5357 0.5555 RF 0.7317 0.8148 0.7857 0.7999 SVM 0.7317 0.8148 0.7857 0.7999 Developed model 0.8048 0.8571 0.8571 0.8571 As seen in Table 16 and Fig. 8, the developed model has more successful results than the other models compared. Figure 8. Comparative experimental results As can be seen in Table 16 and Fig. 8, the developed model showed better classification performance in detecting cirrhosis patients compared to other models. After the developed model, RF, SVM, DT, LR, kNN and NB are the models with the successful results, respectively. Experimental results showed that the developed model had a classification accuracy of over 80% and an F1-score close to 86%. The obtained results showed Utku/Oper. Res. Eng. Sci. Theor. Appl. First online that the developed model can be successfully applied in the cirrhosis detection and can be used in early diagnosis systems. The accuracy/loss graphs of the developed model during the training and validation are shown in Fig. 9. Figure 9. The accuracy/loss graphs of the developed model during the training and validation Experimental results showed that the proposed model was more successful than other models compared according to accuracy, recall and F1-score metrics. The accuracy value is important because it shows the number of correctly classified patients. Precision value is important because it shows how many patients with predicted cirrhosis have cirrhosis. The Recall value is important because it shows how many of the patients who should have been predicted as cirrhosis were correctly predicted. F1-score is a harmonic mean value calculated according to precision and recall values. In other words, successful models are expected to have a higher F1-score value. ROC and precision-recall curve graphs of the developed model are shown in Fig. 10.a and Fig. 10.b. Deep Learning Based Cirrhosis Detection Figure 10. ROC and precision-recall curve graphs of the developed model The ROC curve is a very important performance measure for classification problems. ROC is a probability curve. AUC is the area under the curve and represents the degree or measure of discrepancy. In the ROC curve, there is False Positive Rate (FPR) on the X axis and True Positive Rate (TPR) on the Y axis. As the area under the curve increases, the discrimination performance between classes increases. TPR is the recall value. In other words, it is the detection rate of cirrhosis patients. FPR is the rate of erroneous prediction for non-cirrhosis patients. 4. Conclusions Cirrhosis is an advanced chronic liver disease. Different levels of damage occur in the liver due to different diseases. Depending on these reasons, the cirrhosis process begins with the development of structural changes in the liver. As a result, the number of functional liver cells decreases, and the liver hardens and shrinks. The resistance to the blood that has to pass through increases. When the blood cannot flow from here, the intravascular pressure increases in the areas where the blood comes from. The blood, which cannot reach the liver due to the increased pressure, looks for other ways to reach the liver and creates new vascular pathways. As a result, liver functions gradually deteriorate and signs of liver failure occur. Artificial intelligence technologies are used in medical application areas such as disease diagnosis, surgery, drug development, analysis of radiological images and lesions, and personalized therapy. In this study, MLP-based deep learning model was developed for cirrhosis detection. The developed model was compared with DT, kNN, LR, NB, RF, and SVM. Experimental studies using the accuracy, precision, recall, and F1-score showed that the developed model was more successful than the compared models. Experimental results showed that the developed model has 80.48% accuracy, 85.71% precision, 85.71% recall and 85.71% F1-score. The fact that RF is more successful than DT can be interpreted by the bagging technique of RF. RF creates multiple decision trees. It evaluates the results of these trees using the voting method. The fact that RF is more successful than kNN can be explained as random samples selected from the dataset. kNN only tries to include the closest instances in the same class. However, in this case, samples outside of local similarity will not be classified correctly by kNN. Utku/Oper. Res. Eng. Sci. Theor. Appl. First online The fact that RF is more successful than NB can be interpreted as NB's inability to represent the complex behavior of models due to the small model size. On the other hand, the size of RF is very large. RF can adapt to the dynamic structure and change of data. The fact that SVM is more successful than RF can be interpreted as the presence of numerical and categorical features in the dataset. RF works with a mixture of numerical and categorical features. RF is advantageous when features are of various scales. SVM maximizes the margin between different points and calculates the distance between points. In the classification problem, RF gives the probability of belonging to the class, while SVM gives the points closest to the boundary between classes. Due to the categorical and numerical coexistence of the features in the data, RF performed better than SVM. The fact that the developed model is more successful than other models can be interpreted as a large number of input data presented to the network. The more inputs presented to the network, the network learns better and makes better predictions. But basically machine learning methods require fewer input data. Studies in the literature have generally focused on hepatitis C and liver diseases detection. Disease detection is made from blood test values with ultrasound or MR images. To the best of our knowledge, there is no study in the literature for cirrhosis detection using this dataset. Therefore, the experimental results could not be compared with the studies in the literature. The size of the dataset used is one of the limitations of this study. The dataset consists of 418 PVC patient data and 19 attributes. Making more different measurements in blood tests and increasing the number of patients can enable the developed model to be trained more successfully. In addition, classification accuracy can be increased by using advanced deep learning models such as more hybrid models. References Acharya U. R., Faust O., Molinari F., Sree S. V., Junnarkar S. P., Sudarshan V. (2015). Ultrasound-based tissue characterization and classification of fatty liver disease: A screening and diagnostic paradigm. Knowledge-Based Systems, 75, 66-77. https://doi.org/10.1016/j.knosys.2014.11.021 Aksu D., Üstebay S., Aydin M. A., Atmaca T. (2018). Intrusion detection with comparative analysis of supervised learning techniques and fisher score feature selection algorithm. In International symposium on computer and information sciences, 141-149. https://doi.org/10.1007/978-3-030-00840-6_16 Allugunti V. R. (2022). Breast cancer detection based on thermographic images using machine learning and deep learning algorithms. International Journal of Engineering in Computer Science, 4(1), 49-56. Almadhoun H. R., Abu-Naser S. S. (2022). Detection of Brain Tumor Using Deep Learning. International Journal of Academic Engineering Research (IJAER), 6(3). Amitrano L., Guardascione M. A., Brancaccio V., Balzano A. (2002). Coagulation disorders in liver disease. In Seminars in liver disease, 22 (1), 83-96. https://doi.org/10.1055/s-2002-23205 https://doi.org/10.1016/j.knosys.2014.11.021 Deep Learning Based Cirrhosis Detection Arroyo V., Moreau R., Kamath P. S., Jalan R., Ginès P., Nevens F., Schnabl B. (2016). Acute-on-chronic liver failure in cirrhosis. Nature reviews Disease primers, 2(1), 1- 18. https://doi.org/10.3390/jcm10194406 Che Azemin M. Z., Hassan R., Mohd Tamrin M. I., Md Ali M. A. (2020). COVID-19 deep learning prediction model using publicly available radiologist-adjudicated chest X- ray images as training data: preliminary findings. International Journal of Biomedical Imaging. https://doi.org/10.1155/2020/8828855 Garcia‐Martinez R., Caraceni P., Bernardi M., Gines P., Arroyo V., Jalan R. (2013). Albumin: pathophysiologic basis of its role in the treatment of cirrhosis and its complications. Hepatology, 58(5), 1836-1846. https://doi.org/10.1002/hep.26338 Ginès P., Krag A., Abraldes J. G., Solà E., Fabrellas N., Kamath P. S. (2021). Liver cirrhosis. The Lancet, 398(10308), 1359-1376. https://doi.org/10.1016/S0140- 6736(21)01374-X Goceri E. (2019). Skin disease diagnosis from photographs using deep learning. In ECCOMAS thematic conference on computational vision and medical image processing, 239-246. https://doi.org/10.1007/978-3-030-32040-9_25 Guerci P., Ergin B., Uz Z., Ince Y., Westphal M., Heger M., Ince C. (2019). Glycocalyx degradation is independent of vascular barrier permeability increase in nontraumatic hemorrhagic shock in rats. Anesthesia & Analgesia, 129(2), 598-607. https://doi.org/10.1213/ane.0000000000003918 Ismael A. M., Şengür A. (2021). Deep learning approaches for COVID-19 detection based on chest X-ray images. Expert Systems with Applications, 164, 114054. https://doi.org/10.1016/j.eswa.2020.114054 Jadhav S. D., Channe H. P. (2016). Comparative study of K-NN, naive Bayes and decision tree classification techniques. International Journal of Science and Research (IJSR), 5(1), 1842-1845. https://doi.org/10.21275/v5i1.nov153131 Jain R., Gupta M., Taneja S., Hemanth D. J. (2021). Deep learning based detection and analysis of COVID-19 on chest X-ray images. Applied Intelligence, 51(3), 1690-1700. Luetkens J. A., Nowak S., Mesropyan N., Block W., Praktiknjo M., Chang J., Attenberger U. (2022). Deep learning supports the differentiation of alcoholic and other-than- alcoholic cirrhosis based on MRI. Scientific reports, 12(1), 1-8. https://doi.org/10.1038/s41598-022-12410-2 Mostafa F., Hasan E., Williamson M., Khan H. (2021). Statistical Machine Learning Approaches to Liver Disease Prediction. Livers, 1(4), 294-312. https://doi.org/10.3390/livers1040023 Mozos I. (2015). Arrhythmia risk in liver cirrhosis. World journal of hepatology, 7(4), 662. https://doi.org/10.1007/s10489-020-01902-1 Pasyar P., Mahmoudi T., Kouzehkanan S. Z. M., Ahmadian A., Arabalibeik H., Soltanian N., Radmard A. R. (2021). Hybrid classification of diffuse liver diseases in ultrasound images using deep convolutional neural networks. Informatics in Medicine Unlocked, 22. https://doi.org/10.1016/j.imu.2020.100496 Pinto R. B., Schneider A. C. R., da Silveira, T. R. (2015). Cirrhosis in children and adolescents: An overview. World journal of hepatology, 7(3), 392. https://doi.org/10.4254/wjh.v7.i3.392 https://doi.org/10.1016/S0140-6736(21)01374-X https://doi.org/10.1016/S0140-6736(21)01374-X https://doi.org/10.1007/978-3-030-32040-9_25 https://doi.org/10.1213/ane.0000000000003918 https://doi.org/10.1016/j.eswa.2020.114054 https://doi.org/10.21275/v5i1.nov153131 https://doi.org/10.1038/s41598-022-12410-2 https://doi.org/10.3390/livers1040023 https://doi.org/10.1007/s10489-020-01902-1 https://doi.org/10.1016/j.imu.2020.100496 https://doi.org/10.4254/wjh.v7.i3.392 Utku/Oper. Res. Eng. Sci. Theor. Appl. First online Terlapu P. V., Gedela S. B., Gangu V. K., Pemula R. (2022). Intelligent diagnosis system of hepatitis C virus: A probabilistic neural network based approach. International Journal of Imaging Systems and Technology. https://doi.org/10.1002/ima.22746 Van Zutphen T., Ciapaite J., Bloks V. W., Ackereley C., Gerding A., Jurdzinski A., Bandsma R. H. (2016). Malnutrition-associated liver steatosis and ATP depletion is caused by peroxisomal and mitochondrial dysfunction. Journal of hepatology, 65(6), 1198-1208. https://doi.org/10.1016/j.jhep.2016.05.046 Vranjkovic A., Deonarine F., Kaka S., Angel J. B., Cooper C. L., Crawley A. M. (2019). Direct-acting antiviral treatment of HCV infection does not resolve the dysfunction of circulating CD8+ T-cells in advanced liver disease. Frontiers in immunology, 10, 1926. https://doi.org/10.3389/fimmu.2019.01926 Younossi Z., Henry L. (2015). Systematic review: patient‐reported outcomes in chronic hepatitis C‐the impact of liver disease and new treatment regimens. Alimentary pharmacology & therapeutics, 41(6), 497-520. https://doi.org/10.1111/apt.13090 © 2022 by the authors. Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1002/ima.22746 https://doi.org/10.1016/j.jhep.2016.05.046 https://doi.org/10.3389/fimmu.2019.01926 https://doi.org/10.1111/apt.13090 DEEP LEARNING BASED CIRRHOSIS DETECTION Anıl Utku * 1. Introduction 2. Deep Learning Based Cirrhosis Detection 2.1. Dataset 2.2. Developed Model 2.3. Evaluation Metrics 3. Experimental Results 4. Conclusions