P-ISSN : 2715-2448 | E-ISSN : 2715-7199 

Vol.4 No.1 January 2023 

Buana Information Technology and Computer Sciences (BIT and CS) 

 
24 | Vol.4 No.1, January 2023 

 
Cardiovascular Disease Prediction Using Machine Learning 

 ShivamPandey 

Undergraduate Student of BE CSE 

AIML, Chandigarh University, Punjab 

Email: shivampandey3819@gmail.com 

 
 ‹β› 
 

Abstract—Because of technology developments, the ECG yields 

improved outcomes in the realm of biomedical science and 

research. The Electrocardiogram reveals basic the heart's 

electrical activity. Early detection of aberrant heart disorders is 

crucial for diagnosing cardiac problems and averting sudden 

cardiac deaths. Measurements on an electrocardiogram (ECG) 

among people with comparable cardiac issues are essentially 

equal. Analyzing the Electrocardiogram characteristics can 

help predict abnormalities. Medical professionals presently base 

the preponderance of their Electrocardiogram diagnosis on 

their unique particular areas of expertise, which places a 

substantial load on their shoulders and reduces their 

performance. The use of technology that automatically analyses 

ECGs as hospital personnel performs their duties will be 

advantageous. A suitable algorithm must be able to categories 

Input signal with uncertain awesome feature on just how much 

they approximate Input signal having known characteristics in 

order to speed up the identification of heart illnesses. A 

possibility of identifying a tachycardia is raised if this predictor 

can reliably recognize connections, and this technique may be 

helpful in lab settings. To accurately diagnose myocardial 

illness, a powerful machine learning technique should be used. 

Through using recommended method, the effectiveness of 

cardiovascular disease identification using ECG dataset was 

evaluated. The reliability, sensitivities, and validity obtained 

using the Svm algorithm were 99.314%, 97.60%, and 97.60% 

respectively. 

Keywords— Machine Learning, Heart disease, Cardiovascular  
, dataset, Engineering 

  
I. INTRODUCTION 

Though electrophysiological (ECG) readings are records of 
the bioelectrical activity of the cardiovascular system, when 
cardiovascular disease manifests, most or all of the 
indications diverge beyond their steady levels [1]. ECG 
measurements from individuals who have comparable heart 
problems are quite comparable. Although the parameters of 
an Electrocardiogram are unavailable, it is possible to 
assumed that the signal has the same arrhythmia if its 
structural pattern mimics that of an Electrocardiogram with a 
particular irregularity. The ECG signal pattern can 
occasionally be analyzed to recognize heart problems. One of 
the most difficult and significant health challenges in the real 
world is coronary heart diagnosis. Such condition has an 
influence on the blood vessel functioning, which might 
weaken the person's body. In accordance with the WHO, 

around 18 million people die from cardiovascular disease 
each and every year. Because heart diseases are becoming 
increasingly common, individuals are more likely to avoid 
fatal situations. These are used to assess a patient's level of 
cardiovascular health [2]. 

Wearable sensors can be used to detect diseases like 
cardiovascular disease, but they are susceptible to failure 
because of signal abnormalities. The validity of the data and 
the test results may be impacted by this problem. In order to 
anticipate and evaluate a variety of cardiovascular disorders, 
data mining and hybrid models have been suggested as 
prospective remedies. With textual information, different risk 
characteristics are extracted using a data mining approach. In 
ml algorithms, there still are essentially different phases. The 
very first permits for the selection of a trait's subset or 
importance, whilst the second forecasts the development of 
cardiovascular disease. Inserting unnecessary characteristics 
may generate disturbance and misunderstanding whenever 
creating a base classifier. Further, managing individuals 
might make categorization less accurate. 

It's indeed normal practice to discriminate among- 
characteristics and classifications using uncertain combina- 
tions processes. Variables might increase the mean square 
error and lower your accurateness. An ECG is a non-invasive 
diagnostic tool used to track the physiological activities of the 
cardiac. It is capable of detecting a variety of circulatory 
disturbances, including those brought on by myocardial 
injury, cardiac arrhythmia, and acute coronary syndrome 
(SCA). It is difficult to regularly examine wearable Ecg 
monitoring because of their quick advancement and 
widespread affordability. Machine learning algorithms 
methods have been extensively applied in a variety of fields, 
such as video processing, computational linguistics, and 
automatic speech recognition. Their order to transfer 
extracted features but without help of user experts has been 
one of many main advantages. Alternatively, operations were 
carried out by algorithms using its capacity for data-driven 
learning. 

Timely screening of irregular heartbeat circumstances is 
crucial for the management of unexpected cardiac death and 
some other acute ailments carried on by myocardial 
infarction. A number of studies conducted on examining 
Electrocardiogram information and identifying problems 
therein. 


25 | Vol.4 No.1, January 2023 

 
To find these anomalies inside this lab, each person wants to 
just have continued Electrocardiogram measurement. This 
procedure takes a lot of time and energy. Automating the 
Automatic data treatment based on computer software is a 
quicker tool for diagnosing heart problems. 

 An innovative computerized segmentation method is 
discussed in this work that can classify comparable ECG 
signals among distinct categories and predict the likelihood 
of heart disease in each classification. 

 
II. LITERATURE REVIEW  

Machine Learning in Field of Medical 

Machine learning (ML), a branch of ai technology, offers a 
selection on novel techniques and methodologies for 
developing statistics interpretive and forecasting predictions. 
Doctors make a disease diagnosis based on their training, 
knowledge, observational studies, and expertise. Machine 
learning might prove to be extremely useful in helping people 
acquire more about and comprehend healthcare. 

Utilizing machine learning (ML) systems for precise diseases 
diagnosis and prevention in accordance with clinical 
indications and feelings, healthcare experts have correctly 
diagnosed sufferers [9],[10],[11],[12]. Utilizing electrocardio 
graphic information, clinical characteristics, and Intelligent 
systems, detect, categories, evaluate the seriousness of, and 
prognostic adverse reactions in Cardiovascular. 

A unique video-based computational intelligence method 
called Echo Net-Dynamic was successfully produced by [5]. 
In comparison to human analysts, our algorithm can evaluate 
echocardiography footage to estimate cardiovascular system 
[3]. 

In biomedical sciences, the physician will be able to diagnose 
the patient plus figure out the best course of treatment with 
the help of the pulse rate, Electrocardiogram, and Blood 
oxygen data that have been acquired. Actual analyzing 
techniques and Internet of things technology can assist warn 
sufferers concerning impending cardiovascular catastrophes 
[4],[13],[14],[15]. 

 
Figure 1. ML method for cardiovascular disease diagnosis. 

Several steps needed to build a model of machine learning for 
prediction and diagnosis are shown in Figure 1. The first 
phase is gathering pertinent clinical evidence. After 
becoming sanitized, the data is split into two sets: training and 
assessment. SVM, LR, K-NN, and other computer vision 
(ML) methods are used to build the model using training 

examples. The performance and reliability of the model are 
evaluated using the testing data. The last step could be to 
either choose a totally opposite model or enhance the 
efficiency of the current model that includes additional 
characteristics. 

III. METHOD 

A. Supervised Machine Learning 

This was used to also create a forecasting model, which 
forecasts the upcoming based on the historical data. This 
teaching algorithm employs intake of classification model to 
complete the task on time. Forecasting and classifications 
tasks belong to the category of supervised methods. Using 
past precipitation data, for example, to forecast rainfall 
(Regression task). By using photographs of salmon, the with 
tag "fish," the algorithm is expected to identify squid pictures 
and finish the multiclass classification [5]. 

B. Algorithm Used 

Decision Tree: Although it could be used to handle machine 
learning problems, the supervised machine learning method 
termed as a tree structure was mostly usually utilized to 
resolve detection problem. This technique basically divides 
the entire set-in smaller chunks while constructing a tree 
visualization of the data, where every other node in the tree 
standing in for a classification model and indeed the interior 
nodes represent judgment nodes, and reflect the attributes. 
This characteristic at every cluster that separates the 
classification model most effectively is selected by the 
procedure. 

 
Figure 2. Showing Working of Decision Tree 

As example, this clustering algorithm in Fig.2 predicts 
whether such a person will probably purchase a computer. 
The tree is generated using training data, or tuples with 
known class labels. 

When a individual is a learner, bifurcation is based upon their 
age and credit history. The attribute values for a certain new 
tuple were contrasted with the tree structure. A route that 
shows the class predictions for the combination is constructed 
first from base to a binary tree [6]. 

C. The Fundamental Algorithm Steps 

The step is performed sequentially and highest. Every one of 
the training datasets are situated right at the bottom of the 
tree. Categorical values are recommended for attribute 
values. Before being employed in the model, continuous 
values are discretized. 


26 | Vol.4 No.1, January 2023 

 
Figure 3. Showing Barplot of The Dataset 

Divisions of training sample are iteratively constructed based 
on the given parameters. A quantitative calculation is used to 
choose the diverging properties (such as Information Benefit, 
Gain Ratio, or Gini Index). The splitting cycle is continued 
until every occurrence of a specific node is a member from 
same category. Probably there are no other input 
characteristics or there are not enough observations left to 
provide an accurate split. Evaluate the strategy with data, then 
determine whether it is accurate.  

D. Support Vector Machine 

Typically used only for classifications, SVMs are indeed a 
popular category of supervised algorithms for machine 
learning. The SVM classifier converts characteristic data into 
coordinates in an n-dimensional area. The information is then 
classified by a higher dimensional space that the program 
finds. The classifier represents a maximum margin. The 
fundamental idea behind SVM is to repeatedly find a greatest 
margin class label (MMH) that accurately categories the 
collection only with fewest errors [7]. 
 
E. How Does SVM Algorithm Work 

Promote efficient hyperplane that separates to divide the 
categories. Picture just on left showing various black, blue, 
and orange optimal hyperplane. Even though the black in this 
scenario adequately distinguishes the 2 classes, the blue and 
orange exhibit greater classifying errors. Hyperplane: A 
hyper - plane is a judgement layer that makes distinctions 
between such a collection of elements with varied class 
affiliations. 
Margin is the separation here between two on the closest 
classification points. An angle between both the line and the 
nearby points or testing set is used to determine value. A 
massive class difference is seen to be beneficial; a reduced 
class difference is considered to be harmful. 

The descriptor with the highest separation as from closest 
point should be chosen. 

IV. RESULTS AND DISCUSSION 

In The purpose of our research would have been to concurrently 
mitigate underfit and overfit defects in some other good design. We 

found and observed the model didn't result either in the overfitting 
or underfitting. It is expected that the system loss in training data 
will be less than that in testing data. Another benefit is that if we 
understand these crucial ideas and are aware of how effective, we 
can handle even the most stressful circumstances. In comparison to 
test data, prototype loss should have been lower in training 
instances. 

There were two divisions used to rate the accuracy. Our system 
improved as the quantity of training photos and parameter settings 
increased, resulting in. 

V. CONCLUSION  

Throughout the healthcare profession, cardiovascular disease 

diagnosis is difficult and crucial. The detection of 

cardiovascular problems through to the examination of 

unprocessed medical data will aid inside the lengthy 

safeguarding of human life. If indeed the illness is identified 

in its beginning phases and protective actions were 

implemented as quickly as feasible, the number of deaths 

could be managed. This aids with in earliest diagnosis of 

cardiovascular problems. Therefore, in study, the SVM 

classifier is used to gather data and provide a strategy for 

predicting cardiovascular disease with a reliability of 

97.60%. To concentrate the researches on actual data rather 

than conceptual techniques and computations, A further 

development of something like the research is highly 

necessary. 

REFERENCES 

[1]   P. McSharry, G. Clifford, L. Tarassenko, Method for 

generating an artificial RRtachogram of a typical 

healthy human over 24-hours, Comput. Cardiol. 

29(2002) 225–228. 

[2]    S. Jayalalitha, D. Susan, Shalini Kumari and B. Archana, 

“K-nearest Neighbour Method of Analysing the ECG 

        Signal (To find out the Different Disorders Related to 

Heart)”, Journal of Applied Sciences, 14: 1628-1632 

[3]    Romiti S, Vinciguerra M, Saade W, Anso Cortajarena I, 

Greco E. Artificial Intelligence (AI) and Cardiovascular 

Diseases: An Unexpected Alliance. Cardiol Res Pract. 

2020 Jun 27;2020:4972346.  

[4] Mamun, M.M.R.K. Significance of Features from 

Biomedical Signals in Heart Health Monitoring. 

BioMed 2022, 2, 391-408. 

[5]  Energy Fuels 2022, 36, 13, 6626–6658 Publication   

Date:June 13, 2022 

[6]     Rokach, Lior & Maimon, Oded. (2005). Decision Trees. 

10.1007/0-387-25465-X 

[7]     Han, J., and M. Kamber. 2011. Data Mining: Concepts 

and Techniques. 3rd ed. Burlington: Morgan 

Kaufmann. 

[8]    Joachims, T. 1998. Making large-scale SVM learning 

practical. Adv. Kernel Methods - Support Vector Learn, 

MIT Press. 

[9] A. M. Shah et al., “Echocardiographic features of 

patients with heart failure and preserved left ventricular 

ejection fraction,” J. Am. Coll. Cardiol., vol. 74, no. 23, 

pp. 2858–2873, 2019. 

[10] S. Horiuchi and J. P. Kneller, “What can be learned 

from a future supernova neutrino detection?,” J. Phys. 


27 | Vol.4 No.1, January 2023 

 
G Nucl. Part. Phys., vol. 45, no. 4, p. 43002, 2018. 

[11] M. A. Lancaster and M. Huch, “Disease modelling in 

human organoids,” Dis. Model. Mech., vol. 12, no. 7, p. 

dmm039347, 2019. 

[12] S. J. Al’Aref et al., “Clinical applications of machine 

learning in cardiovascular disease and its relevance to 

cardiac imaging,” Eur. Heart J., vol. 40, no. 24, pp. 

1975–1986, 2019. 

[13] J. Yu, W. Ouyang, M. L. K. Chua, and C. Xie, “SARS-

CoV-2 transmission in patients with cancer at a tertiary 

care hospital in Wuhan, China,” JAMA Oncol., vol. 6, 

no. 7, pp. 1108–1110, 2020. 

[14] J. Stehlik et al., “Continuous wearable monitoring 

analytics predict heart failure hospitalization: the LINK-

HF multicenter study,” Circ. Hear. Fail., vol. 13, no. 3, 

p. e006513, 2020. 

[15] A. A. Kulkarni, V. E. Vijaykumar, S. K. Natarajan, S. 

Sengupta, and V. S. Sabbisetti, “Sustained inhibition of 

cMET-VEGFR2 signaling using liposome-mediated 

delivery increases efficacy and reduces toxicity in 

kidney cancer,” Nanomedicine Nanotechnology, Biol. 

Med., vol. 12, no. 7, pp. 1853–1861, 2016.