*Corresponding Author

P-ISSN: 2087-1244
E-ISSN: 2476-907X

11

ComTech: Computer, Mathematics and Engineering Applications, 13(1), June 2022, 11−21
DOI: 10.21512/comtech.v13i1.7307

The Application of C4.5 Algorithm for Selecting 
Scholarship Recipients

Fristi Riandari¹* and Sarjon Defit2 

1Computer Engineering, STMIK Pelita Nusantara 
Jln. Iskandar Muda, Sumatera Utara 20152, Indonesia

2Faculty of Computer Science, Universitas Putra Indonesia YPTK Padang
Jln. Raya Lubuk Begalung, Sumatera Barat 25145, Indonesia

¹fristy.rianda@ymail.com; ²sarjond@yahoo.co.uk

Received: 26th April 2021/ Revised: 3rd July 2021/ Accepted: 5th July 2021 

How to Cite: Riandari, F., & Defit, S. (2022). The Application of C4.5 Algorithm for Selecting Scholarship Recipients. 
ComTech: Computer, Mathematics and Engineering Applications, 13(1), 11−21. 

https://doi.org/10.21512/comtech.v13i1.7307

Abstract - The scholarship program is one of 
the promotional techniques used by many universities, 
and the right scholarship award will certainly 
be an attraction for many people. STMIK Pelita 
Nusantara is one of the universities that organizes a 
scholarship program. In the current difficult economic 
conditions, the scholarship program is the target of 
many prospective students who want to continue 
their education in higher education. However, the 
absence of tools to process large amounts of data 
make determining scholarship recipients less effective 
and time-consuming. This situation is seen by the 
fact that some students are still unable to maintain 
the scholarships they receive. In the research, a 
classification model was proposed using the C4.5 
algorithm approach by utilizing past data to facilitate 
the decision making of the scholarship program. This 
classification process produced a decision tree that 
could be used as a decision-making tool. Scholarships 
were awarded based on several criteria: academic 
potential, vocational potential, parents’ income, 
number of dependents, and employment status. Based 
on the data processing results of students who apply 
for scholarships in 2020 with predetermined criteria, 
the highest root is obtained. It consists of node 1 for 
academic potential, node 1.1 for vocational potential, 
and node 1.2 for parental income. The resulting 
decision tree model is expected to help to make 
decisions quickly and on target.

Keywords: C4.5 algorithm, scholarship program, data 
mining, decision tree, data classification  

I. INTRODUCTION

Current technological developments have been 
used as a tool in various fields, such as education, to 
enable teachers to quickly, precisely, and accurately 
process student data. Hence, the purpose of a job can 
be achieved effectively and efficiently (Hidayad, Defit, 
& Sumijan, 2020). Data processing by involving the 
technology in question will certainly be easier when 
associated with the right data processing model. One 
of the ways to process large amounts of data is by data 
mining techniques.

Data mining is an implementation model 
applied to look for patterns based on previous data 
to extract knowledge from large amounts of data 
(Guntur, Santony, & Yuhandri, 2018). The purpose 
of data mining is usually predictive (Dardzinska & 
Zdrodowska, 2020). According to Daryl Pregibon, data 
mining is a mixture of statistics, artificial intelligence, 
and database research which is still developing 
(Sulastri & Gufroni, 2017). Various techniques are 
available in data mining for knowledge extraction, 
including prediction, description, classification, 
estimation, association, grouping and classification 
(Ariawan, 2019; Afrianto, Suseno, & Warsito, 2020).

According to Azmi and Dahria (2013), data 
mining is an iterative process that requires human 
interaction to find a new pattern or model that can be 
generalized for the future and useful to carry out an 
action. Data mining has the concept of capturing and 
storing data, converting raw data into information and 
information into knowledge (Condrobimo, Sano, & 
Nindito, 2016). 


12 ComTech: Computer, Mathematics and Engineering Applications, Vol. 13 No. 1 June 2022, 11−21

Data mining is also called Knowledge 
Discovery in Database (KDD) or pattern recognition 
(Hidayad et al., 2020). Data mining includes 
collecting and using historical sources and data to find 
regularities, patterns, or relationships in large datasets 
(Santoso, Hariyadi, & Prayitno, 2016). The process 
includes understanding the application field, making 
target data determined from the raw data contained 
in the database, and preprocessing and cleaning data 
(Virgo, Defit, & Yunus, 2020). The main goal of 
KDD is to extract high-level knowledge from low-
level information (Putra & Defit, 2019). The KDD 
process generally consists of the following steps: 
data selection, data transformation, exploration like 
extraction of knowledge from data, and interpretation 
of results (Dardzinska & Zdrodowska, 2020). Data 
mining has also been implemented to predict students’ 
study periods. The test results show that the error 
rate in predicting students’ study period is only 5% 
(Haryati, Sudarsono, & Suryana, 2015).

Next, classification is finding a model or 
function that differentiates concepts or data classes. 
It aims to predict the class of objects whose class 
labels are unknown based on training data analysis 
(data objects with known class) (Afrianto et al., 
2020). It is also the most commonly applied data 
extraction technique to predict categorical attribute 
values   (discrete or nominal). It uses a set of previously 
classified examples to develop a model to classify 
entire population records with a decision tree or neural 
network-based classification algorithm. The process 
involves two stages, learning and classification. At the 
learning stage, the classification algorithm analyzes 
the training data. At the classification stage, test data 
is used to estimate the accuracy of the classification 
rules. The rules can be applied to the new data tuples 
if the accuracy is acceptable. The classifier training 
algorithm uses pre-classified examples to determine 
the set of parameters required for true discrimination 
and encodes these parameters into the model called 
classifier (Bedregal-Alpaca, Cornejo-Aparicio, 
Zárate-Valderrama, & Yanque-Churo, 2020). 

Moreover, a decision tree is a data mining method 
used for classification. Decision tree classification is 
a simple classification technique that is widely used. 
Previous researchers have developed various decision 
tree algorithms over several periods by improving the 
performance and ability to handle various data types. 
Examples are the Chi-squared Automatic Interaction 
Detector (CHAID), Classification and Regression 
Tree (CHART), Iterative Dichotomiser 3 (ID3), C4.5 
algorithm, C5.0 algorithm, Hunt’s algorithm, and 
Ordinal Class Classifier (OCC) (Effendy & Purbandini, 
2018). 

In the research, the C4.5 algorithm is used. 
C4.5 algorithm is a classification technique using 
entropy and profit information as a separator in a 
decision tree (Florence & Savithri, 2013). The C4.5 
algorithm constructs a decision tree from training 
data in the form of cases or records (tuples) in the 
database (Riandari & Simangunsong, 2019). The C4.5 

algorithm is also used to build a decision tree. There is 
a study comparing the C4.5 algorithm and the CART 
algorithm in the student grade classification. It explains 
that the C4.5 algorithm has a higher accuracy value 
of 85,61%, while the CART algorithm has 84,95% 
(Rahmayuni, 2014). Moreover, the C4.5 algorithm 
generates a decision tree, which provides input in the 
form of a classification sample. The application of the 
C4.5 algorithm functions to produce a level of data 
accuracy as a dataset containing large amounts of data 
(Fiandra, Defit, & Yuhandri, 2017). 

Scholarships are one of the leading programs 
offered by many universities. In the current difficult 
economic conditions, scholarship programs target 
many prospective students who want to pursue higher 
education. However, there are still some difficulties in 
determining the eligible prospective students due to 
the many applicants and the variables assessed in its 
implementation. Besides that, there are no tools that 
determine the selection. It takes a long time, and the 
possibility of inaccurate selection results is quite high. 
Based on the previous explanation of the data, data 
mining can be used to extract student data based on the 
characteristics of the selection results for scholarship 
recipients. The classification algorithm used is a 
decision tree with the C4.5 algorithm approach. Then, 
the classification results in the form of a decision tree 
that can be used as a tool in making decisions in the 
process of receiving scholarships quickly and staying 
on target. In this way, it is expected to help to make 
decisions quickly and on target.

II. METHODS

In tree formation with the C4.5 algorithm, 
there are several stages. Training data is usually taken 
from historical data that has occurred previously and 
grouped into certain classes. Second, it determines 
the roots of the tree. The root will be taken from the 
selected attribute by calculating the acquisition value 
of each attribute. Then, the highest value will be the 
first root (Dhika & Destiawati, 2015). 

Analyzing the C4.5 algorithm is a stage after 
the problem to be analyzed is found. Then, the existing 
data will be processed. So, the C4.5 Algorithm design 
will be carried out after all existing data are processed, 
and all required data are complete. Data processing 
is carried out in accordance with the KDD stages 
(Rahmayuni, 2014). 

First, it is selection. The object of the research 
is students who apply for scholarships in 2020. The 
research is carried out at STMIK Pelita Nusantara 
Medan. Then, the data collection uses observation and 
interviews with implementers. The data obtained are 
qualitative, containing information on each variable 
determined by the college in receiving scholarships, 
such as the value of academic potential, potential 
vocational test, parents’ income, number of dependents, 
and employment status. The number of new students 
who apply for the scholarship that year is 150 people.

Second, there is preprocessing or cleaning. 


13The Application of C4.5 Algorithm..... (Fristi Riandari; Sarjon Defit)

After the data from the selection results are obtained 
(the data of prospective scholarship recipients in 2020, 
amounting to 150), the selection data proceeds to the 
data cleaning stage to remove inconsistent/noise and 
with the same value data. It can be said that this stage 
discards the data of prospective scholarship recipients 
with the same score as the other potential recipients 
in each criterion. Different patterns will be searched 
at this stage, if the same pattern is found, only one 
representative pattern will be left, and the rest will be 
cleaned. So, the final result of the cleaning stage gets 
16 different patterns from 150 participant data in the 
scholarship acceptance process. Hence, the final result 
of this stage obtains 16 people from the previous data, 
amounting to 150 people.

Third, it is transformation. The preprocessed 
qualitative data will be grouped and transformed into 
an appropriate assessment form to be processed into 
data mining. In the research, the data are converted 
into quantitative form. This process makes it easier to 
define during testing.

Fourth, in data mining, the data of 16 students 
will be processed in the C4.5 algorithm data 
classification. It is carried out by making a decision 
tree to identify the conditions for objectively giving 
scholarships by looking at the value of each attribute 
of the new applicants for the scholarship (academic 
potential, potential vocational test, parents’ income, 
total dependents, and employment status). It is based 
on the highest gain value of the existing attributes to 
choose an attribute as the root. Equation (1) is used to 
calculate gain. It shows S as a case set, A as features, n 
as the number of partitions S, and pi and the proportion 
of Si to S.

 
          (1)

Meanwhile, Equation (2) calculates the entropy 
value in the entropy (total) formula. It shows as the 
number of partitions attribute A, | Si | as a number of 
cases on the i-th partition, and | S | as a number of 
cases in S.

Entropy (total) =         (2)

In making a decision tree, it must count the 
number of cases, the number of cases for the decision 
of “Accepted” (S1), the number of cases for the 
decision of “Rejected” (S2), and cases divided based 
on the attributes of academic potential, vocational 
potential, parents’ income, number of dependents, and 
employment status. Then, the gain will be calculated 
for each attribute. In making a decision tree, there 
are several stages. It determines the attribute as the 
root and calculates the value of the attribute gain 
information. It is based on the highest gain value of the 
existing attributes to select the attribute as the root. An 
entropy value is needed to determine the highest gain.

Fifth, the purpose of interpretation or evaluation 
is to objectively obtain the results of the decision 
analysis of students who receive scholarships. It 
is based on the attributes of academic potential, 
vocational potential, parents’ income, number of 
dependents, and employment status. The data will be 
analyzed, and the method will be implemented to get 
the desired results.

Table 1 Data in the Test

No
Academic 
Potential

Vocational 
Potential

Parents’ 
Income

Total 
Dependents

Employment 
Status

Decision

1 Good Enough Low Low High Accepted
2 Enough Good High High Low Accepted
3 Enough Enough High Low Low Rejected
4 Enough Enough Low High Low Accepted
5 Enough Good High Low High Accepted
6 Good Low High High High Accepted
7 Enough Low Low Low Low Rejected
8 Enough Good High Low Low Accepted
9 Low Enough Low High Low Rejected
10 Low Good High Low High Rejected
11 Enough Low High Low Low Rejected
12 Low Good Low Low High Rejected
13 Enough Enough Low Low High Accepted
14 Low Good Low High Low Rejected
15 Enough Good Low High High Accepted
16 Enough Enough High High Low Rejected


14 ComTech: Computer, Mathematics and Engineering Applications, Vol. 13 No. 1 June 2022, 11−21

III. RESULTS AND DISCUSSIONS

Based on the test data in Table 1, the attribute as 
the root is determined. Then, the value of the attribute 
acquisition information is calculated. It is based on the 
highest gain value of the existing attribute to determine 
the attribute as root. In determining the highest gain 
value, the entropy value is needed. Then, to find the 
entropy of each case, the total number of sub-criteria 
values is calculated. The sub-criteria (see Table 2) in 
finding entropy is transformed into the following form.

Table 2 Description of Sub-Criteria

Description

Good G
High H
Enough E
Low L

The calculation of the entropy value for each 
attribute uses Equation (3). Entropy (total) calculates 
the total value of the decision. The “Accepted” (S1) 
is 8, and “Rejected” is 8. Hence, the total number of 
cases is 16.

Entropy (total) =   log2 Pi 

                (3)

As seen in Table 1, the value of academic 
potential has a good score of 2 cases in the attributes 
of academic potential. Then, the rejected value has a 
good score of 0 cases. With a total of 2 cases, Equations 
(4), (5), and (6) calculate the entropy of each case. The 
same way is done for the other attributes.

  (4)

                               (5)

   (6)

Attributes of vocational potential:

                                                                   (7)

   (8)

                            (9)

Attributes of parents’ income:

  (10)

 (11)

Attributes of number of dependents:

                           (12)

                       (13)

Attributes of employment status:

                          (14)

            (15)

In searching for the gain value for each attribute, 
several equations are used.

Gain (Total, Academic Potential) 

           (16)

Gain (Total, Vocational Potential) 

              (17)


15The Application of C4.5 Algorithm..... (Fristi Riandari; Sarjon Defit)

Gain (Total, Parents’ Income)

         (18)

Gain (Total, Number of Dependents)

       
 (19)

Gain (Total, Employment Status) 

       
 (20)

When all the entropy and gain values for each 
attribute have been calculated, the calculation results 
are recorded in Table 3. The calculations in Table 1 
show that the attribute with the highest gain value 
is academic potential, with a gain value of 0,39316. 
So, this attribute is used as the root method with the 
others. The attribute with a lower value can be said to 
be “Rejected”. However, attributes with enough value 
still have to be recalculated. Node 1 of the decision 
tree can be seen in Figure 1.

Furthermore, a solution is carried out to 
calculate Node 1.1 as the root. It is done in the same 
way as calculating the entropy value of the remaining 
attributes, such as vocational potential, parents’ 
income, number of dependents, and employment 
status. After entropy is calculated, the gain for each 
attribute is measured. Entropy (value, academic 
potential, enough) is calculated with the following 
equations.

Table 3 Node 1: Calculation Results to Determine the Main Root

Node Cases (S) Accepted (S1) Rejected (S2) Entropy Gain
1 Total 16 8 8 1

Academic Potential 0,39316

Good 2 2 0 0

Enough 10 6 4 0,97095

Less 4 0 4 0

Vocational Potential 0,02178

Good 7 4 3 0,98523

Enough 6 3 3 1

Less 3 1 2 0,9183

Parents’ Income 0

Less 8 4 4 1

High 8 4 4 1

Number of Dependents 0,01148

High 7 4 3 0,98523

Less 9 4 5 0,99108

Employment Status 0,10584

Less 9 3 6 0,9183

High 7 5 2 0,86312


16 ComTech: Computer, Mathematics and Engineering Applications, Vol. 13 No. 1 June 2022, 11−21

Figure 1 Decision Tree of Node 1

Entropy (total) =   log2 Pi

    
                             (21)

Attributes of vocational potential

 (22)

 (23)

  (24)

Attributes of parents’ income

           (25)

(26)

Attributes of number of dependents

           (27)

  (28)

Attributes of employment status

           (29)

 (30)

Find the values of each attribute:

Gain (Total, Vocational Potential)

      (31)

Gain (Total, Parents’ Income) 

      (32)

Gain (Total, Number of Dependents)

      (33)

Gain (Total, Employment Status) 

       (34)

When all entropy values and gain values have 
been calculated, the calculation results are put in 
Table 4. It can be seen that the highest gain attribute 
is the vocational potential, with a value of 0,57095. 
Thus, it can be interpreted that the vocational potential 
can become the next root node so that a decision tree 
is formed in Figure 2.

Next, the research calculates Node 1.2 as the 
root. It calculates the entropy and gain values in the 
same way, using the entropy value of the remaining 
attributes of parents’ income, total dependence, and 
employment status. After calculating entropy, the gain 
is measured for each attribute. Entropy (Vocational 
potential, C) has the following equations.


17The Application of C4.5 Algorithm..... (Fristi Riandari; Sarjon Defit)

Table 4 Node 1.1: Calculation Results to Determine the Branch from the Main Root

Node Cases(S) Accepted (S1) Rejected (S2) Entropy Gain

1.1 Academic Potential: Enough 10 6 4 0,97095

 Vocational Potential: 0,57095

  Good 4 4 0 0

  Enough 4 2 2 1

  Less 2 0 2 0

 Parents’ Income: 0,04644

  Low 4 3 1 0,81128

  High 6 3 3 1

 Number of Dependents: 0,04644

  High 4 3 1 0,81128

  Low 6 3 3 1

 Employment Status: 0,28129

  Low 7 3 4 0,98523

  High 3 3 0 0

Figure 2 Decision Tree of Node 1.1


18 ComTech: Computer, Mathematics and Engineering Applications, Vol. 13 No. 1 June 2022, 11−21

Entropy (total) =   log2 Pi

   (35)

Attributes of parents’ income

  (36)

  (37)

Attributes of number of dependents

  (38)

  (39)

Attributes of employment status

        (40)

 (41)

Calculate the other values for each attribute:

Gain (Total, Parents’ Income) 

   (42)

Gain (Total, Number of Dependents) 

   (43)

Gain (Total, Employment Status) 

  (44)

After the entropy and gain values are calculated, 
the results of these calculations are put in Table 5. 
It can be seen that the highest gain attribute is the 
parents’ income with a value of 1. So, it can be the 
next root node. The value of the T attribute is high, and 
the value of the R attribute is low. The decision tree 
formed can be seen in Figure 3.

The rules obtained based on the decision tree 
formed are as follows: IF Academic Potential = Good 
THEN Decision = Accepted, IF Academic Potential = 
Sufficient AND Vocational Potential = Good THEN 
Decision = Accepted, IF Academic Potential = 
Sufficient AND Vocational Potential = Sufficient AND 
Parents’ Income = Low THEN Decision = Acceptable, 
IF Academic Potential = Enough AND Vocational 
Potential = Enough AND Parents’ Income = High 
THEN Decision = Rejected, IF Academic Potential 
= Sufficient AND Vocational Potential = Low THEN 
Decision = Rejected, and IF Academic Potential = 
Low THEN Decision = Rejected.

Table 5 Node 1.2: Calculation Results to Determine the Next Branch

Node Cases (S) Accepted (S1) Rejected (S2) Entropy Gain

1.2 Vocational Potential Enough 4 2 2 1
Parents’ Income  1

Less 2 2 0 0
High 2 0 2 0

Number of Dependents  0
High 2 1 1 1
Less 2 1 1 1

Working Status  0,31128
Less 3 1 2 0,9183
High 1 1 0 0


19The Application of C4.5 Algorithm..... (Fristi Riandari; Sarjon Defit)

Figure 3 Decision Tree of Node 1.2

Figure 4 Testing with Rapid Miner


20 ComTech: Computer, Mathematics and Engineering Applications, Vol. 13 No. 1 June 2022, 11−21

After the rules are obtained from the C4.5 
algorithm classification process, further testing is 
carried out using one of the data mining applications, 
Rapid Miner. A decision tree and rules are obtained 
from the test results with the Rapid Miner application, 
which can be seen in Figure 4. The branch formed in 
a graph is from the same dataset as Table 1. The roots 
formed from the application test show the same shape 
as the manual calculation performed with the C4.5 
algorithm in Figure 3. Then, in Figure 5, the rules are 
formed in a description.

IV. CONCLUSIONS

From the discussion results, it is concluded that 
the decision tree with the C4.5 algorithm can be used 
to classify the attributes used in analyzing prospective 
scholarship recipients. It can be a tool in making 
decisions about scholarship recipients and shorten the 
decision-making time. So, it can analyze prospective 
students who entitle to a scholarship with the most 
influential attributes, namely the academic potential, 
vocational potential, and parents’ income. There are 
three influential variables from five variables used 
in selecting prospective scholarship recipients. The 
utilization of these three variables is based on branch 
in accordance with Figure 3, which is formed and 
translated into rules. It can shorten the timeline used 
in the selection because it has known the rules in the 
assessment. Hence, the assessment can be started on 
the criteria with the main priority or the highest root 
and only carried out on the influencing criteria. Then, 
the results obtained can be more efficient and on target. 

Figure 5 Rule Formed From Rapid Miner Test Results

It is hoped that future studies add variables 
related to the expected socioeconomic status, such 
as the parents’ occupation, electricity bills, and 
homeownership status, to expand the research results. 
Hence, the scholarship recipient can be the right 
person regarding academic and socioeconomic status. 
At the same time, increasing the number of variables 
will allow the algorithm to work with larger data sets. 
In addition, future research can use other approaches 
in classifying scholarship patterns to determine the 
performance of each algorithm used, so universities 
can use decision-making tools that best suit their 
needs.

REFERENCES

Afrianto, E., Suseno, J. E., & Warsito, B. (2020). Decision 
tree method with C4.5 algorithm for students 
classification who is entitled to receive Indonesian 
Smart Card (KIP). In IOP Conference Series: 
Materials Science and Engineering. IOP Publishing. 
https://doi.org/10.1088/1757-899X/879/1/012072

Ariawan, P. A. (2019). Optimasi pengelompokan data pada 
metode K-means dengan analisis outlier. Jurnal 
Nasional Teknologi & Sistem Informasi, 5(2), 88–95. 
https://doi.org/10.25077/teknosi.v5i2.2019.88-95

Azmi, Z., & Dahria, M. (2013). Decision tree berbasis 
algoritma untuk pengambilan keputusan. Jurnal 
SAINTIKOM, 12(3), 157–164.

Bedregal-Alpaca, N., Cornejo-Aparicio, V., Zárate-
Valderrama, J., & Yanque-Churo, P. (2020). 
Classification models for determining types of 
academic risk and predicting dropout in university 


21The Application of C4.5 Algorithm..... (Fristi Riandari; Sarjon Defit)

students. International Journal of Advanced 
Computer Science and Applications (IJACSA), 
11(1), 266–272. https://doi.org/10.14569/
ijacsa.2020.0110133

Condrobimo, A. R., Sano, A. V. D., & Nindito, H. (2016). The 
application of K-means algorithm for LQ45 index 
on Indonesia Stock Exchange. ComTech: Computer, 
Mathematics and Engineering Applications, 
7(2), 151–159. https://doi.org/10.21512/comtech.
v7i2.2256

Dardzinska, A., & Zdrodowska, M. (2020). Classification 
algorithms in the material science and engineering 
data mining techniques. In IOP Conference Series: 
Materials Science and Engineering. IOP Publishing. 
https://doi.org/10.1088/1757-899X/770/1/012096

Dhika, H., & Destiawati, F. (2015). Application of data 
mining algorithm to recipient of motorcycle 
installment. ComTech: Computer, Mathematics and 
Engineering Applications, 6(4), 569–579. https://
doi.org/10.21512/comtech.v6i4.2192

Effendy, F., & Purbandini. (2018). Klasifikasi rumah tangga 
miskin menggunakan ordinal class classifier. Jurnal 
Nasional Teknologi & Sistem Informasi, 4(1), 30–36. 
https://doi.org/10.25077/teknosi.v4i1.2018.30-36

Fiandra, Y. A., Defit, S., & Yuhandri. (2017). Penerapan 
algoritma C4.5 untuk klasifikasi data rekam medis 
berdasarkan International Classification Diseases 
(ICD-10). Jurnal RESTI (Rekayasa Sistem dan 
Teknologi Informasi), 1(2), 82–89. https://doi.
org/10.29207/resti.v1i2.48

Florence, A. M., & Savithri, R. (2013). Talent knowledge 
acquisition using C4.5 classification algorithm. 
International Journal of Emerging Technologies in 
Computational and Applied Sciences (IJETCAS), 
4(4), 406–410. 

Guntur, M., Santony, J., & Yuhandri. (2018). Prediksi harga 
emas dengan menggunakan metode Naïve Bayes 
dalam investasi untuk meminimalisasi resiko. Jurnal 
RESTI (Rekayasa Sistem dan Teknologi Informasi), 
2(1), 354–360. https://doi.org/10.29207/resti.
v2i1.276

Haryati, S., Sudarsono, A., & Suryana, E. (2015). 
Implementasi data mining untuk memprediksi 
masa studi mahasiswa menggunakan algoritma 
C4.5 (Studi kasus: Universitas Dehasen Bengkulu). 
Jurnal Media Infotama, 11(2), 130–138.

Hidayad, A., Defit, S., & Sumijan, S. (2020). Penerapan 
algoritma K-means clustering untuk melihat 
hubungan kegiatan Tahfiz dengan hasil belajar (Studi 
kasus Madrasah Aliyah Negeri 1 Bukiktinggi). 
Jurnal Sistim Informasi dan Teknologi, 2(2), 41–47. 
https://doi.org/10.37034/jsisfotek.v2i2.34

Putra, R. A., & Defit, S. (2019). Data mining menggunakan 
rough set dalam menganalisa modal upah produksi 
pada industri seragam sekolah. Jurnal Sistim 
Informasi dan Teknologi, 1(4), 72–78. https://doi.
org/10.35134/jsisfotek.v1i4.18

Rahmayuni, I. (2014). Perbandingan performansi algoritma 
C4.5 dan Cart dalam klasifiksi data nilai mahasiswa 
Prodi Teknik Komputer Politeknik Negeri Padang. 
Teknoif, 2(1), 40–46. 

Riandari, F., & Simangunsong, A. (2019). Penerapan 
algoritma C4.5 untuk mengukur tingkat kepuasan 
mahasiswa. CV. Rudang Mayang.

Santoso, H., Hariyadi, I. P., & Prayitno. (2016). Data mining 
analisa pola pembelian produk dengan menggunakan 
metode algoritma Apriori. Semnasteknomedia 
Online, 4(1), 19–24. 

Sulastri, H., & Gufroni, A. I. (2017). Penerapan data mining 
dalam pengelompokan penderita thalassaemia. 
Jurnal Nasional Teknologi & Sistem Informasi, 
3(2), 299–305. https://doi.org/10.25077/teknosi.
v3i2.2017.299-305

Virgo, I., Defit, S., & Yunus, Y. (2020). Klasterisasi 
tingkat kehadiran dosen menggunakan algoritma 
K-means clustering (Studi kasus Institut Agama 
Islam Batusangkar). Jurnal Sistim Informasi dan 
Teknologi, 2(1), 23–28. https://doi.org/10.37034/
jsisfotek.v2i1.22