APPLICATION OF DIGITAL CELLULAR RADIO FOR MOBILE LOCATION ESTIMATION


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

BOTNET DETECTION USING INDEPENDENT 

COMPONENT ANALYSIS   

WAN NUR HIDAYAH IBRAHIM1,2, MOHD SYAHID ANUAR5,

ALI SELAMAT1,3,4* AND ONDREJ KREJCAR4 

1
School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Media and Game 

Innovation Centre of Excellence (MaGICX), Universiti Teknologi Malaysia, 

81310 Johor Baharu, Johor, Malaysia 
2
Jabatan Teknologi Maklumat dan Komunikasi (JTMK), Politeknik Sultan Idris Shah, 

45200 Sg Lang, Selangor, Malaysia 
3
Malaysia Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, 

Jalan Sultan Yahya Petra, 54100 Kuala Lumpur, Malaysia 
4
Center for Basic and Applied Research, Faculty of Informatics and Management, University of 

Hradec Kralove, Rokitanskeho 62, 500 03 Hradec Kralove, Czech Republic 
5
Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Jalan Sultan Yahya 

Petra, 54100 Kuala Lumpur, Malaysia 

*Corresponding author: aselamat@utm.my 

 (Received: 23rd January 2021; Accepted: 9th August 2021; Published on-line: 4thJanuary 2022) 

ABSTRACT:  Botnet is a significant cyber threat that continues to evolve. Botmasters 
continue to improve the security framework strategy for botnets to go undetected. Newer 

botnet source code runs attack detection every second, and each attack demonstrates the 

difficulty and robustness of monitoring the botnet. In the conventional network botnet 

detection model that uses signature-analysis, the patterns of a botnet concealment strategy 

such as encryption & polymorphic and the shift in structure from centralized to decentralized 

peer-to-peer structure, generate challenges. Behavior analysis seems to be a promising 

approach for solving these problems because it does not rely on analyzing the network traffic 

payload. Other than that, to predict novel types of botnet, a detection model should be 

developed. This study focuses on using flow-based behavior analysis to detect novel botnets, 

necessary due to the difficulties of detecting existing patterns in a botnet that continues to 

modify the signature in concealment strategy. This study also recommends introducing 

Independent Component Analysis (ICA) and data pre-processing standardization to increase 

data quality before classification. With and without ICA implementation, we compared the 

percentage of significant features. Through the experiment, we found that the results produced 

from ICA show significant improvements.  The highest F-score was 83% for Neris bot. The 

average F-score for a novel botnet sample was 74%. Through the feature importance test, the 

feature importance increased from 22% to 27%, and the training model false positive rate also 

decreased from 1.8% to 1.7%.  

ABSTRAK: Botnet merupakan ancaman siber yang sentiasa berevolusi. Pemilik bot sentiasa 

memperbaharui strategi keselamatan bagi botnet agar tidak dapat dikesan. Setiap saat, kod-

kod sumber baru botnet telah dikesan dan setiap serangan dilihat menunjukkan tahap 

kesukaran dan ketahanan dalam mengesan bot. Model pengesanan rangkaian botnet 

konvensional telah menggunakan analisis berdasarkan tanda pengenalan bagi mengatasi 

halangan besar dalam mengesan corak botnet tersembunyi seperti teknik penyulitan dan 

teknik polimorfik. Masalah ini lebih bertumpu pada perubahan struktur berpusat kepada 

struktur bukan berpusat seperti rangkaian rakan ke rakan (P2P). Analisis tingkah laku ini 

95


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
seperti sesuai bagi menyelesaikan masalah-masalah tersebut kerana ianya tidak bergantung 

kepada analisis rangkaian beban muatan trafik. Selain itu, bagi menjangka botnet baru, model 

pengesanan harus dibangunkan. Kajian ini bertumpu kepada penggunaan analisa tingkah-laku 

berdasarkan aliran bagi mengesan botnet baru yang sukar dikesan pada corak pengenalan 

botnet sedia-ada yang sentiasa berubah dan menggunakan strategi tersembunyi. Kajian ini 

juga mencadangkan penggunakan Analisis Komponen Bebas (ICA) dan pra-pemprosesan 

data yang standard bagi meningkatkan kualiti data sebelum pengelasan. Peratusan ciri-ciri 

penting telah dibandingkan dengan dan tanpa menggunakan ICA. Dapatan kajian melalui 

eksperimen menunjukkan dengan penggunaan ICA, keputusan adalah jauh lebih baik. Skor F 

tertinggi ialah 83% bagi bot Neris. Purata skor F bagi sampel botnet baru adalah 74%. Melalui 

ujian kepentingan ciri, kepentingan ciri meningkat dari 22% kepada 27%, dan kadar positif 

model latihan palsu juga berkurangan dari 1.8% kepada 1.7%. 

KEYWORDS: botnet detection; flow-based; machine learning; independent component analysis; 

traffic analysis 

1. INTRODUCTION  

A botnet is a collection of computers infected by malicious software (malware) that a 

botmaster manages. All Internet-of-Things (IoTs) devices, such as closed-circuit television 

cameras (CCTV), web cameras, computers, and mobile devices, can be infected devices. The 

vulnerabilities and the computing resources of these infected devices are exploited where they 

operate remotely as servants following the instructions given by their botmaster. The main aim 

of assigning a botnet is to launch an assault on the victim. However, the number of bots depends 

on the frequency of the attacks. Therefore, the most significant factor contributing to the 

frequency of the attacks is the number of bots in botnet environments [1], [2]. 

They can execute major attacks on victims, such as DDOS or email spam, because of the 

large number of bots, rendering victims unable to function for hours or days. For example, in 

the Mirai incident of 2016, the vast and unlimited number of bots produced a massive impact 

assault [3]. In the Mirai incident of 2016, the attacks were identified from 600,000 Internet-of-

things (IoT) devices [4]. At that moment, Mirai attacks were noteworthy since the bots used 

Internet-of-things (IoT) devices, not just computers or laptops. Consider the result if 600,000 

devices concurrently sent a ping to a specific website, leading to that website being 

overwhelmed, inaccessible, and its services slowed down. 

The botnet detection model has become a hot topic among researchers due to the history 

of botnet attacks and their impact on the industry. The arms race never ends between the 

botmasters and the researchers trying to beat each one. Every group continues to develop its 

abilities, and this can be seen through the botnet revolution. Botnet evolves or mutates every 

day after the source code has been released to the public [5]. It can be seen in the Mirai botnet 

and the Mirai version. Two months after the release of the Mirai source code to the public, the 

bots multiplied with variant complexity, from 213,000 to 493,000, twice. They show the 

statistics of various botnet attacks on Securelist websites, and 39.35 percent of new botnet 

found in 2018 is based on the Kaspersky Lab Botnet Monitoring project compared to 2017 

botnet attacks [6]. Subsection 1.1 briefly clarifies why the botnet relies heavily on the Rallying 

stage or C&C stage Command & Control server. The importance of preventing the server from 

being identified by the security system also explains the botnet structure's revolution. The 

botnet framework revolution switches from centralized to a decentralized. Centralized botnet, 

such as IRC and HTTP, via the primary server, call the Command & Control server. 

Decentralized botnet, such as Peer-to-peer (P2P), are more advanced since the bots themselves 

can act as servers. P2P is designed to hide the C&C server, as stated in [7], [8]. 

96


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
The botnet's strength lies in its capacity to elude security systems and carry out large-scale 

attacks thanks to various tactics such as packet data concealment and encrypted packet data 

[9]. A botnet can hide from the protection system and imitate the regular traffic flow where 

normal traffic is usually more random [10], it then waits for stage and imbalanced class 

distribution. The P2P technique is also a part of the concealment strategy to mask the C&C 

server [7], [8]. 

The botnet is now becoming a profitable business, according to [8], where the botmaster 

provides the service for any cyber-attacks. However, the current capability leading the business 

of these services must monitor the bots, advise of subsequent attacks, lengthen the duration of 

the attacks, and avoid monitoring its identity. 

1.1  Botnet Life-Cycle and Structure 

It is necessary to understand the life cycle of the botnet when designing a behavior-based 

analysis of the botnet detection model since the choice of related features of this model 

depends on it. Other than that, identification must also be carried out before the attacks occur 

and it is too late [11]. As below, the botnet life-cycle can be divided into four main stages: - 

• Phase I: Injection (I) 

• Phase II: Command & Control (C&C) 

• Phase III: Attack  

• Phase IV: Release  

The first stage is injection or replication. This stage can be achieved in several ways on the 

network, such as exchanging folders, visiting malicious websites, or adding emails. The bot 

herder increases the number of bots at this level. The Command & Control stage or 'Rallying' 

stage is the second stage. The infected devices already behave like bots in this process. The 

bots keep updating the devices' status, and if necessary, the bot herder submits a new source 

code [12]. The revised source code and the vulnerability report are designed to ensure the bots 

are undetected and robust [13]. The third stage is the attack phase, where all bots are targeted 

at attacking specific victims. The botmaster gives an attack launch order, and the bots 

simultaneously launch the attack based on the command. The Release Period is the last stage. 

The release stage is where the botmaster removes fingerprints, substitutes new systems for 

identified bots, and does not leave a digital footprint behind. Often the botmaster distributes 

the source code to hinder government investigations. During this process, learning from the 

previous attack, the functionality of the bot system is also enhanced [14]. 

We concentrate on botnet activity in the process of Command & Control or Rallying for 

our study. The infected system continues to attempt to connect to the C&C server to send 

reports of the infected devices. The system also receives updated source code to keep hiding 

from protection [8]. Based on a Kaspersky Lab study [15], monitoring DDOS attacks is the 

correct time to detect botnet to intercept the command from the Command & Control. 

1.2 Motivation and Contribution 

Our inspiration is the potential and consequences of a botnet (a botnet attack), to 

discover its method before the attack is launched. However, our critical general incentive to 

build a model that can predict a novel botnet is due to the continuous evolution of the botnet. 

The technological motivation for using behavior-based patterns and flow-based functionality 

is due to the shortcomings in detecting the new forms of botnets in the signature-based 

detection model. Other than that, we were motivated by the research from [16] that combined 

Principal Component Analysis (PCA) for clustering with k-means.  

97


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
Fig. 1: The botnet structure and botnet component. 

In statistics, principal component analysis is a technique used to describe a data set in 

terms of new uncorrelated variables ("components"). The components are ordered by the 

amount of original variance they describe, so the technology helps reduce the dimensionality 

of a data set. In comparison, Independent Component Analysis (ICA) is a machine learning 

technique used to distinguish independent sources from a mixed input. Unlike principal 

component analysis, which focuses on maximizing data point variance, independent 

component analysis emphasizes independence or independent components. Since we are 

using aggregation for pre-processing data, the vital information might be lost, resulting in 

decreased performance, so we agreed to use ICA. The explanation about ICA is in subsection 

0. The significant contributions to this analysis are: - 

• This method can detect network packets even in concealment strategies such as 
obfuscation, code encryption, oligomorphic strategy, polymorphic strategy, and 

metamorphic. 

• This method used the CTU-13 botnet benchmark dataset that consists of centralized and 
decentralized structures. It proves that our framework can detect both structures. 

• The evaluation of this framework used the different types of botnets, proving that our 
framework can detect novel botnets.  

• Our result shows the average 74% f-score that tests on five types of novel botnets. 

• The performance of the framework is compared with other researchers that used the 
same source of data. 

2.   RELATED WORKS 

The latest developments in the concealment technique of packet data in network traffic 

make the signature-based or content-based inefficient in detecting new forms of botnets. For 

example, Singh et al. [17] suggested that the signature time-to-time was revamped by the botnet 

and significantly modified. These changes in behavior caused signature-based analysis output 

to drop on the new release botnet because signature-based analysis relied heavily on the bot's 

98


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
signature. In addition, many concealing tactics are used to mask packet data content in network 

traffic, including obfuscation, code encryption, oligomorphic approach, polymorphic strategy, 

and metamorphic strategy [18]. 

Patsakis et al. [8] raised many concerns about DNS queries that have been used to conceal 

the botnet on the encrypted channel. Although AsSadhan et al. [17], claimed that packet data 

contents should be shielded to safeguard the identity of the private information of the individual 

or user, where only the header of the packets can be released to the public. This author also 

concentrated on analyzing traffic, exchanging packets, and providing a framework for 

lightweight security. But their work is considered DNS and is only for the DGA botnet. A 

model was also developed by this author using the actions of the botnet when interacting with 

others, but the time interval used for this study is 31-49 minutes. 

Two common approaches, 1) payload-based and 2) traffic-based can be classified into 

machine learning models to detect network operations. The payload-based approach trains 

models based on characteristics derived from the payload/data portion of the packets 

transmitted over the network, as the name implies. The disadvantages of such models are the 

resource-intensive challenge (where features for each packet need to be evaluated), privacy 

problems, and encrypted information where features cannot be extracted [18,19]. By analyzing 

the communication packet headers or Netflow information, the traffic-based approach aims to 

mitigate some of the model's drawbacks. While privacy remains an issue with such an approach 

(such as individual IP addresses in features), this can be mitigated by aggregating time window 

records. 

2.1 Behavior-based and Flow-based Features 

What is behavior-based? What are the differences between behavior-based and signature-

based? Behavior-based and signature-based are in contrast to each other.  In computing, all 

objects have attributes that can be used to develop a custom signature. Signature-based analysis 

refers to detecting attacks by searching for specific patterns, like byte sequences in network 

traffic or known malicious instruction sequences used by malware [22]. This terminology is 

derived from anti-virus software, which refers to these detected patterns as signatures. 

Although behavior-based analysis is an analysis that does not directly analyze the data like 

signature-based, there are some advantages of behavior-based analysis compared to signature-

based analysis. For example, it is more secure or effective in detecting new and novel forms of 

malware threat. In addition, it can detect a single instance of malware that targets a person or 

organization. It can also identify what the malware does when files are opened in a specific 

environment and obtains comprehensive malware information. However, according to Resende 

and Drummond [21], most research defines behavior-based analysis with anomaly-based 

detection, but anomaly detection can also be done using signature-based analysis. So, it means 

that anomaly-based cannot be defined as behavior-based analysis in malware detection. 

The definition of behavior-based Resende and Drummond [21] is the most accurate to our 

definition. Resende and Drummond [21] define behavior-based analysis in Network Intrusion 

Detection Systems as detection techniques that are not evaluated or referred directly to the 

source, destination, and payload of packets. It is an analysis that assesses the behavior of an 

object. Behavior-based detection can be performed by using API call logs [24], network flow 

(NetFlow) [10], and is also a hybrid between API call and Netflow [25]. A flow is a collection 

of packets that come from the same source and destination. Flow-based botnet detection 

techniques employ statistics of all packet headers in a flow (flow record). Because the flow-

99


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
based approach only catches packet header information, it can reduce the computational 

complexity [24-26] and be processed very quickly.  

Flow-based features are the characteristics chosen to illustrate the network flow pattern or 

connection to distinguish either the usual network or a botnet network. Flow-based 

characteristics also relate to packet data information, such as total packets per second, bytes 

per packet, total packet bytes, and the number of packets [29]. The description of flow-based 

functionality is shown in Table 1 and consists of the features, the time window, and the data 

tools derived from published work. Thus, when designing our characteristics and the time 

window, Table 1 became our crucial guide. 

The feature selection process was interpreted in the same concept as used in aggregation. 

As mentioned in Gezer et al. [40], one of the challenges in machine learning is feature selection 

because feature selection needs a good understanding of the domain knowledge. Not all 

features in the data are relevant to building the model (Resende and Drummond [21]). On the other 

hand, aggregation is a process of transforming the data based on a specific theory, and it enables 

the extraction of sufficient data for analysis. Based on [30], aggregation is a part of the data 

mining technique in machine learning for efficient knowledge discovery about network flows. 

2.2  Independent Component Analysis 

Independent Component Analysis (ICA) is a source separation technique in signal 

processing. According to [31], in their survey, ICA and PCA are among the popular methods 

used to select essential network features. PCA extracts and reduces the dimension of features, 

while ICA separates the noise to enhance and maximize each feature's data pattern [32]. The 

authors in [33] claim that principal component analysis (PCA) is a technique for reducing 

features by identifying the relevant feature set. The implementations of ICA and PCA 

clustering algorithm in feature selection has been reported [16]. It is a semi-supervised model 

where the author combined unsupervised and supervised techniques.  

In ICA, the mutual connection between features is minimized by maximizing the non-

Gaussianity. Research from Palmieri et al. [34] is the most similar to our approach. The author 

used ICA in Network Anomaly Detection from the University of Naples, Italy’s network 

traffic. On the other hand, we try to find the implementation of ICA in detecting botnet, and 

we only found an article from Mao et al. [35] where this author used ICA in detecting 

spamming botnet. 

We can summaries that behavior-based analysis that used the flow-based features can 

solve the issues of concealment botnet, but it produces high false alarm (false positive rate). 

High false alarm in machine learning occurs due to the unclear separation between classes that 

also come from the unclear pattern produce by the data. Although some attempts have been 

made to address this issue, it still puts limitations on ICA implementation.   

3.  PROPOSED FRAMEWORK AND PRE-EVALUATION RESULT 

In order to reach the objective, this study proposed a new framework as shown in Fig. .  

The proposed framework starts with selecting the network traffic dataset and pre-processing 

the dataset. This study highlighted the pre-processing phases where the data is provided to 

produce a high-performance during classification. 

100


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
Table 1: Summary of behavior-based analysis in previous works 

Author Features Time Window 

(seconds) 

Data resources 

Ehsan and Hamid [36] 
Group duration time, Number of Receive Packets, Number of Send packets, 

Distance from the previous group 

Group duration 

time 

Kelihos, CVUT Malware 

Capture Facility Project 

Stevanovic and Pedersen 

[37]  

* Basic conversation features: Port number, layer 7 protocol, duration (last pkt 

- first pkts), the total number of packets, total number of bytes, mean of the 

number of bytes per packet & Std of the number of bytes per packet 

* Time-based-based features: number of packets & bytes per second, mean & 

std of packets inter-arrival time 

* Bidirectional features: a ratio of number of packets & bytes in and out, a ratio 

of inter-arrival times in and out 

* TCP specific features: such as percentage TCP SYN packets, percentages of 

TCP SYN-ACK packets, percentages of TCP ACK & percentages of 

TCP ACK PUSH packets 

300 s 

Combination of the dataset: 

- 

• ISCX, ISOT 

• Contagio 

• Honeyjar 

• Malware Capture 

Facility Project (MCFP) 

Fernandez Maimo et al. 

[38]  

* Number of flows, number of incoming flows, number of outgoing flows; 

* % of incoming and outgoing flows over the total 

* % of symmetric and asymmetric incoming over the total  

* Sum, maximum, minimum, mean, and variance of IP, packets per incoming 

outgoing, and totals flows 

20 - 30 s 

 
CTU-13 

Debashi and Vickers [39]  

Array 1:(Src Addr), Dst Port, Packet Count 

Algorithm 2: Dst Addr, Dst Port, Packet count.  

Array 2: Dst Addr, Src addr list, Src Addr count, 

Dst Port List, Dst Port Count 

60 s ISOT 

Gezer et al. [40]  

* Total forward & backward volume 

* Max forward & backward packet length  

* Min, Max, Mean before idle and before active. 

* Max backward inter-arrival time 

600 s capture own network data 

 Garg et al. [41]  
Send Syn, Recv ACK, Recv Rst, Send pkts, Recv pkts, ICMP unr, Send len, 

Recv Len 
120 s Combination of P2P botnet 

 
101


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
3.1  Input: Data Source and Data Distribution 

For this study, the following vital data are extracted from the botnet benchmark dataset 

CTU-13 [42] from the website of the Stratosphere Research Laboratory.  This dataset 

consists of 13 files with several types of botnets in different protocols and different 

structures. Since this framework aims to detect novel botnet, this dataset is divided into two 

sets, training and testing, for building the model and evaluating the dataset, as shown in 

Table 2. The model is evaluated with data from the evaluating dataset separated from data 

for building the model. The separation of this data ensures that the model is derived and 

tested using a different set of data, as explained in Step 2: Dividing Dataset in Section 3.2. 

Table 2: The distribution of bot in training and evaluating dataset 

Data File No Duration (hrs) Bot Name No of Bots Training 

Dataset 

Evaluating 

Dataset 

1 6.15 Neris 1  √ 

2 4.12 1  √ 

3 66.85 Rbot 

 
Virut 

1 √  

4 4.12 1 √  

5 11.63 1 √  

6 2.18 Menti 1  √ 

7 0.38 Sogou 1 √  

8 19.5 Murlo 1  √ 

9 5.18 Neris 10  √ 

10 4.75 Rbot 10 √  

11 0.26 3 √  

12 1.21 Nsis.ay 3 √  

13 16.36 Virut 1 √  

Table 2 summarizes the distribution of data based on the Data File No. In the third 

column are the names of bots in the dataset. The explanation of the bot name, bot category, 

and structure are given in Table 3. It is essential to have both structures (centralized and 

decentralized) in this research. As a result, our data source selection appears reasonable in 

terms of independent structure and bot reliability. Columns 5 and 6 in Table 2 show the 

separation of training and evaluating data for the novel bot. 

Table 3: The description of the bot based on the name 

 Bot Name Bot Category Structure 

1 Neris IRC Centralized 
2 Rbot IRC 

3 Virut HTTP 

4 Menti IRC 

5 Soguo HTTP 

6 Murlo IRC 

7 Nsis.ay P2P Decentralized 

3.2  Data Pre-processing 

Pre-processing data is the phase in which the data is prepared before being incorporated 

into the algorithm to construct the prediction model. Since we used behavior-based analysis, 

the information needed to go through several steps. A behavior-based analysis is not a 

straightforward extraction process but rather a tool for analyzing the raw data. There are 

several vital components or measures that we have grouped into the pre-processing data 

102


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

module, such as Labeling, Cleaning, Dividing Dataset, Feature Selection, Aggregation, and 

Data Quality Process Implementation. 

Fig. 2: The proposed framework for pre-processing flow-based features. 

3.2.1 Step 1: Labeling and Cleaning 

The first step was re-labeling the dataset. Although the CTU-13 dataset is supervised, 

the dataset contains labels but those labels are in string/text, not in numbers. There are 74 

types of descriptive labels in CTU13, as we have shown in Appendix A, but basically, the 

label is based on 3 types of labels: ‘Normal’, ‘Botnet’, and ‘Background’. Due to that, we 

re-labeled the CTU-13 as stated in (1) below. Once the labeling was completed, we 

removed the uncertain data in Label = 2 for the cleaning process. 

𝐿𝑎𝑏𝑒𝑙 =  {

0            𝑖𝑓 𝑡ℎ𝑒 𝑙𝑎𝑏𝑒𝑙 𝑠𝑡𝑎𝑡𝑒 𝑖𝑠 "𝑁𝑜𝑟𝑚𝑎𝑙"
1                𝑖𝑓 𝑡ℎ𝑒 𝑙𝑎𝑏𝑒𝑙 𝑠𝑡𝑎𝑡𝑒 𝑖𝑠 "botnet"

 2     𝑖𝑓 𝑡ℎ𝑒 𝑙𝑎𝑏𝑒𝑙 𝑠𝑡𝑎𝑡𝑒 𝑖𝑠 "𝑏𝑎𝑐𝑘𝑔𝑟𝑜𝑢𝑛𝑑"
 (1) 

103


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
3.2.2 Step 2: Dividing Dataset 

After the cleaning process, the data was split into two main data sets: Creating Model 

Data and Analysing Data. This separation aimed to ensure that the construction model 

evaluation is performed on a novel botnet. The model was based on Constructing Model 

Data, divided into 70-30 ratios of training and testing data.  

 
Fig. 1: Process of dividing dataset for novel botnet. 

3.2.3 Step 3: Features Selection and Aggregation 

Once the dataset division was completed, the dataset was ready for the following 

process: selecting the features and aggregating them in a specific time interval. We aimed 

to build the fastest detection, so for this research we chose a short time interval (1 sec) for 

aggregation.  

Feature selection is a process of selecting specific variables/features/attributes in the 

data. The purpose of this process was to reduce the complexity and processing time. We 

chose the features based on the theory of communication. This theory is between botmaster 

and its bots during the C&C stage in the botnet life-cycle for this research. As mentioned in 

the bot life-cycle in subsection 1.1: Botnet Life-Cycle & Structure, during the C&C phase, 

bots and botmaster keep communicating. This communication pattern is different from the 

regular communication pattern, where a typical communication pattern is usually more 

random. In contrast, a bot’s communication pattern is more uniform with the same amount 

of transferring data to multiple destinations. The features that we used for this research are 

shown in Table 4. 

Table 4: Data type for features in the dataset 

No Feature Data Types Calculation 

1 Destination Address Categorical data n(x) 

2 Destination port Categorical data n(x) 

3 Packet data Continuous data Min, Max, Median, std 

deviation, n(x) 

4 Time Categorical data ∆t 

The data are aggregated or grouped in two parameters: time interval (t) and source address 

(Sip), as shown in Eq. (2).  We used the aggregation technique to calculate the occurrence 

number of the communication within the time interval that represented the bot’s behavior 

in a particular given time. 

104


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

⌈𝑆𝑖𝑝, 𝑡⌉ =  {𝑋0 ,   𝑋1 , … … . , 𝑋𝑛} (2) 

Since the data in the NetFlow is in continuous type and categorical data type, the aggregation 

of these two types of data is different, as shown in Eq. (3). If the data was continuous, we 

implemented the statistical technique such as minimum, maximum, median, standard 

deviation, and specific number, n(x). But if the data type was categorical data, we 

implemented only the total distinct number, n(x), where the total distinct number (n(x)) that 

define as the frequency of unique elements in the set can be described as shown in Eq. (4). 

𝑓 (𝑥) =  {
𝑀𝑖𝑛, 𝑀𝑎𝑥, 𝑀𝑒𝑑𝑖𝑎𝑛, 𝑠𝑡𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛, 𝑛(𝑥)   𝑖𝑓 𝑑𝑎𝑡𝑎 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠 𝑛𝑢𝑚𝑏𝑒𝑟 

𝑇𝑜𝑡𝑎𝑙 𝐷𝑖𝑠𝑡𝑖𝑛𝑡 𝑁𝑢𝑚𝑏𝑒𝑟, 𝑛(𝑥)        𝑖𝑓 𝑑𝑎𝑡𝑎 𝑖𝑠 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑐𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟
(3) 

𝑛(𝑥) =  {𝑋𝑖  , 𝑋𝑗 , … … , 𝑋𝑛|𝑋𝑖  ≠  𝑋𝑗 , 𝑖 ≠ 𝑗, 𝑖 ≥ 0, 𝑗 = 1, … … , 𝑛 } (4) 

While for time, this data is used, whereas the aggregation or rounding process is shown 

in Eq. (1). Time is also used in calculating the Different Time (∆t) between the last time, tn 

and the start time, t1 in 1 second (time interval) duration. The equation for Different Time 

(∆t) is shown in Eqs. (5) and (6). 

𝑓(𝑡) =  {𝑡1, 𝑡2, … … … , 𝑡𝑛} (5) 

∆𝑡 = 𝑡𝑛 −  𝑡1  (6) 

3.2.4 Step 4: Data Quality Process 

This process is the point where the most interesting things occur. This process improves 

the quality of the data and features, improving the performance of detecting novel botnets. 

Therefore, we label it as Data Quality Process to be represented as the objective of this 

combination process. This process utilized a two-step approach of standardization and 

Independent Component Analysis (ICA). Specifically, we used this theory because the 

classifier we chose was related to a distance-based classifier. 

A) Standardization

Standardization is a re-scaling process for the distribution of the dataset to obtain the

mean of the data equal to 0, and the standard deviation equal to 1. In other words, 

standardization is a process of centering the data. Standardizing a data set for a wide range 

of machine learning estimators is a common need. However, it could be harmful if the 

individual features do not look more or less like standard normally distributed data (e.g., 

Gaussian with 0 mean and unit variance). Therefore, test x is calculated as the standard value 

of: 

𝑧 =  
(𝑥 −  𝜇)

𝑠⁄ (7) 

where µ is the mean of the training samples or zero if, with mean= False, s is the 

standard deviation of the training samples or with std= False, respectively. 

 Centering and scaling occurred on each feature independently by computing the 

relevant statistics on the samples in the training set. Mean and standard deviation were then 

stored in a transform to be used on later data. 

B) Independent Component Analysis

For this research, we used FastICA from sklearn. decomposition package in python, as

shown in Algorithm 1 below. As the name FastICA implies, it is the short version of ICA. 

FastICA rotates the data until the data looks non-Gaussian in every axis. By making the 

105


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
mean equal to zero and normalization the variance in all directions, the algorithm can rotate 

the data in any direction. The process of normalizing the variance is called the whitening 

process. As shown in Algorithm 1, the python code to implement ICA is through the 

whitening process. The whitening process is the decorrelation to ensure that all features are 

treated equally before the Algorithm of ICA run. 

After the centering process (mean equal to zero) and the whitening process 

(normalization of variance), the data ran the ICA algorithm. The main goal of ICA is to find 

the unmixing vector of W, where W is the inverse of A, X is the input data, and A is the 

mixing signal. The equation is shown in Eqs. (8) to (10): 

𝑋 = 𝐴 𝑆   (8) 

𝑆 = 𝐴−1𝑋  (9) 

𝑆 = 𝑊 𝑋  (10) 

C)   Features Importance Ranking 

We evaluated our feature selection through Features Importance Ranking calculated 

using Extra Tree Classifier, as shown in Algorithm 2. Extra Trees Classifier for features 

importance in Scikit.learn module is based on impurity-based importance where it calculated 

the importance of training data without reflecting the prediction ability. 

From the feature’s importance ranking in Fig. 2, we can see the percentage of the highest 

contribute features. For example, the 1st feature increased from 22.75% to 27.74% and the 

lowest contributing feature, the 9th feature, increased from 1.8% to 4.31%. Since no feature 

had a 0% contribution, the removal process was not executed. 

 
Fig. 2: The comparison of features ranking with and without ICA. 

 
b) feature ranking with ICA 
a) original features 

106


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
Fig. 3: Percentage differences in the highest and lowest ranking features. 

3.3  Building Model using Classification 

Once the data completed pre-processing, we moved to the classification process. The 

classifier that we chose is K-Nearest Neighbor (K-NN). K - Nearest Neighbor (KNN) is one 

of the most straightforward classification machines that stores all available cases and 

classifies new cases based on a similarity measure [2,41]. The idea behind KNN is that if a 

sample belongs to a specific class in the space of several similar samples (k), the sample is 

also in the category. Thus, techniques based on Nearest-Neighbor classify samples based on 

the similarity of the population. KNN falls into the algorithm family of supervised learning. 

Informally, this means that a labeled data set consisting of training observations (x, y) is 

provided, and the relationship between x and y wants to be captured. More formally, our 

objective was to learn a function h:X→Y to predict the corresponding y output confidently 

with an unseen observation x. First, we needed to determine the k-value of the number of 

groups (cluster) to use K-NN. For this research, we used Elbow Method to determine the k 

value. 

3.3.1 Determine k-value (Elbow Method) 

Elbow method ran the k-NN algorithm several times and calculated the WSS error for 

different values of k. To find the optimal value of k, we used the elbow method that derives 

from the Within-cluster Sum of Squared (WSS). The Elbow method is a heuristic approach 

in determining the number of clusters for k-means or k-NN. The equation of WSS is 

described by Eq. (11): 

𝑊𝑆𝑆 =  ∑ ∑ (𝑋𝑖 − 𝜇𝑘 )
2  𝑋𝑖∈ 𝐶𝑘

𝑘
𝑘=1   (11) 

where; 

Ck = cluster of k 

µk = the mean value of the data that point to the cluster 

Xi = an observation to the Ck 

The optimal value of k is at the elbow curve, or the distortion point that starts decreasing 

linearly, as described in Fig. 6. Although the value of k = 2 looks like there is a curve/ 

distortion from k= 2 until the k = 4, the decrease is still significant and not linear. Due to 

that, for this research, the k-value was k = 4. 

107


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
           Fig. 4: The Elbow method. 

3.3.2 Building the Model 

Once we have determined the k-value for K-NN, we started training and building the 

machine learning model. The Building Model Data was divided into training data and testing 

data with a 70:30 ratio, as explained in Section 3.2. During the building model process, once 

the model was built, the prediction of the Evaluation Data was ready to start. We tested it 

file-by-file to evaluate how well the model could predict a particular bot. 

4.   EVALUATION 

We evaluated the performance of our techniques based on the Confusion Matrix in 

terms of accuracy, precision, recall, f-score, false-negative rate (FNR), and false-positive 

rate (FPR). A confusion matrix is the most widely used method to evaluate a machine-

learning model's performance. The distribution of the results can be seen clearly by creating 

a confusion matrix from the model. The confusion matrix consisted of a two-dimensional 

table with the class "actual" and "cluster/projection" in a single-dimension structure and 

evaluated only two (2) classes. The other dimension was rated as "Botnet" positive and 

"Human" negative. Thus, the cases were classified into four fractions: False Positive (FP), 

False Negative (FN), True Positive (TP), and True Negative (TN), as shown in Table 5. 

Table 5: The Confusion Matrix for this article 

   Prediction (Cluster) 

   Normal Botnet 

Actual (Label) 

 Normal TN FP 

Botnet FN TP 

When the data is in the state “True”, either TP or TN, it shows that the classifier 

predicted it in the correct class. In the “False” state, there was an incorrect prediction class. 

For example, when the data was in a False Negative state, it means that the classifier Falsely 

predicted as Negative (Normal) where the data was positive (botnet), while the Positive and 

Negative indicate Botnet or Normal class. 

108


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
Table 6: The explanation of Confusion Matrix's fraction used in this article 

Fraction Module 2 

True Positive (TP) TP is counted when the model CORRECTly predicted the 

botnet traffic/IP 

True Negative (TN) TN is counted when the model CORRECTly predicted the 

normal (Negative) traffic/IP 

False Positive (FP) FP is counted when the model INCORRECTly predicted the 

normal (Negative) traffic as the BOTNET (Positive) traffic/IP 

False Negative (FN) FN is counted when the model INCORRECTly predicted the 

botnet (Positive) as the normal (NEGATIVE) traffic/IP 

The Confusion Matrix can generate several performance evaluation parameters, but for 

this research, we focused on accuracy, precision, recall, f-score, false-negative rate (FNR), 

and false-positive rate (FPR). These parameters were chosen to make a comparison with 

other researcher’s results that used the same dataset. The overall performance was from the 

Accuracy, but we preferred to compare the overall performance using the f-score. 

4.1  Performance Parameter 

4.1.1 Accuracy 

Accuracy is often used to measure the overall performance of the machine learning 

classifier because it is a parameter that measures how often the algorithm correctly classifies 

a data point. Accuracy is the number of correctly predicted data points from all data points 

where it can be described in the Eq. (12) below:- 

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =  
∑ 𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛

∑ 𝑑𝑎𝑡𝑎
     =

𝑇𝑃+𝑇𝑁

𝑇𝑃+𝑇𝑁+𝐹𝑁+𝐹𝑃
 (12) 

4.1.2 Precision 

The ‘Precision’ parameter is the count of data classified as a botnet (positive) that are 

genuinely botnet. Precision also can be described as in equation (13):- 

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =  
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑡ℎ𝑎𝑡 𝑡𝑟𝑢𝑙𝑦 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒

∑ 𝑇𝑜𝑡𝑎𝑙 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛
              =  

𝑇𝑃

𝑇𝑃+𝐹𝑃
    (13) 

4.1.3 Recall 

A recall is also known as Sensitivity, where it is the fraction of actual positives that 

are identified correctly. Recall also can be described as the ability of a model to find the 

relevant cases.  

𝑅𝑒𝑐𝑎𝑙𝑙 (𝑇𝑃𝑅) =              
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑡ℎ𝑎𝑡 𝑡𝑟𝑢𝑙𝑦 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒

𝐴𝑙𝑙 𝑏𝑜𝑡𝑛𝑒𝑡 𝑑𝑎𝑡𝑎
=  

𝑇𝑃

𝑇𝑃+𝐹𝑁
    (14) 

4.1.4 F-score 

F1 score is the harmonic combination of recall and precision. F1 score is the equal 

weight. 

𝐹𝑠𝑐𝑜𝑟𝑒 = 2 ∗
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
   (15) 

 
109


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
5.   RESULTS AND DISCUSSION 

The evaluation for the framework was conducted in 2 parts; one was when building 

the model and the other one was the prediction of novel data using the Evaluating Data. 

5.1  Building Model 

This section summarizes the findings during the Model Building. The performance of 

the flow-based features and K-NN framework was compared to the performance of the 

same framework but with additional ICA during pre-processing. Both performance 

evaluations are shown in Table 7.  

     Table 7: The performance comparison of the training model with and without ICA 

  Accuracy Precision Recall F1-score FNR FPR 

without ICA 0.9689 0.9831 0.958 0.9704 0.0419 0.01875 

with ICA 0.969 0.9846 0.9568 0.9705 0.0431 0.0169 

This result was evaluated on the testing data split from the Building Model Data 

illustrated in Fig. 3. Table 7 lists the parameters used in the evaluation. From the results in 

Table 7, if we compare the Accuracy and F-score, there are tiny increments, but the False 

Positive Rate (FPR) decreased from 1.88% to 1.69%. 

5.2  Prediction Novel Bots 

CTU-13, the benchmark botnet dataset, is also used by several researchers to evaluate 

their prediction model’s performance. So, we compared the result from another article that 

used the same data source and the same evaluation parameter. There were 5 files in 

Evaluating Data that were separated before the building model process. The data is 

described in Table 8. 

Table 8: The data number and the botnet type in Evaluating Data 

Data Number  Bot Name 

data 1  
Neris 

data 2 

data 6  Menti 

data 8  Murlo 

data 9  Neris 

Table 9: Performance differences between models with and without ICA against novel botnet 

K-NN Model  Data Number Accuracy F1-score FPR 

without ICA 

 data 1 0.6708 0.6895 0.2187 

data 2 0.7583 0.784 0.0679 

data 6 0.7032 0.6041 0.1826 

data 8 0.7658 0.2871 0.1547 

data 9 0.5067 0.6436 0.1891 

with ICA 

 data 1 0.7912 0.8298 0.278 

data 2 0.6734 0.6958 0.1262 

data 6 0.7767 0.7673 0.2995 

data 8 0.8005 0.5859 0.2244 

data 9 0.7134 0.8222 0.3309 

110


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

Based on Table 9, the model with ICA showed a better F score than the Model Without 

ICA. But, the model without ICA showed the lowest FPR compared to the other model. 

Thus, the yellow box in Table 9 indicates the best value, either the highest F score or the 

lowest FPR. Since we were focused on the F-score for comparison, we extracted it to Table 

10. 

   Table 10: The F score for both models (with and without ICA) 

data 1 data 2 data 6 data 8 data 9 

without ICA 0.6895 0.784 0.6041 0.2871 0.6436 

with ICA 0.8298 0.6958 0.7673 0.5859 0.8222 

Table 10 shows that 4 files from 5 novel botnet files had the highest F-score using the 

K-NN model with ICA. Only one file, File No 2, showed the opposite result. Due to that,

we agree that the K-NN Model with ICA performed better than the K-NN Model without

ICA. The result for the K-NN Model with ICA was compared to other researcher’s results.

The results were directly compared with previously reported findings on a novel botnet

prediction model that used the same data sources.  Table 11 summarizes the comparison

result and lists the parameters used in every previous article.

Table 11: Performance model comparison between previous researchers 

Parameter 
Data No/ 

researcher 

Garcia, S., Zunino, 

A., & Campo, M. 

(2014)[42] 

Garcia, S. 

(2015).

[44] 

Kozik, 

R. (2018)

[45]

Wang, J. & 

Paschalidis, 

I.C.

(2017)[46] 

Fernandez 

Maimo et 

al.  [38] 

wnh 

ibrahim 

et al. 

Recall 1 0.4 1 0.077 0.91 0.91 0.8357 

2 0.3 0.88 0.046 0.55 0.95 0.5687 

6 0 0 0.12 0.95 1 0.8836 

8 0.1 0 0.044 0.76 0.26 0.9424 

9 0.1 0.38 0.08 0.38 0.99 0.7171 

Precision 1 0.5 0.87 0.86 0.73 0.68 0.824 

2 0.6 0.96 0.8 0.65 0.88 0.8961 

6 0.4 -1 0.69 0.7 0.92 0.6781 

8 0.2 0 0.6 0.4 0.47 0.4251 

9 0.4 0.72 0.73 0.95 0.89 0.9635 

F1 Score 1 0.48 0.93 0.14 0.81 0.77 0.8298 

2 0.41 0.92 0.088 0.59 0.92 0.6958 

6 0.04 0 0.21 0.8 0.96 0.7673 

8 0.14 -1 0.082 0.53 0.33 0.5859 

9 0.25 0.5 0.14 0.54 0.94 0.8222 

Based on Table 11, in the F-Score parameter, our technique defeated other results for 

Data number 8 (red font). However, three out of five novel botnet files hade the highest f-

score from Fernandez Maimo et al. [31].  To measure the overall performance, we calculated 

the average for each parameter. The average of each data (Data from File 1,2,6,8,9) and 

parameter (Precision, Recall, and F Score) from Table 11 are illustrated in Fig. 7. These 

plots show that our method proposed here outperformed the other approaches except 

Fernandez Maimo et al. [31].  

111


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
Since the Fernandez Maimo et al. [31] techniques outperformed the overall evaluation, we 

compared the different approaches they used. For example, Fernandez Maimo et al. [31] 

also used flow-based features, but they considered the features from dual-direction, 

incoming, and outgoing traffic. Their approach resembled our technique in that both 

methods focused on concealment network traffic and used statistical analysis to aggregate 

the data. 

 
Fig. 5: The performance comparison among other researchers on botnet classification. 

6.   CONCLUSION  

This paper proposes the framework for novel botnet detection that implements data 

standardization and Independent Component Analysis (ICA) during flow-based features 

pre-processing data. The strength of our framework is that we used flow-based features that 

have the benefits of detecting traffic from the concealment network. Other than that, the 

complexity and processing time can also be minimized by flow-based features compared to 

content-based ones. Also, for aggregation, for the quick detection method, we used the 

shortest time interval. Our approach can be applied to botnet concealment and new/novel 

forms of a botnet. 

The use of data standardization and Independent Component Analysis (ICA) improves 

key rating attributes and classification outcomes. However, the overall result is still not the 

best relative to other previous approaches. Nevertheless, it generated an improved result 

using Data Standardization and Independent Component Analysis (ICA). We can also 

assume that behavior analysis caused some noise to the pattern.   

Future directions are connected to enhancing the collection of functionalities. Further 

developments are expected to lead to a deeper understanding of the nature of the selection 

of functions.  

ACKNOWLEDGEMENT 

The authors wish to thank Universiti Teknologi Malaysia (UTM) for its support under 

Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) 

Vot 4L876, and the Fundamental Research Grant Scheme (FRGS) Vot 

(FRGS/1/2018/ICT04/UTM/01/1) supported by the Ministry of Higher Education Malaysia. 

The work is partially supported by the SPEV project (ID: 2102-2021), Faculty of 

Informatics and Management, University of Hradec Kralove. We are also grateful for the 

112


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
support of Ph.D. students Michal Dobrovolny and Sebastien Mambou in consultations 

regarding application aspects from Hradec Kralove University, Czech Republic. The APC 

was funded by the SPEV project 2102/2021, Faculty of Informatics and Management, 

University of Hradec Kralove. 

REFERENCES  

[1] Ibrahim, W.N.H., Selamat, A., Anuar, S., & Krejcar, O. (2019). Clustering botnet behavior 
using K-means with uncertain data, Frontiers in Artificial Intelligence and Applications vol. 

318. pp.244–257. 

[2] Liang, X. & Znati, T. (2019). On the performance of intelligent techniques for intensive and 
stealthy DDos detection, Computer Networks vol. 164. p.106906. 

[3] Gross, G. (2016). Detecting and destroying botnets, Network Security vol. 2016, no. 3. pp.7–
10. 

[4] WHITE OPS. (2018). Retrieved June 1, 2020, https://www.whiteops.com/blog/9-of-the-most-
notable-botnets. 

[5] Kolias, C., Kambourakis, G., Stavrou, A., & Voas, J. (2017). DDoS in the IoT: Mirai and other 
botnets, Computer vol. 50, no. 7. pp.80–84. 

[6] Eremin, A. (2019). Retrieved June 1, 2020, https://securelist.com/bots-and-botnets-in-
2018/90091/. 

[7] Khan, R.U., Zhang, X., Kumar, R., Sharif, A., Golilarz, N.A., & Alazab, M. (2019). An 
adaptive multi-layer botnet detection technique using machine learning classifiers, Applied 

Sciences (Switzerland) vol. 9, no. 11. p.2375. 

[8] Patsakis, C., Casino, F., & Katos, V. (2020). Encrypted and covert DNS queries for botnets: 
Challenges and countermeasures, Computers & Security vol. 88. p.101614. 

[9] Bezerra, V.H., da Costa, V.G.T., Barbon Junior, S., Miani, R.S., & Zarpelão, B.B. (2019). 
IoTDS: A one-class classification approach to detect botnets in internet of things devices, 

Sensors (Switzerland) vol. 19, no. 14. p.3188. 

[10] Wang, Y.-H., Li, Z.-N., Xu, J.-W., Yu, P., Chen, T., & Ma, X.-X. (2020). Predicted 
Robustness as {QoS} for Deep Neural Network Models, Journal of Computer Science and 

Technology vol. 35, no. 5. pp.999–1015. 

[11] Prasad, K.M., Reddy, A.R.M., & Rao, K.V. (2020). BARTD: Bio-inspired anomaly based real 
time detection of under rated App-DDoS attack on web, Journal of King Saud University - 

Computer and Information Sciences vol. 32, no. 1. pp.73–87. 

[12] Su, S.C., Chen, Y.R., Tsai, S.C., & Lin, Y.B. (2018). Detecting P2P Botnet in Software 
Defined Networks, Security and Communication Networks vol. 2018. pp.1–13. 

[13] Mahmoud, M., Nir, M., & Matrawy, A. (2015). A Survey on botnet architectures, detection 
and defences, International Journal of Network Security vol. 17, no. 3. pp.272–289. 

[14] Mathur, L., Raheja, M., & Ahlawat, P. (2018). Botnet Detection via mining of network traffic 
flow, Procedia Computer Science vol. 132. pp.1668–1677. 

[15] Kupreev, O., Badovskaya, E., & Gutnikov, A. (2019). Retrieved June 1, 2020, 
https://securelist.com/ddos-report-q3-2019/94958/. 

[16] Aamir, M. & Zaidi, S.M.A. (2019). Clustering based semi-supervised machine learning for 
DDoS attack classification, Journal of King Saud University - Computer and Information 

Sciences. 

[17] Singh, M., Singh, M., & Kaur, S. (2019a). Detecting bot-infected machines using DNS 
fingerprinting, Digital Investigation vol. 28. pp.14–33. 

[18] Bazrafshan, Z., Hashemi, H., Fard, S.M.H., & Hamzeh, A. (2013). A survey on heuristic 
malware detection techniques, IKT 2013 - 2013 5th Conference on Information and 

Knowledge Technology no. May. pp.113–120. 

[19] AsSadhan, B., Bashaiwth, A., Al-Muhtadi, J., & Alshebeili, S. (2018). Analysis of P2P, IRC 
and HTTP traffic for botnets detection, Peer-to-Peer Networking and Applications vol. 11, no. 

5. pp.848–861. 

113

https://www.whiteops.com/blog/9-of-the-most-notable-botnets
https://www.whiteops.com/blog/9-of-the-most-notable-botnets
https://securelist.com/bots-and-botnets-in-2018/90091/
https://securelist.com/bots-and-botnets-in-2018/90091/
https://securelist.com/ddos-report-q3-2019/94958/


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

 
[20] Alauthaman, M., Aslam, N., Zhang, L., Alasem, R., & Hossain, M.A. (2018). A P2P Botnet 
detection scheme based on decision tree and adaptive multilayer neural networks, Neural 

Computing and Applications vol. 29, no. 11. pp.991–1004. 

[21] Santana, D., Suthaharan, S., & Mohanty, S. (2018). What we learn from learning - 
Understanding capabilities and limitations of machine learning in botnet attacks. 

[22] Rauf, M.A.A.A., Asraf, S.M.H., & Idrus, S.Z.S. (2020). Malware Behaviour Analysis and 
Classification via Windows DLL and System Call, Journal of Physics: Conference Series vol. 

1529, no. 2. 

[23] Resende, P.A.A. & Drummond, A.C. (2018). A survey of random forest based methods for 
intrusion detection systems, ACM Computing Surveys vol. 51, no. 3. pp.1–36. 

[24] Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., & Yagi, T. (2016). Malware Detection 
with Deep Neural Network Using Process Behavior, Proceedings - International Computer 

Software and Applications Conference vol. 2. pp.577–582. 

[25] Muhtadi, A.F. & Almaarif, A. (2020). Analysis of Malware Impact on Network Traffic using 
Behavior-based Detection Technique, International Journal of Advances in Data and 

Information Systems vol. 1, no. 1. pp.17–25. 

[26] Apruzzese, G. & Colajanni, M. (November 2018). Evading botnet detectors based on flows 
and random forest with adversarial samples. Paper presented at NCA 2018 - 2018 IEEE 17th 

International Symposium on Network Computing and Applications. 

[27] Beigi, E.B., Jazi, H.H., Stakhanova, N., & Ghorbani, A.A. (2014). Towards effective feature 
selection in machine learning-based botnet detection approaches, 2014 IEEE Conference on 

Communications and Network Security, CNS 2014 pp.247–255. 

[28] Cabeza, L.F., Solé, C., Castell, A., Oró, E., & Gil, A. (2016). Unsupervised Network Intrusion 
Detection Systems for Zero-Day Fast- Spreading Attacks and Botnets Payam, International 

Journal of Digital Content Technology and its Applications(JDCTA) vol. 10, no. 2. 

[29] Singh, M., Singh, M., & Kaur, S. (2019b). Detecting bot-infected machines using DNS 
fingerprinting, Digital Investigation vol. 28. pp.14–33. 

[30] Malik, R. & Alankar, B. (2019). Botnet and Botnet Detection Techniques, International 
Journal of Computer Applications vol. 178, no. 17. pp.8–11. 

[31] Koroniotis, N., Moustafa, N., Sitnikova, E., & Turnbull, B. (2019). Towards the development 
of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT 

dataset, Future Generation Computer Systems vol. 100. pp.779–796. 

[32] Huda, S., Abawajy, J., Al-Rubaie, B., Pan, L., & Hassan, M.M. (2019). Automatic extraction 
and integration of behavioural indicators of malware for protection of cyber–physical 

networks, Future Generation Computer Systems vol. 101. pp.1247–1258. 

[33] Bou-Harb, E., Debbabi, M., & Assi, C. (2014). On fingerprinting probing activities, 
Computers and Security vol. 43, no. January 2012. pp.35–48. 

[34] Palmieri, F., Fiore, U., & Castiglione, A. (2014). A distributed approach to network anomaly 
detection based on independent component analysis, Concurrency and Computation: Practice 

and Experience vol. 26, no. 5. pp.1113–1129. 

[35] Mao, C.-H., Lin, C.-C., Pan, J.-Y. (Tim), Chang, K.-C., Faloutsos, C., & Lee, H.-M. ( 2012). 
EigenBot: Foiling spamming botnets with matrix algebra. Paper presented at Proceedings of 

the ACM SIGKDD Workshop on Intelligence and Security Informatics - ISI-KDD ’12, New 

York, New York, USA. 

[36] Ehsan, K. & Hamid,  reza shahriari. (2018). BotRevealer: Behavioral Detection of Botnets 
based on Botnet Life-cycle, International Journal of Information Security vol. 10, no. 1. pp.55–

61. 

[37] Stevanovic, M. & Pedersen, J.M. (2015). On the use of machine learning for identifying botnet 
network traffic, Journal of Cyber Security and Mobility vol. 4, nos. 2–3. pp.1–32. 

[38] Fernandez Maimo, L., Perales Gomez, A.L., Garcia Clemente, F.J., Gil Perez, M., & Martinez 
Perez, G. (2018). A Self-Adaptive Deep Learning-Based System for Anomaly Detection in 

5G Networks, IEEE Access vol. 6. pp.7700–7712. 

[39] Debashi, M. & Vickers, P. (2018). Sonification of Network Traffic for Detecting and Learning 
about Botnet Behavior, IEEE Access vol. 6. pp.33826–33839. 

114


IIUM Engineering Journal, Vol. 23, No. 1, 2022 Ibrahim et al. 
https://doi.org/10.31436/iiumej.v23i1.1789 

[40] Gezer, A., Warner, G., Wilson, C., & Shrestha, P. (2019). A flow-based approach for Trickbot

banking trojan detection, Computers and Security vol. 84. pp.179–192.

[41] Garg, S., Peddoju, S.K., & Sarje, A.K. (2016). Scalable P2P bot detection system based on

network data stream, Peer-to-Peer Networking and Applications vol. 9, no. 6. pp.1209–1225.

[42] Garcia, S., Zunino, A., & Campo, M. (2014). "Identifying, Modeling and Detecting Botnet

Behaviors in the Network".

[43] Han, W., Xue, J., Wang, Y., Liu, Z., & Kong, Z. (2019). MalInsight: A systematic profiling

based malware detection framework, Journal of Network and Computer Applications vol. 125,

no. October 2018. pp.236–250.

[44] Garcia, S. (2015). Modelling the Network Behaviour of Malware To Block Malicious

Patterns . The Stratosphere Project : a Behavioural Ips, Virus Bulletin no. September. pp.1–8.

[45] Kozik, R. (2018). Distributing extreme learning machines with Apache Spark for NetFlow-

based malware activity detection, Pattern Recognition Letters vol. 101. pp.14–20.

[46] Wang, J. & Paschalidis, I.C. (2017). Botnet Detection Based on Anomaly and Community

Detection, IEEE Transactions on Control of Network Systems vol. 4, no. 2. pp.392–404.

115