FACTA UNIVERSITATIS 

Series: Electronics and Energetics Vol. 35, No 2, June 2022, pp. 269-282 

https://doi.org/10.2298/FUEE2202269P 

© 2022 by University of Niš, Serbia | Creative Commons License: CC BY-NC-ND 

Original scientific paper  

WK-FNN DESIGN FOR DETECTION OF ANOMALIES  

IN THE COMPUTER NETWORK TRAFFIC 

Danijela Protić1, Miomir Stanković2, Vladimir Antić3 

1Center for Applied Mathematics and Electronics, Belgrade, Serbia 
2Mathematical Institute of SASA, Belgrade, Serbia 

3Center for Applied Mathematics and Electronics, Belgrade, Serbia 

Abstract. Anomaly-based intrusion detection systems identify abnormal computer 

network traffic based on deviations from the derived statistical model that describes the 

normal network behavior. The basic problem with anomaly detection is deciding what is 

considered normal. Supervised machine learning can be viewed as binary classification, 

since models are trained and tested on a data set containing a binary label to detect 

anomalies. Weighted k-Nearest Neighbor and Feedforward Neural Network are high-

precision classifiers for decision-making. However, their decisions sometimes differ. In 

this paper, we present a WK-FNN hybrid model for the detection of the opposite 

decisions. It is shown that results can be improved with the xor bitwise operation. The 

sum of the binary “ones” is used to decide whether additional alerts are activated or not. 

Key words: WK-FNN, anomaly detection, weighted k-nearest neighbor, feedforward 

neural network 

1. INTRODUCTION 

Due to the enormous increase in computer applications in the last few decades, the need 

for protection of the computer networks has multiplied [1]. Intrusion detection systems 

(IDSs) are the main defense of the network infrastructure, used to detect attacks or to 

indicate anomalies in the behavior of the computer network. The signature or misuse IDSs 

proactively detect the presence of known maliciousness. The most practical method to 

detect signature of malicious content is to measure the similarity between detected pattern 

of current network activity and the already known patterns of various types of malicious 

attacks [2]. The anomaly detection is performed by detecting changes in system behavior 

or usage patterns [3]. The identification of anomalies in the network is essential to diagnose 

 
Received October 11, 2021; received in revised form December 6, 2021 

Corresponding author: Danijela Protić 
Center for Applied Mathematics and Electronics, Belgrade, Serbia 

E-mail: adanijela@ptt.rs 


270 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

attacks or failures that seriously affect the performance and security of the computer 

network [4, 5].  

The goal of an anomaly-based IDS is to proactively detect any activity or an event on 

a host computer or network that shows a deviation from a normal network behavior [2]. In 

order to provide suitable solution for the detection of anomalies in the computer network, 

the concept of normality is fundamental. The idea of normality is usually introduced 

through a formal model that expresses the relationship between the variables involved in 

the dynamics of the system, so that an event is recognized as abnormal when its degree of 

deviation in relation to the profile or the behavior of the system, specified by the normality 

model, is high enough [6]. 

In the last few decades, machine learning has started to play an important role in 

anomaly detection [6, 7, 8]. In supervised machine learning, anomaly detection can be 

thought of as a kind of binary classification, since the data sets for training and testing the 

models contain binary labels: one for normal observations and one for abnormal observations. 

It should be noted that the troubleshooting data set can be quite unbalanced in detecting 

anomalies. Therefore, it is important to use some data transformation algorithms prior to 

supervised learning. In this article we propose a three-step algorithm that removes all 

irrelevant features from the Kyoto 2006+ dataset and normalizes the instances so that the 

influence of one feature cannot dominate the others. After pre-processing is completed, there 

were nine features left to train two binary classifiers, namely the weighted k-Nearest 

Neighbor (wk-NN) and the Feedforward Neural Network (FNN). The classifiers show a high 

precision in decision making but, in some cases their decisions are different. The proposed 

WK-FNN hybrid model recognizes the opposite decisions based on a bitwise exclusive or 

(xor) operation between the outputs of the classifiers. The binary sum of the opposing 

decisions is used as the basis for the additional warnings. Two alerts are combined. Trigger 

alert reacts to the opposite decisions and threshold-based alert allows users to prioritize 

alerts that are rated as critical. 

2. LITERATURE REVIEW 

Since the nature of the features and the number of instances determine the applicability 

of anomaly detection techniques, the analysis of the high-dimensional data sets becomes a 

challenge for researchers [9, 10]. In the last few decades, researchers have investigated the 

intrusion detection systems for various purposes and on the different datasets. In [11] and 

[12] the authors compare the DARPA98, KDD CUP ’99, NSL-KDD, Kyoto 2006+ and 

CAIDA datasets. In addition, the authors in [13] have compared a signature-based and 

anomaly-based classification and examined the ISCX2012, CIC-IDS-2017 and CSE-CIC-

2018 datasets in the context of the feature selection and the attack types. In [14] the authors 

describe the functionality of the ADFA-LF and ADFA-WD datasets and compare them 

with the DARPA98, KDD Cup ’99, NSL-KDD and CIC-IDS-2017 datasets. The datasets 

are simulated or captured from real computer network traffic, and differ in size, number of 

features, purpose, type of attacks, etc. The main characteristics of the above datasets are 

summarized in the Table 1. 


 WK-FNN Design for Detection of Anomalies in the Computer Network Traffic 

Table 1 Description of the datasets   

Dataset Type of the attacks Features Kind of traffic Description 

ADFA-LD 
and ADFA-

WF 

Hydra-FTP, Hydra SSH, 
Adduser, Java-
Meterpreter, Meterpreter, 
Webshell. 

26 

From the host for 
normal activities, with 
user behavior ranging 
from web browsing to 

LATEX document 
preparation. 

Created from the 
evaluation of the system-
call-based HIDS; Linux 
and Unix OS (LD) and 

Windows (WF). 

AWID 

Attacks on 802.11 
(authentication request, 
probe request, injection, 
ARP flooding). 

156 features 
extracted 
from each 

packet 

Emulated (small 
network, 11 clients) 

WLAN traffic in packet-
based format; 37 million 

packets in one hour 
captured. 

CAIDA DDoS 
Network 

traffic traces 
Real (collected on 

high-speed monitors) 

Collected on commercial 
backbone link from 2008 
to 2019; Does not contain 

diversity of attacks. 

CIC-IDS-
2017 

Botnets, cross-site-
scripting, DoS, DDoS, 
Goldeneye, Hulk, 
RUDY, Slowhttptest, 
Slowloris.  

More than 
80 

Emulated (small 
network) 

Captured over a period of 
5 days; contains network 

traffic in packet-based 
and bidirectional flow-

based format. 

CSE-CIC-
2018 

Brute force, Hearthbleed, 
Botnet, DoS, DDoS, Web 
attacks, infiltration from 
the network inside. 

More than 
80 

Emulated (simulated 
scenarios) 

10 days network traffic 
and log files of 50 
machines from the 

attacker side and 420 PCs 
and 30 servers from the 

victim organization. 

DARPA98 
DoS, privilege escalation 
(R2L and U2R), probing. 

41 
Emulated (small 

network) 

7 weeks of network 
traffic in packet-based 
format and audit log. 

ISCX 2012 

Scenarios: Infiltrating the 
network from the inside, 
HTTP DoS, DDoS using 
an IRT bootnet, SSH 
brute force attack. 

20 
Emulated (small 

network) 
7 days of packet network 

traffic observed. 

KDD Cup 
‘99 

DoS, privilege escalation 
(R2L and U2R), probing.  

42 
Emulated (small 

network) 

Derived from the 
DARPA98 dataset. Five 
weeks of network traffic 
in packet-based format 

Kyoto 
2006+ 

Attacks against 
honeypots (DoS, exploits, 
malware, port scans, 
shellcode). 

24 
Real (honeypots, and 

regular servers) 

3 years of real packet-
based network traffic; 

packets converted into the 
sessions. 

NSL-KDD 
DoS, privilege escalation 
(R2L and U2R), probing.  

42 
Emulated (small 

network) 

Derived from the KDD-
Cup ’99 dataset; does not 
contain redundant records 

in the training set nor 
duplicates in the test set. 

As it is shown in Table 1, all datasets, with the exception of the Kyoto 2006+ dataset, 

are either simulated network data or come from actual network traffic, which is mainly 


272 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

used for signature detection. The dataset is also the only one intended for anomaly-based 

IDS modelling. For these reasons, this study uses the Kyoto 2006+ dataset as the basis for 

binary classification experiments with machine learning (ML) models. Machine learning 

is effective in eliminating redundant and irrelevant data, increasing learning accuracy and 

improving comprehensibility of the results [15].  Feature selection has direct influence on 

the efficiency of the results and offers a way to reduce computation time, improve accuracy, 

and enable a better understanding of the classification models or the data. In the case of an 

anomaly detection, the labels assigned to the data instances are usually in the form of binary 

values [16]. Machine learning models can be very effective in learning normal or abnormal 

patterns from training data and in detection of the anomalies in the computer networks [17].  

The Kyoto 2006+ dataset is captured and created in actual network traffic to classify 

network traffic as normal or abnormal. Since the purpose of this work is to present the 

hybrid classifier for improved anomaly detection in binary classification this data set is 

used in experiments. The Kyoto 2006+ dataset is unbalanced data set in which the amounts 

of normal and abnormal data are unbalanced.  

In [18], the authors present a series of tests they carried out to assess the effectiveness of 

ML techniques in detecting anomalies and present the algorithms that gave the best results. 

In [19] the authors carried out experiments with 10 daily records from the Kyoto 2006+ 

dataset and showed that accuracy decreases slightly when the number of features is reduced 

from 17 to 9 and the instances range from -1 to 1. In supervised machine learning, wk-NN 

has the highest accuracy of a variety of machine learning models. In [20] the author proposes 

a method that can detect large-scale attacks in real time with weighted k-NN classifiers. The 

key factor in developing an anomaly-based intrusion detection system is the selection of 

significant features for decision-making. A good feature selection for choosing meaningful 

and as few features as possible plays a key role in successful anomaly-based IDS. In [21] the 

authors proposed a new learning algorithm for pseudo-neighbor elimination and anomaly 

detection based on the wk-NN model in order to minimize the effects of these distant 

neighbors.  

In [22] the authors  examine the applicability of the feedforward architecture of neural 

networks for traffic prediction and compare the performance of different back-propagation 

algorithms. The prediction is made for various random aggregates of traffic flows. The 

performance analysis showed the effectiveness of the proposed method for an adequate 

choice of the learning algorithm. In [23] the authors approached an IDS using a 2-layered 

feedforward neural network. In the training phase, the early-stop strategy is used to overcome 

the problem of overfitting in neural networks. The proposed system is assessed against the 

DARPA dataset. The selected connections from the DARPA dataset are preprocessed and 

feature range is converted into [-1, 1]. These modifications affect final detection results in 

particular. In [24] the authors proposed IDS model, which uses the feedforward neural network 

and the back-propagation algorithms along with various optimization techniques to minimize 

the overall computational overhead, while maintaining a high level of performance. The 

experimental results on the benchmark NSL-KDD dataset shows that in some cases the 

accuracy of the proposed IDS model is better than that of the other IDS models. Because of its 

high performance and low computational requirements, the proposed model was a suitable 

candidate for real-time implementation. In [25] the authors showed the results on the accuracy 

of two FNN classifiers in the short processing time when deciding on anomalies in the behavior 

of the complex computer networks. In [26] the authors used a PC-generated offline data set 

to assess the performance of two neural network-based techniques. In this data set, each 


 WK-FNN Design for Detection of Anomalies in the Computer Network Traffic 

data point corresponds to a normal or anomaly class. It is assumed that the anomaly data is 

the intruder data, obtained by disabling some PC controllers, audio drivers, graphics 

drivers, etc. In this article, the authors took 15 randomly selected features from the log file, 

which contains 20,000 records. The authors have shown that the FNN classifiers are 

approximately 98% accurate.  

Hybrid models for anomaly detection are also the topic of various research. In scenario 

given in [27], the authors propose a hybrid online-offline system in which the offline model 

maintains the general properties of the network traffic, based on Radius Nearest Neighbor 

while the online model based on the support vector machine continuously learns and they 

work together to detect anomalies. The method is evaluated using the NSL-KDD 2009 

dataset. This model achieved an accuracy of ~95% with known anomalies. It should be 

noted that the NSL-KDD dataset is the simulation of the computer network traffic on the 

middle-size American military base [11]. In order to improve the detection performance 

and to reduce the tendency to frequent attacks, the two-stage hybrid method based on 

binary classification and k-NN technique is proposed in [28]. First, binary classifiers and 

an aggregation module are used to efficiently identify the exact classes of network 

connections. Afterwards, the connections whose classes insecure, further determine their 

classes by the k-NN algorithm. The second step is built on the results of the first step and 

is a useful addition to the first step. By combining the two steps, the proposed method 

achieves reliable results in the NSL-KDD data set [11].   

Network alerts are a critical aspect of network performance monitoring because they 

are designed to provide information technology (IT) administrators with quick insight into 

the network problems. Therefore, network alerting should be an important consideration 

for those choosing their network alerting tools. In [29] the author provides information on 

the four main types of network alerts. Real-time alerts periodically or continuously scan all 

areas of the network for network behavior problems. The time between each network pass 

is an important consideration as it determines how quickly network problems are identified. 

Intelligent alerts provide details about the problem, when and where it occurred, and which 

areas of the network are affected. Flexible delivery alerts are network monitoring notification 

tools that can be configured for scheduled and hourly alerts to ensure alerts are received at 

the right time. Critical and tiered alerts are tools for minimizing the number of network 

notifications. Network monitoring alerts, also known as threshold alerts, are tools that 

support critical and tiered alerts, so that the user can prioritize alerts that are critical or 

violate a preconfigured network configuration. Systems with tiered alerting assign 

problems to one of several categories. Alerts are processed according to the importance of 

the category. In [30] the authors confirm that the alert ranking classifies alerts according to 

the dangerousness of the alert. The alarm tactic requires that the functionality responsible 

for the alarm classification should not be computationally expensive, otherwise the advantages 

of the quick response, which is obtained by a prioritized reaction to dangerous alerts, are 

negated. In [31] the authors explain that not all classification algorithms equally accurate. 

Therefore, it is important to carefully select the criteria that can accurately classify the 

alerts based on the specific security needs of an organization. In [32] the authors describe 

the efficiency of the basic methods for rule-based alert classification and explain that 

engineers usually concentrate primarily on critical alerts, but not on errors and warnings. 

They claim that engineers should investigate more alerts. At the same time, they find a lot 

of time is wasted in investing in non-serious warnings (low precision), but many serious 

alerts are still lost. In [33] the authors divide alerts into low- and high-level alerts and point 


274 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

out that high-level alert management is a potential task that helps the administrator to analyze 

alerts correctly and to allocate time and effort. 

3. DATA COLLECTION  

The Kyoto 2006+ dataset is publicly available and a widely used dataset in network-

based intrusion detection research. The dataset includes more than three years of actual 

traffic data collected from honeypots (Solaris 8 for Intel, Windows XP (no patch, SP2, fully 

patched), Nepenthes, others), darknet sensors, and other systems (mail server to collect 

various types of mails, web crawler developed by NTT Information Sharing Platform 

Laboratories, Windows XP to evaluate malware activities) deployed on five different 

computer networks inside and outside the University of Kyoto [34]. The Kyoto 2006+ 

dataset is developed through deploying of honeypots in the network, but does not describe 

any details the types of attack [13]. In addition, the IDS Bro has been used to convert 

packet-based traffic into a format called sessions. IDS Bro is a signature and behavior-

based analysis framework that provides detailed data on hypertext transfer protocol (http), 

domain name system (DNS), secure shell (SSH) communication protocol and strange 

network behavior [35]. Thanks to its analysis engine, it is suitable for high performance 

network monitoring, protocol analysis, and real-time application layer status information. 

The Bro event engine is responsible for receiving the internet protocol (IP) packets and 

converting them into events forwarded to the policy script interpreter, which then produces 

an output [36].  

During the observation period (from 2006 to 2009) more than 50 million sessions with 

normal traffic, 43 million sessions with known attacks and 425 thousand sessions with 

unknown attacks were recorded. Each session includes 24 features, 14 out of which 

characterize statistical features derived from the KDD ’99 Cup dataset and 10 additional 

flow-based features (IP addresses, ports, and duration) [11, 37]. A feature Label indicates 

the presence of attacks [38]. In the original data set, there were three labels: 1 for normal 

sessions, -1 for known attacks, and -2 for unknown attacks. However, since unknown 

attacks are very rare in the dataset (~0.7%), we assigned the same label to known and 

unknown attacks (-1), which leads to binary classification [39]. 

The main problem associated with the Kyoto 2006+ is its size. In this study, this 

problem is solved with the pre-processing algorithm, which removes all irrelevant features 

(categorical features, statistical features regarding to the connection duration, and features 

for further analyses) and normalizes instances of the relevant features with a hyperbolic 

tangent function to the range [-1,1]. After the pre-processing is completed, features 5-13 

remain for the evaluation of the models, and the feature Label identifies the session as 

normal or abnormal [19, 40]. Table 2 shows the description of the features used in the 

experiments. 

In this research, the notation of the instances is as follows: the number of instances in 

a daily record is referred to as the total number of instances, the number of instances 

labelled with 1 is referred to as the number of normal instances, while the number of 

instances labelled with -1 denotes the number of anomalous instances. 


 WK-FNN Design for Detection of Anomalies in the Computer Network Traffic 

Table 2 Description of the features from the Kyoto 2006+ dataset 

Feature Description 

Count 

The numbers of connections whose source IP address and 

destination IP address are the same to those of the current 

connection in the past two seconds. 

Same_srv_rate % of connections to the same service in the Count feature. 

Serror_rate % of connections that have ‘SYN’ errors in Count feature. 

Srv_error_rate 

% of connections that have ‘SYN’ errors in Srv_count (% of 

connections whose service type is the same to that of the current 

connections in the past two seconds) features. 

Dst_host_count 

Among the past 100 connections whose destination IP address is the 

same to that of the current connection, the number of connections 

whose source IP address is also the same to that of the current 

connection. 

Dst_host_srv_count 

Among the past 100 connections whose destination IP address is the 

same to that of the current connection, the number of connections 

whose service type is also the same to that of the current connection. 

Dst_host_same_src_port_rate 
% of connections whose source port is the same to that of the 

current connection in Dst_host_count feature. 

Dst_host_serror_rate % of connections that have ‘SYN’ errors in Dst_host_count feature. 

Dst_host_srv_serror_rate 
% of connections that have ‘SYN’ errors in Dst_host_srv_count 

feature. 

Label 

Indicates whether the session was attack or not; ‘1’ means normal. ‘-

1’ means known attack was observed in the session, and ‘-2’ means 

unknown attack was observed in the session. 

4. WK-FNN MODEL  

A classification model generally maps the input data to a specific target and determines 

which label to assign to the new, unlabeled data. With binary classification, a classifier 

assigns the input data into one of two classes. The WK-FNN hybrid model is based on two 

binary classifiers. The wk-NN classifier is a lazy learner who saves training data and labels, 

and waits for the test data. Instead of focusing on building a general model, it works on 

storing instances of the training data into classes. The FNN, an eager learner, creates a 

classification model based on the training data set before it is received data for prediction. 

The basic idea of the wk-NN is to expand k-Nearest Neighbor (k-NN) algorithm which 

stores all instances corresponding to the training data in n-dimensional space. Predictions 

for a new instance x are made by searching the entire training set for the k closest neighbors 

and summarizing the output variable for these cases. The classification is based on 

calculation of a simple majority vote of each point. wk-NN extends the k-NN such that 

instances of the training set, which are particularly close to the new instance, have more 

weight in the decision than those who are more distant. The main idea is to make the distant 

neighbor less effective than the closest, at making decisions by majority vote, by giving 

more weight to the nearest point and less to the more distant [41, 42]. To do this, the 

distances 𝑑𝑤 (𝐱, 𝐲) = √∑ (𝑥𝑖 − 𝑦𝑖 )
2𝑝

𝑖=1
 are converted into the weights. The simplest conversion 


276 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

function is inverse of the distance. The closest k points are weighted with weights 𝑤 =
1

𝑑𝑤(𝒙,𝒚)
2
 (the weight decreases with increasing the distance). 

The FNN consists of a series of layers with highly connected neurons in each layer, 

with the final layer producing the outputs that relate the inputs to the desired output, so that 

 𝑦𝑖 (𝐰, 𝐖) = 𝐹𝑖 (∑ 𝑊𝑖𝑗 𝑓𝑗 (∑ 𝑤𝑗𝑖 𝑥𝑙 + 𝑤𝑗0
𝑚
𝑙=1 )

𝑞
𝑗=1 + 𝑊𝑓0) (1) 

where fj and Fi denote hidden and output layer transfer functions, m represents the 

number of inputs xl, q represents the number of outputs yi, w and W are weight matrices, 

and wj0 and Wf0 are biases [43]. The FNN is trained through an iterative process to modify 

the weights so that the given inputs map an appropriate response. In this way, the inputs 

are classified according to the target classes. In general, FNNs have a large number of 

parameters which, due to the convergence to a correct set of parameter values, can lead to 

the estimation problems [44]. For this reason, the weights are updated according to the 

Levenberg-Marquardt (LM) algorithm [45, 46].  

The design of the WK-FNN model is based on the wk-NN and FNN binary classifiers, 

which work in parallel and decide on the anomaly in the behavior of the computer network. 

The basic idea is to train wk-NN and FNN with the same training set and evaluate high-

precision classifiers (Figure 1). Subsequently, the classification of the unknown network 

transfer is carried out by both classifiers. The decisions about the anomaly are transmitted 

to the xor block, where the result of the counter-decision is calculated. Finally, the 

percentage of the opposite decision triggers an alert. 

 
Fig. 1 Classifiers’ training 

The WK-FNN model is a three-layer structure. The first layer classifies the network 

traffic according to the both wk-NN and FNN. A bit-by-bit xor operation is carried out in 

the second module. The third part of the WK-FNN marks the opposite decisions (Fig. 2). 

 
out1 

out2 

DECISION 

10% 

 
0 

CLASSIFICATION 

Unknown traffic  

wkNN 

FNN 

X
O

R
 

𝑠𝑢𝑚(𝑥𝑜𝑟(𝑜𝑢𝑡1, 𝑜𝑢𝑡2)) 

XOR BLOCK 

TRAINING 

wkNN 

Settrain 

FNN 


 WK-FNN Design for Detection of Anomalies in the Computer Network Traffic 

 
  Fig. 2 WK-FNN model 

Through the classification, both classifiers decide about the unknow network traffic and 

the outputs of each of the classifiers (decision about normal network behavior or the 

anomaly) are then passed on to the XOR block, where the ‘exclusive or’ bitwise operation 

is performed. The different/opposite decisions are recognized by performing the xor logical 

operation on the classification results, which is logically true (1) if one of the outputs is, 

but not both, non-zero. Otherwise the result is logical false (0).  

The sum of the different decision in the decision block is then calculated as follows  

 𝑠𝑢𝑚𝑜𝑢𝑡 = ∑ 𝑥𝑜𝑟(𝑜𝑢𝑡1𝑘 , 𝑜𝑢𝑡2𝑘 )
𝑙𝑒𝑛𝑔𝑡ℎ(𝑑𝑎𝑡𝑎𝑠𝑒𝑡)
𝑘=1  (2) 

where out1k and out2k represent the k-th results of the classification, and outk=xor(out1k,out2k). 
The result is then passed on to the decision-making engine. The opposite decisions indicated by 

bit-by-bit xor operation can generate different types of alerts, depending on the organizational 

structure and information security requirements such as confidentiality, integrity and data 

availability. The alerts can be sent to the network administrator or to the other IDS. It should be 

noted that the additional anomaly alerts are separate from regular IT alerts. Therefore, it is 

necessary to define the anomaly alert promotion rule in order to generate an IT alert based on 

the anomaly alerts.  

The promotion rule of the WK-FNN model is based on the ratio between the number 

of opposing decisions and the total number of decisions of the classifiers, expressed as a 

percentage. A decision is presented based on a linear scale threshold. The basic idea behind 

the decision is that the number of contradicting decisions is low if the two classifiers are 

really highly accurate. Otherwise, the results will not be reliable. Instead of making 

additional decisions about what is normal or abnormal, the decision block points out the 

difference in the classifiers’ decisions. If both choose ‘normal’ or ‘anomaly’ their decisions 

are not different. Otherwise, their decisions will differ, and the result on xor operation will 

be binary one. As ‘normal’ traffic refers to binary 1 (Label equals 1) and ‘anomaly’ refers 

to the binary 0 (Label equals to -1), the number of opposing decisions refers to the sum of 

all decision, because the same decisions result in zero after performing the xor operation. 

The number of opposing decisions 𝑠𝑢𝑚𝑜𝑢𝑡 is given in the Eq. 2. Divided by the total number 

of the decisions, given as 𝑙𝑒𝑛𝑔𝑡ℎ(𝑑𝑎𝑡𝑎𝑠𝑒𝑡), the resulting value shows the percentage of the 

opposing decisions.  

For this experiment, we have divided the priority levels of the alert scale into five 

different categories: negligible/insignificant alerts (whitelisted), potential threats (they 

have no direct influence on network traffic and the network structure), warnings (provide 

information about the risks), silent alarms (critical with ticketing) and high priority alarms 

(signal an attack). The scale is linear and divided into five groups with two percentage 

ranges, from 0 to 10 %. It should be noted that the scale can be chosen differently, 

depending on the needs of the organization security. 

5. EXPERIMENTAL RESULTS  

The experiments are carried out on three days of the computer network traffic recorded 

at Kyoto University computer network in February 2007. All models are selected based on 


278 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

processing time and memory usage and are simulated in the MATLAB Classification 

Learner platform for Windows 64-bit OS installed on an Intel Core i7 processor with 

2.7GHz CPU and 16 GB RAM memory. The wk-NN model is trained by approximating 

an instance by the weighted sum of 10 k-nearest neighbors. Weights are calculated based 

on inverse distances. The FNN with one hidden layer, nine inputs, nine nodes in the hidden 

layer and one output node is trained with the LM algorithm. In order for the LM algorithm 

to work correctly, the activation function of the hyperbolic tangent is used for each node, 

since it is differentiable, centered around 0, and its output range is [-1, 1]. The weights are 

initialized to the small random numbers, because the optimization begins as a gradient 

descent (GD) algorithm, which speeds up the convergence of the LM algorithm and 

minimizes the wrong approximations [20]. The WK-FNN model design is tested as 

follows: 

▪ Each of the three daily sets, consisting of 57278, 57279 and 58317 instances, is 
divided into the two subsets: Set1 of 75% of the instances is used to train and test 

the classifiers, while Set2 of 25% of the instances is used for the WK-FNN tests;  

▪ Set1 is then divided into two groups of instances: 70% are used to train the 
classifiers and the remaining instances are used to test both the models; 

▪ Set2 is used for testing the opposite decisions of the classifiers – the results are 
passed on to the xor module; 

▪ The sum of all contradicting/opposing decisions is sent to the alarm detector. 
In summary, 52.5% and 22.5% of instances of each daily record are used to train and 

test the classifiers, respectively, and 25% of the instances are used to verify the WK-FNN 

model. The performance of the classifiers is calculated in term of accuracy (ACC). ACC 

represents the ratio between the number of correctly classified instances to the total number 

of instances, given as follows [47] 

 ACC =
𝑇𝑃+𝑇𝑁

𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
  (3) 

 A true positive (TP) result indicates that the anomaly has been correctly identified as 

“anomaly”. A true negative (TN) means that the IDS has correctly classified the normal 

behavior as “normal”. A  false positive (FP) indicates a misclassification of the normal 

behavior of the network as an “anomaly”. A false negative (FN) indicates abnormal 

behavior of the network that has been mistakenly assigned to the “normal” class. The 

accuracy results for the classifiers and the number of opposing decisions recognized by the 

WK-FNN model are shown in Table 3. The opposing decisions are calculated for 25% of 

the instances from each daily record.  

Table 3 Accuracy of the classifiers and the number of opposing decisions   

Instances 
Accuracy [%] 

Opposing decisions [%]  Opposing decisions [instances] 
FNN wk-NN 

57278 99.3 99.5 8.08 1157 

57279 99.3 99.3 3.18 456 

58317 99.0 99.1 0.67 98 

In Table 3, the Opposing decisions [instances] = 𝑠𝑢𝑚𝑜𝑢𝑡 (every binary 1 triggers an 

alert), and the ratio of the number of opposing decisions and the total number of decisions 

(anomaly score) is given with the Opposing decisions [%] =
𝑠𝑢𝑚𝑜𝑢𝑡

𝑙𝑒𝑛𝑔𝑡ℎ(𝑑𝑎𝑡𝑎𝑠𝑒𝑡)
∙ 100%. It can be 


 WK-FNN Design for Detection of Anomalies in the Computer Network Traffic 

seen that there is no relationship between number of instances in the daily set and the 

Opposing decisions. The anomaly score ranges from 0 to 10 % and is used as the threshold 

value for the additional alert. There are some concerns about the priorities and the 

percentages associated with the conflicting decision. A higher percentage of the different 

decisions indicates the greatest uncertainty in the classification and the incomplete 

knowledge of the event, which is not related only to the decision of the classifiers [48]. In 

general, uncertainty in decisions can arise from the following sources: (1) data errors 

(uncertainties about past events), (2) forecast errors (uncertainties about future events) and 

(3) model (residual) errors (differences between what is observed and what the model 

shows). The WK-FNN supports resolving the uncertainty by calculating the percentage of 

the opposing decisions of the classifiers, but cannot determine the probability of a certain 

event occurring. It is designed to provide the warning for the conflicting decisions of the 

classifiers. Then the decision makers, knowing all the possible versions of the resolved 

issues use this auto-generated alert to resolve information security related issues in their 

organization. The alert scale presented in this paper was chosen to indicate the low 

probability of serious effects on network security with a small percentage of similar 

decisions and the high probability of an attack on the computer network with a high 

percentage of opposing decisions. It should be noted that there are other options for 

selecting the decision criteria, the threshold value ranges, and the alert colors, which can 

be modified depending on the additional protection requirements and the sensitivity of the 

information to the potential threats. In [49] Multicriteria decision-making (MCDM) is 

presented. The authors examined the changes in the measurement scale and the formulation 

of criteria. In [50] the authors proposed the evaluation metrics to measure the effectiveness 

of collaborative decisions based on the likelihood of trust in collaborative decision-making 

processes. In [51] the author proposes a prioritization of alerts, which can be achieved by 

integrating several methods. In the experiments presented in this paper, the percentages of 

the opposing decisions are divided into five alert groups (See Table 4). 

Table 4 Linear threshold scale and ranges of the opposing decisions   

Threshold range [%] Alert grouping and colouring 
Opposing decisions [%] 

0.67 3.18 8.08 

0-1.99 Negligible alert (Black)    

2-3.99 Potential threat (Blue)    

4-5.99 Warning (Green)    

6-7.99 Silent alarm (Orange)    

≥ 8.00 High priority alarm (Red)    

Although the simple manual rules cannot always adequately capture the complex and 

interactive patterns of factors that influence the priority of the alerts the manual rules 

proposed here were used for classification of the five different priority levels. Generally, 

the levels can be divided into three main categories: (1) errors/failures (negligible alert, 

potential threat), (2) warnings and (3) critical level (silent alarm, high-priority alarm) [32]. 

A negligible alert means that the alert is whitelisted (the lowest probability of serious 

effects on network security and the smallest percentage of similar decisions). The 

administrator can exclude certain activities which generate alerts, based on the user 

analytics. For the purposes of this research, the percentage of the negligible alert is used to 


280 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

be less than 2%. Potential threat means that the alert may result from network disturbances 

and has no negative impact on business. The warning displays the known alert sources, 

acts as an information aggregator, provides information about the risks, and generates a 

hazard message. The silent alarm signals the high-level discrepancy in the decision of the 

classifiers and causes the ticket to be issued (the proof of authentication or authorization 

must be verified). A high priority alarm signals the highest probability of the attack (the 

highest percentage of opposing decisions). The ranking list can be adopted after a few other 

factors relating to the dataset (label, total number of instances) and metrics (accuracy, 

precision,  recall), and depending on the changes to the system, new types of the alerts can 

be added [32]. The ranking can be combined with methods that reduce or reclassify a given 

list of rankings [52]. 

6. CONCLUSION  

This article introduces the design of a WK-FNN hybrid model that warns of opposing 

decisions about anomalies in the computer network. The model consists of a classification 

module, an XOR block and a decision-making engine. In the classification module two 

high-precision binary classifiers work in parallel. The classifiers take into account 9 

features with the normalized instances to decide whether the network traffic is abnormal 

or not. The results of the decisions of the classifiers are passed on to the XOR block, where 

the exclusive or binary operation is carried out. Binary 1 triggers an additional anomaly 

alert which is sorted into one of the predefined alert groups. The results show the presence 

of additional alerts related to the negligible alert, potential threat, and high-priority alarm. 

Acknowledgement: A part of this research is presented at the 21st International Arab Conference 

on Information Technology, 6th of October 2020, Giza, Egypt. 

REFERENCES  

[1] D. Protic, "Neural cryptography," Military Technical Courier, vol. 64, no. 2, pp. 483–492, 2016. 
[2] J. Sen and S. Methab "Machine Learning Applications in Misuse and Anomaly Detection," 2009. Available 

https://arxiv.org/ftp/arxiv/papers/2009/2009.06709.pdf. 
[3] D. Dasgupta and H. Brian, "Mobile security agents for the network traffic analysis," In Proceedings of the 

DARPA Information Survivability Conference and Exposition II DISCEX01, 2001, vol. 2, pp. 332–340. 

[4] A. Kind, M. P. Stoecklin and X. Dimitropoulos, "Histogram-based traffic anomaly detection," IEEE 
Transactions on Network and Service Management, vol. 6, no. 2, pp. 110–121, June 2009. 

[5] P. Čisar and S. Marvić Čisar, "EWMA statistics and fuzzy logic in function of network anomaly detection," 
Facta Universitatis, Series: Electronics and Energetics, vol. 32, no. 2, pp. 249–265, June 2019.  

[6] M. H. Bhuyan, D. K. Bhattacharyya and J. K. Kalita, "Network Anomaly Detection: Methods Systems and 
Tools," IEEE Communication Surveys & Tutorials, vol. 16, no. 1, pp. 303–336, First quarter 2014.  

[7] V. Hodge and J. Austin, "A survey on outlier detection methodologies,” Artificial Intelligence Review, vol. 
22, no. 2, pp. 85–126, 2004. 

[8] T. Nguyen and G. Armitage, "A Survey of Techniques for Internet Traffic Classification using Machine 
Learning," IEEE Commun. Surveys Tutorials, vol. 10, no. 4, pp. 56–76, 2008. 

[9] S. Omar, A. Ngadi and H. H. Jebur, "Machine Learning Techniques for Anomaly Detection: An 
Overview," International Journal of Computer Applications, vol. 79, no. 2, pp. 33–41, October 2013. 

[10] C. Jie, L. Jiawei, W. Shulin and Y. Sheng, "Feature selection in machine learning: A new perspective," 
Neurocomputing, vol. 300, pp. 70–79, 26 July 2018.  

https://arxiv.org/ftp/arxiv/papers/2009/2009.06709.pdf


 WK-FNN Design for Detection of Anomalies in the Computer Network Traffic 

[11] D. Protic, "Review of KDD CUP ’99, NSL-KDD and Kyoto 2006+ Datasets," Military Technical Courier, 
vol. 66, no. 3, pp. 580–595, 2018. 

[12] B. Bohara, J. Bhuyan, F. Wu and J. Ding, "A Survey on the Use of Data Clustering for Intrusion Detection 
System in Cybersecurity," Int. J. Netw. Secur. Appl., vol. 12, no. 1, pp. 1–18, Jan 2020. 

[13] A. Thakkar and R. Lohiya, "A Review of the Advancement int the Intrusion Detection Datasets," 
International Conference on Computational Intelligence and Data Science (ICCIDS 2019), Procedia 

Computer Science, vol. 167, pp. 636–645, 2020. 
[14] A. Khraisat, I. Gondal, P. Vamplew and J. Kamruzzaman, "Survey of intrusion detection systems: 

techniques, datasets and challenges," Cybersecurity, pp. 2–20, 2019. 

[15] S. Khalid, T. Khalil and S. Nasreen, "A survey of feature selection and feature extraction techniques in 
machine learning," In Proceedings of the 2014 Science and Information Conference, 2014, pp. 372–378. 

[16] O. Osanaiye, O. Ogundile, F. Aina andA. Periola, "Feature selection for intrusion detection system in a 
cluster-based heterogeneous wireless sensor network," Facta Universitatis, Series: Electronics and 
Energetics, vol. 32, no. 2, pp. 315–330, June 2019. 

[17] M. Bahrololum, E. Salahi and M. Khaleghi, "Machine Learning Techniques for Feature Reduction in 
Intrusion Detection Systems: A Comparison," In Proceedings of the 2009 Fourth International Conference 
on Computer Sciences and Convergence Information Technology, pp. 1091-1095, 2009. 

[18] Y. -G. Cheong, K. Park, H. Kim, J. Kim and S. Hyun, "Machine Learning Based Intrusion Detection 
Systems for Class Imbalanced Datasets," Journal of the Korea Institute of Information Security and 
Cryptology, vol. 27, no. 6, 2017, pp. 1385–1395.  

[19] D. Protic and M. Stankovic, "Detection of Anomalies in the Computer Network Behaviour," European 
Journal of Engineering and Formal Sciences, vol. 4, no. 1, pp. 7–13, 2020.  

[20] Ming-Yang Su, "Real-time anomaly detection systems for Denial-of-Service attacks by weighted k-nearest 
neighbor classifier," Expert Systems with Applications, vol. 38, no. 4, pp. 3492–3498, April 2011. 

[21] J. Dhar, A. Shukla, M. Kumar and P. Gupta, "A Weighted Mutual k-Nearest Neighbour for Classification 
Mining," arXiv.org. Submitted on 14 May 2020. https://arxiv.org/abs/2005.08640 [cs.LG]. 

[22] C. Callegari, S. Giordano and M. Pagano, "Neural network based anomaly detection," In Proceedings of 
the 2014 IEEE 19th International Workshop on Computer Aided Modeling and Design of Communication 
Links and Networks (CAMAD), 2014, pp. 310–314. 

[23] F. Haddadi, S. Khanchi, M. Shetabi and V. Derhami, "Intrusion Detection and Attack Classification Using 
Feed-Forward Neural Network," In Proceedings of the 2010 Second International Conference on Computer 
and Network Technology, 2010, pp. 262–266. 

[24] B. Subba, S. Biswas and S. Karmakar, "A Neural Network based system for Intrusion Detection and attack 
classification," In Proceedings of the 2016 Twenty Second National Conference on Communication 
(NCC), 2016, pp. 1–6. 

[25] D. Protic and M. Stankovic, "А Hybrid Model for Anomaly-Based Intrusion Detection in Complex 
Computer Networks," In Proceedings of the 21st International Arab Conference on Information 
Technology, 6th of October 2020, Giza, Egypt, pp. 1–8. 

[26] S. K. Gutam and H. Om, "Computational neural network regression model for host based intrusion detection 
system," Perspectives in Science, vol. 8, pp. 93–95, September 2016. 

[27] M. Odiathevar, W. K. G. Seah and M. Frean, "A Hybrid Online Offline System for Network Anomaly 
Detection," In Proceedings of the 2019 28th International Conference on Computer Communications and 
Networks (ICCCN), 2019, pp. 1–9. 

[28] L. Li, Y. Yu, S. Bai, Y. Hou and X. Chen, "An Effective Two-Step Intrusion Detection Approach Based 
on Binary Classification and $k$ -NN," IEEE Access, vol. 6, pp. 12060–12073, 2018. 

[29] J. Griffin, "All about network alerts + Best tools," by SolarWinds on October 29, 2020. Available 
https://logicalread.com/network-alerts/.  

[30] F. Ullah and M. Ali Babar, "Architectural Tactics for Big Data Cybersecurity Analytic Systems: A 
Review," The Journal of Systems and Software, vol. 151, pp. 81–118, 2019. 

[31] S. Allier et al., "A framework to compare alert ranking algorithms," In Proceedings of the Reverse 
Engineering (WCRE), 19th Working Conference on. IEEE, 2012. 

[32] N. Zhao, P. Jin, L. Wang, X. Yang, R. Liu, W. Zhang, K. Sui and D. Pei, "Automatically and Adaptively 
Identifying Severe Alerts for Online Service Systems," In Proceedings of the INFOCOM, 2020. 

[33] W. Alhakami, "Alerts Clustering for Intrusion Detection Systems: Overview and Machine Learning 
Perspectives," International Journal of Advanced Computer Science and Applications, vol. 10, no. 5, pp. 

573–582, 2019. 

[34] J. Song, H. Takakura, Y. Okabe, M. Eto, D. Inoue and K. Nakao, "Statistical Analysis of Honeypot Data 
and Building of Kyoto 2006+ Dataset for NIDS Evaluation," In Proceedings of the 1st Work-shop on 

https://arxiv.org/abs/2005.08640
https://logicalread.com/network-alerts/


282 D. PROTIĆ, M. STANKOVIĆ, V. ANTIĆ 

Building Anal. Datasets and Gathering Experience Returns for Security, Salzburg, April 10-13, 2011, pp. 
29–36. 

[35] K. Demertzis, "The Bro Intrusion Detection System", Project: Machine Learning to Cyber Security, 2018, 
DOI: 10.31140/RG.2.2.35333.40168. 

[36] R. McCarthy, "Network analysis with the Bro security monitor," 2014, retrieved from https://www.admin-
magazine.com/Archive/2014/24/Network-analysis-with-the-Bro-Network-Security-Monitor, 7 November 2021. 

[37] KDD CUP ‘99 dataset. [Internet] http://kdd.ics.uci.edu/dataset/kddcup’99/kddcup’99.html, 2018. 
[38] M. Ring, S. Wunderlich, D. Scheuring, D. Landes and A. Hotho, "A Survey of Network-based Intrusion 

Detection Data Sets, " arXiv:1903.02460v2 [cs.CR] 6 Jul 2019, pp. 1–17. 

[39] Y.e Maleh, "Security and Privacy Management, Techniques, and Protocols," IGI Global, USA, 2018, pp. 
266–267. 

[40] D. Protic and M. Stankovic, "Anomaly-Based Intrusion Detection: Feature Selection and Normalization 
Instance to the Machine Learning Model Accuracy," European Journal of Engineering and Formal 
Sciences, vol. 1, no. 3, pp. 43–48, 2018. 

[41] M. Zhao and J. Chen, "Improvement in comparission of weighted k nearest neighbor classifiers for model 
selection," Journal of Software Engineering, vol. 10, pp. 109–118, 2016. 

[42] M. Faryaneh, "Weighted k-nearest neighbors (WKNN)," MATLAB Central File Exchange, 
https://www.mathworks.com/matlabcentral/fileexchange/74111-weighted-k-nearest-neighbors-wknn.  

[43] W. F. Schmidt, M. A. Kraaijveld and R. P. W. Duin, "Feed forward neural networks with random weights," 
The Netherlands, Delft University of Technology, Faculty of Applied Phisics,1992, 0-8186-2915-0/92, 

IEEE, pp. 1–4. 

[44] D. Protic, "Feedforward neural networks: The Levenberg-Marquardt optimization and the optimal brain 
surgeon pruning," Military Technical Courier, vol. 63, no. 3, pp. 11–28, 2015. 

[45] K. Levenberg, "A method for the solution of certain problems in least squares," Quarterly of Applied 
Mathematics, vol. 5, pp. 164–168, 1944. 

[46] D. Marquardt, "An algorithm for least-squares estimation of nonlinear parameters," SIAM Journal in 
Applied Mathematics, vol. 11, no. 2, pp. 431–441, 1963. 

[47] C. Ambedkar and V. K. Babu, "Detection of Probe Attacks Using Machine Learning Techniques," International 
Journal of Research Studies in Computer Science and Engineering, vol. 2, no. 3, pp. 25–29, 2015. 

[48] M. Kurhade and R. Wankhade, "An Overview on Decision Making Under Risk and Uncertainty," 
International Journal of Science and Research, vol. 5, no. 4, pp. 416–422, April 2016.  

[49] D. Pamucar, D. Bozanic and A. Randjelovic, "Multi-criteria decision-making: An example of sensitivity 
analysis," Serbian Journal of Management, vol. 12, no. 1, 2017. 

[50] A. Ramos, M. Lazar, R. F. Filho and J. j P. C. Rodrigues, "A security metric for evaluation of collaborative 
intrusion detection systems in wireless sensor networks," In Proceedings of the 2017 IEEE International 

Conference on Communications (ICC), 2017, pp. 1–6.  

[51] L. Zomlot, "Handling uncertainty in intrusion analysis,” Thesis for PhD, 2014. http://doi.org/10.13140/ 
RG.2.1.4936.4326.   

[52] T. H. Ho, J. J. Hull and S. N. Sirihari, "Decision Combination in Multiple Classification Systems," IEEE 
Transaction on Pattern Analysis and Machine Intelligence, vol. 16, no.1, pp. 66–75, January 1994. 

https://www.admin-magazine.com/Archive/2014/24/Network-analysis-with-the-Bro-Network-Security-Monitor
https://www.admin-magazine.com/Archive/2014/24/Network-analysis-with-the-Bro-Network-Security-Monitor
http://kdd.ics.uci.edu/dataset/kddcup’99/kddcup’99.html
https://www.mathworks.com/matlabcentral/fileexchange/74111-weighted-k-nearest-neighbors-wknn
http://doi.org/10.13140/%0bRG.2.1.4936.4326
http://doi.org/10.13140/%0bRG.2.1.4936.4326