Instruction


FACTA UNIVERSITATIS   
Series: Electronics and Energetics Vol. 30, N

o
 4, December 2017, pp. 571 - 584 

DOI: 10.2298/FUEE1704571V 

TOWARD ACOUSTIC NOISE TYPE DETECTION 

BASED ON QQ PLOT STATISTICS
*
 

Sanja Vujnović, Aleksandra Marjanović, Željko Đurović, 

Predrag Tadić, Goran Kvaščev 

University of Belgrade, School of Electrical Engineering, Belgrade, Serbia 

Abstract. Fault detection and state estimation using acoustic signals is a procedure 

highly affected by ambient noise. This is particularly pronounced in an industrial 

environment where noise pollution is especially strong. In this paper a noise detection 

algorithm is proposed and implemented. This algorithm can identify the times in which 

the recorded acoustic signal is influenced by different types of noise in the form of 

unwanted impulse disturbance or speech contamination. The algorithm compares 

statistical parameters of the recordings by generating a series of QQ plots and then 

using an appropriate stochastic signal analysis tools like hypothesis testing. The main 

purpose of this algorithm is to eliminate noisy signals and to collect a set of noise free 

recordings which can then be used for state estimation. The application of these 

techniques in a real industrial environment is extremely complex because sound 

contamination usually tends to be intense and nonstationary. The solution described in 

this paper has been tested on a specific problem of acoustic signal isolation and noise 

detection of a coal grinding fan mill in thermal power plant in the presence of intense 

contaminating sound disturbances, mainly impulse disturbance and speech contamination. 

Key words: Acoustic signal, QQ plot, noise detection, predictive maintenance 

1. INTRODUCTION 

It is well known that the largest financial loss for modern industrial plants is due to 

inefficient or untimely maintenance [2]. This is especially true for power plants which are 

designed to be in function for many decades after their construction. Therefore, it is only 

logical that there is a significant amount of research done in an attempt to prolong the 

working life of the plant, improve the quality of its operation [3] and reduce unnecessary 

losses [4]. With this in mind, the fact that predictive maintenance has become a very 

                                                           
Received November 16, 2016; received in revised form February 21, 2017 

Corresponding author: Sanja Vujnović  

School of Electrical Engineering, University of Belgrade, Kralja Aleksandra Blvd. 73, 11120 Belgrade, Serbia 
(E-mail: svujnovic@etf.bg.ac.rs) 

 An earlier version of this paper received Best Section Paper Award at Automatics Section at 3rd International 

Conference on Electrical, Electronic and Computing Engineering IcETRAN 2016, Zlatibor, Serbia, 13-16 June, 
2016 [1] 


572 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

popular area of research is not so surprising. Crucial aspects of predictive maintenance 

are fault detection and state estimation, i.e. the estimate of whether the fault has occurred 

somewhere within the system or whether certain components are worn and the 

maintenance needs to be done in order to replace them.  

The accelerometers are the sensors most commonly used for implementing predictive 

maintenance algorithms on rotating machinery. The logic behind this is sound: as the 

fault occurs within the rotating element or as the wear of some components becomes 

pronounced, the vibration of the machine is sure to change accordingly [5]. The sensors can 

measure this vibration and algorithms can be constructed which can, based on the change in 

vibration signal, detect the amount of wear of certain components. These techniques are 

widely used in the industry with much success [6]; however, an alternative has been 

presented in the early 90s. This alternative proposes the use of acoustic signals for the same 

purpose. It has been shown that sound recordings can be as informative as vibration signals 

when it comes to state estimation of components [7], but acoustic sensors (microphones) 

are cheaper to obtain and are contactless, which is a very important feature for certain types 

of processes. One major drawback of using microphones for predictive maintenance is the 

fact that they are very sensitive to ambient noise. This makes them less than ideal for the 

use in an industrial environment which is usually very polluted with contaminating noise. 

For this reason microphones are still rarely used for predictive maintenance in real 

industrial environments. 

One way to significantly increase the applicability of acoustic signals for this purpose 

is developing an algorithm capable of filtering out the acoustic noise caused by the 

surrounding events. There are many preprocessing algorithms developed in recent years 

for purpose of fault detection and state estimation. Using one of the standard frequency 

filters is usually not applicable because it is very difficult (if not impossible) to determine 

the frequencies on which the noise is dominant. Even if that can be established, usually the 

useful part of the signal exists on the same frequencies as well, so filtering out the noise 

would significantly damage the informative part of the signal. Impulse disturbance in time 

domain, for example, is equally pronounced on all frequencies, so it cannot be filtered using 

traditional algorithms.  

Taking this into consideration one can easily conclude that standard frequency 

domain analysis is not reliable enough for noise detection in acoustic signals. Therefore 

advanced procedures should be used for this purpose, such as statistical analysis of the 

signal. Statistical parameters of the recorded signal can be very informative in this case 

because different statistical behavior is expected when the noise occurs and when the 

signal is in its nominal form. One of the standard tools used for statistical comparison and 

analysis are QQ plots and they are shown to be quite effective in this case [8]. 

The purpose of the algorithm proposed in this research is not removal, but rather 

detection of noise. The entire recording is separated into windowed signals, and each 

windowed segment is tested for noise. This is done by comparing the statistical distribution 

of the recorded signal against the statistical distribution of the signal in nominal working 

condition. The comparison is conducted using QQ plots and Neyman-Pearson hypothesis 

test. The noisy sequences are discarded and those which are classified as nominal are saved 

for the purpose of state estimation or some other predictive maintenance procedure. 


 Toward Acoustic Noise Type Detection Based on QQ Plot Statistics 573 

The algorithm developed in this research is seen as a part of a larger system of state 

estimation and fault detection mechanism of rotating elements in thermal power plants 

based on acoustic signals. It has been tested on real recordings taken in thermal power plant 

Kostolac A1 in Serbia, on a specific fan mill which is a part of coal grinding subsystem. It 

has been shown that state estimation of impact pates within a mill is possible only by using 

recordings from a microphone placed in the vicinity of the mill [9]. However, it has also 

been shown that noise can significantly influence the classification results. The purpose of 

this algorithm is to conduct signal preprocessing, so that the noise-free samples of the 

acoustic signals can be used for state estimation of impact plates of the mill.  

This paper is structured as follows. Section 2 contains theoretical description of the 

algorithm used, mainly QQ plots and Neyman-Pearson method of hypothesis testing. 

Section 3 contains the description of the real industrial coal grinding subsystem in thermal 

power plants on which this algorithm has been tested. In Section 4 the detailed results of the 

algorithm are given. Here, the algorithm has been tested on nominal and noisy signals. 

Furthermore, the effect of the change of certain parameters of the algorithm has been 

examined, as well as upgrade of the algorithm which enables it to be used for classification 

and not just noise detection. Finally, the conclusions are presented in Section 5. 

2. QQ PLOT AS A TOOL FOR NOISE DETECTION 

In nominal, stationary operation of the system it is assumed that the statistical 

parameters of the measured signals will remain constant. If, on the other hand, an event 

occurs that causes a deviation from nominal state (e.g. nonstationary ambient noise), 

statistical parameters of the recorded signals are expected to change in a certain way. 

Therefore, the probability distribution of the recorded signal in nominal regime is going 

to be different from the distribution of the signal which is polluted with noise. This 

change is going to depend on the duration and the type of noise, so the statistical 

parameters can be used not only for noise detection, but for noise classification as well. 

2.1. QQ plot 

A very efficient graphical tool which is used to compare the expected and obtained 

probability distribution is a QQ plot method [10]. This graph is obtained by plotting 

quantiles of the measured signal against the quantiles of the expected probability 

distribution. If the two distributions are similar, all the points in QQ plot will approximately 

lie on the line    . Figure 1 shows a QQ plot of an experimentally obtained zero mean 
unit variance Gaussian distribution against its theoretical expectation. 

The application of this type of data inspection allows not only the comparison of two 

probability distributions, but also the identification of the distribution of recorded model. 

For example, if outliers occur at the end of the     line, this means that the measured 
distribution has lager (or smaller) tails than the expected distribution. If all dots lie on the 

line, but the angle is not 45
o
, then the variance of the expected distribution is not the same 

as in the measured signal.  


574 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

 
Fig. 1 Experimentally obtained Gaussian samples plotted against the theoretical distribution. 

 
Fig. 2 Contaminated Gaussian distribution in time domain (upper left) with the appropriate 

QQ plot (upper right) and Laplace distributed sample data in time domain (lower 

left) with its QQ plot (lower right). 

Using these rules one can easily infer the shape of the probability distribution as a 

function of the expected distribution. For example, a Gaussian signal polluted with noise 

is expected to contain large tails on the QQ plot, as in Fig. 2 (up). On the other hand, if 

the distribution of experimentally obtained signal is significantly different in nature than 

-3 -2 -1 0 1 2 3
-3

-2

-1

0

1

2

3

Normal theoretical quantiles

N
o
rm

a
l 
d
a
ta

 q
u
a
n
ti
le

s

QQ plot

0 200 400 600 800 1000

-5

0

5

n[sample]

x
[n

]

-4 -2 0 2 4

-5

0

5

Normal theoretical quantiles

C
o
n
ta

m
in

a
te

d
 d

a
ta

 q
u
a
n
ti
le

s

0 200 400 600 800 1000
-5

0

5

n[sample]

x
[n

]

-4 -2 0 2 4
-5

0

5

Normal theoretical quantiles

L
a
p
la

c
e
 d

a
ta

 q
u
a
n
ti
le

s


 Toward Acoustic Noise Type Detection Based on QQ Plot Statistics 575 

the expected distribution, one will expect the deviation from     axis for both lower 
and higher values of quantiles. This is shown in Fig. 2 (down) where Laplacian 

distributed experimental samples are plotted against the Gussian distribution. The graph 

indicates that the obtained samples have higher values than the Gaussian distribution will 

indicate and there is a curve for lower values as well. 

If the measured samples            form a distribution  ( ), an ordered non-
decreasing sequence    can be obtained, where       for    . Here,   represents the 

number of samples taken. By observing the ordered sequence   , the formula for conditional 
probability can be obtained [8] which calculates the probability that measurement   will 
have the rank   in the said sequence: 

  (   )  (
   
   
)    ( )(   ( ))
   
  (1) 

2.2. Hypothesis testing 

QQ plots in this research are used to represent the relationship between the measured 

signal distribution and the distribution of the signal in nominal working conditions. For 

this reason hypothesis testing is implemented in order to decide, based on the available 

data, whether the assumption of nominal working conditions is correct. If not, then the 

signal is considered polluted by noise and is discarded. 

The noise detection algorithm developed in this research relies heavily on Eq. (1). In 

order to successfully implement it several initial calculations need to be performed. First 

the expected probability distribution in nominal regime (when there is no noise) needs to be 

established. Then, after calculating nominal probability density function   , the discriminant 
boundaries should be determined. If all the samples of the QQ plot lie within these boundaries, 

then the recorded signal is in nominal working condition, i.e. there is no noise. If, on the other 

hand, points on the QQ plot find themselves beyond the calculated boundaries, the fault has 

occurred, and the recorded samples are dismissed. 

There are two objectives which must be taken into account when establishing valid 

bounds on the QQ plot. The first objective is maximization of the probability that the 

noise-free recordings will be classified as valid. The second objective is minimization of 

the probability that faulty recordings will be falsely classified as valid. Therefore, a tradeoff 

needs to be made, and as a solution a variation of Neyman-Pearson method [11,12] for 

hypothesis testing has been chosen. This means that the probability   for the desired 
efficiency under nominal conditions has been fixed. In the literature this value is usually 

adopted in the range between     and     . In this paper the value        has been taken. 
Therefore, lower and upper bounderies (  and  ) are calculated so that the following 
condition is satisfied: 

 ∫  (    )   
  
  
  (    )   (    )     (2) 

where the probability density function  (    ) can be expressed using the Bayesian formula: 

  (    )  
 (    )  (  )

 ( )
  (3) 


576 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

3. CASE STUDY 

Coal fueled thermal power plants play a very important role in energy production 

worldwide and are the number one energy provider in Serbia. For that reason an increase 

of productivity and work life of an entire plant, as well as its subsystems, is of great 

economical importance. Coal grinding subsystem is one of the key parts of thermal power 

plant and is responsible for pulverization of coal, so it can be used in a burner system. 

In thermal power plant Kostilac A1 in Serbia fan mills used for coal pulverization have 

ten impact plates which rotate around the center. Pulverization occurs as a result of friction 

between the plates and the chunks of coal within the mill. When the coal is grinded into a 

fine powder it is transported into a burner system where it is used as a fuel. The particles 

which are not small enough return back into the mill where they are additionally pulverized. 

After several weeks the impact plates within the mill get worn due to constant impact 

with coal chunks and rock and the efficiency of the mill starts to decrease. This is when the 

maintenance needs to be performed or other more serious problems and malfunctions will 

occur. The algorithms which can detect the moment the maintenance is needed based on the 

recorded acoustic signals have already been developed. They, however, are unable to 

perform their function when the noisy measurements are provided, which often happens 

with acoustic signals in a real industrial environment. 

Mills in thermal power plants produce high intensity noise and they are located in the 

vicinity of other mills of the coal grinding subsystem. Therefore, the acoustic environment 

in which the recordings are measured is extremely complex. Even with all this in mind, the 

frequency features of this noise are very informative for state estimation of impact plates 

within the mills. However, given that the area around the mill consists of a large number of 

other actuators, valves, pipes, pumps, additional works such as welding, repairs, maintenance 

and the like, are quite common. At the same time, the sound recording is being enriched by 

sporadic impact of larger chunks of coal. These occurrences contaminate acoustic recording 

and, considering that their statistics are not included in the training sets used for impeller 

state estimation algorithms, they can cause the algorithm to make a wrong decision or, at 

the very least, cause a large time delay in making a correct decision. For this reason it is of 

great importance to develop techniques for detection and, if possible, classification of 

contamination in the acoustic recording. 

Acoustic signals used to demonstrate the results of the proposed algorithm are recorded 

in different acoustic surroundings of the mill. One part of these recordings is taken in 

nominal working conditions in which, other than the noise from the mills and other rotating 

elements, there are no other sources of contamination. The second group of recordings 

consists of nominal sound sources as well as the sound of people talking in the vicinity of 

the microphone. The third group of signals contains nominal sound as well as the sound 

produced during welding and repair of the steam lines near the mill. 

The noise detection algorithm developed in this research has been tested on real 

acoustic signals recorded in thermal power plant Kostolac A1 in Serbia. There are 10 

impact plates within the mill for which the noise detection algorithm has been tested and 

the recorded signal has the sampling frequency of      . The length of the obtained 
recording is approximately 20 minutes. This recording consists of intervals in which the 

system is in nominal regime, as well as intervals when the artificially created noise has 

been used to pollute the recording.  


 Toward Acoustic Noise Type Detection Based on QQ Plot Statistics 577 

4. RESULTS 

The proposed algorithm is tested in several steps. First the learning part of the 

algorithm is conducted in which the recordings in nominal regime are analyzed. In this 

way the nominal probability density function   , as well as the discriminant boundary for 
nominal recordings are obtained. After that, the algorithm is tested on both contaminated 

and nominal samples in order to determine how prone it is to false classification. The 

effect of window length on proposed algorithm is analyzed as well. Finally, an attempt 

has been made to classify the obtained noise and to determine whether the impulse 

disturbance or speech contamination has occurred. 

4.1. Nominal recordings 

As it is stated earlier, the initial part of the algorithm is a learning process in which 
sufficiently long signal in nominal regime is used to approximate the nominal probability 
density function. After that the Hilbert transform of the signal is performed in order to 
obtain an envelope of the signal. 

There are several ways to approximate the probability density function (pdf) of the 
obtained sequence. One is by observing the scaled histogram of the signal, and the other is 
using the method of kernel functions. The latter method is chosen in order to obtain a 
smoother version of the estimate without a significant increase in computational complexity. 
For pdf estimation an Epichenkov kernel function is used due to the fact that it is most 
commonly applied in the literature because it minimizes the mean square error. As expected, 
the pdf estimate obtained in this way roughly resembles the shape inferred from the 
histogram. 

After estimating pdf of a noise-free signal, the next step is to determine the boundaries of a 
QQ plot from Eq. (2). Seeing how all the samples of a Hilbert transform of the signal are 
positive and the expected behavior of a noisy signal would be a larger variance and a greater 
mean value (with respect to noise-free parameters) a slight simplification of (2) can be 
implemented, for the sake of easier numerical calculations: 

 ∫  (    )   
  
 
    (4) 

The lower classification boundary does not need to be determined because when the noise 

occurs, the points on the QQ plot are expected to drift above the     line. Therefore, Eq. (4) 
is used for the purpose of noise detection and boundary   calculation. The resulting QQ plot 
of the samples in nominal regime and the calculated boundary are shown in Fig. 3. 

 
Fig. 3 QQ plot of nominal recorded quantiles with respect  

to nominal expected quantiles, with boundary  . 

0 0.2 0.4 0.6
0

0.2

0.4

0.6

F
nom
-1

N
o
m

in
a
l 
d
a
ta

 q
u
a
n
ti
le

s

 
578 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

4.2. Noisy recordings 

Testing the algorithm as a tool for noise detection is conducted on the part of the 

signal which is 12 seconds long and whose Hilbert transform is shown in Fig. 4. This 

signal contains dominant sections of nominal regime (blue), sections contaminated with 

speech (green) and samples which contain impulse disturbance (red). In this way all the 

aspects of noise detection algorithm are tested. The Hilbert transform is applied in order 

to obtain an envelope of the signal. 

 
Fig. 4 Part of the recording on which the algorithm has been preliminary tested.  

Blue represents the nominal regime, green represents the part of the signal 

contaminated with speech, and red represents the part of the signal contaminated 

by impulse disturbance. 

The testing recording has been separated into smaller pieces obtained using window 

the size of 1sec, with overlap of 50%. Each window has been tested for noise, and the 

noisy recordings have been dismissed. All the windows which include only the nominal 

behavior without the noise have QQ plots which resemble the shape shown in Fig. 3. All 

the points of the plot are below the discriminant boundary and are therefore classified as 

noise-free samples. 

The effect of speech contamination on the QQ plot depends heavily on the percentage 

of contaminated signal which is enveloped within the window, as shown in Fig. 5. In case 

when the windowed signal consists exclusively of speech contaminated samples (Fig. 5 

down), its QQ plot has quantiles which lie on an approximately straight line with angle 

larger than    . This indicates that the variance of the recorded signal, as well as its mean 
value, is larger than expected. Also, most of the samples lie above the discriminant line 

which means that the algorithm has detected the noise. The situation is not so clear when 

only part of the window which is examined contains speech contaminated samples. In 

that case the angle of the plot is lower and, depending on the amount of speech included 

in the window, sometimes all the quantiles lie below the discriminant line. This means 

that the contamination has not been detected (Fig. 5 up). 

 
0 2 4 6 8 10 12
0

0.5

1

1.5

t [sec]

x
h
ilb

e
rt

 
nominal

impulse disturbance

speech contamination


 Toward Acoustic Noise Type Detection Based on QQ Plot Statistics 579 

  
Fig. 5 Speech contaminated samples in time domain (left) and the appropriate QQ plot (right). 

Upper figures show the behavior of the plot when only small part of the speech 

contamination is encompassed in the window. Central figures show the behavior when 

about 50% of the window contains contamination, while lower figures show what 

happens when the contamination is present in the entire windowed signal. 

With impulse disturbance the problem becomes much simpler and the algorithm 

manages to detect the contamination regardless of the percentage of noisy samples in the 

window. The nature of impulse disturbance is so abrupt that even a small number of 

samples encompassed within a window is enough to significantly change the statistical 

parameters. The appropriate QQ plot of this is shown in Fig. 6. 

0.5 1 1.5
0

0.1

0.2

0.3

0.4

0.5

0.6

t [sec]

x
h
ilb

e
rt

0 0.2 0.4 0.6 0.8
0

0.2

0.4

0.6

0.8

1

F
nom

-1

1 1.2 1.4 1.6 1.8 2
0

0.2

0.4

0.6

0.8

t [sec]

x
h
ilb

e
rt

0 0.2 0.4 0.6 0.8
0

0.2

0.4

0.6

0.8

1

1.2

F
nom

-1

2 2.2 2.4 2.6 2.8 3
0

0.2

0.4

0.6

0.8

1

1.2

t [sec]

x
h
ilb

e
rt

0 0.2 0.4 0.6 0.8
0

0.2

0.4

0.6

0.8

1

1.2

F
nom

-1


580 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

  
Fig. 6 Samples which contain impulse disturbance in time domain (left)  

and the appropriate QQ plot (right). 

The classification results of the algorithm are presented in Table 1. While classifying 

the nominal samples and samples which contained impulse disturbance the algorithm has 

achieved accuracy of 100%, while speech contamination has a lesser percentage of 

detection. This is due to the fact that the statistical parameters of the windowed signal do 

not vary considerably with respect to the nominal regime when only a small part of the 

window contains speech contamination. This is precisely what happened in those 2 

windowed parts of the signal which were wrongly classified. 

Table 1 Results of the noise detection algorithm 

 
Nominal 

recordings 

Speech 

contamination 

Impulse 

disturbance 

Classified as nominal 13 (100%) 2 (25%) 0 

Classified as noisy 0 6 (75%) 4 (100%) 

 
4.3. Length of the window adjustment 

The previous analysis suggests that the proposed algorithm easily detects impulse 

disturbances, but speech contamination can be somewhat more elusive. In the given 

8.5 9 9.5
0

0.5

1

1.5

t [sec]

x
h
ilb

e
rt

0 0.2 0.4 0.6 0.8
0

0.5

1

1.5

F
nom

-1

9.5 10 10.5
0

0.5

1

1.5

t [sec]

x
h
ilb

e
rt

0 0.2 0.4 0.6 0.8
0

0.5

1

1.5

F
nom

-1


 Toward Acoustic Noise Type Detection Based on QQ Plot Statistics 581 

example, out of 8 windowed signals contaminated with speech, the algorithm cannot 

correctly classify two of them. The problematic windowed signals are at the beginning and 

the ending of the speech sequence and incorrect classification is due to the fact that there is 

a small percentage of contaminated samples inside the window. One way to correct this 

error is by changing the length of the window. The noise detection results as the length of 

the window is changed are given in Table 2. 

Table 2 Changeable length of the window tested on speech contaminated signals  

Window length 
Classified as 

nominal 

Classified as 

noisy 

Total number of 

windowed signals 

       1 (17%) 5 (83%)   6 
     2 (25%) 6 (75%)   8 
       2 (13%) 13 (87%) 15 
       31 (37%)   52 (63%) 83 

One thing which is obvious from the results is the fact that the number of speech 

contaminated windowed signals increases as the length of the window decreases. This is 

important for statistical significance of the experiment. However, with smaller number of 

samples inside the window, the QQ plots are not as representative as they are for larger 

number of samples. The table shows that for window sizes between 1.5s and 0.5s only 

one or two windowed signals are wrongly classified as nominal, and those correspond to 

the beginning or the end of the sequence, as discussed previously. Therefore smaller 

length of the widow will yield statistically better results because higher percentage of 

signals will be correctly classified as noisy. 

By continuing to decrease the length of the window, however, the algorithm starts to 

behave inconsistently. For window length of 0.1s the percentage of misclassified signals 

drastically increases. This is due to several factors. First of all, QQ plots have fewer 

samples and are therefore less accurate. Secondly, the dynamics of speech is such that 

usually the gaps between the words, and sometimes even within a single word, are larger 

than 0.1s. Therefore there are a significant number of windowed signals which do not 

contain any information about the speech. Furthermore, while other window lengths 

correctly classify all nominal recordings and all impulse disturbance recordings, for 

       misclassification occurres not only for speech contaminated signals, but for 
nominal signals as well. 

4.4. Noise detection and classification 

From Fig. 5 and 6 it is clear that two different types of noise present themselves quite 

differently on the QQ plot. With this in mind it might be possible to classify which type 

of noise has occurred when the algorithm detects the presence of contamination. The way 

in which this can be done is by determining another classification line, as in Eq. (4), but 

this time with respect to speech contaminated signals, rather than nominal recordings. In 

this way two classification lines are obtained, one which classifies nominal recording 

from the contaminated ones, and the other which detects whether contaminated recordings 

have impulse or speech disturbance, as shown in Fig. 7. In the upper graph it can be seen 

that nominal recordings do not trigger any errors. Speech contaminated recordings can be 

seen in the lower left part of the figure, and they fit ideally between two classification 


582 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

lines. Impulse disturbance, on the other hand, has the quantiles above both discrimination 

lines, as can be seen in the lower right part of the figure. 

This upgraded algorithm for noise detection and classification has been tested on the 

recording from Fig. 4 and the results are shown in Table 3. As can be seen, the impulse 

disturbance has been impeccably classified as such. Nominal recordings have a high 

percentage of nominal classification as well. Speech still has the lowest detection and 

classification percentage due to the facts discussed earlier. 

Table 3 Results of the noise detection algorithm 

 
Nominal 

recordings 

Speech 

contamination 

Impulse 

disturbance 

Classified as nominal 100% 25% 0 

Classified as noisy      0% 55% 0 

Classified as impulse noise      0% 20% 100% 

 
Fig. 7 QQ plot with 2 classification lines. When the samples of a QQ plot go above the  

red line, the noise has been detected. However, if samples are above the black line, 

this means that impulse disturbance has occurred, and when they are between the 

red and blue classification lines the speech contamination has occurred. 

0 0.2 0.4 0.6 0.8
0

0.2

0.4

0.6

0.8

1

F
nom
-1

Nominal data quantiles

 




0 0.2 0.4 0.6 0.8 1 1.2
0

0.5

1

1.5

F
nom
-1

Speech data quantiles

0 0.2 0.4 0.6 0.8 1
0

0.5

1

1.5

2

F
nom
-1

Impulse data quantiles


 Toward Acoustic Noise Type Detection Based on QQ Plot Statistics 583 

5. CONCLUSION 

In this paper an algorithm was presented which is capable of detecting the occurrence 

of noise in acoustic signals and is able to classify this noise with high percentage of 

accuracy. The main tool used for this purpose is a QQ plot with probability density 

function estimates and hypothesis testing algorithms. This research has been conducted 

with a purpose of making acoustic signals more broadly usable in the industry as a tool 

for predictive maintenance and state estimation of machines.  

The algorithm has been tested in a real industrial environment in thermal power plant 

Kostolac A1 in Serbia, and is shown to be capable of detecting whether the noise has 

occurred, and to classify whether the impulse disturbance or speech contamination is in 

question. Furthermore, the influence of the length of the window used on the efficiency 

of the algorithm is tested as well. 

Successful detection and classification is much lower on speech signals than on impulse 

disturbance due to the fact that the intensity of the speech, as well as words that are spoken 

directly influence the amount of contamination of the nominal signal. Therefore, if someone 

speaks quietly or makes long pauses while speaking, the chances are that the proposed 

algorithm will not manage to detect all the polluted parts of the signal. Also the percentage 

of contamination which is included in the analyzed window affects the detectability of the 

contamination, so the beginning and an ending of a speech contaminated sequence may not 

always be detectable. This can be improved by increasing the overlap between the windows 

and decreasing the size of the window, but only up to a point. 

The algorithm proposed in this paper is an introductory research of a preprocessing 

tool that should be capable of detecting and isolating acoustic noise in an industrial 

environment with a purpose of making acoustic recordings more compelling for usage in 

industrial predictive maintenance algorithms. Further research is going to contain 

robustification of the algorithm and improvement of speech detection possibly by using 

correlation analysis or some similar tools. Also, a pdf estimation of noisy signals based 

on their QQ plots is something that might yield more robust results as well. 

Acknowledgement: This paper is a result of activities within the projects supported by Serbian 

Ministry of Education and Science III42007 and TR32038. 

REFERENCES 

[1] S. Vujnović, A. Al-Hasaeri, P. Tadić and G. Kvašĉev, “Acoustic noise detection for state estimation”, In 
Proceedings of the 3rd International Conference on Electrical, Electronic and Computing Engineering 

(IcETRAN 2016), Zlatibor, Serbia, June 13 – 16, 2016, AUI4.6 1-5. 
[2] R. K. Mobley, An introduction to predictive maintenance, 2nd ed. Amsterdam, Netherlands: Butterworth-

Heinemann, 2002. 

[3] M. A. Stošović, M Dimitrijević, S. Bojanić, O. Nieto-Taladriz, V. Litovski, “Characterization of 
nonlinear loads in power distribution grid,” Facta Universitatis, Series: Electronics and Energetics, vol. 

29, no. 2, pp. 159-175, 2016. 

[4] D. Stevanović, P. Petković, “Utility needs smarter power meters in order to reduce economic losses,” 
Facta Universitatis, Series: Electronics and Energetics, vol. 28, no. 3, pp. 407-421, 2015. 

[5] M. J. Crocker, Handbook of noise and vibration control, Hoboken, New Jersey: John Wiley & Sons, 
2007. 

[6] Z. Su, P. Wang, X. Yu, Z. Lv, "Experimental investigation of vibration signal of an industrial tubular ball 
mill: Monitoring and diagnosing," Miner Eng, vol. 21, no. 10, pp. 699-710, 2008. 


584 S. VUJNOVIĆ, A. MARJANOVIĆ, Ţ. ĐUROVIĆ, P. TADIĆ, G. KVAŠĈEV 

[7] N. Baydar, A. Ball, "A comparative study of acoustic and vibration signals in detection of gear failures 
using Wigner-Ville distribution, "Mech Syst Signal Pr, vol. 15, no. 6, pp. 1091-1107, 2001. 

[8] G. S. Kvascev, Z. M. Djurovic, B. D. Kovacevic, "Adaptive recursive M-robust system parameter identification 
using the QQ-plot approach," IET control theory & applications, vol. 5, no. 4, pp. 579-593, 2011. 

[9] S. Vujnovic, Z. Djurovic, G. Kvascev, "Fan mill state estimation based on acoustic signature analysis," 
Control Engineering Practice, vol. 57, pp. 29-38, 2016. 

[10] J. J. Filliben, "The Probability Plot Correlation Coefficient Test for Normality," Technometrics, vol. 17, 
no. 1, pp. 111-117, 1975. 

[11] K. Fukunaga, Introduction to statistical pattern recognition, 2nd ed. San Diego, California: Academic 
Press Professional, 1990. 

[12] S. Theodoridis, K. Koutroumbas, Pattern recognition, 3rd ed. Orlando, Florida: Academic Press, 2006.