FACTA UNIVERSITATIS Series: Mechanical Engineering Vol. 17, N o 3, 2019, pp. 309 - 320 https://doi.org/10.22190/FUME190415036S © 2019 by University of Niš, Serbia | Creative Commons License: CC BY-NC-ND Original scientific paper* ARTIFICIAL NEURAL NETWORK APPLICATION FOR THE TEMPORAL PROPERTIES OF ACOUSTIC PERCEPTION Miloš Simonović 1 , Marko Kovandžić 1 , Vlastimir Nikolić 1 , Mihajlo Stojčić 2 , Darko Knežević 2 1 Faculty of Mechanical Engineering, University of Niš, Niš, Serbia 2 Faculty of Mechanical Engineering, University of Banja Luka, Banja Luka, Republika Srpska, Bosnia and Herzegovina Abstract. Though acoustic perception is well established in literature, it seems to be insufficiently implemented in practice. Experimental results are excellent but a lot of issues arise when it comes to the application in real conditions. Using artificial neural networks makes acoustic signal processing very comfortable from the mathematical point of view. However, a great job has to be done in order to make it possible. The procedure includes data acquisition, filtering, feature extraction and selection. These techniques require much more resources than mere artificial neural networks and this represents a limiting factor for the implementation. The paper investigates the complete procedure of acoustic perception, in terms of time, in order to identify limitations. Key Words: Perception, Temporal Properties, Localization, Filtering, Neural Networks 1. INTRODUCTION Acoustic perception is a good alternative to the visual perception in engineering applications, with respect to simplicity, reliability and price. There are a lot of techniques of acoustic observation where each of them assumes specific preconditions to be implemented. In accordance with the preconditions the methods provide limited results. It is necessary to combine several methods of acoustic observation in order to overcome these limitations. Filtering is, for example, an inevitable method of signal processing in real conditions (presence of disturbances) [1]. The function of filter is to suppress the disturbances while the signal is left unchanged. In such a manner, the filter enhances discriminative capacity (amount of useful information) in the signal. Received April 15, 2019 / Accepted July 28, 2019 Corresponding author: Miloš Simonović Faculty of Mechanical Engineering, University of Niš, A. Medvedeva 14, 18000 Niš, Serbia E-mail: milos.simonovic@masfak.ni.ac.rs 310 M. SIMONOVIĆ, M. KOVANDŽIĆ, V. NIKOLIĆ, M. STOJĈIĆ, D. KNEŽEVIĆ In order to act against acoustic object, it is necessary to identify it. The procedure is called acoustic recognition because it assumes that what is currently being received corresponds in some way to something that has already been processed in the past [2].The problem belongs to the general category of artificial intelligence problems, namely pattern recognition. In the experiment, artificial neural networks were chosen, as a proven pattern recognition tool, for processing acoustic signals [3]. Though neural networks can be trained to perform filtering [4], preprocessing, before the signal is presented to a neural network, is inevitable. Except filtering, the procedure includes normalization (scaling) [5], feature extraction and feature selection [6]. The last two of them decrease computational complexity by combining several features in a new one, with a higher discriminative capacity. Determination of object position, relative to some reference frame, based on acoustic signals is called acoustic localization. The procedure is implemented, in various forms, in science and practice [7-12]. From the mathematical point of view, acoustic localization belongs to two categories. Near field acoustic localization is performed if the sound source is in the surroundings of the microphones and far field localization when the source is far away from them. The second is much simpler (computationally) but it gives less information, only the direction of sound source. The first method provides spatial coordinates of the sound source using time difference of arrival (TDOA) between microphones [13] as a reference. Analytical solution requires solving system of hyperbolic equations, which is not a trivial problem even for a minimal configuration [14]. In the experiment, the problem is solved by processing TDOAs using feed-forward neural network. Again, the signals were previously processed using different techniques including obtaining TDOAs [15, 16]. 2. ACOUSTIC PERCEPTION The experiment investigates two separate elementary processes of acoustic perception, namely acoustic recognition and acoustic localization, separately. But, in both, artificial neural network is employed as a signal processing tool because of its simplicity, universality and excellent performance. Neural networks are built of simple processing elements, called artificial neurons, which are inspired by biological nervous cells. Neurons are spatially organized in layers and different layers may perform different transformation on input signals. The neurons in a layer are not interconnected and, depending on the way they are connected between layers, there are several network topologies. Neural networks have ability to establish complex nonlinear relationship between input and output variables, by adjusting weights between neurons. The adjustment is done not by an explicit programming but through an adaptive method called learning algorithm, using training data set as a reference. Thanks to the massive parallelism in data processing, neural networks have excellent speed of performance, in the phase of exploitation. The use of neural networks spreads in modern engineering permitting us to investigate a wide range of problems [17], but it also demonstrates a superior accuracy with respect to alternative methods as evident in [18]. The most important temporal characteristic of the artificial neural network is temporal resolution. It is a measure of precision with respect to the operating time. Temporal resolution is limited by computational power available and number of calculations. The transfer of signals from the layer with i to the layer with j neurons is described by [19] , ( ) j ij i j j Y W X X f Y  (1) Artificial Neural Network Application for Temporal Properties of Acoustic Perception 311 According to the matrix algebra, the computational complexity of the first equation is O(i×j) and computational complexity of the second is O(j), as element wise function. The total computational complexity of the transition between two layers is O(i×j). Both experiments of acoustic perception deal with corrupted signals. In order to cancel the negative effect of disturbances, a digital IIR filter of second order was employed [13]. The transfer function of the filter in general is 1 2 0 1 2 1 2 1 2 ( ) 1 b b z b z H z a z a z          (2) Computational complexity of the digital filter is equal to its order. On the whole length of the signal it is O(N × r) or simply O(N), because r << N, where N is the signal length and r is filter order. The experiment searches for an efficient procedure of filter design with the goal of improving the accuracy of acoustic perception. Duration of the training phase has no influence on the neural network performance, in the phase of exploitation, but it still requires some resources. Most of the literature about the neural networks deals with computing problems but when it comes to the practical implementation the crucial phase of training is acquiring training samples. Without accurate and properly sampled data, it is not possible to perform a correct training procedure. There is no general rule for data acquisition and feature selection because any implementation requires specific solutions and innovative approach. The experiment searches for efficient methods of acquiring data for the purpose of acoustic perception. The solutions were limited to the regular, simple and cheap equipment. 2.1. Acoustic recognition Categorization (taxonomy) is essential for acoustic recognition, as for all cognitive processes. It provides generalization rules that are used for making decisions [20]. Categories (classes) are groups of objects that have similar characteristics in some frame of reference. A categorization, or division of objects into classes, enables the observer to make predictions of unobserved characteristic of the object based on previous experience. The process, where general rules are derived from specific examples, is called abstraction. Taxonomy of the phenomena is never unambiguous. Except from the object properties, it depends on the observer properties and his experience. Several clues of a sound can be used for acoustic categorization. In the living world, most of the acoustic sensations are attributed by pitch, timbre, loudness and duration. Engineering practice suggests different perceptual qualities of sound for different applications. For instance, temporal characteristics, like variance, zero-crossing rate and silence ratio, in combination with spectral properties, like harmonic ratio and sub-band energy, are used for discrimination between speech and music. The most important perceptual property of sound is surely frequency spectrum (envelope) and its derivates (power spectral density, spectral centroid, spectral irregularity, odd and even harmonics). It is perceived as timbre or tone color by living beings. The easiest way to obtain frequency spectrum of the sample, s[k] is to process it using fast Fourier transform [21] 1 ( 2 / ) 0 , 0 1 N ikn N k n n S s e k N         (3) 312 M. SIMONOVIĆ, M. KOVANDŽIĆ, V. NIKOLIĆ, M. STOJĈIĆ, D. KNEŽEVIĆ The first half of the fast Fourier transform, for the case of real valued signals, is conjugate complex of the second half. That is why one of the halves can be neglected without loss of information. Eq. (3) is standard tool for digital signal processing because it provides frequency response with a lower computational complexity, O(NlogN), in comparison with the standard Fourier transform, O(N 2 ). At the same time, the procedure provides satisfactory result. The obtained frequency spectrum was presented to a feed-forward neural network for pattern recognition. 2.2. Near field acoustic localization People and animals are able to point to the horizontal direction that the sound is coming from using slightly different signals that arrive at each ear [13]. For the vertical direction, spectrum features, produced by a sound reflector (pinna), is used as the auditory cue. Artificial devices perform localization based on different acoustic clues. The working principle is strongly dependent on the number of microphones. Monoaural (by one microphone) localization, is performed based on energy drop or spectrum deformation, as the sound propagates through the medium. The localization ability of binaural systems can be established by the learning procedure through the repetition of movement. An alternative is employing head related transfer function (HRT), which captures transformations of a sound wave propagating from the source to the microphone. But the most frequent and most valuable clue for acoustic localization is a lag between the signals collected at different spatial positions (TDOA). The basic approach of estimating TDOA is using cross-correlation function [13] 1 0 1 ( ) [ ] [ ] N i i jk R j l s k s k l N     (4) as argument l that maximizes its value within the range of possible lags max max1 ( [ ], 2 2 ij ij s l l t argmax R l l f      (5) where N is the signal length and is the range of expected lags. It has to be chosen in accordance with the measuring range of the experimental setup. Since the expression in the bracket has N 2 multiplications, N-1 additions and one division, its computational complexity is O(N 2 ). The sum has to be evaluated T+1 times plus searching for the maximum in the range of possible lags. The final estimation of computational complexity of cross-correlation procedure with maximum allowed lag is O(N 2 ×T). A good approximation of cross- correlation function can be obtained using inverse discrete Fourier transformation 2 1 0 1 ( ) ( ) ( ) j fl K K ij ijk R l f R f e K       (6) where Rij(f) is cross-power spectral density (XPSD) ( ) ( ) ( ) ij i j R f S f S f (7) Since frequency spectrum, S(f) is twice shorter than original signal, s[k], and the computational complexity of cross correlation procedure, O(N 2 ×T) Eq. (6) can save a lot of computational effort. To be implemented it has to provide acceptable approximation error. Artificial Neural Network Application for Temporal Properties of Acoustic Perception 313 Function ψ(f) is called windowing function; its role is to highlight some of the spectrum features in order to improve its discriminativity. Different windowing functions (Table 1) are intended for different purposes but the choice among them is still ambiguous. Table 1 Windowing functions Window Window Function Cross correlation 1CC  Roth window 1/ ( )Roth iiS f  PHAT window 1/ ( ) PHAT ij S f  SCOT window 1/ ( ) ( ) SCOT ii jj S f S f  3. EXPERIMENTAL SETUP The experiment investigated two separate experiments. In both cases, the acoustic signals were processed using regular processor Intel (R) Celeron (R) CPU N3350 @ 1.10 GHz. 3.1. Acoustic recognition For the training of the feed-forward neural network in acoustic recognition 500 sound samples were recorded and collected from the Internet. The samples were processed by a human listener, using specially designed software with the possibilities of playing, visualizing and separating parts of interest from the rest of the content. At the end, each of the samples, 1 s long at the frequency of 44.1 kHz, contained only consistent content that can be uniquely labeled with one label. To simulate different levels of abstraction, sound samples were chosen from three categories. The first category was made of 32 recorded sound samples, all produced by the same cricket. This category was most specific among three in the implemented category-abstraction space. The second category was consisted of 261 sound samples produced by different fly individuals that belong to several subgroups and families. This category represented the middle of category-abstraction space. And the last category was made of 392 acoustic samples of different backgrounds, starting from human voices, animals, natural phenomena up to the different engines, vehicles, machines, musical instruments and other technical devices. The samples were grouped in the category, simply named “Sound”, which represented the most abstract category in the category-abstraction space. Amplitude-frequency spectrum, as a recognition clue, was calculated by Eq. (3) and the recognition was performed using a feed-forward neural network with the sigmoid activation function and the back-propagation algorithm with momentum. The number of neurons in the input layer, for the constant sampling frequency, was determined by the signal length. The number of outputs was equal to 3, which is the number of signal categories. The rest of the network configuration and training parameters were obtained as a result of examination. The network was tested by 200 samples, not used in the training procedure. 314 M. SIMONOVIĆ, M. KOVANDŽIĆ, V. NIKOLIĆ, M. STOJĈIĆ, D. KNEŽEVIĆ 3.2. Near field acoustic localization The acoustic source was driven along the training path hanged on three strings. The opposite end of each string was wrapped around a step motor driven pulley. Three of these pulleys were geometrically placed in vertices of horizontal, approximately 4.35 m edge long equilateral triangle, above the acoustic source, building a simple routing mechanism (Fig. 1). A meaningful winding and unwinding of the pulleys were used for achieving any specified location of the sound source within a selected range. The step motors were driven by Arduino CNC driver and the driver is governed using PC through USB connection and ATmega328P microcontroller. Fig. 1 Sketch of experimental setup for near field acoustic localization Random number generator was employed for selecting 500 spatial locations within a 1.6 m edge long cubic space, below the three pulleys. The locations were intended for training feedforward neural network in near field acoustic localization. Before training was performed, the locations were ordered using genetic algorithm with the objective of minimizing training route. The source was stopped, at each of 500 spatial locations and the sound sample was recorded for a period of 3s. For this purpose, the array of 8, low cost, mini spy microphones, was designed and connected to a PC through 8 channel TerraTec EWS88 MT sound card. These microphones were spatially displaced at vertices of 2m edge length cube, symmetrically around the sound source moving zone, with purposely chosen tolerance of +/- 20 mm. Parabolic reflector dishes were applied, on each microphone, as an audio signal mechanical amplifier. The first second, of each acoustic sample, was recorded before the sound source has been turned on as a representative of the noise that exists in the room independently of the sound source activity. The rest of each signal was recorded for a period of 2 s, after the emitter has been activated, as a representative of corrupted signal. The samples were Artificial Neural Network Application for Temporal Properties of Acoustic Perception 315 recorded by cheap equipment in a highly reverberant room full of interfering sound sources (fans and step motors) so they were expected to be corrupted with a high level of noise. To cancel the negative effect of disturbances on TDOA estimation accuracy, the signals were filtered by a suitable second order IIR filter. The filter was designed through the iterative steps of evolutionary strategy in order to minimize the mean absolute TDOA estimation error over the recorded collection of samples. The error was calculated as a difference between TDOA estimated using the preprocessed signal and the theoretical TDOA calculated based on the sound speed and geometrical relations between acoustic components. Except for filter coefficients, the chance was taken for the rest of preprocessing procedure to be configured. Finally, the genotype of the complete preprocessing consisted of 8 variables. The first five of them were digital filter coefficients while one real variable more was employed for determining optimal range of lags to be tested in cross-correlation procedure. Two integer variables were used to make choice among windowing functions and nonlinear operators. The algorithm was started with an initial population of 50 individuals each of them represented one preprocessing configuration. The genotype of the initial individuals was chosen randomly within the logical range of values. The termination condition was formulated as a maximum number of successive evolutions with no improvement in the mean absolute TDOA error. The best configuration was employed for preprocessing in the experiment of acoustic localization. TDOAs were processed using a feed-forward neural network. The number of neurons in the input layer was determined by the number of employed microphones. In order to achieve the best accuracy, all redundant pairs that correspond to the certain number of microphones were employed as inputs. ( 1) / 2n M M  (8) where M is number of microphones. The number of outputs was always 3 because the spatial position was determined by 3 independent coordinates. The rest of the network configuration (number of hidden layers and artificial neurons in them) and training parameters were obtained as the result of examination. Network performance was tested along 126 spatial locations, from the same space, which were not used in the training procedure. 4. EXPERIMENTAL RESULTS In both experiments, the processing time of neural network and of filtering duration was hard to notice in comparison to the time it takes for the preprocessing techniques. This is in accordance with the theoretical predictions about the computational complexity of these procedures. 4.1. Acoustic recognition The best recognition accuracy was achieved using a feed-forward neural network with 50 neurons in a single hidden layer. The network was trained using learning coefficient 0.025, momentum factor 0.999 and 400 training epochs. The overall accuracy was around 92% (Fig. 2). The result was estimated as satisfactory since it was achieved in the presence of disturbances. 316 M. SIMONOVIĆ, M. KOVANDŽIĆ, V. NIKOLIĆ, M. STOJĈIĆ, D. KNEŽEVIĆ Fig. 2 Confusion matrix as result of the acoustic recognition From the point of duration, the most important phase of preprocessing, for the acoustic recognition, was fast Fourier transform. Fig. 3. represents the duration of the fast Fourier transform procedure with respect to the signal length while Fig. 4. represents the recognition error with respect to the same parameter. Fig. 3 Fast Fourier transform duration with respect to the signal length Fig. 4 Mean squared error with respect to the signal length Artificial Neural Network Application for Temporal Properties of Acoustic Perception 317 4.2. Near field acoustic localization Geometrical relations between the experimental components and the realized spatial positions of the acoustic source were precisely measured using Total Station Sokkia SET630R. The instrument provides a laser measurement of distances with the accuracy of ±3 mm, at the used range of lengths, memorization and automatic data transfer, through the RS-232 port, to PC. After the realized positions were compared to the given coordinates, the resulting mean absolute error, achieved by the routing mechanism over the whole collection of audio samples, was approximately 10 mm. The accuracy of the mechanism was evaluated as satisfactory regarding the near field acoustic localization because the sound source diameter was approximately 40 mm. Since the training positions were randomly chosen, the realized positions, precisely determined by total station, were adopted as reference for calculating theoretical (reference) TDOA-s. The path of the acoustic source, between training positions, was optimized using evolutionary algorithm. The procedure reduced the total length of the training path for 4- 5 times. The shorter training path did not only have an influence on a shorter time required for the routing mechanism to complete it but it also affected a better positioning accuracy. The reason for this is that the routing mechanism used in the experiment, governed by the winding strings, made a higher error with a longer movement. All the audio samples collected contained a part recorded before the sound source was activated and the part after it was started. The amplitude frequency spectrums of these are evaluated directly, using fast Fourier transform, and averaged over the whole collection of samples. Subtracting frequency spectrum of noise from the frequency spectrum of corrupted signal resulted in the frequency spectrum of a clear signal, without noise. The ratio estimated between the dominant frequencies of the clear signal and noise was around 0.1 which is equal to the SNR ratio of -20 dB, in the logarithmic scale. According to the literature [22], the minimum SNR ratio, which provides meaningful TDOA estimation, is in the range between -13 dB and -13.5 dB. The SNR evaluated suggested that the collected audio samples, in the experiment of acoustic localization, contained too much noise to be useful for TDOA estimation. The same conclusion was obtained based on the dependency of the mean absolute TDOA estimation error with respect to the sample length (Fig. 5). The diamonds represent the error obtained using raw signals, without any preprocessing, for 8 different lengths of acoustic sample. The TDOAs were calculated by cross-correlation in time domain. The approximation line, between them, was obtained using two terms exponential function. The curve shows increasing of the mean absolute TDOA error with the signal length, which is against logical assumption that more data should provide a better result. The influence of preprocessing, on the level of mean absolute TDOA estimating error, is demonstrated by the cyan curve marked with triangles. The approximation line was obtained in the same way as previous. The line constantly decreases with the length of acoustic samples. The results presented, in Fig. 5, proved the necessity of preprocessing in the acoustic localization procedure over employed collection of samples. According to the graph of processed signal, in Fig. 5, the length of 80ms was adopted as optimal for obtaining TDOAs in the experiment of acoustic localization. Further increasing of the signal length, despite higher computational complexity, gave no significant improvement of accuracy. The minimal TDOA estimation error achieved in 318 M. SIMONOVIĆ, M. KOVANDŽIĆ, V. NIKOLIĆ, M. STOJĈIĆ, D. KNEŽEVIĆ the experiment was approximately 0.13ms. For the sound velocity of 334.33 m/s, at the temperature of 5°C that ruled during the experiment. the mean absolute error corresponded to the length of 45 mm. This was estimated as satisfactory, regarding the acoustic source diameter too. The average duration of TDOA estimation, which assumed preprocessing (filtering) and calculating of lags between microphones, was just under 0.1s per location. Fig. 8 represents TDOA processing time, for 8 microphones, with respect to the signal length while Fig. 6 presents TDOA processing time with respect to the number of microphones. Both graphs were obtained with the signal length of 0.8s. Fig. 6 TDOA processing time with respect to the signal length After testing different configurations, the artificial neural network was adopted with 10 neurons in a single hidden layer. The network was trained using learning coefficient 0.7, momentum factor 0.9 and 4000 training epochs. The final result is presented in Fig. Fig. 5 Average TDOA estimation error with respect to the sample length Artificial Neural Network Application for Temporal Properties of Acoustic Perception 319 7. Average deviation from the actual path was 35.7 mm which is even lower than the mean absolute error of input TDOAs. The accuracy was result of redundant microphones. Fig. 7 TDOA processing time with respect to the number of microphones Fig. 8 Actual path and estimated paths of acoustic source 3. CONCLUSION The most demanding procedure of the acoustic recognition was fast Fourier transform. On the described processor, it lasted about 500 times shorter than signal itself which leaves the possibility to employ the rest of computational power for improving recognition accuracy. One of the techniques is overlapping audio signals that result in raising temporal resolution. The processing time of near field localization is conditioned with the complexity of cross-correlation (3). On the employed processor, it was at the limit of real time application. 320 M. SIMONOVIĆ, M. KOVANDŽIĆ, V. NIKOLIĆ, M. STOJĈIĆ, D. KNEŽEVIĆ The experiment confirms theoretical assumption that the temporal resolution of acoustic perception, by artificial neural networks, strongly depends on the feature extraction procedure. The paper indicates crucial implementation problems of the acoustic perception, which are omitted in literature, and gives some solutions. Acknowledgements: This paper presents the results of the research conducted within the project "Research and development of new generation machine systems in the function of the technological development of Serbia" funded by the Faculty of Mechanical Engineering, University of Niš, Serbia. REFERENCES 1. McLoughlin, I., Zhang, H., Xie, Z., Song, Y., Xiao, W., Phan, H., 2017, Continuous robust sound event classification using time-frequency features and deep learning, Plos ONE, 12(9), pp 1-19. 2. Mcadams, S., 1993, Recognition of sound sources and events, Oxford University Press. 3. Bishop, C., 1995, Neural networks for pattern recognition, Oxford: Oxford University Press. 4. Michaelides, P.G., Tsionas, E. G., Vouldis, A. T., Konstantakis, K. N., Patrinos, P., 2018, A Semi-Parametric Non-linear Neural Network Filter: Theory and Empirical Evidence, Computational Economics, 51(3), pp 637-675. 5. Choi, J.Y., Hu, E.R., Perrachione, T.K., 2018, Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing, Attention, Perception & Psychophysics, 80(3), pp. 784-797. 6. Flasinski, M., 2016, Introduction to Artificial Intelligence, Springer, Cham. 7. Osamu, I., Masafumi, T., Tetsuya, N., 2003, Sound Source Localization Using a Pinna-based Profile Fitting Method, Ieice Transactions - IEICE, pp. 263-266. 8. Johnson, M. L., 2015, Systems and methods of processing information regarding weapon fire location using projectile shockwave and muzzle blast times of arrival data, Retrieved from http://search.ebscohost.com/login. aspx?direct=true&db=edspgr&AN=edspgr.08995227&site=eds-live (last access: 01.03.2019). 9. Rowell, C.R., 2014, Three-Dimensional Volcano-Acoustic Source Localization at Karymsky Volcano, Kamchatka, Russia, Journal of Volcanology and Geothermal Research, 283, pp. 101-115. 10. Martín, S. R., Genescà, M., Romeu, J., Clot, A., 2016, Aircraft localization using a passive acoustic method. Experimental test, Aerospace Science and Technology, 48, pp. 246-253. 11. Grabowski, K., 2016, Time–distance domain transformation for Acoustic Emission source localization in thin metallic plates, Ultrasonics, 68, pp. 142–149. 12. Tan, C., 2016, A low-cost centimeter-level acoustic localization system without time synchronization, Measurement, 78, pp. 73–82. 13. Kovandžić, M., Nikolić, V., Al-Noori, A., Ćirić, I., Simonović, M., 2017, Near field acoustic localization under unfavorable conditions using feedforward neural network for processing time difference of arrival, Expert Systems with Applications, 7(1), pp 138-146. 14. Park, C., Jeon, J., Kim, Y., 2014, Localization of a sound source in a noisy environment by hyperbolic curves in quefrency domain, Journal Of Sound And Vibration, 333, pp. 5630-5640. 15. Hing, C.S., 2005, A comparative study of two discrete-time phase delay estimators, IEEE Transactions on Instrumentation and Measurement, 54, pp. 2501-2504. 16. Khaddour, H., 2011, A Comparison of Algorithms of Sound Source Localization Based on Time Delay Estimation, Elektrorevue, 2(1), pp. 31-37. 17. Babic, M., Calì, M., Nazarenko, I. et al., 2019, Surface roughness evaluation in hardened materials by pattern recognition using network theory, International Journal on Interactive Design and Manufacturing, 13(1), pp. 211-219. 18. Fragassa, C., Babic, M., Bergmann, C., Minak, G., 2019, Predicting the tensile behaviour of cast alloys by a pattern recognition analysis on experimental data, Metals, 9(5), 557. 19. Rojas, R., 1996, Neural Networks, Springer. 20. George, I., Cousillas, H., Richard, J., Hausberger, M., 2008, A Potential Neural Substrate for Processing Functional Classes of Complex Acoustic Signals, Plos ONE, 3(5), pp 1-10. 21. Smith, W. S., 1997, Digital signal processing, California Technical Publishing. San Diego. 22. Dhull, S., Arya, S., Sahu, O.P., 2010, Comparison of time-delay estimation techniques in acoustic environment, International Journal of Computer Applications, 8(9), pp 29–31.