PHYSICAL, CHEMICAL AND BIOLOGICAL ASPECTS OF HUMAN IMPACTS ON URBAN SOILS OF SZEGED (SE HUNGARY) Journal of Env. Geogr. Vol. I. No. 3-4. pp. 15-20 APPLICATION OF SELF-ORGANIZING NEURAL NETWORKS FOR THE DELINEATION OF EXCESS WATER AREAS Szántó, G. 1 – Mucsi, L. 1 – van Leeuwen, B. 1 1 Department of Physical Geography and Geoinformatics, University of Szeged, Hungary Abstract In recent times Artificial Neural Networks (ANNs) are more and more widely applied. The ANN is an information processing system consist- ing of numerous simple processing units (neurons) that are arranged in layers and have weighted connections to each other. In the present study the possible application of an unsupervised neural network model, the self-organizing map (SOM), for the delineation of excess water areas have been examined. By means of the self-organizing map high-dimensional data of large databases could be mapped to a low- dimensional data space. Within a data set, it is able to develop homo- geneous clusters, thus it can be effectively applied for the classification of multispectral satellite images. The classification was carried out for an area of 88 km 2 to the south of Hódmezővásárhely situated in the south-eastern part of Hungary, which is frequently inundated by excess water. As input data, the intensity values of the pixels measured in six bands of a Landsat ETM image taken on 23rd April 2000 were used. To perform the classification, three different sized neural network models were created, which classified the pixels of the satellite image to 9, 12 and 16 clusters. By using the gained clusters three thematic maps were created, on which different types of excess water areas were delineated. During the validation of the results it was concluded that the applied neural network model is suitable for the delimitation of excess water areas and it could be an alternative to the traditional classification methods. Keywords: Artificial Neural Networks (ANNs), excess water, multispectral classification INTRODUCTION In recent times Artificial Neural Networks (ANNs) are more and more widely applied. The ANN is an information processing system consisting of numerous simple pro- cessing units (neurons) that are arranged in layers and have weighted connections to each other. Their construction and operating principle are based on the biological neural net- works, and their significant feature is adaptiveness, i.e. they solve the problems by learning from examples and not by means of programming. Since several types exist, the ‘neu- ral network’ designation rather means a model range than a concrete process. Their field of application is quite varied; sample-association, classification, optimization and similar- ity identification. They are applied in various fields of sci- ence, also in geography. The significance of their applica- tion is based on the sharp increase in the amount of geo- graphical data. In recent times several data collecting tech- niques are widely used e.g. the multi- and hyperspectral remote sensing, thus the resolution of the data rapidly in- creases both in geometric and attribute space. Neural net- works could be a really effective alternative for the analysis of the high-dimensional data. Their application in the field of geography is discussed in more detail in the works of Agarwal P. et al. (2008) and Hewitson B. C. et al. (1994). The aim of this study is to delineate excess water ar- eas on the basis of satellite images by using a neural net- work. The study area is in the vicinity of the settlement Batida, situated on the left bank of the River Tisza, to the south of Hódmezővásárhely. The term of excess water was defined in a number of ways, and the main point was summarized by Rakonczai J. et al. (2001) as follows: ‘Excess water is a kind of surplus water on the surface of a certain (drainage) area or in the pores of the arable land/near-surface formations, that inhibits the growth of vegetation and damages the man-made buildings.’ Excess water is a yearly recurring problem which endangers 45% of the area of the country, 60% of the arable lands, more than 4 million hectares altogether. Therefore the exact delineation of these areas is highly important, for which the different remote sensing methods provide the most objective way (Rakonczai J. et al. 2001). The delineation of excess water areas was carried out by the classification of medium spatial resolution multispectral satellite images. During the processing of multispectral satellite images, classification is a funda- mental procedure, through which the pixels of the image are classified according to their spectral features, by mathematical methods. As a result a thematic map was created which makes it possible to visualize the infor- mation stored in satellite images in a more expressive way. The classification was performed via one type of the neural networks, the so-called self-organizing map that creates classes in the training samples by unsuper- vised learning. Several examples could be mentioned referring to the application of neural networks in multi- spectral classification. According to Awad M. (2010), the multispectral classification carried out by self- organizing maps is more precise than the Isodata classi- fication. In the opinion of Aitkenhead M. J. et al. (2007) ANNs are a quick and accurate method for mapping land cover change. Pacifici F. et al. (2009) carried out urban land use classification on the basis of the sample analysis of high resolution satellite images, performed by neural networks. Barsi Á. (1997) could be mentioned from Hungary, who classified a Landsat TM image by one type of the neural networks. In his opinion, this method provides as accurate results as the traditional methods. 16 Szántó G. – Mucsi L. – van Leeuwen B. JOEG I/3-4 ARTIFICIAL NEURAL NETWORKS (ANNS) The structure of Artificial Neural Networks (ANNs) is similar to the human brain in that the storage of knowledge takes place in connected processing units (neurons). A processing unit converts the weighted sum of the incoming inputs by the help of an activation func- tion. The most commonly used activation functions are the linear function, the sigmoid function and the step function. The result obtained is sent to other neurons through the outgoing connections of the neuron (Fig. 1). Fig. 2 Structure of an Artificial Neural Network (ANN) The neurons are arranged into layers. Every net has an input layer for feeding input data and an output layer for visualization of the results. Between these layers a number of hidden layers could be found (Fig. 2). The learning of the net is realized by the modification of the weights between the neurons. Supervised and unsuper- vised learning methods could be distinguished. In case of supervised learning the training set includes both the input samples and the output samples. During an iteration pro- cess, the weights of the connections between the neurons undergo such changes that the appropriate result is added to the given input sample. In case of unsupervised learning only the set of input samples is known, while the output neurons compete for the input samples on the basis of certain similarity aspects. The weight vectors of winning neurons vary based on their added input value. By the help of such types of nets, regularities could be observed in the distribution of the sample data. There are several types of ANNs, which could dif- fer in certain elements from the general model. They have several advantages over the traditional methods, as their application does not depend on the statistical distri- bution of the input data, they are not sensitive to incom- plete and disturbed samples and are able to process huge amount of data. Self-organizing map The self-organizing map (SOM, Kohonen Map) applied in this research was created by Kohonen (Kohonen T. 2001), and this is the most widespread ANN which car- ries out unsupervised learning. It classifies the n- dimensional input samples (n>2) by means of unsuper- vised learning, and adds them to the elements of a lower- dimensional output layer. The similar samples are asso- ciated with the neighbouring elements of the output layer, i.e. apart from the distribution of the samples in the input space, it also learns the topology between them. The self-organizing map performs data clustering and dimension reduction at the same time. Therefore it is suitable for the solution of different problems and it can be an alternative besides other methods e.g. the principle component analysis and the k-median clustering. Self-organizing maps are made up of two layers, the input and output (or Kohonen) layers, that are connected with each other through all of their neurons (Fig. 3). Data is fed into the input layer, which has the same Fig. 1 Sketch of a neuron JOEG I/3-4 Application of self-organizing neural networks for the delineation of excess water areas 17 number of neurons as the number of input variables. Classification takes place in the Kohonen layer, the number of classes created during the learning process will be equal to the number of neurons located here. In this layer the neurons are located in a 1D, 2D or 3D topological position that enables connection between the neighbouring neurons. The 2D topology is the most widespread, in which the elements are arranged in square grid or hexagonal pattern. Each neuron has an n-element weight vector, where ‘n’ equals to the dimension of the input vector. The initial weight vectors of the neurons are usually determined by random numbers or, for the acceleration of the learning process, along the first two principle component vectors of the sample data. Fig. 3 Self-organizing map (Kohonen T. 2001) The learning of the net takes place according to the Kohonen rule, on the basis of which the processing units learn competitively. The model searches for the weight vector of the most similar i.e. the winning neuron in each input sample. This is usually calculated on the basis of Euclidean distance. The model modifies and shifts the weight vector values of the winning neuron and those in its certain topological neighbourhood circle towards the value of the input sample. The degree of modification at t time is determined according to the Kohonen rule (Borgulya 1998): ΔWj(t) = η(t)hcj(t)(X(t)-Wj(t)), where - Wj is the weight vector of the j th element - η is the learning rate decreasing in time - hcj is the neighbourhood function, which decreases from the winning neuron ‘c’ - X is an input sample vector. At the beginning of the learning process a larger learning rate and neighbourhood circle are used, which allow large-scale modifications providing the addition of the similar input samples to the neighbouring neurons. By the decreasing of the learning rate and the neigh- bourhood circle, the fine-tuning of the model is the next step towards the end of the learning process. The learning algorithm of a self-organising map can be described as follows (Hewitson et al. 1994): 1. The initiation of the net by giving the geometry and the number of neurons. 2. Giving the initial weight vectors of the neurons. 3. Giving a sample case to the net. 4. Determination of the winning neuron connected to the sample. 5. Modification of the weight vectors of the winning neuron and the topologically neighbouring neurons based on the Kohonen rule. 6. Slight reduction of the learning rate and the neigh- bourhood circle. 7. Repetition of the last four steps until the conver- gence is reached. The different types of visualization of the created model make the analysis of the results possible. If the distribution of the input sample in the data space is examined, the position of the neurons in the data space, their distance from each other or the component planes could be visual- ized. The component planes represent the strength of the neuron weights regarding each variable. Through the examination of the similarity of the component planes, the connections between the variables could be detected. Self-organising maps can be used together with oth- er visualization tools, thus in case of geographical appli- cations they can be connected to geographical maps or integrated into geographical information systems. STUDY AREA AND DATA USED The 88 km 2 study area is situated in the vicinity of the settlement Batida, to the south of the town of Hódmezővásárhely, in the southern part of Tiszántúl (the region east of the River Tisza) in the Great Hungarian Plain. The area is covered by young alluvial deposits, on which vertisols and fluvisols were formed. Most of it is under agricultural cultivation. There are several aban- doned river meanders and point bars in the study area. Classification was performed on the basis of medium resolution Landsat 7 ETM satellite images taken on 23rd April 2000. For the validation of the results color infra- red aerial photographs of the Lower Tisza Valley region taken by the ARGOS Studio of VITUKI Plc. on 23rd March 2000 were used. Numerous software exist for the training and visuali- zation of self-organising maps, which were also included in some large software packages and in software made for this special demand (e.g. SOM_PAK developed by Ko- honen). Their integration with geographical information systems has not really been widespread yet (Coleman A. M. 2008), but certain programmes offer such tools (e.g. 18 Szántó G. – Mucsi L. – van Leeuwen B. JOEG I/3-4 IDRISI). Matlab was chosen for our analysis, as it pro- vides more complex analysing and visualizing methods. Matlab is a mathematical program system with a special programming language, developed for numeric calcula- tions. It is applied in many fields and a large number of modules are available for the different applications. In our research the tools of the Neural Network Toolbox were applied, which can be used for the planning, simulation and visualization of different types of neural networks, among others the self-organising maps. PROCESS OF ANALYSIS Matlab offers many kinds of parameterization possibili- ties during the planning process of the self-organising map. The size of the net, the type of the topology and the neighbourhood function, the size of the neighbourhood circle and the number of the training iterations could be set. Three nets of different sizes were created for the classification, the output layers of which consisted of 3x3, 3x4 and 4x4 processing units arranged in a 2D hexagonal topology. The process of the analysis is going to be presented through the example of a net consisting of 4x4 neurons (Fig. 4). Fig. 4 Self-organising map consisting of 4x4 neurons Intensity values measured on six bands (blue, green, red, near infrared and two medium infrared bands) of the pixels of the Landsat ETM satellite image were used as input data. The training process consisted of one thou- sand iterations, that is, the whole set of samples was fed one thousand times to each of the three nets. After train- ing, the simulations of the models were run in case of each net. Figure 5 demonstrates that how many input samples were added to the individual neurons, for in- stance how many pixels were sorted to each class in case of a net consisting of 4x4 neurons. The further evaluation of the results was performed by ArcGIS software, and included the creation of thematic maps by merging the clusters according to the appropriate theme, and the creation of the required legend. Fig. 5 Number of inputs added to the individual neurons RESULTS For the delineation of inland water areas it was practical to merge the clusters into a small number of classes, because the individual clusters could have been specified only by the help of an exact field work. Analysing the composite planes, it could be determined to which neu- rons were the pixels having different reflectance features added. On the basis of this the following 5 classes could be separated: open water surface, dry soil, water saturat- ed soil, vegetation and vegetation in water. Figure 6 shows that pixels with the same reflectance values were projected to the neighbouring neurons, as a characteristic of the self-organising map. The classes created were represented in a thematic map as well (Fig. 7). Fig. 6 Position of the classes of excess water mapping JOEG I/3-4 Application of self-organizing neural networks for the delineation of excess water areas 19 On the thematic map, created on the basis of the re- sults of the three neural networks, open water surfaces and even those, which are not entirely covered with water are well separable. Table 1 shows how large the differences were in the extensions of the different land cover types. It was not possible to quantify the accuracy of the results owing to the lack of appropriate field survey data. However, excess water areas of great extension could be delineated in all the three cases by comparing the the- matic maps with aerial photographs (Fig. 8). It is the delineation of transitional classes where there are more considerable differences as it is difficult to determine how high moisture content indicates another class. In our opinion, the application of more neurons makes it possi- ble to determine this boundary more precisely. CONCLUSIONS As a result of this present research, it could be concluded that the type of neural network applied is suitable for the thematic classification of satellite images. By the help of this method excess water areas have been successfully delimited in the study area. For the examination of the effectiveness of this method, it will be required to com- pare the results with those gained from traditional classi- fication methods. In the application of self-organising maps, the possibility of their extension is a great asset, since this way they can simultaneously manage data from different sources. Thus, besides the spectral infor- mation of the satellite images, other data could also be used for the classification, for instance elevation models and other thematic layers, e.g. soil and geomorphological Fig. 7 Classification of the settlement Batida and its surroundings Table 1 Differences in the extension of the classes for the three models 3x3 3x4 4x4 Open water surfaces 2.03% 1.74% 1.45% Water saturated soil 13% 19.65% 15.41% Vegetation in water 12% 10.2% 9.62% Dry soil 49.3% 41.48% 45.15% Vegetation 23.65% 26.93% 28.35% 20 Szántó G. – Mucsi L. – van Leeuwen B. JOEG I/3-4 maps. The shape recognition function of the self- organising maps could make it possible to identify fre- quently inundated landforms, for example abandoned river meanders and point bars. REFERENCES Agarwal P. – Skupin A. 2008. Self-Organising Maps: Applica- tions in Geographic Informations Science. New Jersey: John Wiley & Sons Ltd. 214 p Aitkenhead M. J. – Lumsdon P. – Miller D. R. 2007. Remote sensing-based neural network mapping of tsunami damage in Aceh, Indonesia. Disasters 31/3: 217-226 Awad M. An Unsupervised Artificial Neural Network Method for Satellite Image Segmentation. The International Arab Journal of Information Technology 7/2: 199-205 Borgulya I. 1998. Neurális hálók és Fuzzy-rendszerek. Buda- pest: Dialóg Campus Kiadó. 230 p Barsi Á. 1997. Landsat-felvétel tematikus osztályozása neurális hálózattal. Geodézia és Kartográfia 49/4: 21-28 Coleman A. M. 2008. An adaptive landscape classification procedure using geoinformatics and artificial neural net- works. Unpublished MSc Thesis, Amsterdam. 195 p Hewitson B. C. – Crane R. G. 1994. Neural Nets: Applications in Geography. Dordrecht: Kluwer Academic Publishers. 194 p Kohonen T. 2001. Self-Organizing Maps. Berlin: Spinger Verlag. 501 p Miller D. M. – Kaminsky E. J. – Rana S. 1995. Neural Net- work Classifation of Remote-Sensing Data. Computers and Geosciences 2/3: 377-386 Pacifici F. – Chini M. – Emery W. J. 2009. A neural network approach using multi-scale textural metrics from very high- resolution panchromatic imagery for urban land-use classifi- cation. Remote Sensing of Environment 113: 1276-1292 Rakonczai J. – Mucsi L. – Szatmári J. – Kovács F. – Csató Sz. 2001. A belvizes területek elhatárolásának módszertani le- hetőségei. I. Földrajzi Konferencia, Szeged Fig. 8 Comparison of the delimited excess water areas with aerial photographs