CHEMICAL ENGINEERING TRANSACTIONS VOL. 61, 2017 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Petar S Varbanov, Rongxin Su, Hon Loong Lam, Xia Liu, Jiří J Klemeš Copyright © 2017, AIDIC Servizi S.r.l. ISBN 978-88-95608-51-8; ISSN 2283-9216 Vegetation Interpretation and Classification from High Resolution True Colour Map Images of Shanghai City Jing Zhanga,b, Yongjian Huanga*, Maohua Wanga, Mingquan Wanga aShanghai Carbon Data Research Center, Shanghai Advanced Research Institute, Chinese Academy of Sciences bCollege of communication and Information Engineering, Shanghai University huangyj@sari.ac.cn Vegetation interpretation and classification is the core work of regional ecological monitoring and carbon sinks calculating. Different from traditional remote sensing data, the development of high resolution true colour map provides a new possibility of vegetation interpretation and classification, especially for scattered, small-scale vegetation distributions in city regions. In Hue, Saturation and Value (HSV) colour space, colour model and texture model are combined to extract vegetation features, adjusting weights of the two features as a whole to achieve a better identification effect. Based on the nearest neighbour method, the model matches the features of candidate images with typical training vegetation samples. The main researches and innovations are as fol- lows: (i) The research uses high resolution true colour map images to provide real-time and more convenient data, making the study less limited to low spatial resolution of sensing images. (ii) It explores vegetation cover in city regions in an effective way, which is dispersed in size, variable in type and difficult to be located pre- cisely. (iii) The results of simulation show that this method is feasible and the feature-weighted model im- proves the precision to 83.3 % around by adjusting weight parameters, much better than single feature model. (iv) Combined with the annual Net primary productivity (NPP) values in different vegetation types, the carbon storage of carbon sinks in one area of 23,373 m2 is calculated, ranging from 9,000 - 12,000 kg, providing a new way to track the carbon footprints in city regions. 1. Introduction From the carbon sources, the process of industrialization indicated that the optimized promotion of industrial structure is the most effective approach to slow down the rapid growth of carbon emissions (Du et al., 2016). From the carbon sinks, the world’s oceans, plants, soils on land, and numerous other less significant carbon pools within the global carbon cycle steadily absorb and store carbon. Therefore, being in good view of vege- tation resources of an area such as coverage and plant types, has great value on calculating carbon flux and emissions and evaluating ecological environment. Recent years, the analysis of satellite remote sensing images and aerial remote sensing images is the main research method to investigate vegetation, whose image data includes multispectral information. Shoshany (2000) reviewed the utility of spectral, temporal and spatial data for identifying Mediterranean vegetation land regions. The development in multispectral sensors, such as WorldView, containing key spectral bands, has brought about unique opportunities for those wishing to classify vegetation at species level (Dlamini, 2010). But hyperspectral data is deficient for low spatial resolution and high expense, and its accuracy for relatively sparse vegetation is not up to 80 % (Khalid Mansour et al., 2012). In recent years, map images taken by high spatial resolution satellites have shown their special advantages as the global satellite technology improves. As high resolution true color image, it can show ground conditions by means of its high resolution. True color map images can offer vegetation distribution information more freely, especially to separable and complex distributions in city areas. From map images, interesting points are extracted and calculated, such as vegetation distribution range and type, to get precise and real-time vegetation data and to do researches on carbon footprints further. DOI: 10.3303/CET1761076 Please cite this article as: Zhang J., Huang Y., Wang M., Wang M., 2017, Vegetation interpretation and classification from high resolution true colour map images of shanghai city, Chemical Engineering Transactions, 61, 469-474 DOI:10.3303/CET1761076 469 2. Research methodology 2.1 Survey region and data source This study chooses Chongming county, Shanghai, as the research region. The map images are downloaded from Google Maps. The highest image level is 19th and its image scale is 1:1,128.5. The image resolution is at 0.2986 meter per pixel. 2.2 Image segmentation The map image adopted in this paper has high resolution and pixel density, in which the 19th level image pixel is up to 23,552*18,688. Considering the machine efficiency and the experimental time, the high-resolution map image is cut into smaller images with fewer pixels. Then, the segmented images are used to identify veg- etation, to extract vegetation information, to determine the vegetation type and to mark different types with dif- ferent colors. In this research, the original image is segmented into 1,556 smaller images by rows and columns and the smaller image objects have pixel size of 512*512. After image filtering and color enhancement, we take 16 * 16 pixel square as a unit (about 22.8 m2), convenient for marking. That means there are 1,024 units in a 512 * 512 pixels image where the vegetation recognition and classification is operated in the following research. Figure 1 shows the map image 1 segmentation process. a) b) Figure 1: Map image 1 segmentation process: a) image segmentation by 16*16 pixels; b) segmentation sketch. 2.3 Feature extraction There are color, texture, shape and spatial relationship features among common image features, for vegeta- tion, the first two features are very important (Di et al., 2015). We try to recognize and classify vegetation from map images, using color and texture features in this paper. The color histogram is used to present color feature. Then in each unit, a 256-dimension color feature vector is obtained after calculation and normalization (Manjunath et al., 2001). Tamura texture feature (Tamura et al., 1978) mainly contains six components. They are coarseness, contrast, directionality, line-likeness, regularity, and roughness. Lots of image recognition researches and studies have found that, the three components—roughness, contrast and directionality are the most important in image identification (Wang, 2012). The color image is preliminary pretreated into gray to calculate the coarseness, contrast and directionality of each unit, and then they are normalized to generalize a three-dimensional texture feature vector. In a 16*16 unit whose real area is 22.8 m2, it is assumed that there is only one type of vegeta- tion in it. Figure 2 shows the color and texture feature values. a) b) c) Figure 2: Colour and texture feature values of map image 1: a) map image 1; b) color feature value; c) texture feature value. 470 javascript:void(0); javascript:void(0); javascript:void(0); javascript:void(0); javascript:void(0); javascript:void(0); javascript:void(0); 2.4 Image matching 2.4.1 Sample training According to the requirement of target identification, images of the typical areas are chosen artificially from the images cut into 512 * 512 pixels in size. These images only contain one of the two special vegetation types— vegetation Y and vegetation G. We split them again to thousands pieces whose size is 16 * 16 pixels, the same size with the unit image object. Then, color and texture features are extracted from these samples and represented with a 256-dimension color feature vector and a 3-dimendion texture feature vector. The way to obtain color and texture feature reference values is to average the 1,600 samples of each typical vegetation. 2.4.2 The improved nearest neighbour method The nearest neighbor method was first proposed in 1968, and it is a very common and important statistical recognition method. This method relies on the shortest Euclidean distance to make judgements. The distance is calculated between the inspected sample and all the training samples. After that, the nearest training sam- ple is positioned. We argue that the inspected sample and the located training sample belong to the same kind of vegetation. In this paper, the nearest neighbour method is made a small change because a very small number of vegeta- tion types can be recognized through manual visual interpretation and there are many other ground features in map images which are difficult to identify. It is important to decrease the storage space and improve calcula- tion speed when facing the massive data to gain the knowledge. One solution is to retain valid samples. Using the existing sample set generates new sample sets and retain the sets which can classify all the original test samples correctly. At the same time, a threshold method is introduced to extend the range of promising goals. Within the threshold range, the inspected sample belongs to the same vegetation type of training samples. The threshold is determined together by colour and texture feature values of training samples. The main steps of the algorithm are as follows: (i) Define two sample sets, one is called estore and another is called garbagestore, where estore stores all samples, and the garbagestore is a null set. (ii) Put any one sample chosen from the estore into the garbagestore as the first sample; choose the ith sam- ple and use the sample set in estore to classify it, if the result is wrong then put the sample into the gar- bagestore and so on. (iii) When there is no sample moved into estore or the garbagestore is null set again, the algorithm stops or repeat step (ii). 2.5 Single feature recognition model Since the colour and texture features are the effective features to identify vegetation mentioned above, we use them to recognize vegetation respectively on the basis of the improved nearest neighbour method. In fact, the images prepared to be identified have the problem of brightness variance, even though they have been pre-processed in advance. While the colour feature is easily influenced by brightness, leading to larger or smaller feature values when extracted. As a result, the recognition effect would be effected. Also, according to the characteristic of texture feature, when the vegetation accounts for a large part but the plants have simi- lar textures, the texture feature difference is small. In turn, it is hard to distinguish different vegetation types. From Figure 3a), 3b), it can be seen that the recognition effect of single feature model is poor, which means with the single feature, no matter the colour or the texture feature, it cannot provide a precise recognition. 2.6 Texture and colour weight model Considering that single feature offers limited information and cannot display characters of one image object fully, multiple-features are integrated to build a texture and color weight model. In this paper, a weight model is designed to combine the color feature with the texture feature of vegetation, as given in Eq(1) and Eq(2).   2 2 d = T a_T + C b_C (1)   2 2 d(i, j) = T(i, j) a_T + C(i, j) b_C (2) Where T and C represent the mean value of texture feature vector and color feature vector, d denotes the comprehensive feature value of training samples, (i, j) is the position of unit image object in the image, d(i, j) is the comprehensive feature value of the unit image object whose position is (i, j) , 2 T(i, j) and 2 C(i, j) repre- sent the norm of texture feature vector and color feature vector respectively, and a_T , b_C are the weights of texture feature vector and color feature vector. 471 javascript:void(0); javascript:void(0); We still take map image 1 as the test image. Figure 3 shows recognition effect of different models. a) b) c) Figure 3: Recognition effect images from recognition models: a) colour feature model; b) texture feature mod- el; c) texture-colour features model. Table 1: Recognition accuracy of different feature models Model Colour feature Texture feature Texture-colour fea- tures Recognition accuracy 67.7 % 50.9 % 83.3 % 3. Net primary production in different vegetation types The photosynthesis and respiration of plants are important vectors in the carbon-oxygen cycle of the planet, connecting carbon sources and storage sinks.Net primary productivity (NPP) is the organic matter accumulat- ed by plants through the photosynthesis and respiration in unit time and unit area. It directly reflects the productivity of plant communities in natural conditions. It plays an important role in carbon source, carbon sinks and regulation of ecosystem, as well as climate change and carbon circle around the world (Pan et al., 2015). Different vegetation has different productivity according to its own biological characteristics (it is as- sumed that all plants grow in habitable zones in this paper). The vegetation types of the research region and their annual NPP average values (Chen et al., 2002; Sun et al., 2000) are shown in the following Table 2. Table 2: The statistics of simulated and measured annual NPP values in different vegetation types Vegetation type NPP/(gC·m2) values of models and simulating calculations NPP/(gC· m2) values of meas- urement Cultivated land 573.1 532.9 Shrub 379.9 364.0 It has been an important and wildly accepted way to estimate NPP by models due to the hardness of measur- ing NPP values directly and fully on regional or global scales. Considering the error, mean NPP values of models and measurements are taken to calculate the carbon storage of the vegetation recognized from map images in our paper. 4. Result analysis. We pick another test image—map image 2, for identification and assortment, which includes more abundant types than image 1. Then the identification and assortment effect maps are obtained. In order to get the opti- mal recognition and classification effect, the weights and parameters are adjusted to improve the recognition accuracy. The weights are analyzed in both two map images. According to the above sections, the near- est neighbor method is adopted to match and classify the map images pre-treated. Figure 4 contains the origi- nal map image 2, as well as its recognition effect images from single feature model and texture-color features model. 472 javascript:void(0); javascript:void(0); a) b) c) d) Figure 4: Recognition effect images of map image 2 from feature models: a) map image 2; b) colour feature model; c) texture feature model; d) texture-colour features model. Compare the three effect images of map image 2, where two are generated from single feature model and one is from texture-colour feature model as is shown above. It can be seen that with only one feature, the recogni- tion accuracy is obviously lower than the texture and colour weight model. That is because when the vegeta- tion accounts for a large part, the texture feature difference of different vegetation is small. It is also easy to see that there are more misjudgments at boundaries on account of complex surface features at borders. During the recognition and classification process, different color and texture weights have different effects. To get a better result, various values of a_T and b_C (a_T+ b_C = 1, represent the weight of texture and color feature respectively) are substituted in the map images for recognition and classification. After that, the recog- nition accuracy of the two images are by statistic and analysis. The optimal effect image and the recognition accuracy with the parameters varying is shown in Figure 4c) and Table 3. Table 3: Recognition precision Map image a_T value Vegetation Y Vegetation G Others Recognition accuracy Map image 1 0.8 765 222 37 74.7 % 0.7 799 186 39 78.0 % 0.6 831 153 40 81.2 % 0.5 853 131 40 83.3 % 0.4 847 94 83 82.7 % 0.3 683 76 265 66.7 % Map image 2 0.8 343 541 140 74.8 % 0.7 310 592 122 77.3 % 0.6 294 614 116 80.7 % 0.5 246 673 105 81.6 % 0.4 318 563 143 79.2 % 0.3 387 458 179 72.1 % From the data in Table 3, it can be seen that when the weighted value of color feature is 0.6-0.4 and the weighted value of texture feature is 0.4-0.6, the recognition and classification effect of map images is good. From the figures and the data in Table 3, they show that only one feature cannot recognize vegetation effec- tively due to close texture or color feature values. Through visual interpretation, the vegetation G and Y are cultivated land and shrub. Combined with the annual NPP values in different vegetation types, we work out the carbon storage of the vegetation in a map image with pixel size of 512*512, resolution of 0.2986 m/pixel, and scale of 1:1, 128.5. The following Table 4 is the carbon storage of the vegetation in the map images with an area of 23,373 m2. Table 4: Carbon storage (kg) of vegetation from different models Map image Model Colour feature model Texture feature model Texture-colour fea- tures model Map image 1 6,518.54 5,799.86 11,942.15 Map image 2 10,074.82 4,970.82 9,009.15 473 From Table 4, we can see that there is much error in carbon storage of different models. That means a more precise recognition model has great value in carbon storage calculation. 5. Conclusions In this study, high resolution map images are used, compared with remote sensing images and UAV images, they have more available data and high spatial resolution. From them, vegetation information is extracted with feature model. The experiment shows in HSV color space, the texture-color weight model provides a higher accuracy recognition accuracy than a single feature model. Through the weight model, a rough recognition and classification is obtained, up to 83.3 % recognition accuracy. Combined with net primary productivity val- ues from models and measurements, carbon storage is estimated further. Taking this research as example, we can know the vegetation coverage, type and its carbon storage in a map image. On the basis, the study of vegetation recognition and classification can be promoted into larger areas such as Shanghai city. Higher res- olution map images of city areas are processed in batches, easier can we get a rough coverage and classifi- cation of Shanghai and work out the total carbon storage of vegetation. In future studies, we will continue to increase the accuracy of recognition and classification and make a fine distinction of vegetation, such as trees, bushes, crops, grass and so on. The final result will be combined with net primary productivity to estimated vegetation carbon storage and track the carbon footprints of vegetation in city areas. Acknowledgments This paper is supported by Shanghai Science and Technology Committee key R & D projects (No.15DZ1170600); National Key R&D Program (No.2016YFA060260; No.2016YFA0602602); Chinese Acad- emy of Sciences STS network projects (No.KFJ-EW-STS-140); Chinese Academy of Sciences Youth Innova- tion Promotion Association Funding. References Du Y.W., Huang T.Z., Song S.B., Li C.H., 2016, Study on the effects of the industrial structure evolution on carbon emissions—a case study, Chemical Engineering Transactions, 51, 1177-1182 DOI:10.3303/CET1651197 Shoshany M., 2000, Satellite remote sensing of natural Mediterranean vegetation: a review within an ecologi- cal context. Progress in Physical Geography, 24(2), 153-178. Dlamini WM., 2010, Multispectral detection of invasive alien plants from very high resolution 8-band satellite imagery using probabilistic graphical models, Digital Globe® 8 Bands Research Challenge, pp. 1-17. Mansour K., Mutanga O., Everson T., 2012, Remote sensing based indicators of vegetation species for as- sessing rangeland degradation: Opportunities and challenges, African Journal of Agricultural Research, 7(22), 3261-3270. Di P.H., Wang X., 2015, The Research on the Feature Extraction of Sunflower Leaf Rust Characteristics Based on Color and Texture Feature, 2015 International Conference on Computational Intelligence and Communication Networks(CICN), Jabalpur, 457-460. Manjunath B.S., Ohm J.R., Vasudevan V.V., Yamada, A., 2002, Color and texture descriptors. IEEE Transac- tions on Circuits & Systems for Video Technology, 2002, 11(6):703-715. Tamura H., Mori S., Yamawaki T., 1978, Textural features corresponding to visual perception. IEEE Transac- tions on Systems Man & Cybernetics, 8(6):460-473. Wang S.J., 2012, Application of Tamura texture feature to classify underwater targets, Applied Acoustics, 31(2):135-139. Pan S.F., Tian H., Lu C., Dangal S.R.S., Liu M., 2015. Net primary production of major plant functional types in China: vegetation classification and ecosystem simulation. Acta Ecologica Sinica, 35(2), 28-36. Chen L.J., Liu G.H., Li H.G., 2002, Estimating Net Primary Productivity of Terrestrial Vegetation in China Us- ing Remote Sensing. Journal of Remote Sensing, 6(2):129-135. Sun R., Zhu Q.J., 2000, Distribution and Seasonal Change of Net Primary Productivity in China from April, 1992 to March, 1993, Acta Geographica Sinica, (01):36-45. 474