Microsoft Word - 523-538 IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |523    Comparison of Features Extraction Algorithms Used in the Diagnosis of Plant Diseases Mohammed A.Hussein mohammed.alsraf@gmail.com Amel H.Abbas dr.amelhussein2017@uomustansiriyah.edu.iq2 Computer Science/ College of Science/ University of Mustansiriyah Abstract The detection of diseases affecting plant is very important as it relates to the issue of food security, which is a very serious threat to human life. The system of diagnosis of diseases involves a series of steps starting with the acquisition of images through the pre-processing, segmentation and then features extraction that is our subject finally the process of classification. Features extraction is a very important process in any diagnostic system where we can compare this stage to the spine in this type of system. It is known that the reason behind this great importance of this stage is that the process of extracting features greatly affects the work and accuracy of classification. Proper selection of the right features leads to high accuracy in the system diagnostics and vice versa. The proposed system collect images of different crop (Rice, cotton and tomato) disease, we will enter the images of cropping them , then Re-size the images to fixed size, then improve the image through Fuzzy histogram equalization (FHE) , then perform image segmentation using color based K-means and finally compare the methods of features extraction (Percentage of Leaf Area Infected (PI),Texture-Based Features, Color Moments, Features obtained by Color Co-occurrence Method and Shape based Features) we found that the use of 4 methods together (Percentage of Leaf Area Infected (PI),Texture-Based Features, Color Moments and Shape based Features) produce excellent result.. Keywords: Plant Leaf, Feature Extraction, Texture, Color Moments, and Segmentation. IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |524    1. Introduction There are many diseases that affect crops and lead to significant production losses, which threaten the issue of food security. Human visual examination with the naked eye is the way most widely used and common. This method gives a large room for error depending on where the farmers trying to detect the disease through visual inspection as a big chance of error in some cases resorting to experts, this thing needs a lot of time, effort and money. Another problem in Iraq, said most of the crop fields is located in rural areas, which requires farmers to go long distances to find experts [16]. Image processing gives accuracy, high-speed, do not require large sums of money have been spent and time- consuming as in the case brought experts [17] Features Extraction is an important task and key in the plant disease detection algorithm. The extraction of features simplifies the amount of data required to describe the large amount of data contained in the image. The large amount of information consumes memory and processor time therefore we need the smallest possible quantity of information able to describe the picture as best as possible This can be done by using features extraction. Given the Importance of features extraction in the process of classification of plant diseases, there is a lot of research that focused on the details of this subject as [7], [13], [14] etc. There are many ways to extract features, in this research the focus was on comparing the most important methods used in this topic 1- Percentage of Leaf Area Infected (PI) 2- Texture-Based Features Gray Level Co-Occurrence Matrix (GLCM) 3- Color Moments(CM) 4- Features obtained by Color Co-occurrence Method (CCM) 5- Shape based Features (SF) The steps followed for the proposed comparison process appear in Figure 1 IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |525    Figure (1): Research Block Diagram 2. Image Acquisition This paper uses the most important agricultural products in Iraq (rice, cotton and tomatoes)  the images were obtained from the Internet. Twelve images were used in this paper; with two images per disease. All images obtained are RGB Color space and jpg format. Two diseases were used for each of the three crop types. Figure (2) shows some examples of this image. (a) (b) IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |526    (c) (d) (e) (f) Figure (2): (a) Rice brown spot, (b)Rice leaf smut, (c) Cotton bacterial blight, (d) Cotton spider mite, (e) Tomato septoria leaf spot and (f) Tomato target spot 3. Preprocessing In the pre-processing phase, a set of operations will be performed in order to prepare the image for the segmentation process, this operation include 3.1 Crop the image: The captured image containing about 30% of the infected plant leaf information and the remaining 70% of the rest of the information is not important because it represents the background. This background is unnecessary consumption of memory, and also in the treatment time in the CPU during the process of retail segmentation in order to gain efficiency in the storage and speed of processing is important, we deduct the portion of the image through a process of cropping as shown in the Figure (3). 3.2 Image Resize: Resize all images that will be used to a fixed size (300*400). This fixed size was used for all imported images because the accuracy of the feature extraction process is affected if the images are of different sizes. IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |527    (a) (b) (c) (d) (e) (f) Figure (3): ((a), (b) rice diseases after resize and crop,(c) , (d) cotton diseases after resize and crop ,(e) and (f)tomato diseases after resize and crop 3.3 Image enhancement: For enhance the image the paper using Fuzzy histogram equalization using the equation (1) as shown in Figure 4. Fuzzy histogram equalization (FHE) is proposed for image contrast enhancement [1]. The FHE contains two periods. First, fuzzy histogram is computed based on weird set theory to manage the inexactness of grey level values in an improved way in comparison to classical clean histograms. Second stage, the fuzzy histogram is divided into two sub histograms based on the median value of the initial image and then equalizes them independently to preserve image brightness computed as (1) Where h(i) is a sequence of real numbers IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |528    , is the fuzzy membership function (a) (b) (c) (d) (e) (f) Figure (4): ((a), (b) rice diseases after FHE, (c) ,(d) cotton diseases after FHE,(e) and (f) tomato diseases after FHE 4. Segmentation Image segmentation is an important and fundamental process where the process of features extraction depends on the process of segmentation. The process of image segmentation is to separate the region of interest (ROI) from the rest of the parts of the Image. this research used the color based K-means segmentation algorithm using the equation (2) as shown in the Fig 5 adopted on the color because the plant diseases can clearly distinguish them through the colors and this facilitates the process of separating the affected parts from the rest of the parts of the image. K-means is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem [3]. K-Means Clustering Algorithm is used to segment the leaf image into one cluster [4, 6] or more than one [5] IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |529    (2) Where ‘||xi - vj||’ is the euclidean distance between xi and vj. ‘Ci’ is the number of data points in ith cluster. ‘C’ is the number of cluster centers. (a) (b) (c) (d) (e) (f) Figure (5): ((a), (b) rice diseases after K-mean segmentation, (c), (d) cotton diseases after K-mean segmentation, (e) and (f) tomato diseases after K-mean segmentation. 5. Features Extraction 5.1 Percentage of Leaf Area Infected (PI): Image Segmentation by the methods mentioned previously, characterize some pixels in the image as that of the diseased part of the leaf, as the others lay in the undispersed section. Thus we can analyze IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |530    values of portion of diseased part and the full total section of the leaf. Then, we estimate ratio of area attacked, using Equation (3) as [7]. (3) Where (AD) is the diseased part, (AL) is the leaf total area. This value of PI has been used as one of the features in many of the previous researches we mention some of them. The following are the scientific names of the agricultural crops that have been worked on Cassava [7], Pomegranate [8] and Sugarcane [9] leaf images. 5.2 Texture-Based Features: In case of feature extraction based on texture of the image, we used gray level co-occurrence matrix [10, 11]. The co-occurrence matrix C (i,j) counts the number of co-occurrence of pixels with gray- levels i and j respectively, at a given distance d. The matrix [12] is given by: (4) Where d is the distance defined in polar coordinates (d,) with discrete length and orientation. takes values 0, 45, 90, 135, 180, 225, 270 and 315. Cord {} represents the number of elements present in the set. Equations (5), (6), (7) and (8) represent the features extracted from GLCM matrix [13]. Contrast feature (5) Correlation feature (6) Energy feature (7) Homogeneity feature IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |531    (8) Where: Pij= Element i,j of the normalized symmetrical GLCM N = Number of gray levels μ = the GLCM mean (being an estimate of the intensity of all pixels in the relationships that contributed to the GLCM), σ 2 = the variance of the intensities of all reference pixels in the relationships that contributed to the GLCM, calculated as: 5.3 Color Moments: In order to extract the features based on color [28], we have used the 4 color moments. (Mean, Standard Deviation, Kurtosis and Skewness) as [14] Mean (9) Standard Deviation (10) Kurtosis (11) Skewness IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |532    (12) 5.4 Color Co-occurrence: using the same texture feature but not gray image its use Hue, Saturation, Value (HSV) color image H component only. 5.5 Shape Features: There are various processes and techniques for shape representation that are summarized in [15]. Normally shape descriptors such as the number of the area of an object, no of objects, centroid of object, perimeter are important characteristics to describe the object shape and It will be used in this research. 6.Compare Results of Features Extraction In this research, we used images of three types of crops, two types of diseases per crop two images for each disease. The research compares five types of feature extraction methods (PI) Table 1, GLCM Table 2, Color Moments Table 3, Color Co-occurrence Method Table 4 and Shape based Features Table 5. 1-Percentage of Leaf Area Infected (PI) Table (1): Percentage of Leaf Area Infected (PI) results PI .Image NO Rice Brown Spot 6.1586 1 5.0381 2 Rice Leaf Smut 10.8805 1 17.7698 2 Cotton Bacterial Blight 8.6376 1 8.989 2 Cotton Spider Mite 47.1679 1 66.5836 2 Tomato Septoria Leaf Spot 8.746 1 14.4644 2 Tomato Target Spot 21.4469 1 10.4172 2 IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |533    2- Texture-Based Features Table (2): Texture-Based Features (Contrast, Correlation, Energy and Homogeneity) results Hom. Ener. Cor. Con. Image NO. Rice Brown Spot 0.98 0.86 0.89 0.08 1 0.99 0.86 0.88 0.07 2 Rice Leaf Smut 0.95 0.72 0.8 0.24 1 0.94 0.62 0.88 0.31 2 Cotton Bacterial Blight 0.99 0.79 0.92 0.08 1 0.98 0.84 0.9 0.18 2 Cotton Spider Mite 0.84 0.38 0.76 2.33 1 0.84 0.29 0.86 1.6 2 Tomato Septoria Leaf Spot 0.98 0.79 0.95 0.07 1 0.94 0.74 0.83 0.44 2 Tomato Target Spot 0.95 0.59 0.94 0.21 1 0.98 0.82 0.97 0.06 2 IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |534    3- Color Moments Table (3): Color Moments Features (Skewness, Kurtosis, StandardDeviation and Mean) Mean SD Kurt Skewn Image NO. Rice Brown Spot 6.1586 23.36 24.0356 4.5139 1  5.0381 19.677 23.8195 4.5248 2  Rice Leaf Smut 10.881 28.26 16.5848 3.5172 1  17.77 39.89 15.8211 3.3723 2  Cotton Bacterial Blight 8.6376 27.33 20.4906 3.9337 1  8.989 34.11 26.9211 4.6906 2  Cotton Spider Mite 47.168 72.71 3.3453 1.3519 1  66.584 79.24 1.9852 0.7062 2  Tomato Septoria Leaf Spot 8.746 28.3 19.621 3.9957 1  14.464 39.37 15.4118 3.5136 2  Tomato Target Spot 21.447 45.86 7.09 2.2133 1  10.417 34.26 20.4903 4.114 2  IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |535    4-Features obtained by Color Co-occurrence Method Table (4): Features obtained by Color Co-occurrence Method for hue component only (Homogeneity, Energy, Correlation and Contrast) Con Cor Ener Hom Image NO. Rice Brown Spot 0.757 0.81 0.71 0.956 1  0.7752 0.78 0.6873 0.952 2  Rice Leaf Smut 1.675 0.75 0.434 0.894 1  1.982 0.75 0.332 0.88 2  Cotton Bacterial Blight 0.395 0.94 0.749 0.979 1  0.423 0.81 0.828 0.974 2  Cotton Spider Mite 1.405 0.69 0.297 0.89 1  0.926 0.75 0.39 0.924 2  Tomato Septoria Leaf Spot 0.642 0.82 0.611 0.954 1  1.507 0.75 0.337 0.893 2  Tomato Target Spot 0.33 0.81 0.799 0.975 1  0.458 0.82 0.694 0.972 2  IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |536    5-Shape based Features Table (5): Features obtained by Shape (Centroid, Number of Object,Object, Piremeter and Area) results Area Pirm obj.number Cent(x,y( Image NO. Rice Brown Spot 10468 3269 369 192.07 148.68 1  9391.1 2620 268 182.72 185.96 2  Rice Leaf Smut 23624 9636 877 213.26 118.19 1  34355 14181 1240 218.29 107.3 2  Cotton Bacterial Blight 14195 2174 193 237.28 209.29 1  11032 2793 341 234.97 162.8 2  Cotton Spider Mite 53336 15025 2106 230.72 164.82 1  68006 12773 1547 188.47 154.48 2  Tomato Septoria Leaf Spot 15501 3852 274 210.45 108 1  27024 10266 818 229.41 141.42 2  Tomato Target Spot 29170 1533 205 264.45 150.77 1  21604 3112 110 164.62 186.76 2  IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |537    7. Discussion of Results Results were discussed in each method based on the number of features that give good results and we mean the good results of those features that give a clear difference between the two diseases for the same crop if the method gives good results in all the features in both two images say this way is 100% good, but if the way gives good results in three features out of four we say this method is good by 75% and so on. 7.1 Rice disease 1. The PI feature gave us good results by 100% 2. The GLCM Features gave us good results by 75% 3. The Color Moment Features gave us good results by 75% 4. Color Co-occurrence Features gave us good results by 50% 5. The Shape Features gave us good results by 75% 7.2 Cotton disease 1. The PI feature gave us good results by 100% 2. The GLCM Features gave us good results by 75% 3. The Color Moment Features gave us good results by 100% 4. Color Co-occurrence Features gave us good results by 50% 5. The Shape Features gave us good results by 75% 7.3 Tomato disease 1. The PI feature gave us good results by 100% 2. The GLCM Features gave us good results by 50% 3. The Color Moment Features gave us good results by 50% 4. Color Co-occurrence Features gave us good results by 50% 5. The Shape Features gave us good results by 75% 7.4 All diseases In order to calculate the accuracy of this method in all diseases in general, we collect their proportions and divide them into three (number of crops) Results in all diseases were as follows PI (100%) in all disease, GLCM (66.6%), color moment (66.6%), 4. Color Co-occurrence (50%), Shape Features (75%) 8. Conclusion In this research found the use of the following mix of features PI, GLCM, color moment, shape features give excellent results in all types of plants that were taken in this research. IHSCICONF 2017 Special Issue Ibn Al-Haitham Journal for Pure and Applied science https://doi.org/ 10.30526/2017.IHSCICONF.1785 For more information about the Conference please visit the websites: http://www.ihsciconf.org/conf/  www.ihsciconf.org   Computer |538    9. Future Work Comparisons between other methods have never been used in this type of research. References [1] V. Magudeeswaran and Ravichandran C,(2013)," Fuzzy Logic-Based Histogram Equalization for Image Contrast Enhancement",  Mathematical Problems in Engineering,  013 (2013), Article ID 891864, 10 pages. [2] Sheet D, Garud.H, Suveer A, Chatterjee J and Mahadevappa M, (2010), "Brightness Preserving Dynamic Fuzzy Histogram Equalization", IEEE Trans Consumer Electronics, 56, no. 4, 2475 – 2480. [3] Majumdar D, Kumar D, Chakraborty A and Dutta D,(2014)," DETECTION & DIAGNOSIS OF PLANT LEAF DISEASE USING INTEGRATED IMAGE PROCESSING APPROACH ",  International Journal of Computer Engineering and Applications, VI –III. [4] S, Sannaki, V. Rajpurohit, V. Nargund, A. Kumar ,and P, Yallur (2010), " Leaf Disease Grading by Machine Vision and Fuzzy Logic" . Int. J. Comp. Tech. Appl., 2 (5), 1709-1716:2229-609. [5] Bashish D., Braik M., Bani-Ahmad S,(2010), "A Framework for Detection and Classification of Plant Leaf and Stem Diseases" IEEE. [6] M. Badnake, P. Deshmukh,(2012), "Infected Leaf Analysis and Comparison by Otsu Threshold and k-means Clustering", International Journal of Advanced Research in Computer Science and Software Engineering, 2, 3. [7] K. Powbunthorn, W. Abudullakasim, Unartngam J,(2012), "Assessment of the severity of Brown Leaf Spot Disease in Cassava using Image Analysis" , The International conference of the Thai Society of Agricultural Engineering. [8] S. Sannaki, V. Rajpurohit, Nargund, V. Kumar A, Yallur P ,(2010) , "Leaf Disease Grading by Machine Vision and Fuzzy Logic" . Int. J. Comp. Tech. Appl., 2 (5) 1709-1716:2229-609. [9]. S.Patil , S. Bodhe ,(2011), "LEAF DISEASE SEVERITY MEASUREMENT USING IMAGE PROCESSING". International Journal of Engineering and Technology,.3 (5), 297-301. [10] D. Majumder, B. Chanda,(2007), "Digital Image Processing and Analysis", Prentince Hall of India Private Limited. [11] R, Haralick. K. Shanmugam. and Dinstein.I, (1973),"Textual features for image classification", IEEE Trans. Syst. Man Cybern., 3,.6, 610–621, Nov. 1973. [12] Zhang, Zhang F,(2008) ,"Congress on Image and Signal Processing", IEEE computer society, 773-776. [13] K. Xu and Y. Li. ,(2005),"An image search approach based on local main color feature and texture feature", Journal of Xi'an Shiyou University (Natural Science Edition), 20 (2): 77 - 79. [14] Yu H, Li M, Zhang H, Feng J,(2002), "Color Texture Moments for Content-Based Image Retrieval", Proc. IEEE Intl Conf. on Image Processing, pp. 929-932. [15] S.Patil, S. Bodhe,(2011) ,"LEAF DISEASE SEVERITY MEASUREMENT USING IMAGE PROCESSING",International Journal of Engineering and Technology,.3 (5), 297-301. [16] M.Saade,(2012), "Iraq Agriculture sector note" ,Food and Agriculture Organization of the United Nations (FAO). [17] S. Pramod, A. Sushil, Dhanashree S, Omkar D, Utkarsha G,(2013),"  Automatic Detection and Classification of Plant Disease through Image Processing",International Journal of Advanced Research in Computer Science and Software Engineering, 3, 7.