IHJPAS. 36(1)2023 

113 
This work is licensed under a Creative Commons Attribution 4.0 International License 

 
 Studying the Classification of Texture Images by K-Means of Co-

Occurrence Matrix and Confusion Matrix 

 
. 
 
 
Abstract 

     In this research, a group of gray texture images of the Brodatz database was studied by 

building the features database of the images using the gray level co-occurrence matrix 

(GLCM), where the distance between the pixels was one unit and for four angles (0, 45, 90, 

135). The k-means classifier was used to classify the images into a group of classes, starting 

from two to eight classes, and for all angles used in the co-occurrence matrix. The distribution 

of the images on the classes was compared by comparing every two methods (projection of 

one class onto another where the distribution of images was uneven, with one category being 

the dominant one. The classification results were studied for all cases using the confusion 

matrix between every Two cases or two steps (two different angles and for the same number 

of classes). The agreement percentage between the classification results and the various 

methods was calculated. 

 
Keywords: K-Means, Feature Extraction, Confusion Matrix, Agreement Percent, Class 

Projection. 

 
1.Introduction 

Image processing is one of the primary areas employed in various applications. It may be 

characterized as a method through which primitive images are improved. The sources of these 

images are cameras, remote sensors on satellites, or images used for medical diagnostics. In recent 

years, advancements in image processing have been made using variety methods, and the influence 

of these advancements on the capabilities of improving images, whether for military reconnaissance 

missions, space probes, or other applications, has been significant. Image processing comprises 

several distinct methods that can be distinguished from one another, such as feature extraction and 

classification. These features have important and unique information about digital images, and 

together they help their respective classifiers get the best possible results [1-2]. 

doi.org/10.30526/36.1.2894 

Article history: Received  19 June 2022, Accepted  7 Augest 2022, Published in January 2023. 

 
Ibn Al-Haitham Journal for Pure and Applied Sciences 

Journal homepage: jih.uobaghdad.edu.iq 

 
Haider S. Kaduhm 
Department of Phesics,College of Education for 

Pure Sciences, Ibn Al - Haitham, University of Baghdad,  

Baghdad, Iraq. 
ha idar.sa dek1204a@ihcoedu.uoba ghda d.edu.iq  

 
Hameed M. Abduljabbar 
Department of Phesics,College of Education for 

Pure Sciences, Ibn Al - Haitham, University of Baghdad,  

Baghdad, Iraq. 
hameed.m.aj@ihcoedu.uobaghdad.edu.iq 

 
https://creativecommons.org/licenses/by/4.0/
mailto:haidar.sadek1204a@ihcoedu.uobaghdad.edu.iq
mailto:hameed.m.aj@ihcoedu.uobaghdad.edu.iq


IHJPAS. 36(1)2023 
 

114 

The Brodatz texture database is one of the well-known global databases. The Brodatz texture images 

database was built from the Brodatz album, so it is considered an actual measure for evaluating the 

algorithms used in the segmentation and classification of texture images because it contains 

homogeneous and heterogeneous materials in addition to large-scale patterns [3-4]. 

 The texture is a feature that allows to extract and arrange images to be used in many applications 

and provides spatial information about an image's hue or intensity. The texture of the points cannot 

be explained. It is determined by the spatial organization of gray-level values in the vicinity. 

The features of the texture explain how the image behaves. To derive these features, feature 

extraction techniques that help classify and identify images must be applied. 

Dimensional modification is a subset of feature extraction. The primary objective of this method is 

to collect essential features of the raw data and to interpret these features in a space with fewer 

dimensions. When the original data is too large to be handled, the raw data in this approach is 

converted to a decremental description of the features. Dealing with the actual data is unnecessary 

(big data, but more information is needed) [5]. 

The Texels are what the texture is composed of. A distinct definition can be given to the Texels, the 

basic unit used to describe the homogeneity of images, as they appear with a certain extent and 

regularity [6-7]. Also, they are the information provided by the intensity coordination in the image 

or the spatial arrangement of colors[5]. 

Textures can define the surface and the properties of aerial or satellite images, biomedical images, 

and other types of images. With these essential properties, it can be utilized in many applications, 

such as knowing product quality through industrial monitoring, finding land resources by remote 

sensing, and finally, medical diagnosis using a computer in tomography. Thus the texture of the 

image can be defined as the spatial contrast function of pixel intensities (gray values)[8-9]. 

Image texture describes the spatial arrangement of colors in an image. Also, spatial variation in pixel 

intensities (gray values) is used to define an image texture. Image texture has various uses and has 

been the topic of much investigation by many academics because the regular images usually do not 

have complex backgrounds. They include less information about textures. Therefore, one evident 

use of image textures is to select areas based on textural qualities using texture applications[9][10]. 

Image classification techniques are categorized into supervised and unsupervised classification 

methods [11]. Unsupervised classification aims to break the extensive data in the image into smaller 

units with similar characteristics[12]. Unsupervised classification is an aggregation principle used 

to reveal the structure of this aggregation in a data set[13][14]. The Isodata algorithm and the K-

means algorithm are considered the most popular methods used in unsupervised classification. They 

are also widely used in satellite data analysis [15]. 

As for the supervised classification methods, in these methods, there is a group called the training 

sample, or what is known as the input of the analyst, which plays a prominent role in the accuracy 

of classification, as these samples are the main factor and of high importance in the supervised 

classification methods[16-18]. 

In statistical methods, the adopted idea is that the image's texture is determined using statistics 

related to the selected features from among a large group of local image characteristics. You find 

that the human system differentiates between a texture from others based on statistical features, 

which include first-degree, second-degree, and an example stats on second-degree is the gray level 

occurrence matrix, and high-level statistics such as the autocorrelation function [19-20]. 

 
One of the most important statistical methods is the gray-level presence matrix (GLCM), considered 

one of the oldest methods for extracting many texture features [21]. The co-occurrence matrix Co 


IHJPAS. 36(1)2023 
 

115 

(i, j) calculates these features. It finds the gray levels i and j, respectively, with the number of pixel 

co-occurrence, within a certain distance [21-23]. 

The result of this is a matrix of distance d and direction for every number of intensity levels observed 

in the image. Classification for fine textures requires small values for distance d, while for coarse 

textures, a considerable distance [24]. 

After creating (GLCM), the statistical characteristics are calculated as an example. Six statistical 

properties can be inferred from (GLCM) (Contrast, correlation, energy, homogeneity, Entropy, and 

maximum probability [21]. 

This research is to study the features of texture images taken from the Brodatz database for different 

angles and its impact on the classification results in terms of the distribution of images in various 

classes and the percentage of agreement between them on the classification of images. The whole 

image was adopted in extracting the characteristics and the classification process. 

Texture analysis and classification have been studied thoroughly for their importance; the following 

is some research about them: 

The texture analysis approach for local binary patterns (LBP) and the gray level co-occurrence 

matrix method was tested and contrasted against one another (GLCM). They are used on industrial 

products and new integration schemes to conduct texture analysis. They collected a wide range of 

industrial samples. The obtained findings were satisfactory and demonstrated effectiveness in 

identifying complicated industrial products with varying color and pattern distributions [24]. 

 
The spatial gray level method was for texture analysis. He proposed a minor modification to replace 

the usual frequency matrices with addition and difference graphs when the sum and difference of 

two random variables with the same data were related to each other and determine the main axes of 

the joint probability function associated with them. He presented two high-probability classifiers for 

texture. The first, the sum and difference graph, was considered a feature vector component and 

performs a fast execution, avoiding the explicit evaluation of the feature vector. The other was a 

classifier based on general measurements extracted from the graphs. He illustrated that the proposed 

method of tissue analysis from the traditional spatial gray level dependence was the reduction of the 

computation time as well as the storage of memory [25]. 

The suggestion was to make a mixture based on both the co-existence matrix (GLCM) as well as 

the random threshold vector matrix (RTV) due to the need for a high accuracy rate of tissue 

classification using the random threshold vector (RTV). They were used on different data sets, such 

as Brodatz and Outex. In the beginning, the first dimension of the feature vector was calculated. 

They then estimated the inverse from the co-existence matrices by applying the (RTV) method. 

They found that the vectors had two dimensions, one was the Entropy of the proposed existence 

matrix, and the other was the threshold dimension. Their proposed approach showed a high-quality 

accuracy rate for classification textures and contained significant discriminatory information 

required for successful analysis [26]. 

 
2.Methodology 

The following steps were adopted in this research: 

 Using the Brodatz image database as a sample to test the algorithm. 

 Building a database for each angle (0, 45, 90, and 135) of the gray-level co-occurrence matrix 
method (GLCM) using a distance equal to one pixel. 

The features adopted in extracting the images features using the following equations[27]: 

 
𝐶𝑜𝑛𝑡𝑟𝑎𝑠𝑡 = ∑ ∑ (𝑖 − 𝑗)2𝑁−1𝑖=0
𝑁−1
𝑖=0  𝑝(𝑖, 𝑗)                (1) 

𝐶𝑜𝑟𝑙𝑎𝑡𝑖𝑜𝑛 =
1

𝜎𝑥𝜎𝑦 ∑ ∑ (𝑖,𝑗)𝑝
𝑁

𝑗=1

𝑁

𝑖=1
(𝑖,𝑗)

− 𝜇𝑥 𝜇𝑦               (2) 


IHJPAS. 36(1)2023 
 

116 

𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = − ∑ ∑ 𝑝(𝑖, 𝑗)𝑙𝑜𝑔(𝑝(𝑖, 𝑗))𝑁𝑗=1
𝑁
𝑖=1                        (3) 

 
𝐻𝑜𝑚𝑜𝑔𝑒𝑛𝑒𝑖𝑡𝑦 = ∑ ∑
𝑝(𝑖,𝑗)

1

𝑁−1
𝑗=0

𝑁−1
𝑖=0 + |𝑖 − 𝑗|             (4) 

𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎 = √
1

𝑁−1
∑ |𝐶𝑜𝑖 − 𝜇|

2𝑁
𝑖=1             (5) 

where μ is the mean of co-occurrence matrix 

𝜇 =
1

𝑁
∑ 𝐶𝑜𝑖

𝑁

𝑖=1

 
𝑇ℎ𝑖𝑟𝑑 𝑀𝑜𝑚𝑒𝑛𝑡𝑢𝑚 (𝑀3) = (
1

𝑁−1
∑ |𝐶𝑜𝑖 − 𝜇|

2𝑁
𝑖=1 )

3

  (6) 

𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝜎 = 1 −
1

1+(
𝑆𝑐𝑜
255

)
2     (7) 

 𝑈𝑛𝑖𝑓𝑜𝑟𝑚𝑖𝑡𝑦(𝑈𝑓) = ∑ ∑ (
𝐶𝑜𝑖𝑗

∑ ∑ 𝐶𝑜𝑖𝑗
)

2

𝑗𝑖     (8) 

 
Where 𝑝(𝑖, 𝑗) the probability of the co-occurrence matrix element value 𝐶𝑜𝑖𝑗  

 Classifying the Brodatz database using the K_ Means classifier for a number of classes. 

 Calculating the confusion matrix between each two classification methods for all angles 0, 45, 90 
and 135. 

 
 Calculating the agreement percent between the results of classificat ion of each two methods (two 
different angles). 

 Studying the distribution of the classified images between methods (different angles) by calculating 
the projection of one method into another method, as shown in Figure 1 

 
  First Method  

 
S
e
c
o
n
d
 

M
e
th

o
d

  
The projection for the 

first method on the 

second 
 

 The projection of the second method on the first  

Figure 1. The confusion matrix 

 
The confusion matrix was used to calculate the agreement percent for the classification results of 

classifying the images between each two methods, using equation 9. 

 
𝐴𝑔𝑟𝑒𝑒𝑚𝑒𝑛𝑡 𝑃𝑒𝑟𝑐𝑒𝑛𝑡 (𝐴𝑃) =
∑ 𝐶𝑜𝑛𝑓𝑖𝑖

𝑁
𝑖

∑ 𝐶𝑜𝑛𝑓𝑖𝑗
𝑁
𝑖𝑗

    (9) 

 
3.Results and discussion 

Brodatz images were used as samples to study the GLCM method in distinguishing texture images, 

where 112 (gray images) were used. Below, in Figure 2, is a sample of these images. 

 
IHJPAS. 36(1)2023 
 

117 

    
Figure 2. A set of samples taken from Brodatz 

 
The GLCM was calculated for the angles (0, 45, 90 and 135) with one-pixel distance to study the 

effect of angle change on the ability of the GLCM to distinguish texture images. A database was 

created for each image and each angle used in the gray-level co-occurrence matrix using eight 

features that illustrate in the methodology as a pattern to describe the texture of the images. K-means 

classifier was used to classify the images starting from two to eight classes. The confusion matrix 

was used to analyze the results, as shown in Figure 2. The results of each method intersected with 

other methods. 

 
Figure 3. shows the relationship between the number of classes and the percentage of agreement 

for every two angles. The general behavior of all selected angles decreased rate with the number of 

classes. Still, the behavior of (135-45) and (0-90) angles are different from the others because they 

are right angles to each other. For the angle (45-135) and (0-90), the percentage of agreement was 

around 90%. And for several classes, up to seven classes between the two angles (45-135) and within 

five classes between the two angles (0-90). At the same time, the rest of the angles showed that the 

percentage of agreement for the class has started decreasing from the third class and downward. 

When the number of varieties was few, the percentage of compatibility between them was very high, 

but this case still must be fixed. With the increase in the number of classes, the percentage of the 

agreement began to decrease, and the speed of reduction depended on the value of the angle formed 

by the two angles used in the classification in terms of being a multiple of the angle 90. 


IHJPAS. 36(1)2023 
 

118 

 
Figure 3. The distribution of the number of classes and the agreement percentage for each of the two angles 
 

The confusion matrix was used to study the distribution of the images in each method relative to 

the second method (its projection) and various adopted classes, as shown in Figure 2. 

The projection of the first method on the second and the second method on the first and for all 

classes and angles adopted in the research were calculated, as shown in Figure 4. 

 
The projection for first method to second method The projection of second method to first method 

 
IHJPAS. 36(1)2023 
 

119 

 
IHJPAS. 36(1)2023 
 

120 

  
Figure 4.The projection of the method on to the other, using different classes 

 
It can be seen that in the confusion matrix, there is a problem with the classification results. In most 

cases, a class contains the highest number of images. We note that this method distributes images 

to the classes so that there is a class that contains most images, regardless of their arrangement, 

which is usually the class in the middle, which represents one of the defects of this method, as the 

distribution is not equal. Still, one of the classes is dominant and takes the most significant number 

of images. This behavior is identical to all the results of different classes' projections and angles. 

 
IHJPAS. 36(1)2023 
 

121 

4.Conclusions 

      Depending on the obtained results, we can reach the following conclusions: 

 The percentage of agreement between the classification results when using the GLCM 

method is high when the number of classes is few. It decreases when the number of classes 

is increased. 

 When every two angles used for GLCM form a right angle, the percentage of agreement 

decreases less than in the rest of the cases. 

 When using GLCM, the classification results in one of the classes being the dominant one in 

terms of the number of images, and it is usually arranged in the middle. 

  
References 

 
1.Mayada,  J. K.; . Emad, K. J.; Decision Tree for Image Classification; Iraqi Commission for 

Computers and Informatics: Baghdad, 2013. 

2.Latef, A. A. A.; Image Retrieval Based on Coefficient Correlation Index. Ibn AL-Haitham J. Pure 

Appl. Sci.2017, 25. 

3.Farhan, A.H.; Mohammed Y. Kamil Texture Analysis of Breast Cancer Using Mammogram; 

Mustansiriyah University /College of Science: Baghdad, 2020. 

4.Zhang, X.; Cui, J.; Wang, W.; Lin, C. A.; Study for Texture Feature Extraction of High-Resolution 

Satellite Images Based on a Direction Measure and Gray Level Co-Occurrence Matrix Fusion 

Algorithm. Sensors2017, 17, 1474. 

5.Wirth, M. A.; Texture Analysis. Univ. Guelph Guelph, ON, Canada2004. 

6. Chang, T.; Kuo, C. C.; Texture Analysis and Classification with Tree-Structured Wavelet 

Transform. IEEE Trans. image Process.1993, 2, 429–441. 

7.Hashim F. A. AL-Bassam;  A Texture Analysis System Based on Spatial Frequency and Attributes 

for Image Classification; University of Baghdad - College of Science Department of Physics: 

Baghdad, 2019. 

8.Naghashi, V.; Co-Occurrence of Adjacent Sparse Local Ternary Patterns: A Feature Descriptor 

for Texture and Face Image Retrieval. Optik (Stuttg).2018, 157, 877–889. 

9.Warner, T. A.; Foody, G.M.; Nellis, M.D. The SAGE Handbook of Remote Sensing; Sage 

Publications, 2009. ISBN 1412936160. 

10.Mohammed, M.A.; Naji, T.A.H.; Abduljabbar, H. M. The Effect of the Activation Functions on 

the Classification Accuracy of Satellite Image by Artificial Neural Network. Energy Procedia.2019, 

157, 164–170. 

11.Akey Sungheetha, D. J. An Efficient Clustering-Classification Method in an Information Gain 

NRGA-KNN Algorithm for Feature Selection of Micro Array Data. Life Sci. J.2013, 10. 

12.Sharma, A. R.; Beaula, R.; Marikkannu, P.; Sungheetha, A.; Sahana, C. Comparative Study of 

Distinctive Image Classification Techniques. In Proceedings of the 2016 10th International 

Conference on Intelligent Systems and Control (ISCO); IEEE, 2016,1–8. 

13.Jain, M.; Tomar, P.S.; Review of Image Classification Methods and Techniques. Int. J. Eng. Res. 

Technol.2013, 2, 852–858. 

14.Abduljabbar, H.M.; Hatem, A. J.; Al-Jasim, A. A. Desertification Monitoring in the South-West 

of Iraqi Using Fuzzy Inference System. NeuroQuantology.2020, 18, 1. 

15.Abburu, S.; Golla, S. B.; Satellite Image Classification Methods and Techniques: A Review. Int. 

J. Comput. Appl.2015, 119. 

16.Mohammed, M. A.; Hatem, A. J.; Change Detection of the Land Cover for Three Decades Using 

Remote Sensing Data and Geographic Information System. In Proceedings of the AIP Conference 

Proceedings; AIP Publishing LLC, 2020. 2307, 20029. 

17.Zhang, J.; Tan, T.; Brief  Review of Invariant Texture Analysis Methods. Pattern Recognit.2002, 

35, 735–747. 

18. Julesz, B.; Caelli, T.; On the Limits of Fourier Decompositions in Visual Texture Perception. 

Perception.1979, 8, 69–73. 


IHJPAS. 36(1)2023 
 

122 

19.Haralick, R. M.; Statistical and Structural Approaches to Texture. Proc. IEEE1979, 67, 786–804. 

20. Abaas Hussain, L. H.; Correction of Non-Uniform Illumination for Biological Images Using 

Morphological Operation Assessing with Statistical Features Quality. Ibn AL-Haitham J. Pure Appl. 

Sci.2017, 29, 81–90. 

21.Hussein, M. A.; Abbas, A. H.; Comparison of Features Extraction Algorithms Used in the 

Diagnosis of Plant Diseases. Ibn AL-Haitham J. Pure Appl. Sci.2018, 523–538. 

22.Materka, A.; Strzelecki, M. Texture Analysis Methods–a Review. Tech. Univ. lodz, Inst. 

Electron. COST B11 report, Brussels. 1998, 

 10, 4968. 

23.Suresh, A.; Shunmuganathan, K.L.; Image Texture Classification Using Gray Level Co-

Occurrence Matrix Based Statistical Features. Eur. J. Sci. Res.2012, 75, 591–597. 

24.Akhloufi, M.A.; Maldague, X.; Larbi, W.; Ben A New Color-Texture Approach for Industrial 

Products Inspection. J. Multimed.2008, 3. 

25.Unser, M.; Sum and Difference Histograms for Texture Classification. IEEE Trans. Pattern Anal. 

Mach. Intell.1986, 118–125. 

26.Rezaei, M.; Saberi, M.; Ershad, S.F. Texture Classification Approach Based on Combination of 

Random Threshold Vector Technique and Co-Occurrence Matrixes. In Proceedings of the 

Proceedings of 2011 International Conference on Computer Science and Network Technology; 

IEEE. 2011, 4, 2303–2306. 

27.Ali, A. H.; Abdulsalam, S. I.; Nema, I. S.; Detection and Segmentation of Ischemic Stroke Using 

Textural Analysis on Brain CT Images. Int. J. Sci. Eng. Res.2015, 6, 396–400.