CHEMICAL ENGINEERING TRANSACTIONS VOL. 51, 2016 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Tichun Wang, Hongyang Zhang, Lei Tian Copyright © 2016, AIDIC Servizi S.r.l., ISBN 978-88-95608-43-3; ISSN 2283-9216 The Image Retrieval Method Based on the Homolographic Block Color Histogram Gongwen Xua, Lina Xua, Xiaomei Lib, Wenjing Qia* aShandong Jianzhu University, School of Computer Science and Technology bCancer Center of the Second Hospital, Shandong University qiwj@sdjzu.edu.cn An image retrieval method based on the homolographic block color histogram was proposed in this paper. Firstly, the selection of color space and color quantity method was introduced. Then, the new features extracting method was given out. When describing the color distribution of the image, the traditional global histogram would ignore the space distribution features of different colors. So the objects in the image cannot be described correctly. The image space information was induced by segmenting the image using equal area rings when extracting the image features, which was named HBCH (homolographic block color histogram) method. Finally, the HBCH index and similarity computing methods were introduced and implemented. The new HBCH method was proved to be effective and timesaving compared with the traditional ones through the experiments. 1. Introduction Recently years, with the high speed development of Internet and multimedia information, the number of digital images grows under an unimaginable speed. The images contain a large amount of useful information as a sharing data. In the people’s daily life, the images play an important role in all kind of trades (Smeulders, 2000). So all kinds needs of processing images information comes into being nowadays. But the huge quantity images information is distributed with no rules in all kinds fields, and there is not a general image retrieval method can be used in every field. So the huge images resource can not serve us with high efficiency. In order to obtain the right information in the disorderly images databases, the study of image retrieval becomes the important project in the computer fields (Xu et al., 2014). As the great commercial value and grand development forground of Content Based Image Retrieval (CBIR), the CBIR was applied in the daily system model under the science and technology workers’ ten years’ research work (Xu et al., 2014; Yuan, 2015). A series of CBIR systems under different application backgrounds have been carried out by different corporations and scientific organizations, such as QBIC (Query by Image Content), which was developed by IBM Almaden Research Center, Photobook, which was developed by MIT Multimedia Lab, MARS (Multimedia Analysis and Retrieval System), which was developed by ILLINOI (Jiang et al., 2005). Image retrieval is developed from the base of computer vision and image process. With the emergence of large quantity of images, the information retrieval method depending on ‘word’ cannot satisfy the people’s need. The new retrieval method was expected and its key principal is how to describe the image accurately. An outstanding image description can be the firm base of the image retrieval. So how to describe the image information in the round is the most important part of the image retrieval. In this paper, a new features extracting method, HBCH (homolographic block color histogram) was proposed. In this method, the image space information was introduced by equal area rings image segmentation. The remainder of this paper is organized as follows: Section 2 introduces the selection of color space and the color quantization method. What’s more, the HBCH method is proposed in section 3, including image segmentation method and the features extraction method. In section 4, the realization of image retrieval based on HBCH was described. In section 5, the image retrieval experiments and performance evaluation were reported. Finally, the conclusions and future works were given in section 6. DOI: 10.3303/CET1651068 Please cite this article as: Xu G.W., Xu L.N., Li X.M., Qi W.J., 2016, The image retrieval method based on the homolographic block color histogram, Chemical Engineering Transactions, 51, 403-408 DOI:10.3303/CET1651068 403 2. Selection of color space and color quantization 2.1 Selection of color space The There are many known and unknown color space model in image fields, and the color histograms are different with the different color space. The construction of color space has intuitionistic effect on image color features extraction. To construct an ideal color space, the three factors are needed, completeness, consistency, uniqueness. The completeness means that the color space can describe all the colors that we can sense. The consistency means that measuring difference and sensing difference of the color space are consistent. The uniqueness means that the sense of different color is different each other. The RGB (Red, Green, Blue) color space and HSV (Hue, Saturation, Value) color space are familiar in research work and the former is the most popular one. The RGB mode is the original format of the digital image, which uses R, G, B color components to present the color. Each color component is presented by 8- bit, so the RGB color space can simulate 224=16,777,216 kinds colors (Wang et al., 2011). The RGB space is widely used in the image process because the image is collected and displayed in RGB value. Nevertheless, the RGB value cannot express the image color information directly, and it is greatly different from the eye’s sense. It is hard to get the image cognitive attribute via RGB value, so the RGB space is always changed to other color space in application. HSV color space uses Hue, Value, and Saturation to describe the image color information according to human eye visual features, so it meets human’s visual sense well. The hue (H) of a color refers to which pure color it resembles. All tints, tones and shades of red have the same hue. Hues are described by a number that specifies the position of the corresponding pure color on the color wheel, as a fraction between 0 and 1. The saturation of a color describes how white the color is. The value of a color, also called its lightness, describes how dark the color is. The HSV color space is suitable for human’s eyes and is similar to human’s sense, so it is widely used in computer vision research fields. It is easy and convenient to change RGB space to HSV space, so the HSV color space is selected to process image in this paper. 2.2 Color Quantization There is much color information in an image, especially colorful image. The high-dimensional data features will bring great trouble to the image process so the color quantization and dimension reduction are needed. If the range of the color definition is different, the quantization of the H, S, V is also different(Ko et al., 2005; Zhang et al., 2015). According to the research and comparison about color models, the Hue will be divided into 8 sections, Value 3 sections, and Saturation 3 sections too. The quantization values are expressed with H, S, V and the quantization formulas are shown below.                                              ' 0, H 316,360 0,20 1, H 21,24 2, H 41,75 3, H 76,155 4, H 156,190 5, H 191,270 6, H 271,295 7, H 296,315 H (1)                 ' 0, 0,0.2 1, 0.2,0.7 2, 0.7,1 S S S S (2)                 ' 0, 0,0.2 1, 0.2,0.7 2, 0.7,1 V V V V (3) As the quantization method above, the 3 color components can be merged into 1 color component. The formula is shown below.    ' ' s v v K H Q Q S Q V (4) 404 In the formula, s Q and v Q are respectively the quantization series of 'S and 'V . Generally, s Q =3 and v Q =3, so the formula can be changed into the below one.    ' ' '9 3K H S V (5) That is, the weight of 'H is 9, 'S is 3, 'V is 1. So K’s value is ranged with [0,71]. Then the image can be express with 72 dimensions, and the color features dimensions of the image are reduced effectively. 3. Features extraction based on HBCH The traditional histogram features extraction method is easy and convenient, but it only counts the number of each color pixel in the image. The space information of the color cannot be expressed appropriately. The traditional method can retrieve an image approximately, but it cannot retrieve the objects in the image and the retrieval result is unsatisfactory. When extracting image features, two images may have the similar color histogram, but their space information is quite different. To recover this problem, the image color information is extracted by HBCH, which will make up for the defect of traditional method which is short of space information (Chen and Wang, 2002). 3.1 Image segmentation with equal area rings In traditional method, image is divided into m x n blocks equally, but the main part of the image is ignored. Commonly the center of an image is the interesting one for people, not the surrounding background part. To highlight the interesting image center, an image segmentation method based on equal area rings was adapted, which is a fixed division method based on image space. Firstly, all the images in this work are normalized and 256 x 256 pixels. Then the physical center point O was selected, and it’s coordinate is (128,128). An image is divided into N+1 blocks, with a circle, N-1 rings, 1 remainder. In this paper, N is set to 3, and image is divided into 4 blocks. The 4 parts are 1 circle, 2 rings, 1 remainder. This method can emphasize the main part of the image and cut down the subordinate part effect on the image. As shown in figure 1, the image is segmented into 4 parts. Figure 1: Equal area rings segmentation 3.2 Features Extraction Based on HBCH Color histogram is the digital displaying mode of color distribution. In order to get the better results, the histogram information is normalized. The formula is shown below.    /kH k g G (6) k: image color value, which is ranged with [0,71]; gk: number of pixel whose color value is k; G: total number of image color, 72; Using above formula to calculate each sub-block histogram, then the color feature CFi can be achieved.           , , { 1 2 3 }; 1,2,3,4iCF H H H H k i (7) 405 The four blocks’ color features are combined to one HBCH feature vector which carries color space information.   1 2 3 4 , , , CF CF CF CF CF (8) As shown in the formula, color feature vector CF contain four color features CFi. Each color feature contains 72 figures, which describe the color information of each block of the image respectively. Before extracting image color feature, the image is segmented in advance, so the space relation is added in certain degree, and color feature has not only the global color information but also the space information. The features extracted in this method are benefit for experiment, and it will make the result more accurate. 4. The realization of image retrieval based on HBCH 4.1 HBCH index Based on the color information extraction on the image database, combining the database clustering index technology, the image database index is brought out. The steps of HBCH index algorithms are shown below. (1) Segmenting each image under the equal area rings method; (2) Calculating color histogram normalization value of each block, taking it as local feature data and combining each color histogram to construct image HBCH feature vector, CF. (3) Ordering local feature data of each image in database by the physical storing path and image file name. (4) Creating the clustering HBCH index, each image local features data is on the same line with the image physical storing path. It is shown in Table 1. Table 1: Image clustering index Image Physical storing path/name Image feature data C:\Corel\0.jpg 0.0411068 … 0.0423566 C:\Corel\1.jpg 0.0459620 … 0.0370021 … … … … C:\Corel\999.jpg 0.0326671 … 0.0530126 Once the index is built, the retrieval operation will become simple. When an image is retrieved, its feature data is calculated, then comparing this data with the data in HBCH index table. The feature of each image in database needn’t to be extracted, so the retrieval speed is increasing. 4.2 Similarity calculation of HBCH feature After extracting the image HBCH features and building feature index, the key technology of retrieval is the definition of similarity between querying image and database images. The Euclidean Distance is used to measure the similarity. The Euclidean Distance is D, querying image is Q, and the image in database is L. If the distance D is lower than a certain threshold, then the image Q and image L is considered to be similar. As mentioned before, color feature vector is 4 dimensions. The HBCH feature vector of querying image Q is CF .  ’ 1 2 3 4 ' , ' , ' , ' CF CF CF CF CF (9) The HBCH feature vector of database image L is CF. Each block of the image is set different weight Wi,and the weights are defined by users according to the querying image information. Accordingly, the centre circle weight can be enhanced to give prominence to the image central part. The sum of all the weights is 1. So the HBCH feature similarity of two images is Dc(Q,L).    4 ' 2 1 (Q,L) (CF CF) c i i D w (10) The querying image is compared with the data from index file, when Euclidean Distance Dc(Q,L)<, the retrieval is stopped.  is a threshold and can be set via experiment. When the distance is lower than , it means that this image in database is the most similar one, and the retrieval work is over. The retrieval time will be shortened greatly using threshold. User can find his favorite image quickly. When the images have been compared with querying image, their information such as path, name and distance will be arranged in HBCH feature similarity order. The bigger of Dc(Q,L) value, the larger of the image’s order. The image retrieval algorithm deal with the selection of color space, color quantization, image equal area rings segmentation, image HBCH feature exaction, and building HBCH index. The steps of the realization of image retrieval algorithm based on HBCH index are shown blow. 406 (1) Pre-processing all the images. The images in the database are segmented into 4 blocks via the equal area rings method. Each sub-block is normalized and the color histogram is calculated. The four blocks’ normalized histograms are combined to build image HBCH feature vector CF. (2) Every image in the database with physical storing path, file name, and HBCH feature vector CF is sorted to generate HBCH index file. (3) User chooses one image from the database as the querying image, selects the image database path and HBCH index method, sets each component’s weight according to the image attributes. (4) Extracting HBCH feature vector CF from querying image. (5) Picking up image CF data from HBCH index file. The weight is set to every component in the CF and CF feature vector and the Euclidean Distance Dc(Q,L) of the two images is calculated. (6) When Dc(Q,L)<, retrieval is finished. The images participated in the comparison were sorted by the similarity and returned to user. Otherwise, the retrieval goes on until he most similar image is selected out. 5. Experimental results and analysis The Corel image database is selected in this experiment, which contains 1000 images about dinosaur, flower, building, bus, and elephant and so on. The querying image is selected from image database and after retrieval the top 50 images are chosen as the retrieval result. The average Precision and Retrieval Time comparison between global color histogram and HBCH index are shown in table 2. Table 2: Performance comparison between global histogram and HBCH Image Classes Average Precision Retrieval Time(ms) Global Histogram HBCH Global Histogram HBCH elephant 0.43 0.85 25113 5012 bus 0.45 0.70 22057 4582 building 0.51 0.75 19526 3981 flower 0.56 0.89 19309 4137 dinosaur 0.65 0.87 26588 5620 As shown in table 1, the retrieval time of traditional method is about 5 times of HBCH index. The average precision of the new method is also overrun the traditional one. It is a new image retrieval method with high efficiency and precision and can retrieve image quickly. Figure 2: Retrieval results based on global histogram Figure 3: Retrieval results based on HBCH index In this experiment, the elephant images in the database are selected. The figure 1 is used as the querying image. The retrieval results based on global color histogram method are compared with the new HBCH index method. The results are shown in figure 2 and figure 3. In figure 2, we can learn from the results based on global color histogram that the images 3, 5, 7, 9, 13, 14, 15 are obviously different from the querying image. The images 10, 12, 16 are elephant, but the number of elephant do not conform to the querying image exactly. The precision of the top 16 images is only 56.25%. The retrieval error is a bit larger and not perfect. While the results based on the HBCH index method are 407 better than the traditional ones. There are fewer unsuitable images in this results and the arranging order of this results is more in keep with people’s intuitive judgment. The precision of the top 16 images is 87.5% and the retrieval effect is enhanced obviously. As the results shown, the retrieval method based on HBCH index is more superior to the one based on global color histogram either on precision or time. 6. Conclusions Color feature is one of the most useful image features in the image retrieval applications based on content. The paper started from the human eyes’ visual feature, taking into account the affection of local features to the whole image, according to the traditional global color histogram drawback which considers the whole image color distribution of the image but ignores the color details, and a new image retrieval method based on HBCH index is proposed. When retrieval is submitted, the image features do not need to be calculated each time. When a querying image is submitted by user, the feature data is withdrawn from index files and compared with the querying image’s feature. The Euclidean distance is calculated; when it is lower than the presenting threshold, the comparison stops. The experiment results show that the new method proposed in this paper has the better retrieval effect and the retrieval time is reduced. Acknowledgements This work is partially supported by A Project of Shandong Province Higher Educational Science and Technology Programs (J12LN31, J13LN11, J14LN59), the Shandong Xiehe College School Fund (XHXY201431), the Scientific Research Fund of the Second Hospital of Shandong University (S2015010004), the Development Projects of Science and Technology of Shandong Province (2012GGX27073, 2014GGX101011, 2015GGX101018, 2016GGE27402). Reference Chen Y., Wang J.Z, 2002, A region-based fuzzy feature matching approach to content-based image retrieval, Pattern Analysis & Machine Intelligence IEEE Transactions on, 24(9), 1252-1267. Jiang S.Q., Du J., Huang Q.M., Huang T.J., Gao W., 2005, Visual ontology construction for digitized art image retrieval, Journal of Computer Science & Technology, 20(6), 855-860. DOI: 10.1007/s11390-005-0855-x Ko B., Byun H, 2005, Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching. IEEE Transactions on Multimedia, 7(1), 105-113. DOI: 10.1109/TMM.2004.840603 Smeulders A., 2000, Content-based image retrieval at the end of the early years, IEEE Trans on Pattern Analysis & Machine Intelligence, 22(12), 1349 - 1380. DOI: 10.1109/34.895972 Wang J., Kong B., Jia Q.L, 2011, Color-based image retrieval, Computer Systems & Applications, 19(5), 530- 535. Xu G., Zhang Z., Yuan W., Xu L, 2014, On Medical Image Segmentation Based on Wavelet Transform, 2014 Fifth International Conference on Intelligent Systems Design and Engineering Applications (ISDEA), IEEE Computer Society, pp. 671-674. Xu G., Zhang Z., Qi W., Liao M., Xu L., Zhao H, 2014, Image Automatic Annotation Based on the Similarity of Regions. Journal of Computational Information Systems, 21(10), 9397–9404. Yuan B., 2015, Features Extraction of Moving Target Image Based on Nuclear Method and Spatio-temporal Correlation Theory, 46, 385-390. DOI: 10.3303/CET1546065 Zhang Z., He X., Sun X., Wang J., Wang F, 2015, Research on Steel Strip Image Segmentation Algorithm Based on Particle Swarm Optimization, 46, 205-210. DOI: 10.3303/CET1546035 408