Microsoft Word - cet-01.docx CHEMICAL ENGINEERING TRANSACTIONS VOL. 46, 2015 A publication of The Italian Association of Chemical Engineering Online at www. aidic. it/cet Guest Editors: Peiyu Ren, Yanchang Li, Huiping Song Copyright © 2015, AIDIC Servizi S. r. l., ISBN 978-88-95608-37-2; ISSN 2283-9216 An Improved Camshift Algorithm Based on Grabcut with a LBP Model of Correction Tracking Centroid Xianggong Hong*, Xiying Zheng, Huimei Xiao, Zhiyi Xue Nanchang University Information and Engineering College, Jiangxi, 330031, China. 393472615@qq.com According to the analysis of advantages and disadvantages of traditional Camshift algorithm, it can be known that background noise and interference of similar color objects will have a greater impact on Camshift algorithm, which may cause tracking errors easily. This paper presents an improved Camshift algorithm based on Grabcut and LBP centroid tracking correction model. Current frame image enhanced by enhanced coefficient, the Camshift algorithm will apply Grabcut object segmentation to achieve pure histogram for target object and then move the centroid which has been distracted due to similar color objects’ obstacles to uncovered parts of the target object by using LBP centroid tracking correction model. As a result, the problems caused by background noise and similar color obstacles distraction can be solved effectively. Tests presented in the paper can prove that the algorithm can be used to operate the real -time tracking much more steadily and accurately. 1. Introduction Visual tracking, one of the hottest research directions, plays a significant rolein image processing and computer vision at present. Moving object tracking is widely used in many military and civilian fields, such as vision-guiding, unmanned aerial vehicles tracking, security monitoring, public scene monitoring, ITS and so on. Currently, common used methods for tracking moving targets are particle filter, compressive sensing algorithm, background subtraction, neighbor frame difference method, optical flow method, Camshift and so forth. Nevertheless, all of these methods, instead of being perfect, are defective. For example, although optical flow method has adopted subtraction method, it is poor in real-time algorithm which would easily lead to tracking fail under the complex environment. Particle filter algorithm is strong in anti -jamming capability, but it presences a phenomenon of particle degeneracy and it is weak in stability. Background subtraction and adjacent frame difference method cannot be used for background transforming situation. According to the above analysis, this paper has adopted Camshift algorithm to be the main body of tracking method. This paper introduces an improved Camshift algorithm. Taking into account the selected initial histogram would be easily mixed with background noise, the image is enhanced as a whole, which increases contrast between object and background. The next our algorithm adopts Grabcut to separate the target object and then to gain the pure color histogram. For similar color obstacles, we get LBP histogram by processing the search window firstly and then get the discriminant coefficient through modeling. With image analysis from the search window, the S-Grabcut algorithm is used to get uncovered target area and works out its centroid. If this centroid replaces its original counterpart, the effect of object tracking will not be affected when similar color obstacles moving in. As to the fast occlusion problem, the paper has applied the Kalman filter which were confirmed (Wang and Li (2010)) to improve the real-time tracking accuracy. 2. The Constitutes of the Model of Correction Tracking Centroid 2.1 Fundamental of Grabcut The Grabcut, N. Otsu (1979) reported, expands Graphcut from monochromatic space to color space and we use GMM (Gauss mixture model) instead of statistical histogram to model for foreground and background color modeling. Grabcut expresses images as vectors Z and definesthe opacity arrayof those images as values α. W e assume that if α is equal to 1,it indicates the corresponding image is foreground. If α is equal to 0, the images are background. The parameter θ is the eigenvalue of histogram that describes the statistical DOI: 10.3303/CET1546063 Please cite this article as: Hong X.G., Zheng X.Y., Xiao H.M., Xue Z.Y., 2015, An improved camshift algorithm based on grabcutwitha lbp model of correction tracking centroid, Chemical Engineering Transactions, 46, 373-378 DOI:10.3303/CET1546063 373 properties of the color of foreground region and background region. The value Z is expressed as the pixel gray values. Image Gibbs’s energy function model formula which were confirmed (C. Rother et al (2004)): E(α, k, θ, z) = U(α, k, θ, z) + V(α, z) (1) Histogram gauss modelθformula: θ = {π(α, k), u(α, k), ∑(α, k), α = 0,1, k = 1 … k} (2) 2.2 LBP algorithm Texture is one of the inherent characteristics of the surface that can be considered to be a pattern in gray space in the form of a certain change. The basic idea of LBP (Local binary patterns) algorithm is that the value of the center pixel gray scale image is used as a threshold value, which is compared with its neighboring pixels so as to obtain a binary code which can be used to express the local texture features and reflect the texture information in the region. In order to adapt to the texture features of different scales, T. Ojala et al proposed an "equivalent model" method in 1996 which is used to reduce the dimension of LBP operator and named it asLBPp,R riu2. With such improvements, the type of binary mode could be greatly reduced without losing the edge information. 2.3 S-Grabcut algorithm Surf which were confirmed (Bay et al (2006)) applies the same method as what evaluates the approximate value of Hessa in matrix determinant to extract key points, and adds detailed information to these points, describing the main direction and building subvectors. Then it compares the features of the two images so as to find numbers of matching feature points. Finally it establishes a corresponding relationship between scenes. When Surf carried out non-overlapping regions’ detection and segmentation in similar color interferences, which first obtained feature point between template and current marquee images, Surf made match points connect into a curve. Getting together with the current marquee division, they are viewed as Grabcut initialization parameters. At last, it extracts the non-overlapping target area by using Grabcut to divide. By this means Surf works out the centroid. The breaking through, mixture of Surf and Grabcut, develops Grabcut into automatic dividing and no longer needs any manual operation. Therefore, this kind of combination is called S - Grabcut algorithm. Experiment results prove that it can be automatically segmented effectively. 3. Principle and Implementation of Improved Algorithm 3.1 Purify histograms of the target object Because background noise will be mixed into the color histogram when the original Camshift algorithm artificially selects the search window under complex backgrounds, the search window will be unable to achieve the best convergence and it will bring unnecessary deviation. In order to solve this problem, we first extract the target's three channels mean of RGB and then enhance current frame image by the enhanced coefficient. Z = ⌊255 max (r1 , g1 , b1 ) ⁄ ⌋ (3) R (i,j) = Z ∗ r(i,j), G(i,j) = Z ∗ g(i,j), B(i,j) = Z ∗ b(i,j)(i ≤ W, j ≤ H) (4) Z is the enhanced coefficient. r1, g1 and b1 represent three channels mean of RGB respectively, and W ,H represent the overall image of width and height respectively. r(i,j), g(i,j) and b(i,j) stand for the gray value of corresponding points in the original image respectively. Then we conduct foreground segmentation on the selected target object by Grabcut. A color histogram is established in the light of separating target object and treated the target as the initialized tracking templates of Camshift through Grabcut. Then basing on the histogram, Camshift and Kalman will make it erative calculation directly in the enhanced video. Next, the position of search window will be worked out. At last, map the size and position of the search window into the original video. Figure 1: Grabcut flow chart 374 3.2 The model of track centroid correction First of all,we should detect the size of the search window when it tends to be stable in the enhanced video stream. If the continuously three frame search window sizes vary in the range of error (The variation range of current frame's length and width is within the 0.2 times first frame's length and width), we should extract the third frame as a LBP template of target tracking. When doing above to work, we can get the corresponding LBP histogram. According to the gray-scale value, each feature statistics are stored in different bins. Next, we have the real-time mark for the current frame of the target object frame as the same process used to template. At last, we get LBP histograms of each frame image and the values of different bins. Bmax = Max(B0i) ,∆bi = |B0i − Bni|, i ≤ N (5) a = { 1, (∆bi ≤ 0.3Bmax) 0, (∆bi > 0.3Bmax) (6) N stands for the number of bin, i represents the position of the bin, B on behalf the value of bin, ais the judgment coefficient, B0 represents the value of the bin of the template, Bn represents the bin value of LBP’s histogram in the selected Nth frame. In the test, it is clear that when a non-similar color obstacle enters, the search window becomes smaller. In contrast when similar color obstacles enter, the search window becomes inclusive and bigger. So it can be assumed that when a is 0, the search window becomes smaller, the model has no feedback and the tracking follows the same process as Camshift; When a is 0, the search window becomes bigger (The current frame's length and width is greater than the 1.2 times template's length and width), and then the S- Grabcut Segmentation begins to conduct At this time, S-Grabcut extracts uncovered the similar-color parts and works out their centroid position (x1 , y1). Next, it replaces the center position of the current search window (x0 , y0). Take this result as a starting point for next frame iteration. In a short,we adjust the centroid to the target object rather than obstacle so that the similar color interference could be eliminated, which reduce the image storage consumption that caused by numerous pictures matching divisions. At last, the algorithm is improved in the timeliness and accuracy. Figure 2: Real-time extraction process of corection model 4. Experimental Result and Analysis In the scene of test 1, we chose human face as the target object. On the premise of no obstacle , apply the tradition Camshift Algorithm and the improved Algorithm to test the matching accuracy of the search window. In this test the man moves his head from side to side, so that we could compare the degree of match the search window and object (head) when the object (head) moves or changes its size. Figure 3: Pictures of test 1: (a) experimental pictures of this paper (b) experimental pictures of tradition Camshift 375 Table 1: The processing of experimental data from test 1 Tradition Camshift Histogram after purification The ratio of two relative distances Actual center a Search window center a1 Actual center a Search window center a2 △ 10 (103,354) (121,355) (103,354) (115,356) 0.6753 20 (290,300) (285,306) (290,300) (291,295) 0.6530 30 (440,337) (437,338) (440,337) (437,337) 0.9494 40 (510,373) (492,359) (510,373) (498,358) 0.8425 50 (284,312) (295,310) (284,312) (289,310) 0.4821 60 (239,318) (231,320) (239,318) (243,320) 0.5418 Note:∆= √(xa2 − xa) 2 + (ya2 − ya ) 2 √(xa1 − xa) 2 + (ya1 − ya ) 2⁄ From the analysis result of the test 1, enhancing and purifying the target histogram is useful to reduce the interference which is caused by edge background noise. Compare the data of search window centroids with actual centroids from frame10 to frame 60, all the relative distance rate between centroids are less than 1. It indicates that after improvement the distance between the search window centroids and the actual centroids of objects is smaller than before. Therefore it can show that the fitting degree between the Search window and the object increases as well as the tracking effect is improved. The test 2 aims to test and compare the tracking effects of those three tracking algorithms when objects are not all under similar color obstacles. As before, the human face is the object being tracking and the hand is regarded as the similar color distraction obstacle. In the test, the hand is placed at the 2/3 of the face and then moved from one side to the other. We test the effects of tradition Camshift, the fusion algorithm of Camshift and Kalman as well as the algorithm which mentioned in this paper under some same frames and compare them. The results are as follows: Figure 4: Pictures of test 2: (a) experimental pictures of tradition Camshift (b) experimental pictures of the fusion algorithm of Camshift and Kalman (c) experimental pictures of this paper 376 Table 2: The processing of experimental data from test 2 Original After image matting Relative moving distance Actual center a Search window center a’ Actual center b Search window center b’ △ 10 (344,299) (89,181) (345,300) (90,182) 1.414 25 (311,319) (122,160) (276,320) (87,161) 35.0143 40 (326,312) (106,166) (280,289) (60,143) 51.4296 55 (329,302) (99,174) (317,195) (87,67) 107.6708 70 (340,298) (91,180) (334,200) (85,82) 98.1835 85 (341,295) (91,174) (321,229) (63,108) 71.6938 100 (359,297) (97,176) (340,230) (78,109) 69.6419 115 (349,296) (88,177) (343,289) (82,170) 9.2195 130 (344,295) (87,178) (341,292) (84,175) 4.2426 Note:∆= √(xa′ − xb′ )2 + (ya′ − yb′ )2 According to the test 2, when the target (the face) is covered by similar color obstacle(the hand), the tradition Camshift algorithm is under great influences. When the hand enters in, the search window is deformed irregularly and the centroid is totally moved. In the fusion algorithm of Camshift and Kalman, when the hand enters, the search window becomes smaller while the window is still connected with the head. However, when hand moves out, the centroid is moved together with the hand and its place is changed obviously. In the 85th frame, the centroid leaves from the target. The results of the improved algorithm show that when the hand enters, the search window keeps the same effect with that under the previous algorithms ——their containments are the same (the frame expands). When the hand moves away from the face, after the frame 55, the window does not change correspondingly. Therefore, the corrected centroid still stays at the face. The search window stays even when the hand moves out of the face picture. According to the forms and the relative movement distances which derived from the centroid correction model, which shows that when similar color obstacles enter into video stream, this model does move the original centroid of the current frame, so the centroid would be kept within the target object but not be distracted or moved by similar color obstacles and lead to tracking mistakes. The results above show that when targets are distracted by similar color obstacles, by using the improved algorithm in this paper, the method of centroid correction which is based on LBP and S-Grabcut can decrease the error-tracking probabilityand increase the tracking robustness. 5. Conclusions This paper, after analysis of the processes of the tradition Camshift algorithm, has presented a Camshift algorithm that is based on Grabcut with a LBP model of correction tracking centroid, which can be applied into tracking moving target objects. Setting a fixed number to enhance the tracking video and with the help of Grabcut draw out a colorful histogram of pure color of the object which could help to decrease the background noise and distraction. Build a LBP-centroid tracking correction model, discern the similar color obstacles distraction and use the S-Grabcut algorithm to move the original centroid to the un-covered part of target object. What’s more, the result of automatic segmentation is good and the movement of the centroid is high- efficiency. According to the tests, this model has a good effect on against the distraction of similar color and it can satisfy the requirement of real-time and stability. At last, the Kalman filter is used to predict the location of the target object and track effectively when most parts of the object are covered or the object moves quickly. As a result, the robustness of the whole system could be enhanced. References Bay H., Tuytelaars T., and Gool L.V. 2006. SURF: Speeded up robust features. Computer vision–ECCV 2006. Springer Berlin Heidelberg. Vol. 17(8) (pp. 404-417). DOI: 10.1007/11744023_32. 377 Cheng Y.Z. 1995. Mean shift, mode seeking, and clustering. Pattern Analysis and Machine Intelligence, IEEE Transactions on. Vol. 17(8) (pp. 790-799). DOI: 10.1109/34.400568. Exner D., Bruns E., Kurz D., Grundhoefer A., Bimber O. 2010. Fast and Robust CAMshift Tracking. Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on . (pp. 9-16). DOI: 10.1109/CVPRW. 2010.5543787. Fukunaga K., Hostetler L.D. 1975. The estimation of the gradient of a density function with applications in pattern recognition. Information Theory, IEEE Transactions on. Vol. 21(1) (pp. 32-40). DOI: 10.1109/TIT.1975.1055330. Hou Y.C., Peng W.C. 2014. Distance Between Uncertain Random Variables. Mathematical Modelling and Engineering Problems. Vol. 1(1). (pp. 17-24). Jianhui W., Guoyun Z., Longyuan G. 2013. Study The improved CAMSHIFT algorithm to detect the moving object in fisheye image. Mechatronic Sciences, Electric Engineering and Computer (MEC), Proceedings 2013 International Conference on. (pp: 1017-1020). DOI: 10.1109/MEC. 2013.6885210. Li Z.Y. 2011. Modified local entropy-based transition region extraction and thresholding. Applied Soft Computing. Vol. 11(8) (pp. 5630-5638). DOI: 10.1016/j.asoc. 2011.04.001. Ojala T., Pietikainen M., Harwood D. 1996. A comparative study for texture measures with classification based on feature distributions. Pattern Recognition. Vol. 29(1) (pp. 51–59). DOI: 10.1016/0031-3203(95)00067-4. Ojala T., Pietikäinen M., Mäenpää T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol: 24(7) (pp. 971-987). DOI: 10.1109/TPAMI. 2002.1017623. Otsu N. 1979. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man & Cybernetics. Vol. 9(1). (pp. 62-66). DOI: 10.1109/TSMC.1979.4310076. Rother C., Kolmogorov V., Blake A. 2004. “Grabcut”-Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG). Vol. 23(3) (pp. 309-314). DOI: 10.1145/1015706.1015720. Tu J.H. 2014. A Novel Building Boundary Extraction Method for High-Resolution Aerial Image. Review of Computer Engineer Studies. Vol. 1(2). (pp. 19-22). Wang C., Zhu S.M. 2015. A Design of FPGA-Based System for Image Processing. Review of Computer Engineer Studies. Vol. 2(1). (pp. 23-28). Wang X.Y., Li X.J. 2010. The study of Moving Target tracking based on Kalman-CamShift in the video. Information Science and Engineering (ICISE), 2010 2nd International Conference on. (pp. 1-4) DOI: 10.1109/ICISE. 2010.5690826. Zhao S., Gao Y. 2008. Establishing Point Correspondence Using Multidirectional Binary Pattern for Face Recognition. Pattern Recognition, 2008.ICPR 2008.19th International Conference on. (pp: 1–4). DOI: 10.1109/ICPR. 2008.4761248. 378