2) 3( 23المجلد مجلة ابن الھیثم للعلوم الصرفة والتطبیقیة 010 إقتراح خوارزمیة ھجینة إلستحداث مفتاح فھرسة لقاعدة البیانات بناءاً على محتویات الصورة جبار عماد كاظم،شھالء طالب عبد الوھاب جامعة بغداد، كلیة التربیة ابن الھیثم، قسم علوم الحاسبات ة، كلیة العلوم، قسم علوم الحاسبات الجامعة التكنلوجی الخالصة HYBRID Algor)(یتناول ھذا البحث اقتراح طریقة جدیدة ithm ر لحمایة الصور والوثائق من التزویر او التغیی م خالل ضغط البیانات والحصول على مفاتیح استرجاع وتحقق مبنیة على محتوی االلكتروني وذلك من ات الصورة من قی وخزنھا في قاعدة البیانات ومن ثم استخدامھا لتدقیق صحة بیانات ) االحمر واالزرق واالخضر(رقمیة لاللوان االساسیة ومقارنة مفاتیح قیم محتویاتھا مع ما مخزون في قاعدة نفسھا الطریقةبالوثیقة المطلوبة من خالل إعادة ضغط بیاناتھا بحث قد تم تطویرھا بناء على طرإن الطریقة ا. البیانات وھما ق سابقة في ضغط واسترجاع البیاناتائلمقترحة في ھذا ال لى خوارزمیة وذلك بالدمج بینھما للحصول ع D4 Lifting Scheme)(وطریقة Haar Lifting Scheme)( طریقة ت الضغط والخزن واالسترجاع لبیانات من دقة وصالحیة الطریقة المقترحة من خالل اجراء اختبارا تثبتلقد تم ال. ھجینة تویات ضغط تصل الى راسیة لمس توى 14مجموعة من الوثائق المھمة مثل الوثائق الد كما تمت مقارنة النتائج . مس ر Lifting Scheme (Haar(ق الضغط واالسترجاع االخرى ائالمستحصلة من الطریقة المقترحة مع مثیالتھا من ط Lifting sch eme and D4 أظھرت . على حده من حیث دقة اكتشاف التزویر او التغییر ومقدار المساحة الخزنیةكل ر ي ائنتائج االختبارات ان الطریقة المقترحة تتمیز على بقیة الط ق المستخدمة من حیث تحّسسھا البسط تعدیل او تغییر ف ي الوثائق تی الت ًا في مقدار المساحة المس اعھا في حین اظھرت تحّسن D4 Lifting)(خدمة مقارنة بطریقة راد استرج Scheme. IHJPAS IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 Proposed Hybrid Algorithm for Generate Database Index Key Based on Image Contents S. T. Abdul wahab, E. K. Jabbar Departme nt of Computer Science, College of Education I bn Al-Haitham, Unive rsity of Baghdad Departme nt of Computer Science , Unive rsity of Technology Abstract This paper deals wit h proposing new lift ing scheme (HYBRID Algorit hm) that is capable of preventing images and documents which are fraud t hro ugh decomposing t here in t o t he real colors value arrays (red, blue an d green) t o creat e ret rieval keys for it s properties and store it in t he dat abase and then check the document originalit y by ret rieve the query image or document t hrough t he decomposition described above and compare the predict ed color values (ret rieval keys) of the query document wit h t hose stored in the dat abase. The proposed algorit hm has been develop ed fro m the two known lift ing schemes (Haar and D4) by merging them to find out HYBRID lift ing scheme. T he validit y and accuracy of the propo sed algorit hm have been evaluat ed t hrough experiment s wit h t he decomposition of database image con sist s of impo rtant documents like college certificat ion s up t o maximal decomposit ion level of 14. The t ests results using the HYBRID algorit hm were compared wit h t hat of the other metho ds (Haar and D4 Lifting scheme) in t erms of t he accuracy of discovering fo rgeries (ret rieval accuracy) and t he required store memory area. The results illustrat e t hat t he HYBRID algorit hm show better performance t han t he others in terms o f the sensit ivity to any ch ange in the retriev al documents. Also, HYBRID Algorit hm exhibits goo d improvement in t erm s of t he used memor y space compared to the result s obt ained by D4 Lift ing scheme. Ke y words: Image Retrieval; Fraud Docum ents Preven tion; HYBRID Algorithm. Introduction M ost of database sy stem use ima ges since they show a lar ge amount of information and are easy to st ore without data entry . The amount of digital images available for the office document is rap idly growin g b ecause of that t here is a gr eat need for efficient image indexin g and access tools. Document which is fraud involves efforts to obtain genuine identity documents through fraudulent means and the alteration of valid documents to be used for fraudulent p urp oses [1]. In recent y ears, fraud do cument has grown increasingly widesp read. The sop histication of these schemes has also grown, as document forgers increasingly use comp uter software and high-r esolution digital scanners to p lay their trade. Additionally , the Internet is being used more frequently to market fake documents to customers. M ost losses become the resp onsibility of the document issuer. Fraud do cument p revention is an imp ortant p art of p rotecting a database sin ce fraud Prevention can occur through a variety of ways [2]. IHJPAS Content-Based Image Retrieval (CBIR) is consider ed as the p rocess of retrieving desired images or do cuments from huge databases based on extracted features from the image itself (without indexing a key word). Features are derived directly from the images and they are extracted and analy zed by comp uter p rocessing [3]. CBIR is a bott leneck of the access for the multimedia d atabases that deal with text, audio, video and image data which could provide IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 us with enormous amount of information. M any commercial and r esearch CBIR systems have been built and developed [4]. Content based image retrieval allowin g to automatically extract targets according to objective visual contents of image itself like color, texture and shap e [5]. In this work, Haar and D4 lifting schemes have been used to decompose color document images into multilevel scale and wavelet coefficients. New lifting scheme has been p rop osed throughout this p ap er. The new scheme p resents a p rogressive retrieval st rategy , which contributes to flexible comp romise between the retrieval accuracy and memory sp ace. The retrieval p erformances are comp ared with those of its classical counterp art in terms of retrieval accuracy and memory sp ace. Lifting Schemes The lifting scheme is a tool for constructing second-gener ation wavelets, which are no longer, dilates and translates of on e sin gle function. The lifting scheme can b e viewed as a p rocess of taking an existing wavelet and modify ing it by adding linear combinations of the scaling fun ction at t he same level of resolution [6].The wavelet l ifting scheme was d eveloped by Wim Sweldens and others. Wavelet liftin g scheme algor ithms have sever al advantages lik e [7]: 1. Allows fast er implementation of the wavelet transform. 2. Saves storage by p roviding an in- p lace calculation of the wavelet t ransform. 3. Simp lifies determining the inv erse wavelet t ransform. 4. Provides a natural way to introduce and think about wavelets. Wavelet algor ithms are recursive. So the output of one step of the algorithm beco mes the input for the next step . The initial input data set consists of 2 n elements. Each successive st ep op erates on 2 n-i elements, were i = 1 ... n-1. For examp le, if the initial data set contains 128 elements, t he wavelet transform will consist of seven steps on 128, 64, 32, 16, 8, 4, and 2 elements. stepj +1 follows stepj . If element i in step j is being up dated, the notation is step j ,i. The forward lifting scheme wavelet transform divides t he data set being p rocessed into an even half and an odd h alf [8]. The Lifting Scheme is a method for decomposing wav elet transforms into a set of st ages namely: Sp lit, Predict, and Up date [9]. A simp le lifting scheme forward transform is shown in Fig. (1). The lifting scheme forward st eps illust rated in Figure (1) can be exp lained as follows: 1. Split Step This divides the data set into odd and even indexed samples. The original si gn al is sp lit into even indexed samples, Sj ,2L, and the odd indexed samples, Sj ,2L+1, where j denotes the level of decomp osition and L = 0, 1, 2, …, 2 n-1 are the indices of the elements in the signal. The sp litting p rocess is sometimes referred to as the Lazy wavelet t ransform [10]. 2. Predict Step Predicts the odd elements from the even elements. The linear interp olation function "p redicts" that an odd element will be located at t he mid-p oint of a line between its two even neighbors. T he difference between the p redicted value and the actual value of the odd element replaces t he odd element [3]. IHJPAS The odd and even subsets are often h ighly correlated. Thus, it is p ossible to p redict one from the other. The predict step , where the odd value is "p redicted" from the even valu e is described by the equation 1:  ijijij evenPoddodd ,,,1  (1) IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 3.Update Ste p The up date step rep laces the even elements with an average. The goal at this st ep is to maintain some global p rop erties of the original si gnal in the reduced set [7]. This result in a smoother inp ut for the next st ep of the next st ep of the wavelet transforms. The odd elements also rep resent an approximation of the original data set, which allows filters t o be const ructed. The up date p hase follows the predict p hase. The original value of the odd elements has been overwritt en by the difference between the odd element and its even "p redictor" as in equation 1. So in calculatin g an average the up date p hase must op erate on the differences that are st ored in the odd elements [8]:  ijijij oddUeveneven ,,,1  (2) Haar Lifting Scheme Haar lifting scheme p rediction st ep p redicts that t he odd element will be equal to the even element. The differ ence b etween the p redicted value (the even element) and the actual value of the odd element replaces the odd element. For the forward transform iteration j and element i, the new odd element, j+1,i would be [8]: ijijij evenoddodd ,,,1  (3) In the Haar lifting scheme, the up date step rep laces an even element with the average of the even/odd p air. The new odd value is got from old odd value and old even value as in equations 3. 2 ,, ,1 ijij ij oddeven even   (4) The original value of the oddj ,i element has been rep laced by the difference between this element and its even predecessor. T he new even valu e is got from old ev en value and old odd value as in equations 4. ijijij oddevenodd ,1,,1   (5) Subst itut ing this into the average, we get: 2 ,1,, ,1 ijijij ij oddeveneven even     (6) 2 ,1 ,,1 ij ijij odd eveneven    (7) The averages ( even elements) become the input for the next recursive st ep of the forward transform. This can be shown in Figure (2). The number of data elements p rocessed by the wavelet transform must be a power of two. If there are 2 n data elements, the first st ep of the forward transform will p roduce 2 n-1 averages and 2 n-1 differences (between the p rediction and the actual odd element value). These differences are sometimes referr ed to as wavelet coefficients. Figure (3) shows a 4- st ep s forward wavelet transform on a 16-element data set. One of the elegant features of the lifting sch eme is that the inverse transform is a mirror of the forward transform as shown in Fi gur e (4). In the case of the Haar transform, IHJPAS additions are substituted for subt ractions and subtractions for additions. The merge st ep replaces t he sp lit st ep . D4 Lifting Scheme The Daubechies D4 Lifting Scheme wavelet transforms are comp osed of Up date and Predict st eps. In this case a normalization st ep has been added as well. One forward transform st ep is shown in Figure (5) [3, 8]. IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 The sp lit step divides the input data into even elements which are st ored in the f irst half of an N element array section ( S0 to Shalf-1) and odd elements which are st ored in the second half of an N element array section (Sh alf to SN-1). In the forward transform equations below the exp ression S[half+n] references an odd element and S[n] references an even element. Although the diagram above shows two normalization st ep s, in p ractice they are folded into a sin gle fun ction. Forward transform step equations: Up date 1 (U1): For n = 0 to half-1      13  halfSnSnS (8) Where : S[n]: one d imensional sequence of input samples Predict (P1):        1 4 23 0 4 3    halfSShaflShalfS (9) For n = 1 to half-1:        1 4 23 4 3    nSnSnhaflSnhalfS (10) Up date 2 (U2): For n = 0 to half-2      1 nhalfSnSnS (11)      halfShalfShalfS  11 (12) Normalize: For n = 0 to half-1    nSnS 2 13   (13)    halfnShalfnS  2 13 (14) Proposed Hybrid Lifting Scheme In archive sy stems with an amount of digital images that are rap idly growing, there is a great need for efficient ima ge indexing key as access tool in ord er to fully utilize this massive digital resource, and to retriev e stored data base images. The traditional index k ey in database like document numbers, document number and date, sequence number, codes and others is just codes t o images codin g since it does not extract from the image contents. So in this p aper we try to create index key from the document image itself [11]. In general, both Haar and D4 algorithms are used mainly for image decomp osition. Each one of them h as the sp ecial features for accur acy , retrieval and memory sp ace required. IHJPAS In the p rop osed algorithm, Ha ar and D4 algorithms have been used as tools for image k ey generation depends on its features through image multi decomposition until find a numerical value to be the unique image key consists of three p arts (red, green and blue color array). Then, this image key has to be stored with t he document image in the database in ord er to use it later on to discover about any changes on the document image st ored in the database to p revent fraud. The p rop osed HYBRID lifting scheme depends on merging of the lifting schemes (Haar and D4) equations in up date st ep s of D4 in order to find new algor ithm with new IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 features aid for generation image key . Because we did notice that D4 a lgorithm n eeds huge memory to p erform its steps and to save last numbers and Haar algorithm is not very sensitive to small chan ges that may be happened in document image. While, our p rop osed algorithm needs l itt le st orage sp ace and very sensitive cap acity to small changes. Therefore, the followin g up date equation will be used:        2/nhalfSnSnS  (15) The above equation has been taken from Haar algorithm equation (7) with D4 up date 1 (U1) terms in equation (8) and replaces the equation (11) of D4 up date 2 (U2) with Haar up date equation (7). Figure (6) exp lains the structure of the p rop osed HYBRID algorithm U1H which contains the modification of D4 up date 1 and U2H contains t he modification of D4 up date 2. The following st eps rep resent the p rop osal algorithm index key generation. Algorithm: HYBRI D ALG ORITHM FOR G ENERATE IN DEX KEY B ASED ON IMAGE CON TENTS. Inputs : Database Docum ent Images. Output: Im age Index Key. B egin Image scanning; {Key genera tion} For Recurs ive=1 to 14 For i=1 to n {where n = imag e width * image height} Spli t (im age values ) {divide the image val ues to even and odd el ements} Next i For n =0 to hal f -1 {U 1 f or even elem ents}        2/nhalfSnSnS  Next n        1 4 23 0 4 3    halfSShaflShalfS {For the first val ue of odd elements only} For n =1 to half -1 {P1 f or odd el ements}        1 4 23 4 3    nSnSnhaflSnhalfS Next n For n =0 to half -2 {U 2 f or even elem ents}        2/nha lfSnSnS       ha lfShalfShalfS  11 Next n For n =0 to hal f -1 {N ormalization}      nSnS *51 7.0      nhalfShal fnS  *932.1 IHJPAS 1. Proposed Algorithm Imple mentation The imp lementation of the p rop osed algorithm is p erformed on Pentium IV PC with Visual b asic compiler and access database soft ware in two stages as follows: Stage 1: To create ind ex key for input document image. Stage 2: To discoverin g fr aud document image throu gh retrieval document image. IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 Stage 1: As show in Fi gure (7), input new document image and p repare the document image to:  Comput e t he unique index key by the proposed algorithm.  Save t he input image an d it s key index in database. Stage 2: Ch eck the fraud in documents image when retrieval any document images as show in Figure (8):  Comput e t he unique index k ey to retriev al image.  Compare t his index key with st ored index key of t he document image.  If t he index key is t he same, then t he document image is not fraud (original), oth erwise, t he document image is fraud.  Display message box as result t o upper if st at ement. Applications and Re sults The exp eriments focus on discovering for geries in the document images throu gh the comp arison of the retrieval keys between the query document images and the original document images that p reviously stored in the database. Two imp ortant retrieval indicates have been taken into consideration throughout these exp eriments. The first one is the sensitivity to any change in the query documents image (retrieval accuracy ), and the other is the reduction in the memory sp ace required. In order to verify the validity , accuracy and cap ability of the p rop osed HYBRID algorithm for discover ing fraud documents, several imp ortant document images have been selected for this p urp ose. These documents include different college certifications. The exp erimental tests have been carried out in the simplest way so that the fraud document images can be created from the originals after chan ging the originality through addin g small dots or dashes or rep lacing some marks and n ames. The detection of the fraud documents will appear through an alarm installed in the program. The results obtained by using the p rop osed HYBRID algorithm are shown in Fi gur e (9). It can be noted the cap ability of the HYBRID algorithm to detect any change in the document ori ginality even if it is very small or negligible. In the other hand, the same exp eriments have been r epeated but using Haar lifting scheme. The r esults can be shown in Fi gure (10). It rev eals that t here is no d etection record ed for the fraud document. While, when D4 lifting scheme has been used for the same exp eriments, it detects t he fraud document but usin g high st orage memory sp ace for saving index keys comp ared with HYBRID algorithm as shown in Figur e (11). Table (1) sum marizes the results of three typ es of imp ortant document (college certification) usin g Haar, D4 and the p rop osed HYBRID algorithms resp ectively. The table illustrates that the HYBRID algorithm has very good results in terms of fraud detection with small storage sp ace for savin g index k eys. IHJPAS Conclusions Based on the above exp eriment results, the following conclusion can be drawn: 1. The prop osed HYBRID algor ithm shows accurate sensitivity against any change in the original do cuments comp ared to the other lifting methods. 2. The p rop osal Hyp er algorithm have best p erformance in terms of discovering fraud documents IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 3. Harr lifting sch eme shows almost no sensitivity against document fraud. 4. D4 lifting scheme exh ibits accep table sensitivity against document fraud but with relatively high memory sp ace required for saving index keys comp ared Harr and HYBRID algorithm. 5. In terms of the memory sp ace required for sav ing index keys in the database, the HYBRID algorithm p resents less memory sp ace compared to D4 lifting scheme. Re ferences 1. Jim, K.(2008) ,Document Fraud Investigation,(http://www.honestrep orting.com/ articles/rep orts/Document _Fraud_Investigation.asp ). 2. Blanco, J. ( 2002), Guide to Fraud Prevention-Part 4”, (http://www.furninfo.com/abso- lutenm/temp lates /Article_Retailing. asp ?articleid=2359&zoneid=7). 3. Latha, Y. M .; Jinaga , B.C. and Reddy, V.S.K. (2008) “ A Precise Content-based Color Image Retrieval: Lifting Scheme”, ICGST-GVIP Journal, ISSN: 1687-398X , 8, Issue( 1): :25-32. 4. Jin, X. and French, J.C. ( 2003) ,Imp roving Image R etrieval Effectiveness via M ultip le Queries, University of Virginia, Charlott esville, , 86-93. 5. Tody , D. ( 2003) ,Simple Image Retrieval Interface Concepts and Issues”, MMDB’03, New Orleans, Louisian a, USA, 1-6. 6. Pradhan, B.; Sandeep, K.; M ansor, S.; Ramli, A., and Sharif, A.M . (2007), GIS Terrain Data Compression Using Lifting Scheme-A New Dimension, International Journal of The Computer, the Internet and M anagement, 15, ( 2): 9 -19. 7. Sp ires, W. ( 2005) ,Lossless Image Co mpression via the Lifting Scheme, University of Central Florida, 22. 8. Kap lan, i. ( 2003) ,Wavelets and Signal Processing, (http://www.bearcave.co m/misl/misl _tech/wavelets/). 9. Islam, S. ; Bhuyan, M .S.; M adesa, A.H. and Othman, M . ( 2007) ,Inv erse Discrete Wav elet Transform Processor for JPEG 2000, Proceedings of the International Conference on Electrical En gineering and Informatics Institute Technology Bandung, Indon esia, 240-243. IHJPAS 10. Tashakkori, R.; Ty ler, J.M .; Qi, X. and Sallee, C.W. ( 2008)” Effect of Scannin g of M edical Images on the Perfor mance of Lifting”, Dep t. of Computer Science, Ap p alachian State University , Boone, NC 28608, 1-10. 11. Rahma, A. and Jabbar, E. K. (2007) ”Prop osal Database Image Ind ex Key Extracting Algorithm Based on Image Contents(IIKE) “, Journal of Comp uters, Communication, Control and Sy st em Engineering University of Technology 7 (2): 60-67. IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 Fi g. (1 ): Li fting scheme forward wavele t transform. Fi g. (2): Two ste ps in the wavelet li fting scheme Haar forward tran sform . Fi g. (3 ): 4 ste ps of a 16 elemen t wavele t transform. Split Predict Update even values odd values - + Sj - + Split Predict Update - + Split Predict Update Av erages Coefficients Sj forward transform step forward transform step forward transform step forward transform step 16 data elements 8 coeffic ients 4 coe fficie nts 2 coe fficie nts 1 coe fficie nts 8 averages 4 averages 2 averages 1 average Merge Upd at Predict even values odd values + - Sj IHJPAS Fi g. (4 ): Li fting scheme in verse transform. IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 Fi g. (5 ): Forward transfo rm ste p of the li ftin g scheme ve rsion of the Daube chie s D4 Fi g. (6): HYBRID Li fting S cheme Fi g. (7 ): HYBRID Li fting S cheme Appli cation-First S tage. Split P Evenj+1 Oddj+1 - + Sj U - U N2 N1 Split P1 evenj+1 odd j+1 - + Sj U1H - U2H N2 N1 Input Image Compute Image Index key by using HYBRID Algorithm Databas e Input Query Image Retrieval Document Image from Databas e using HYBRID Algorithm Image Databas e Compute Image Index Key by using HYBRID Algorithm for Image Comparison of the Query Keys with the Stored Keys Field to decide Fraud or Original IHJPAS Fi g. (8 ) : HYBRID Li ftin g Scheme Appli cation -Se cond Stage . IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 (a) Original Document. (b) Fraud Document (degree n o. 4 and 13). (c) Alarm Results. Fi g. (9): Expe rimental Results for the college ce rti fi cation u sing HYBRID algorithm. (a) Original Document. (b) Fraud Document (degree n o. 4 and 13). (c) Alarm Results. Fi g. (10): Expe riment Results for the col lege ce rti fi cation u sing Haar li fting scheme . The document is fraud. The document is ori ginal. IHJPAS (a) Origin al Document. (b) Fraud Document (degree no. 4 and 13). (c) Alarm Results. Fi g. (11): Expe riment Results for the college ce rti fi cation u sing D4 li fting scheme . IBN AL- HAITHAM J. FOR PURE & APPL. S CI. VOL.23 (3) 2010 Tabl e (1): Expe riment Results for the college ce rti fi cati ons. Notes: (f ) means fraud document c opy. The document is fraud. IHJPAS