paper 61 Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) Al-khwarizmi Engineering Journal Al-Khwarizmi Engineering Journal, vol.1, no.2,pp 46-51, (2005) FPGA Realization of Two-Dimensional Wavelet and Wavelet Packet Transform Abstract: - The Field Programmable Gate Array (FPGA) approach is the most recent category, which takes the place in the implementation of most of the Digital Signal Processing (DSP) applications. It had proved the capability to handle such problems and supports all the necessary needs like scalability, speed, size, cost, and efficiency. In this paper a new proposed circuit design is implemented for the evaluation of the coefficients of the two-dimensional Wavelet Transform (WT) and Wavelet Packet Transform (WPT) using FPGA is provided. In this implementation the evaluations of the WT & WPT coefficients are depending upon filter tree decomposition using the 2-D discrete convolution algorithm. This implementation was achieved using an FPGA Kit after building the logical circuits on the specified kit that uses the Spartan-IIE electronic library type implemented using the Xilinx Foundation Series 2.1I software. Key words: - FPGA, Wavelet Transform, Wavelet Packet Transform. 1: -Introduction: - In the last few years, the wavelet analysis has take a place in the analysis field and has proved its strength and qualification for handling wide spectrum jobs, specially in the fields of speech and image processing. The Wavelet Transform (WT) & the Wavelet Packet Transform (WPT) are one of the most powerful tools in the digital signal processing field which work as an analyzing tool for the decomposition of the input signals to evaluate the information carried by the Dr. Walid A. Mahmoud Electr ical Engineering Dept. College of Engineering University of Baghdad Dr. Mohammed N. Al-Turfi Computer and Software Engineering Dept. College of Engineering University of Al-Mustansirya Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) signal. And since it provides a representation of time against frequency against magnitude, therefore the frequency component can be evaluated at any specific time with high accuracy. For this reason the wavelet coefficients must be evaluated as accurate, as fast, as easy as possible which represents a problem [1,2]. The FPGA technology was the solution for this problem, because it provides the necessary needs to over come these difficulties because of its high performance capabilities. The last ten years, all the companies that deals with electronic devices productions were in a race with each others and time to be the initiative in the FPGA production field, because of the new word needs for this type of technology. Recently, Field Programmable Gate Arrays have enjoyed widespread use due to several advantages related to relatively high gate density, short design cycle, and low cost. They can be used in all applications that currently use Small – Scale Integrated circuits (SSI), Medium– Scale Integrated circuits (MSI), Large Scale Integrated circuits (LSI), and Programmable Logic Devices (PLDs). They also replace Mask – Programmable Gate Array (MPGAs) in many applications that are limited to 10000 gates and they do not need a very high operational speed. Because of the small number of coefficients that produced by the wavelet transform and their effectiveness, therefore these coefficients where very elegant and suitable to be implemented using the FPGA. And since the WT minimize the amount of the processes on the data in order to obtain the needed coefficients therefore it reduces the processing time. Which leads to reduce complexity, cost, size, and power consumption [2,3,4]. In this paper a new circuit design is dedicated which is suitable for calculating the two-dimensional wavelet transforms coefficients depending upon the decomposition of multi-resolution algorithm using the convolution approach in the representation of the Finite Impulse Response (FIR) filters while implementing the wavelet transform decomposition tree. This approach is used first because of the easiness in representation and implementation. Second because of its perfectness in reconstructing the signal after its decomposition when applying inverse two-dimensional wavelet transform. Third because this approach is very suitable for communication systems Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) because the filters in side the decomposition or reconstruction tree may be used to reduce noise or performing other processes while evaluating the coefficients of the WT and WPT [5,6]. This circuit can operate in a high speed (Operating frequency is 1GHz). Short processing time, and high accuracy (16,32,64 bit for the data and one sign bit), receiving data in many ways whether its serial or parallel and produce data in serial or parallel which made the circuit can be used for general purposes easily. An example and experimental results that has been obtained from the processor is given and shown in this paper with circuit block diagrams to show the data flow and results on the circuit. Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) Sc al e (f re qu en cy ) Time Fig. (1): -the time – scale representation in the wavelet plane 2: -Two Dimensional Wavelet & Wavelet Packet Transform: - In practical, the wavelet transform is of interest for the analysis of the non- stationary signals because it provides an alternative to the classical Short Time Fourier Transform (STFT). For some applications its desirable to see the wave let transform as a signal decomposition onto a set of basis functions in fact basis functions called wavelets always under lie the wavelet analysis. They are obtained from the same prototype wavelet called mother function by dilation and contraction (scaling) as well as shifting. Though the prototype wavelet can be of as a Band Pass Filter (BPF) of constant Quality Factor (Q property) of the other BPFs (wavelets) follows because they are scaled versions of the prototype. So, the notion of scale introduced in WT represents an alternative to frequency, leading to so call time – scale representation where this means that the signal is mapped in the time scale plain [5,6,7]. The Fourier transforms and hence the STFT use the sine and cosine as basis functions for analyzing since they are orthogonal (there is no correlation between them). The same must be applied in WT, which means that we are in need for the basis functions, which they must be orthogonal. And one more condition must be applied. The orthonormality condition in order to get a perfect reconstruction [8,9]. Since the wavelets are localized functions (which means that they are zero out side a finite interval) with zero mean hence many different base vectors formed by dilation and translation of the mother function. Therefore the perfect vector must satisfy the compact support, vanishing Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) moment, smoothness, and orthogonality [10,11]. For orthogonality property, the condition: ∫ Ф (t)Ф (t-m) dt =0 …..(1) Must hold. Where Ф (t) is the bases function. m: is the amount of shifting one interval at a time where m not equal 0. For the orthonormality property the condition: - ∫ Ф (t) dt =1 …..(2) Must hold. The orthonormal bases are good in localization for time and frequency and they are related to special filters for the sub band coding. These filters are lead to exact waveform reconstruction without aliasing and without amplitude and phase distortion [10,11]. For the vanishing moments, and smoothing, they are very important especially for the two dimensional applications. Since the amount of information is bigger than the one dimensional applications and hence the interference between these information’s is larger so we are in need for high localization in frequency which Leeds to made the higher derivations equal to zero. Fig. (2) shows the sub bands distribution of the frequency components of the 2-D signal in the Wavelet plain. Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) x(n) h(n) g(n) h(n) g(n) h(n) g(n) x(n) ↓2 ↓2 ↓2 ↓2 Coding ĥ(n) ĝ(n) ↑2 ↑2 ┼ ĥ(n) ĝ(n) ↑2 ↑2 ┼ ĥ(n) ĝ(n) ↑2 ↑2 ┼ Decoding Fig. (2): - Sub-band coding scheme in terms of MRA In the practical situation the filters approach is the appropriate way for analyzing the signal into its frequency components. This way consists of decomposing the signal into high frequency components and low frequency components depending upon the convolution approach [12,13]. In each level the I/P information is separated into approximate information: - A (j-1) (n) = ∑fj (k) h (k-2n) …(3) And into detail information: - D (j-1) (n) = ∑fj (k) g (k-2n) …(4) Where j, j-1 denotes the decomposition level and the level follow it respectively, h(n) and g(n) are the High Pass Filter (HPF) & Low Pass Filter (LPF) impulse response respectively. Where (A) Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) are the approximated information and the (D) are the detailed information [1]. Inverting the above process makes the reconstruction. This is done by: - 1) Up sampling to over come the down sampling 2) Using ĥ(n) & ĝ(n) which are the inverse of h(n) &g(n) respectively So the reconstruction will follow the formula: - fj+1=2{∑fj (k) g(n-2k) + ∑fj (k) h(n- 2k ) …..(5) This is called filter tree decomposition algorithm, where in this algorithm the sub-band coding scheme And since it uses the convolution approach, then the filters will have constant Q factor because constant filter taps are used, so the relative frequency bands distribution will be as shown in Fig. (3). If we assume using Finite Impulse Response (FIR) filters, then it tern out that the HPF & LPF are related by the formula h(L-1-n) = (-1)n g(n) …..(6) Where L is the filter length (-1) n is transforming the modulation from LPF into a HPF [13]. According to filter tree decomposition algorithm The classical wavelet transform is obtained by the decomposition (low pass filtering and down sampling). This decomposition is processed only on the low frequency branch since it is the more intelligible part as well as most of the information is in this part. This part of low frequency components with narrow bandwidth has the higher rate of information [4,8]. This part takes calculation in the analyzing part of order O (N) “or its linear in complexity with the decomposition level ’’. LL LH LH HL HH HL HH Fig (3): - The relative frequency bands distribution Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) The complete signal analyzing is done by decomposing both the high as well as the low frequency branches. Where this will make the Mallat decomposition tree to be Strictly Binary Tree (SBT) in which each branch is decomposed into two secondary branches. In this type of decomposition we are facing complexity in the analyzing part of order O (Nlog2N) and this is result in completely evenly spaced frequency resolution, and a tree type of a logarithmic splitting and tilling of time – scale plain [6]. The decomposition of the high frequency components which involves wide bandwidth is done by the same strategy (high pass filtering and down sampling) is guarantying the global view of the signal. This is implemented in order to diagnose the positions where the signal has or may has data in it where this kind of analysis is called the complete wavelet transform or wavelet packet transform (WPT)[10,13]. Fig (4) shows the signal tree decomposition using the WPT approach while fig (5) shows the frequency distribution of the signal representation in the previous figure. Frequency Scale Fig. (5): - The freq. representation of the WPT XLL XHH X XL XHL XLH XH Fig. (4): - The tree representation of the WPT Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) 3: -FPGA: - The reason beyond choosing the FPGA for the implementation of nearly all the modern digital systems is its fitness for handling the high computationally expensive problems and cover its intensive need for the parallel processing or pipelining [14]. Therefore among the advantages of FPGAs some are [15,16,17]: - 1) The replacement of the small-scale integrated circuits (SSI) and medium-scale integrated circuits (MSI) chips. 2) The availability of parts of the shelf. 3) Rapid turn around. 4) Low risk. 5) Some FPGAs have the ability of reprogramming. 6) Relatively low cost and flexible design. But its relative long design cycle is a disadvantage because its design process generally requires nine steps as follow [15,16,17]: - 1) Entering the design in the form of schematic, net list, logic expressions, or HDL (Hardware Descriptive Language). 2) Simulate the design for functional verification. 3) Mapping the design into the FPGA architecture. 4) Placing and routing the FPGA design. 5) Extracting the delay parameters of the routed design. 6) Re-simulating for time verification. 7) Generating the FPGA device configuration format. 8) Configuring or programming the device. 9) Testing the product for undesired functional behavior. This long design cycle force the production companies to face a challengeble problem and hence force them to find suitable solutions. Solutions are divided into four categories. The first one is consist of using the schematic approach for designing, which is used if the design elements and the circuit branches are well defined and their functions are specified. The second one is consist of using the Very high-speed integrated circuits Hardware Descriptive Languages (VHDL), where this category is used with looping or iterative processes. The third one uses the Finite State Machine (FSM) which uses with the control circuits, which have small numbers of inputs and outputs. Any of the above categories can be merged together to form a new combination for handling new jobs. For example, we can implement a control circuit in an FSM part to control the process flow of a certain function implemented using the schematic approach. The fourth category uses the high Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) level language which is called the SYSTEM C4 or SYSTEM C5 [18]. This branch is a wild force PCI plug in board with 5-15 Processing Elements (PEs) with a 1MByte imbedded memory attached to each PE, and other inter connectors like FIFO, Crossbars, Single Inline Modules Data (SIMD). This board is installed by its installation software after its placement in the motherboard of the personal computer, where this kind of kits operate on 1,3,5,10,20,33,66 MHz clock frequency with large number of inputs and outputs and wide range of function control and implementation [18]. 4: -FPGA Simulation of Two- Dimensional Wavelet and Wavelet Packet Transform: - This paper shows a new proposed circuit design of an FPGA digital circuit for the evaluation of the coefficients of the two dimensional Wavelet transform; therefore the simulation process should pass through four stages as shown [19,20]. The problem formulation and function establishment represents the first stage. In this stage the general features of the problem are identified in order to specify the needs for the problem solution. In the other hand; the function establishment is very important to specify the necessary equipments and devices for function implementation where the limitations and boundaries are defined. The second stage is represented by over come the limitations and difficulties in a reasonable way keeping an eye to the over all cost. While the implementation of the function must be optimized as much as possible in order to specify the type of the FPGA kit, size, capabilities, frequency ranges, number of inputs and outputs, power consumption, scalability, and compatibility. Operating the optimized designed kit represents the third stage. Generally in this stage two important problems are appearing, the first one is represented by the timing problem, which is the most important. The second one is represented by the production of the undesired results and values through the operating process and the way to get rid of them where most of these values are produced due to the timing problem. The fourth stage is represented by connecting the designed kit to the operating environment and search for its compatibility and the best ways for operating in the presence of other system equipments. Therefore it’s preferred to choose the kit type, as near as possible to the type of other system equipment’s in order to reduce the compatibility problems. Or if it’s possible and not costly to change the whole system since the FPGA kits are Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) relatively of low cost. For the first stage, the problem is represented by the evaluation of the coefficients of Two-Dimensional Wavelet and Wavelet Packet Transform depending upon filter tree decomposition using the convolution approach. For the second stage, the FPGA kit for the implemented design is SPARTAN- IIE .The SPARTAN-IIE 1.8V FPGA gives high performance, abundant logic resources, rich features set, all at exceptional low price. This family contains seven members offers density range from 50000 to 600000 system gates with wide operating frequency range (500KHz – 2.5GhHz), delivering more I/Os and other features per dollar than other FPGAs by combining advanced process technology with a streamlined architecture based on the proven Vertex-E. Features include Block RAM (288 K bit), Distributed RAM (221K bit), 19 selectable I/O standard, 4 DLL (Delay Locked Loop), Fast Predictable Interconnection means that successive design iteration continue to meet timing requirements. The third stage depending upon the type of application, its operating frequency, its operating speed, and its response time. For example if we are dealing with speech signal then the highest possible frequency component does not accede 6KHz in any way. And the speech rate is relatively slow, while in target identification the image verification must be real time and very accurate with a very high response speed for direction changing. But if systems are not in need for high speed and have symmetry in their circuits’ architecture then these symmetric circuits can be combined in few ones. Correspondly the operations may be performed in sequence, or else redundant circuits will be built to perform parallel processing or pipe lining to achieve the necessary operating speed. The fourth stage is represented by implementing the whole system from its receiving point to its transmitting point since the used kit is more than enough to handle this task from size, speed, accuracy, cost. So building the whole system on the same kit reduces the compatibility problems to the minimum. In this design we are spotlighting on an important point, which is the usage of multi-purpose instruments to perform multi process. Its represented by the Full Adder/Subtractor circuit with the shift register following. This is done to reduce the number of used devices, which reduces the power consumption and hence reduces the heat generation and heat sink size and reduces the kit die area but it will be in need for a control circuit to synchronize the circuit operations. Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) 5: -Demonstrated Example: - An example will be simulated on a Spartan-IIE FPGA platform using the schematic approach implemented on Xilinx Foundation 2.1I software. This software provides three types of entrance for the circuit design. Either by schematic, or by Very high-speed integrated circuit Hardware Descriptive Language (VHDL), or by Finite State Machine (FSM). This software provides very high programming capabilities that specify the most tiny details in the design (starting from the inputs passing through the latching and interfacing parts reaching to the process and the evaluations right to the outlet of the results to the output circuit). All other necessaries represented by the clock evaluation (1 GHz in the circuit designed), the synchronization problems, design problems (Essential hazard, Racing, and Oscillation problems). The most important part is that the software evaluates the problem and might specify a solution for it. An example of a two-dimension signal of 4*4 matrix will be demonstrated fully with its logical circuit representation and its corresponding results. In this example the full tree decomposition of the two-dimensional wavelet and wavelet packet transform is evaluated. Taking in the consideration the arrangement of the input and output data. The data entered vector by vector. The output data obtained as four, 2*2 matrices vector by vector, arranged as HH, HL, LH, LL matrices. Each result is represented in sixteen-bit width. Between any two cascaded stages there exist latches first to keep the synchronization while translating the data from one stage to another second to hold these values until the circuit reach the stable state in case of value changing due to negative signs, racing, essential hazard, or oscillation occurs. Each stage operate Asynchronously in order to speed up the process (combinational logic circuits) with no delay or storing devices. But the delivery from one stage to the next one must be done under full circuit control synchronization. The input matrix is: - The output matrices are: - 1 1 0 1 2 1 1 0 0 2 1 1 3 0 1 2 4*4 2 5 5 2*2 HH= -1 0 -1 -1 2*2 HL= 1 0 1 -1 2*2 LH= -1 -2 -5 1 2*2 LL= Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) Fig (6): - The Decomposition circuit of 2D wavelet Packet Transform 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 5 bit synchronous counter 12H HH R11 HH R12 HL R11 HL R12 LH R11 LH R12 LL R11 LL R12 HH R21 HH R22 HL R21 HL R22 LH R21 LL R21 LL R22 LH R22 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor 16 bit Full Adder/ Subtractor T he in pu t d at a fr om th e ex te rn al e nv ir on m en ts fr om th e to p “e le m en t 1 ,1 ” to th e bo tto m “ el em en t 4 ,4 ” ro w b y ro w 15H Controller =1 Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) Fig (7): - The Reconstruction circuit of 2D wavelet Packet Transform 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto HH R11 HH R12 HL R11 HL R12 LH R11 LH R12 LL R11 LL R12 HH R21 HH R22 HL R21 HL R22 LH R21 LL R21 LL R22 LH R22 5 bit synchronous counter 12H 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto 16 bit Full Adder/ Subtracto T he o ut pu t d at a fr om th e R ec on st ru ct io n ci rc ui t w hi ch re pr es en ts th e in pu t d at a in pu t t o th e D ec om po si tio n ci rc ui t f ro m th e to p “e le m en t 1 ,1 ” to th e bo tto m “ el em en t 4 ,4 ” 15H Controller =1 Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) Tables (1) and (2) shows the decomposition and reconstruction result evaluation respectively for the 4*4 signal given above. Table (1): -The design result evaluation of the two dimensional 4*4 point’s decomposition circuit Data input First stage Second stage Data output 0000000000000001 (1) 0000000000000011 (3) 0000000000000110 (6) 0000000000000110 0000000000000010 (2) 1000000000000001 (-1) 0000000000000100 (4) 0000000000000100 0000000000000000 (0) 0000000000000011 (3) 0000000000000101 (5) 0000000000000101 0000000000000001 (1) 0000000000000001 (1) 0000000000000101 (5) 0000000000000101 0000000000000010 (2) 0000000000000000 (0) 0000000000000000 (0) 0000000000000000 0000000000000001 (1) 0000000000000000 (0) 1000000000000100 (-4) 1000000000000100 1111111111111011 0000000000000000 (0) 0000000000000100 (4) 1000000000000011 (-3) 1000000000000011 1111111111111100 0000000000000011 (3) 1000000000000010 (-2) 1000000000000011 (-3) 1000000000000011 1111111111111100 0000000000000000 (0) 0000000000000001 (1) 0000000000000000 (0) 0000000000000000 0000000000000011 (3) 1000000000000001 (-1) 1000000000000010 (-2) 1000000000000010 1111111111111101 0000000000000001 (1) 0000000000000100 (4) 0000000000000001 (1) 0000000000000001 0000000000000010 (2) 0000000000000010 (2) 0000000000000001 (1) 0000000000000001 0000000000000001 (1) 0000000000000001 (1) 1000000000000010 (-2) 1000000000000010 1111111111111101 0000000000000001 (1) 0000000000000001 (1) 0000000000000010 (2) 0000000000000010 0000000000000000 (0) 0000000000000100 (4) 1000000000000011 (-3) 1000000000000011 1111111111111100 0000000000000010 (2) 0000000000000000 (0) 0000000000000001 (1) 0000000000000001 Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) Table (2): -The design result evaluation of the two dimensional 4*4 point’s reconstruction circuit Data input First stage Second stage Data output 0000000000000110 0000000000000110 (6) 0000000000000100 (4) 0000000000000001 (1) 0000000000000100 0000000000000110 (6) 0000000000001000 (8) 0000000000000010 (2) 0000000000000101 0000000000000000 (0) 0000000000000000 (0) 0000000000000000 (0) 0000000000000101 0000000000001000 (8) 0000000000000100 (4) 0000000000000001 (1) 0000000000000000 0000000000000010 (2) 0000000000001000 (8) 0000000000000010 (2) 1000000000000100 1111111111111011 0000000000001000 (8) 0000000000000100 (4) 0000000000000001 (1) 1000000000000011 1111111111111100 0000000000000010 (2) 0000000000000000 (0) 0000000000000000 (0) 1000000000000011 1111111111111100 0000000000001000 (8) 0000000000001100 (12) 0000000000000011 (3) 0000000000000000 1000000000000010 (-2) 0000000000000000 (0) 0000000000000000 (0) 1000000000000010 1111111111111101 0000000000000010 (2) 0000000000001100 (12) 0000000000000011 (3) 0000000000000001 0000000000000000 (0) 0000000000000100 (4) 0000000000000001 (1) 0000000000000001 1000000000000100 (-4) 0000000000001000 (8) 0000000000000010 (2) 1000000000000010 1111111111111101 1000000000000010 (-2) 0000000000000100 (4) 0000000000000001 (1) 0000000000000010 0000000000000100 (4) 0000000000000100 (4) 0000000000000001 (1) 1000000000000011 1111111111111100 0000000000000010 (2) 0000000000000000 (0) 0000000000000000 (0) 0000000000000001 0000000000000000 (0) 0000000000001000 (8) 0000000000000010 (2) The output results here suffering from a problem that they are multiplied by certain factor which is 4 because we are decomposing for two levels only in the high and the low frequency branches. So we must divide her the data by 4 or the data must be shifted for 2 bits. Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) 6: -Conclusions: - The aim of this paper is to propose a new circuit design for the implementation and evaluation of the two-dimensional wavelet and wavelet packet transform. In this design the DSP point of view was adapted, where nearly in all other designs the implemented circuits where from the communication point of view, where this type of designs need no wide range filter taps design which reduces the cost. Because of the kit wide capabilities high speed, high accuracy, low cost, low power consumption, and easy to be handled where achieved. Other privilege is achieved by overcoming the disadvantage of the conventional wavelet and wavelet packet transform coder-decoder which is not able to produce the coefficients until the signal is complete. The production now is achieved while the data is received with no need to wait. For example; if the signal is of 8*8 then the evaluation of the WT & WPT coefficients will start if the received signal is of 4*4 to fit for the designed circuit. 7: -References: - [1] Jan E. Odegard & Ivan W. Selensnick 1993 “Introduction to wavelet and wavelet transform”. [2] F.Argentini, G.Benelli, reprinted from electronic letters, vol 28, pp 513-515, 27th February 1992 “IIR implementation of wavelet decomposition for digital filter analysis “. [3] A.Aron & E.Rosenberg, IEEE transactions, vol. 64, pp 475-487, April 1986. “Speaker verification using pattern recognition approach “. [4] N. Rex Dixon & Thomas B. Martin 1989.A volume in the IEEE press selected reprint series “Automatic speech and speaker recognition “. [5] L.R.Rabiner & R.W. Shafer 1982 “Digital processing of speech signal”. [6] Lawrence Rabiner & Biing – Hwang Juang 1993 “Fundamentals of speech signals”. [7] C.Cheng, Msc thesis, Rise university, department of electrical engineering, May 1996. “Wavelet signal processing of digital audio with application in electro-acoustic music “. [8] G.R.Doddington processing of electr.76, pp 22-26, May 1976. “Personal identity verification using voice “. [9] R.J.Mammare, X.Zhang, & Ramachadram, IEEE signal processing magazine, pp 58-71, September 1996 “Robust speaker recognition “. [10] D.E.Newland, third addition, 1994. “An introduction to random vibration spectral and wavelet transform “. [11] A.Chinienti, I. Cibrario, R. Picco, DSP 1997 vol56 p.p 633-636 ”Filter evaluation of Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51 (2005) wavelet transform”. [12] Dusan Levicky, Emil Matus, Peter Kral, electrical engineering 1996 vol.47 p.p281-286 ”Wavelet transform analysis-synthesis- algorithms”. [13] Mohammad N.Hussain, Msc. Thesis, University of Baghdad, department of electrical engineering, October 2000, “Speaker Recognition Based Upon Phonemes Using Wavelet Packet Transform”. [14] V.Herrero, J.Cerda, R.Gadea, M.Martinez, A.Sebestia, Group of design of digital systems, department of electrical engineering, June 2003, “Implementation of 1-D Daubechies Wavelet transform on FPGA”. [15] Piyush Jamkhandi, Amar Mukherjee, Kunal Mukherjee, Robert Franceschini, school of electrical engineering and computer science, July 2003, “ Parallel H/W- S/W architecture for computation of discrete wavelet transform using RMF algorithm”. [16] Sarin George Mathen, thesis, University of Kansas ,June 2000, “Wavelet Transform based adaptive image compression on FPGA”. [17] www.Xilinix.com, July 2003, “Spartan-IIE 1.8 V FPGA family: introduction and ordering information’s”. [18] Pak K. Chan, University of California, October 2000, “Digital Design Using Field Programmable Gate Array”. [19] John E. Gilbert , September 2003, “Wavelets : theory and applications on FPGA”. [20] Miguel Figueroa , Chris Diorio, University of Washington ,September 2003,”A 200 MHz , 3mW,16-Tap Mixed –signal FIR Filter”. http://www.Xilinix.com