paper 61


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

                  
                                                                     Al-khwarizmi 

                 Engineering  
                      Journal 

                                   Al-Khwarizmi Engineering Journal, vol.1, no.2,pp 46-51, (2005)         
 

FPGA Realization of Two-Dimensional 
Wavelet and Wavelet Packet Transform 

 
Abstract: - 

The Field Programmable Gate Array (FPGA) approach is the most recent 

category, which takes the place in the implementation of most of the Digital Signal 

Processing (DSP) applications. It had proved the capability to handle such problems and 

supports all the necessary needs like scalability, speed, size, cost, and efficiency.  

In this paper a new proposed circuit design is implemented for the evaluation of 

the coefficients of the two-dimensional Wavelet Transform (WT) and Wavelet Packet 

Transform (WPT) using FPGA is provided. 

In this implementation the evaluations of the WT & WPT coefficients are 

depending upon filter tree decomposition using the 2-D discrete convolution algorithm. 

This implementation was achieved using an FPGA Kit after building the logical circuits 

on the specified kit that uses the Spartan-IIE electronic library type implemented using 

the Xilinx Foundation Series 2.1I software. 

Key words: - 
FPGA, Wavelet Transform, Wavelet Packet Transform. 

 
1: -Introduction: - 
In the last few years, the wavelet 

analysis has take a place in the analysis 

field and has proved its strength and 

qualification for handling wide spectrum 

jobs, specially in the fields of speech and 

image processing. 

The Wavelet Transform (WT) & 

the Wavelet Packet Transform (WPT) are 

one of the most powerful tools in the 

digital signal processing field which 

work as an analyzing tool for the 

decomposition of the input signals to 

evaluate the information carried by the 

Dr. Walid A. Mahmoud 
 Electr ical Engineering Dept. 

College of Engineering 
University of Baghdad 

 
Dr. Mohammed N. Al-Turfi 
Computer and Software Engineering Dept. 

College of Engineering 
University of Al-Mustansirya 

 
Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

signal. And since it provides a 

representation of time against frequency 

against magnitude, therefore the 

frequency component can be evaluated at 

any specific time with high accuracy. For 

this reason the wavelet coefficients must 

be evaluated as accurate, as fast, as easy 

as possible which represents a problem 

[1,2]. 

The FPGA technology was the 

solution for this problem, because it 

provides the necessary needs to over 

come these difficulties because of its 

high performance capabilities. 

The last ten years, all the 

companies that deals with electronic 

devices productions were in a race with 

each others and time to be the initiative 

in the FPGA production field, because of 

the new word needs for this type of 

technology. 

Recently, Field Programmable 

Gate Arrays have enjoyed widespread 

use due to several advantages related to 

relatively high gate density, short design 

cycle, and low cost. They can be used in 

all applications that currently use Small – 

Scale Integrated circuits (SSI), Medium– 

Scale Integrated circuits (MSI), Large 

Scale Integrated circuits (LSI), and 

Programmable Logic Devices (PLDs).  

They also replace Mask – 

Programmable Gate Array (MPGAs) in 

many applications that are limited to 

10000 gates and they do not need a very 

high operational speed. 

Because of the small number of 

coefficients that produced by the wavelet 

transform and their effectiveness, 

therefore these coefficients where very 

elegant and suitable to be implemented 

using the FPGA. And since the WT 

minimize the amount of the processes on 

the data in order to obtain the needed 

coefficients therefore it reduces the 

processing time. Which leads to reduce 

complexity, cost, size, and power 

consumption [2,3,4]. 

In this paper a new circuit design is 

dedicated which is suitable for 

calculating the two-dimensional wavelet 

transforms coefficients depending upon 

the decomposition of multi-resolution 

algorithm using the convolution 

approach in the representation of the 

Finite Impulse Response (FIR) filters 

while implementing the wavelet 

transform decomposition tree. 

This approach is used first because 

of the easiness in representation and 

implementation. Second because of its 

perfectness in reconstructing the signal 

after its decomposition when applying 

inverse two-dimensional wavelet 

transform. Third because this approach is 

very suitable for communication systems 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

because the filters in side the 

decomposition or reconstruction tree may 

be used to reduce noise or performing 

other processes while evaluating the 

coefficients of the WT and WPT [5,6]. 

This circuit can operate in a high 

speed (Operating frequency is 1GHz). 

Short processing time, and high accuracy 

(16,32,64 bit for the data and one sign 

bit), receiving data in many ways 

whether its serial or parallel and produce 

data in serial or parallel which made the 

circuit can be used for general purposes 

easily.   

An example and experimental 

results that has been obtained from the 

processor is given and shown in this 

paper with circuit block diagrams to 

show the data flow and results on the 

circuit. 

 
Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   
Sc

al
e 

(f
re

qu
en

cy
)  

   
Time  

Fig. (1): -the time – scale representation in the wavelet plane 

2: -Two Dimensional Wavelet & 
Wavelet Packet Transform: - 

In practical, the wavelet transform 

is of interest for the analysis of the non-

stationary signals because it provides an 

alternative to the classical Short Time 

Fourier Transform (STFT).  

For some applications its 

desirable to see the wave let transform 

as a signal decomposition onto a set of 

basis functions in fact basis functions 

called wavelets always under lie the 

wavelet analysis. They are obtained 

from the same prototype wavelet called 

mother function by dilation and 

contraction (scaling) as well as shifting. 

Though the prototype wavelet can be of 

as a Band Pass Filter (BPF) of constant 

Quality Factor (Q property) of the other 

BPFs (wavelets) follows because they 

are scaled versions of the prototype. So, 

the notion of scale introduced in WT 

represents an alternative to frequency, 

leading to so call time – scale 

representation where this means that the 

signal is mapped in the time scale plain 

[5,6,7]. 

 
The Fourier transforms and hence 

the STFT use the sine and cosine as basis 

functions for analyzing since they are 

orthogonal (there is no correlation between 

them). The same must be applied in WT, 

which means that we are in need for the 

basis functions, which they must be 

orthogonal. And one more condition must 

be applied. The orthonormality condition 

in order to get a perfect reconstruction 

[8,9]. 

Since the wavelets are localized 

functions (which means that they are zero 

out side a finite interval) with zero mean 

hence many different base vectors formed 

by dilation and translation of the mother 

function. Therefore the perfect vector must 

satisfy the compact support, vanishing 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

moment, smoothness, and orthogonality 

[10,11]. 

For orthogonality property, the 

condition:  

 ∫ Ф (t)Ф (t-m) dt =0 …..(1) 

 
Must hold. Where Ф (t) is the bases 

function. 

 m: is the amount of shifting one 

interval at a time where m not equal 0. 

 For the orthonormality property the 

condition: - 

∫ Ф (t) dt =1  …..(2) 

Must hold.  

The orthonormal bases are good in 

localization for time and frequency and 

they are related to special filters for the 

sub band coding. These filters are lead to 

exact waveform reconstruction without 

aliasing and without amplitude and phase 

distortion [10,11]. 

 
For the vanishing moments, and 

smoothing, they are very important 

especially for the two dimensional 

applications. Since the amount of 

information is bigger than the one 

dimensional applications and hence the 

interference between these information’s 

is larger so we are in need for high 

localization in frequency which Leeds to 

made the higher derivations equal to zero. 

 Fig. (2) shows the sub bands 

distribution of the frequency components 

of the 2-D signal in the Wavelet plain. 

 
Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

x(n) 

 
          h(n) 

 
        g(n) 

 
         h(n) 

 
      g(n) 

 
        h(n) 

 
       g(n) 

     x(n) 

      ↓2 

    ↓2 

     ↓2 

    ↓2 

Coding  

 
       ĥ(n) 

    
       ĝ(n)      ↑2 

     ↑2 

   ┼ 
    
       ĥ(n) 

 
       ĝ(n)   ↑2 

   ↑2 

   ┼ 
  
         ĥ(n) 

  
        ĝ(n)       ↑2 

     ↑2 

     ┼ 

Decoding  

Fig. (2): - Sub-band coding scheme in terms of MRA 

 
In the practical situation the filters 

approach is the appropriate way for 

analyzing the signal into its frequency 

components. This way consists of 

decomposing the signal into high frequency 

components and low frequency components 

depending upon the convolution approach 

[12,13]. 

In each level the I/P information is 

separated into approximate information: - 

A (j-1) (n) = ∑fj (k) h (k-2n) 

  …(3) 

 And into detail information: - 

D (j-1) (n) = ∑fj (k) g (k-2n) 

  …(4) 

Where j, j-1 denotes the 

decomposition level and the level follow it 

respectively, h(n) and g(n) are the High 

Pass Filter (HPF) & Low Pass Filter (LPF) 

impulse response respectively. Where (A) 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

are the approximated information and the 

(D) are the detailed information [1]. 

Inverting the above process makes 

the reconstruction. This is done by: - 

1) Up sampling to over come the 

down sampling  

2) Using ĥ(n) & ĝ(n) which are the 

inverse of h(n) &g(n) respectively  

So the reconstruction will follow the 

formula: - 

fj+1=2{∑fj (k) g(n-2k) +  ∑fj (k) h(n-

2k )   …..(5) 

This is called filter tree 

decomposition algorithm, where in this 

algorithm the sub-band coding scheme And 

since it uses the convolution approach, then 

the filters will have constant Q factor 

because constant filter taps are used, so the 

relative frequency bands distribution will 

be as shown in Fig. (3).  

 
If we assume using Finite Impulse 

Response (FIR) filters, then it tern out that 

the HPF & LPF are related by the formula  

h(L-1-n) = (-1)n g(n)  

 …..(6) 

 Where L is the filter length (-1) n is 

transforming the modulation from LPF into 

a HPF [13]. 

According to filter tree decomposition 

algorithm The classical wavelet transform is 

obtained by the decomposition (low pass 

filtering and down sampling). This 

decomposition is processed only on the low 

frequency branch since it is the more 

intelligible part as well as most of the 

information is in this part. This part of low 

frequency components with narrow 

bandwidth has the higher rate of information 

[4,8]. This part takes calculation in the 

analyzing part of order O (N) “or its linear 

in complexity with the decomposition level 

’’. 

 
     LL                   LH   
 

                                                               LH 
 
     HL                    HH                                                       
 
 
                              HL                                         HH 
Fig (3): - The relative frequency bands distribution 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

The complete signal analyzing is done 

by decomposing both the high as well as the 

low frequency branches. Where this will 

make the Mallat decomposition tree to be 

Strictly Binary Tree (SBT) in which each 

branch is decomposed into two secondary 

branches.  

In this type of decomposition we are 

facing complexity in the analyzing part of 

order O (Nlog2N) and this is result in 

completely evenly spaced frequency 

resolution, and a tree type of a logarithmic 

splitting and tilling of time – scale plain [6]. 

The decomposition of the high frequency 

components which involves wide bandwidth 

is done by the same strategy (high pass 

filtering and down sampling) is guarantying 

the global view of the signal. This is 

implemented in order to diagnose the 

positions where the signal has or may has 

data in it where this kind of analysis is 

called the complete wavelet transform or 

wavelet packet transform (WPT)[10,13]. 

Fig (4) shows the signal tree 

decomposition using the WPT approach 

while fig (5) shows the frequency 

distribution of the signal representation in 

the previous figure. 

                                                                    
Frequency 

Scale 
Fig. (5): - The freq. representation of the WPT 

XLL XHH 

X 

XL 

XHL XLH  

XH 

Fig. (4): - The tree representation of the WPT 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

3: -FPGA: - 
The reason beyond choosing the 

FPGA for the implementation of nearly all 

the modern digital systems is its fitness for 

handling the high computationally 

expensive problems and cover its intensive 

need for the parallel processing or 

pipelining [14]. 

Therefore among the advantages of 

FPGAs some are [15,16,17]: - 

1) The replacement of the small-scale 

integrated circuits (SSI) and medium-scale 

integrated circuits (MSI) chips. 

2) The availability of parts of the shelf. 

3) Rapid turn around. 

4) Low risk. 

5) Some FPGAs have the ability of 

reprogramming. 

6) Relatively low cost and flexible design. 

But its relative long design cycle is a 

disadvantage because its design process 

generally requires nine steps as follow 

[15,16,17]: - 

1) Entering the design in the form of 

schematic, net list, logic expressions, or 

HDL (Hardware Descriptive Language). 

2) Simulate the design for functional 

verification. 

3) Mapping the design into the FPGA 

architecture. 

4) Placing and routing the FPGA design. 

5) Extracting the delay parameters of the 

routed design. 

6) Re-simulating for time verification. 

7) Generating the FPGA device 

configuration format. 

8) Configuring or programming the device. 

9) Testing the product for undesired 

functional behavior. 

This long design cycle force the 

production companies to face a 

challengeble problem and hence force them 

to find suitable solutions. 

Solutions are divided into four 

categories. The first one is consist of using 

the schematic approach for designing, 

which is used if the design elements and the 

circuit branches are well defined and their 

functions are specified. 

The second one is consist of using the 

Very high-speed integrated circuits 

Hardware Descriptive Languages (VHDL), 

where this category is used with looping or 

iterative processes. 

The third one uses the Finite State 

Machine (FSM) which uses with the 

control circuits, which have small numbers 

of inputs and outputs. 

Any of the above categories can be 

merged together to form a new combination 

for handling new jobs. For example, we can 

implement a control circuit in an FSM part 

to control the process flow of a certain 

function implemented using the schematic 

approach. 

The fourth category uses the high 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

level language which is called the 

SYSTEM C4 or SYSTEM C5 [18]. 

This branch is a wild force PCI plug 

in board with 5-15 Processing Elements 

(PEs) with a 1MByte imbedded memory 

attached to each PE, and other inter 

connectors like FIFO, Crossbars, Single 

Inline Modules Data (SIMD). 

This board is installed by its 

installation software after its placement in 

the motherboard of the personal computer, 

where this kind of kits operate on 

1,3,5,10,20,33,66 MHz clock frequency 

with large number of inputs and outputs 

and wide range of function control and 

implementation [18]. 

4: -FPGA Simulation of Two-
Dimensional Wavelet and Wavelet 
Packet Transform: - 

This paper shows a new proposed 

circuit design of an FPGA digital circuit for 

the evaluation of the coefficients of the two 

dimensional Wavelet transform; therefore 

the simulation process should pass through 

four stages as shown [19,20]. 

The problem formulation and 

function establishment represents the first 

stage. In this stage the general features of 

the problem are identified in order to 

specify the needs for the problem solution. 

In the other hand; the function 

establishment is very important to specify 

the necessary equipments and devices for 

function implementation where the 

limitations and boundaries are defined. 

The second stage is represented by 

over come the limitations and difficulties in 

a reasonable way keeping an eye to the 

over all cost. While the implementation of 

the function must be optimized as much as 

possible in order to specify the type of the 

FPGA kit, size, capabilities, frequency 

ranges, number of inputs and outputs, 

power consumption, scalability, and 

compatibility. 

Operating the optimized designed kit 

represents the third stage. Generally in this 

stage two important problems are 

appearing, the first one is represented by 

the timing problem, which is the most 

important. The second one is represented 

by the production of the undesired results 

and values through the operating process 

and the way to get rid of them where most 

of these values are produced due to the 

timing problem. 

The fourth stage is represented by 

connecting the designed kit to the operating 

environment and search for its 

compatibility and the best ways for 

operating in the presence of other system 

equipments. Therefore it’s preferred to 

choose the kit type, as near as possible to 

the type of other system equipment’s in 

order to reduce the compatibility problems. 

Or if it’s possible and not costly to change 

the whole system since the FPGA kits are 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

relatively of low cost.   

For the first stage, the problem is 

represented by the evaluation of the 

coefficients of Two-Dimensional Wavelet 

and Wavelet Packet Transform depending 

upon filter tree decomposition using the 

convolution approach. 

For the second stage, the FPGA kit 

for the implemented design is SPARTAN-

IIE .The SPARTAN-IIE 1.8V FPGA gives 

high performance, abundant logic 

resources, rich features set, all at 

exceptional low price. This family contains 

seven members offers density range from 

50000 to 600000 system gates with wide 

operating frequency range (500KHz – 

2.5GhHz), delivering more I/Os and other 

features per dollar than other FPGAs by 

combining advanced process technology 

with a streamlined architecture based on the 

proven Vertex-E. Features include Block 

RAM (288 K bit), Distributed RAM (221K 

bit), 19 selectable I/O standard, 4 DLL 

(Delay Locked Loop), Fast Predictable 

Interconnection means that successive 

design iteration continue to meet timing 

requirements. 

The third stage depending upon the 

type of application, its operating frequency, 

its operating speed, and its response time. 

For example if we are dealing with 

speech signal then the highest possible 

frequency component does not accede 

6KHz in any way. And the speech rate is 

relatively slow, while in target 

identification the image verification must 

be real time and very accurate with a very 

high response speed for direction changing. 

But if systems are not in need for 

high speed and have symmetry in their 

circuits’ architecture then these symmetric 

circuits can be combined in few ones. 

Correspondly the operations may be 

performed in sequence, or else redundant 

circuits will be built to perform parallel 

processing or pipe lining to achieve the 

necessary operating speed. 

The fourth stage is represented by 
implementing the whole system from its 
receiving point to its transmitting point 
since the used kit is more than enough to 
handle this task from size, speed, accuracy, 
cost. So building the whole system on the 
same kit reduces the compatibility 
problems to the minimum.  

In this design we are spotlighting on 

an important point, which is the usage of 

multi-purpose instruments to perform multi 

process. Its represented by the Full 

Adder/Subtractor circuit with the shift 

register following. This is done to reduce 

the number of used devices, which reduces 

the power consumption and hence reduces 

the heat generation and heat sink size and 

reduces the kit die area but it will be in 

need for a control circuit to synchronize the 

circuit operations. 

 
Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

5: -Demonstrated Example: - 

An example will be simulated on a 

Spartan-IIE FPGA platform using the 

schematic approach implemented on Xilinx 

Foundation 2.1I software. This software 

provides three types of entrance for the 

circuit design. Either by schematic, or by 

Very high-speed integrated circuit 

Hardware Descriptive Language (VHDL), 

or by Finite State Machine (FSM). 

This software provides very high 

programming capabilities that specify the 

most tiny details in the design (starting 

from the inputs passing through the 

latching and interfacing parts reaching to 

the process and the evaluations right to the 

outlet of the results to the output circuit). 

All other necessaries represented by the 

clock evaluation (1 GHz in the circuit 

designed), the synchronization problems, 

design problems (Essential hazard, Racing, 

and Oscillation problems). The most 

important part is that the software evaluates 

the problem and might specify a solution 

for it. 

An example of a two-dimension 

signal of 4*4 matrix will be demonstrated 

fully with its logical circuit representation 

and its corresponding results. 

In this example the full tree 

decomposition of the two-dimensional 

wavelet and wavelet packet transform is 

evaluated. Taking in the consideration the 

arrangement of the input and output data. 

The data entered vector by vector. 

The output data obtained as four, 2*2 

matrices vector by vector, arranged as HH, 

HL, LH, LL matrices. Each result is 

represented in sixteen-bit width. 

Between any two cascaded stages 

there exist latches first to keep the 

synchronization while translating the data 

from one stage to another second to hold 

these values until the circuit reach the 

stable state in case of value changing due to 

negative signs, racing, essential hazard, or 

oscillation occurs. 

Each stage operate Asynchronously 

in order to speed up the process 

(combinational logic circuits) with no delay 

or storing devices. But the delivery from 

one stage to the next one must be done 

under full circuit control synchronization. 

The input matrix is: - 

 
The output matrices are: - 

 
1       1       0       1 
2       1       1       0 
0       2       1       1 
3       0       1       2 4*4 

 2 
5       5 

2*2 

HH= 
-1      0 
-1     -1 

2*2 

HL= 
1      0 
1     -1 

2*2 

LH= 
-1    -2 
-5    1 

2*2 

LL= 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

 
Fig (6): - The Decomposition circuit of 2D wavelet Packet Transform 

16 bit Full 

Adder/ 

Subtractor 

16 bit Full 

Adder/ 

Subtractor 

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

5 bit 
synchronous 

counter 
 

12H 

HH 

R11 
HH

R12 

HL

R11 
HL

R12 

LH
R11 

LH

R12 

LL 

R11 
LL 

R12 

HH
R21 

HH

R22 

HL
R21 

HL

R22 

LH

R21 

LL 

R21 
LL

R22 

LH 

R22 

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

16 bit Full 

Adder/ 

Subtractor  

T
he

 in
pu

t d
at

a 
fr

om
 th

e 
ex

te
rn

al
 e

nv
ir

on
m

en
ts

 fr
om

 th
e 

to
p 
“e

le
m

en
t 1

,1
” 

to
 th

e 
bo

tto
m

 “
el

em
en

t 4
,4
” 

ro
w

 b
y 

ro
w

 
15H 

Controller =1 
 

Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

 
Fig (7): - The Reconstruction circuit of 2D wavelet Packet Transform 

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

HH 

R11 
HH 

R12 

HL 

R11 
HL 

R12 

LH 

R11 
LH 

R12 

LL 

R11 
LL 

R12 

HH 

R21 
HH 

R22 

HL 

R21 
HL 

R22 

LH 

R21 

LL 

R21 
LL 

R22 

LH 

R22 

5 bit 
synchronous 

counter 
 

12H 

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

16 bit Full 

Adder/ 

Subtracto

T
he

 o
ut

pu
t d

at
a 

fr
om

 th
e 

R
ec

on
st

ru
ct

io
n 

ci
rc

ui
t w

hi
ch

 re
pr

es
en

ts
 th

e 
in

pu
t d

at
a 

in
pu

t t
o 

th
e 

D
ec

om
po

si
tio

n 
ci

rc
ui

t f
ro

m
 th

e 
to

p 
“e

le
m

en
t 1

,1
” 

to
 th

e 
bo

tto
m

 “
el

em
en

t 4
,4
” 

 
15H 

Controller =1 
 

Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

Tables (1) and (2) shows the decomposition and reconstruction result evaluation respectively for 

the 4*4 signal given above. 

 
Table (1): -The design result evaluation of the two dimensional 4*4 point’s decomposition circuit 

Data input First stage Second stage Data output 
0000000000000001 

(1) 

0000000000000011 

(3) 

 0000000000000110 

(6) 

  
0000000000000110 
0000000000000010 

(2) 

 1000000000000001 

(-1) 

0000000000000100 

(4) 

  
0000000000000100 
0000000000000000

(0) 

0000000000000011 

(3) 

 0000000000000101 

(5) 

  
0000000000000101 
0000000000000001 

(1) 

 0000000000000001 

(1) 

0000000000000101 

(5) 

  
0000000000000101 
0000000000000010 

(2) 

0000000000000000 

(0) 

  0000000000000000 

(0) 

 
0000000000000000 
0000000000000001 

(1) 

 0000000000000000 

(0) 

 1000000000000100 

(-4) 

1000000000000100 

1111111111111011 
0000000000000000

(0) 

0000000000000100 

(4) 

  1000000000000011 

(-3) 

1000000000000011 

1111111111111100 
0000000000000011 

(3) 

 1000000000000010 

(-2) 

 1000000000000011 

(-3) 

1000000000000011 

1111111111111100 
0000000000000000

(0) 

0000000000000001 

(1) 

 0000000000000000 

(0) 

  
0000000000000000 
0000000000000011 

(3) 

 1000000000000001 

(-1) 

1000000000000010 

(-2) 

 1000000000000010 

1111111111111101 
0000000000000001 

(1) 

0000000000000100 

(4) 

 0000000000000001 

(1) 

  
0000000000000001 
0000000000000010 

(2) 

 0000000000000010 

(2) 

0000000000000001 

(1) 

  
0000000000000001 
0000000000000001 

(1) 

0000000000000001 

(1) 

  1000000000000010 

(-2) 

1000000000000010 

1111111111111101 
0000000000000001 

(1) 

 0000000000000001 

(1) 

 0000000000000010 

(2) 

 
0000000000000010 
0000000000000000

(0) 

0000000000000100 

(4) 

  1000000000000011 

(-3) 

1000000000000011 

1111111111111100 
0000000000000010 

(2) 

 0000000000000000 

(0) 

 0000000000000001 

(1) 

 
0000000000000001 

 
Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

Table (2): -The design result evaluation of the two dimensional 4*4 point’s reconstruction circuit 

Data input First stage Second stage Data output 
 

0000000000000110 

0000000000000110 

(6) 

 0000000000000100 

(4) 

 0000000000000001 

(1) 

 
0000000000000100 

 0000000000000110 

(6) 

0000000000001000 

(8) 

 0000000000000010 

(2) 

 
0000000000000101 

0000000000000000

(0) 

 0000000000000000

(0) 

 0000000000000000

(0) 

 
0000000000000101 

 0000000000001000 

(8) 

0000000000000100 

(4) 

 0000000000000001 

(1) 

 
0000000000000000 

0000000000000010 

(2) 

  0000000000001000 

(8) 

0000000000000010 

(2) 

1000000000000100 

1111111111111011 

 0000000000001000 

(8) 

 0000000000000100 

(4) 

0000000000000001 

(1) 

1000000000000011 

1111111111111100 

0000000000000010 

(2) 

  0000000000000000

(0) 

0000000000000000

(0) 

1000000000000011 

1111111111111100 

 0000000000001000 

(8) 

 0000000000001100

(12) 

0000000000000011 

(3) 

 
0000000000000000 

1000000000000010 

(-2) 

 0000000000000000

(0) 

 0000000000000000

(0) 

1000000000000010 

1111111111111101 

 0000000000000010 

(2) 

0000000000001100

(12) 

 0000000000000011 

(3) 

 
0000000000000001 

0000000000000000

(0) 

 0000000000000100 

(4) 

 0000000000000001 

(1) 

 
0000000000000001 

 1000000000000100 

(-4) 

0000000000001000 

(8) 

 0000000000000010 

(2) 

1000000000000010 

1111111111111101 

1000000000000010 

(-2) 

  0000000000000100 

(4) 

0000000000000001 

(1) 

 
0000000000000010 

 0000000000000100 

(4) 

 0000000000000100 

(4) 

0000000000000001 

(1) 

1000000000000011 

1111111111111100 

0000000000000010 

(2) 

  0000000000000000

(0) 

0000000000000000

(0) 

 
0000000000000001 

 0000000000000000

(0) 

 0000000000001000 

(8) 

0000000000000010 

(2) 

 
The output results here suffering from a 

problem that they are multiplied by certain 

factor which is 4 because we are decomposing 

for two levels only in the high and the low 

frequency branches. So we must divide her the 

data by 4 or the data must be shifted for 2 bits. 

 
Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

6: -Conclusions: - 

The aim of this paper is to propose a new 

circuit design for the implementation and 

evaluation of the two-dimensional wavelet and 

wavelet packet transform. 

In this design the DSP point of view was 

adapted, where nearly in all other designs the 

implemented circuits where from the 

communication point of view, where this type 

of designs need no wide range filter taps design 

which reduces the cost. 

Because of the kit wide capabilities high 

speed, high accuracy, low cost, low power 

consumption, and easy to be handled where 

achieved. 

Other privilege is achieved by 

overcoming the disadvantage of the 

conventional wavelet and wavelet packet 

transform coder-decoder which is not able to 

produce the coefficients until the signal is 

complete. The production now is achieved 

while the data is received with no need to wait. 

For example; if the signal is of 8*8 then 

the evaluation of the WT & WPT coefficients 

will start if the received signal is of 4*4 to fit 

for the designed circuit. 

 
7: -References: -  

[1] Jan E. Odegard & Ivan W. Selensnick 1993  

“Introduction to wavelet and wavelet 

transform”. 

[2] F.Argentini, G.Benelli, reprinted from 

electronic letters, vol 28, pp 513-515, 27th 

February 1992        

     “IIR implementation of wavelet 

decomposition for digital filter analysis “. 

[3] A.Aron & E.Rosenberg, IEEE transactions, 

vol. 64, pp 475-487, April 1986. “Speaker 

verification using pattern recognition approach 

“.  

[4] N. Rex Dixon & Thomas B. Martin 1989.A 

volume in the IEEE press selected reprint series 

     “Automatic speech and speaker recognition 

“. 

[5] L.R.Rabiner & R.W. Shafer 1982 “Digital 

processing of speech signal”. 

[6] Lawrence Rabiner  & Biing – Hwang Juang 

1993 “Fundamentals of speech signals”. 

[7] C.Cheng,  Msc thesis, Rise university, 

department of electrical engineering, May 1996. 

     “Wavelet signal processing of digital audio 

with application in electro-acoustic music “. 

[8] G.R.Doddington processing of electr.76, pp 

22-26, May 1976. “Personal identity 

verification 

      using voice “. 

[9] R.J.Mammare, X.Zhang, & Ramachadram,  

IEEE signal processing magazine, pp 58-71,   

     September 1996 “Robust speaker 

recognition “. 

[10] D.E.Newland,  third addition, 1994. “An 

introduction to random vibration spectral and 

wavelet    

      transform “. 

[11] A.Chinienti, I. Cibrario, R. Picco, DSP 

1997 vol56 p.p 633-636 ”Filter evaluation of 


Dr. Walid A. Mahmoud/Al-khwarizmi Engineering Journal ,vol.1, no. 2,PP 46-51  (2005)   

wavelet   

       transform”. 

[12] Dusan Levicky, Emil Matus, Peter Kral, 

electrical engineering 1996 vol.47 p.p281-286 

       ”Wavelet transform analysis-synthesis-

algorithms”. 

[13] Mohammad N.Hussain, Msc. Thesis, 

University of Baghdad, department of electrical                

engineering, October 2000, “Speaker 

Recognition Based Upon Phonemes Using 

Wavelet Packet Transform”. 

[14] V.Herrero, J.Cerda, R.Gadea, M.Martinez, 

A.Sebestia, Group of design of digital systems, 

department of electrical engineering, June 2003, 

“Implementation of 1-D Daubechies Wavelet 

transform on FPGA”. 

[15] Piyush Jamkhandi, Amar Mukherjee, 

Kunal Mukherjee, Robert Franceschini, school 

of electrical engineering and computer science, 

July 2003, “ Parallel H/W- S/W architecture for 

computation of discrete wavelet transform 

using RMF algorithm”. 

[16] Sarin George Mathen, thesis, University of 

Kansas ,June 2000, “Wavelet Transform based 

adaptive image compression on FPGA”.  

[17] www.Xilinix.com, July 2003, “Spartan-IIE 

1.8 V FPGA family: introduction and ordering 

information’s”. 

[18] Pak K. Chan, University of California, 

October 2000, “Digital Design Using Field 

Programmable Gate Array”. 

[19] John E. Gilbert , September 2003, 

“Wavelets : theory and applications on FPGA”. 

[20] Miguel Figueroa , Chris Diorio, University 

of Washington ,September 2003,”A 200 MHz , 

3mW,16-Tap Mixed –signal FIR Filter”. 
 

http://www.Xilinix.com