AP04_1.vp 1 Introduction The objective of an image compression algorithm is to exploit the redundancy in an image such that a smaller number of bits can be used to represent the image while main- taining an “acceptable” visual quality for the decompressed image. The redundancy of an image resides in the correlation of the neighboring pixels. For a color image, there is also a correlation, which can be exploited, between the color com- ponents [1, 2]. Different applications require different data rates for the compressed images and different visual qualities for the decompressed images. In some applications when browsing is required or transmission bandwidth is limited, progressive transmission is used to send images in such a way that a low quality version of the image is transmitted first at a low data rate [3]. Gradually, additional information is trans- mitted to progressively refine the image. A specific coding strategy known as embedded rate scalable coding is well suited for progressive transmission [4]. In embedded coding, all the compressed data is embedded in a single bit stream. The decompression algorithm starts from the beginning of the bit stream and can terminate at any data rate. A decom- pressed image at that data rate can then be reconstructed. In embedded coding, any visual quality requirement can be ful- filled by transmitting the truncated portion of the bit stream. To achieve the best performance the bits that convey the most important information need to be embedded at the begin- ning of the compressed bit stream [4]. The remainder of this paper is organized as follows: in Section 2, the classifications of the image coder are pre- sented. Section 3 presents the discrete wavelet transform. Section 4 presents imbedded image coding in the wavelet do- main. The objective and subjective image quality measures are presented in Section 5. Section 6 provides a review of the main algorithms and recent publications in embedded scalable grayscale image compression. Embedded scalable color image compression codecs are presented in Section 7. In Section 8 the JPEG2000 still image compression standard is presented. Finally, conclusions are drawn in Section 9. 2 Image coder classifications Image coders can be classified according to the organiza- tion of the compressed bit stream into scalable image coders © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 3 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 Wavelet-Based Embedded Rate Scalable Still Image Coders: A review Farag Ibrahim Younis Elnagahy, B. Šimák Embedded scalable image coding algorithms based on the wavelet transform have received considerable attention lately in academia and in industry in terms of both coding algorithms and standards activity. In addition to providing a very good coding performance, the embedded coder has the property that the bit stream can be truncated at any point and still decodes a reasonably good image. In this paper we present some state-of-the-art wavelet-based embedded rate scalable still image coders. In addition, the JPEG2000 still image compression standard is presented. Keywords: JPEG2000, embedded image coding, grayscale, color, scalable, progressive, compression, discrete wavelet transform. Fig. 1: Non-scalable image coder and non-scalable image coders. In non-scalable image coders, the complete image is only obtained by decoding the entire compressed bit stream. Truncating the compressed bit stream during the decoding process will produce an incompletely re- constructed image, as shown in Fig. 1. Scalable image coders are coders that allow the image data to be compressed once and then decompressed at multiple data rates or decom- pressed at different resolutions of the image. Resolution re- fers to the size of the reconstructed image. A scalable image coder that is able to decode different resolutions of the image from the compressed bit stream is called a resolution scalable coder, while a scalable image coder that has the ability to de- code a full resolution image with a certain bit rate from the compressed bit stream is called a rate (SNR) scalable coder. Both types of image scalable (resolution/SNR) coders are fur- ther classified into embedded and non-embedded image coders. In an embedded resolution scalable image coder, lower resolution of the image is obtained by just decoding the first portion of the compressed bit stream, as shown in Fig. 2. In a non-embedded resolution scalable image coder, lower reso- lution of the image is obtained by decoding certain portions of the compressed bit stream, as shown in Fig. 3. In both cases no complete decoding of the whole compressed bit stream is needed. In an embedded rate (SNR) scalable image coder, a lower quality of the image is obtained by just decoding the first portion of the compressed bit stream. By decod- ing more and more bits we can achieve a higher quality 4 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague Fig. 2: Embedded resolution scalable image coder Fig. 3: Non-embedded resolution scalable image coder image (higher SNR), as shown in Fig. 4. In non-embedded rate (SNR) scalable image coder, however, a lower quality of the image is obtained by decoding certain portions of the compressed bit stream, as shown in Fig. 5. In both cases no complete decoding of the whole compressed bit stream is needed. It is important to note that if a compression tech- nique produces an embedded bit stream, this implies that the technique is scalable, because embedded bit streams can be truncated at any point during decoding. However, not all scalable compression techniques are embedded. Resolution scalability can be achieved by encoding whole wavelet sub-bands one after the other, without interleaving bits from different sub-bands. SNR scalability can be achieved by distributing encoded bits from one sub-band into the whole bit stream in an optimal way. Another very interesting property of some coders is the ability to decode only certain regions of an image; this property is called random access. A coder can support the decoding of arbitrarily shaped re- gions or of any rectangular shaped region. Region of interest encoding and decoding is also related to random access. For region of interest encoding, only an arbitrarily shaped region of the input image is encoded, and the rest of the image, which does not fall into the region of interest, is discarded. Random access can be achieved by independently encoding portions of the whole image. All image coders can support random access by tiling the image and encoding each tile independently of the rest. © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 5 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 Fig. 4: Embedded SNR scalable image coder Fig. 5: Non-embedded SNR scalable image coder 3 Discrete wavelet transform A wavelet transform corresponds to two sets of analy- sis/synthesis digital filters, g/~g and h/~h, where h is a low pass filter (LPF) and g is a high pass filter (HPF). In two dimensions, filtering is usually applied in both horizontally and vertically. Filtering in one direction results in decompos- ing the image in two components. There is a total of four components after vertical and horizontal decompositions. We will refer to these components as image sub-bands, LL, HL, LH, and HH. By using filters g and h, an image can be decomposed into four sub-bands. This is the first level of the wavelet transform (Fig. 6). One of the four sub-bands, the LL sub-band, will contain low pass information, which is es- sentially a low resolution version of the image. Sub-band HL will contain low pass information vertically and high pass information horizontally, and sub-band LH will contain low pass information horizontally and high pass informa- tion vertically. Finally, sub-band HH will contain high pass information in both directions [5,6]. The operations can be repeated on the low-low (LL) sub-band for some number of levels, D, producing a total of 3-D+1 sub-bands whose samples represent the original image. The total number of samples (coefficients) in all sub-bands is identical to that in the original image. Thus, a typical 2-D discrete wavelet trans- form used in image processing will generate a hierarchical pyramidal structure, as shown in Fig. 7. The LL sub-band at the highest level can be classified as most important, and the other ,details’ sub-bands can be classified as of lesser importance, with the degree of importance decreasing from the top of the pyramid to the sub-bands at the bottom. The in- verse wavelet transform is obtained by reversing the trans- form process and replacing the analysis filters by the synthesis filters and using up-sampling (Fig. 8). An example illustrating three level octave-band decomposition is shown in Fig. 9. 4 Embedded image coding The wavelet transform can decorrelate the image pixel values and result in frequency and spatial-orientation separa- tion. The transform coefficients in each sub-band exhibit unique statistical properties that can be used for encoding the image. The sub-band LL can be encoded alone and is added to the encoded bit stream information about the sub- -bands HL, LH, HH. Then the decoder is responsible for deciding whether to decode only LL or all sub-bands. Sub- -band data are usually real numbers, therefore they require a significant number of bits in their representation, and do not directly offer any form of compression. The first approach towards compressing wavelet data is to apply a quantizer in all the coefficients, where uniform and dead-zone quantizers are the most common choices. After quantization an entropy coder is applied to compress the quantization indices. In embedded coding, a wavelet transform is usually used to decorrelate the image pixels and achieve frequency and spatial-orientation separation. Assuming that the wavelet transform � �c p� � , where p is the collection of image pixels and c is the collection of transformed coefficients, is unitary, the distortion of the image after decompression is � � � � � �D p D c D ci i � �� . The distortion measure is defined as the summation of the distortion at each pixel. The greatest distortion reduction can be achieved if the transformed coefficient with the largest magnitude is coded with infinite precision. Thus attempts have been made to encode the transformed pixels with larger magnitudes first. Furthermore, in order to distribute the bits strategically such that the decoded image will look “natural’’ at any data rate, progressive refinement or bit-plane coding is used. Thus in the coding procedure, multiple passes through the data are adopted. In the first pass, those transformed pixels that are larger than a pre-selected threshold are added to the significance list and coded to the precision of the threshold. In the following passes, the threshold is halved and the pixels already in the list are refined to one more bit of precision. Meanwhile more transformed pixels that are larger than the (halved) threshold are added to the list. It is critical that the positions of the enlisted pixels be encoded ef- ficiently. It has been observed that large number of transform coefficients is insignificant in comparison with the threshold. These coefficients will be quantized to zero, which will not reduce the distortion. Thus spending more bits on coding these insignificant coefficients results in lower efficiency [7]. Imbedded image coding makes more efficient use of a com- munications channel, because images become visible more quickly, reducing the apparent retrieval time. Moreover, pro- gressive transmission enables efficient browsing of image databases: based on a quickly obtained preview of an image, a user might decide to terminate the transmission and proceed to the next image. Alternatively, a user might select a low or medium resolution default for image retrieval. In this way, images of less value are not transmitted at full resolution. Because of the small volume of data required to transmit rela- tively high quality previews of an image, the savings in access time can be significant when browsing large databases of images. 5 Measures of image quality Two mathematical formulas are commonly used to com- pare the various image compression techniques; they are the Mean Square Error (MSE) and the Peak Signal to Noise Ratio (PSNR). The MSE is the cumulative squared error between the compressed and the original image, whereas PSNR in decibels (dB) is a measure of the peak error. The mathemati- cal formulae for the two are: � � � �� �MSE MN I x y I x y x N y M � � �� ��1 2 11 , , , PSNR MSE � � � � � � �10 2552 log . Where I( x, y) is the original image, I’( x, y) is the approxi- mated version (which is actually the decompressed image) and M, N are the dimensions of the images. A lower value for MSE means less error, and as seen from the inverse relation between MSE and PSNR, this translates to a high value of PSNR. Logically, a higher value of PSNR is good because it means that the ratio of signal-to-noise is higher. Here, the ‘signal’ is the original image, and the ‘noise’ is the error in 6 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague reconstruction. The nominator (255) in formula 2 is the maxi- mum decimal value of an unsigned 8-bit image. A high PSNR value does not always correspond to an image with perceptually high quality. Another measure of image quality is the mean opinion score, where the performance of a com- pression process is characterized by the subjective quality of the decoded image. For example, a five-point scale such as very annoying, annoying, slightly annoying, perceptible but not annoying, and imperceptible might be used to character- ize the impairments in the decoder output [8]. More details about the performance of image quality measures can be found in [9]. © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 7 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 Fig. 8: One level of the wavelet transform reconstruction (a) Single level decomposition (b) Two levels decomposition (c) Three levels decomposition Fig. 7: Pyramidal decomposition of an image Fig. 6: One level of wavelet transform decomposition 6 Wavelet-based embedded rate scalable grayscale image compression algorithms Several techniques have been proposed to achieve rate scalability in still image compression. Two of the most impor- tant techniques are Shapiro’s Embedded Zerotree Wavelet (EZW) [4], and Said and Pearlman’s Set Partitioning in Hierarchical Trees (SPIHT) [10]. Both make use of “spatial orientation trees” (SOT). Spatial orientation trees are struc- tures that use quad-tree representations of sets of wavelet coefficients that belong to different sub-bands, but have the same spatial location. This structure, which can be efficiently represented by one symbol, has been used extensively in rate scalable image and video compression. EZW, SPIHT, and other recent works are described in the following sub-sections. 6.1 Embedded zerotree wavelet (EZW) algorithm [4] The EZW algorithm is based on four key concepts: � A discrete wavelet transform or hierarchical sub-band decomposition. � Prediction of the absence of significant information across scales by exploiting the self-similarity inherent in images. � Entropy-coded successive approximation quantization. � Universal lossless data compression, which is achieved via adaptive arithmetic coding [11]. The Embedded zerotree wavelet (EZW) algorithm ex- ploits the interdependence between the coefficients of the wavelet decomposition of an image by grouping them into SOTs. This is based on the following observations: � Most of the energy is concentrated at the low frequency sub-bands of the wavelet decomposition. � If the magnitude of a wavelet coefficient in the lowest sub-band of a decomposition is insignificant with respect to a threshold, then it is more likely that wavelet coefficients having the same spatial location in different sub-bands will also be insignificant. � The standard deviation of the wavelet coefficients de- creases when proceeding from the lowest to the highest sub-bands of the wavelet pyramid. The combination of the above observations allows for the coding of a large number of insignificant wavelet coefficients by coding only the location of the root coefficient to which the entire set of coefficients is related. Such a set is commonly referred to as a “zerotree”. The SOT used in EZW is shown in Fig. 10. The root of the tree is the coefficient located in the lowest sub-band (LL) of the decomposition. Its descendants are all the other coefficients in the tree. The wavelet coeffi- cient in the LL sub-band has three children. Other coeffi- cients, except for those in the highest frequency sub-bands, have four children. The EZW algorithm consists of successive approximation of wavelet coefficients, and can be summarized as follows: Let f (m; n) be a grayscale image, and let W [f (m; n)] be the coefficients of its wavelet decomposition. 1) Set the threshold � �� �� �T W f m n� max ; 2 2) Dominant pass: � Compare the magnitude of each wavelet coefficient in a tree, starting with the root, to threshold T. � If the magnitudes of all the wavelet coefficients in the tree are smaller than threshold T, then the entire tree structure (that is, the root and all its descendants) is represented by one symbol. This symbol is known as the Zerotree (ZTR) symbol. � Otherwise, the root is said to be “significant" (when its magnitude is greater than T), or “insignificant” (when its magnitude is less than T). A significant coefficient is repre- sented by one of two symbols, POS or NEG, depending on whether its value is positive or negative. The magnitude of significant coefficients is set to zero to facilitate the formation of Zerotree structures. An insignificant coef- ficient is represented by the symbol IZ “isolated zero” if it has some significant descendant. � This process is carried out such that all the coefficients in the tree are examined for possible sub-zerotree structures. 3) Subordinate pass: The significant wavelet coefficients in the image are refined by determining whether their magni- tudes lie within the interval [T; 3T/2), represented by the symbol LOW, or the interval [3T/2; 2T), represented by the symbol HIGH. 4) Set T T� 2, and go to Step 2. Only the coefficients that have not yet been found to be significant are examined. This coding strategy is iterated until the target data rate is achieved. The order in which the coefficients are examined is 8 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague (a) Original image (b) Three level octave-band decomposition (c) Synthesized image Fig. 9: An example of wavelet decomposition predefined and known by the decoder, as shown in Fig. 11. The initial threshold is encoded in the header of the bit stream, followed by the resulting symbols from the dominant and subordinate passes, which are entropy coded using an arithmetic coder [11]. An important element of EZW is the “significance map". The significance map is used to code the position of the significant coefficients in the wavelet decom- position. The simplest significance map is that where a signifi- cant coefficient is accompanied by the actual coordinates of its location. In EZW, the significance map is determined by hav- ing both the encoder and the decoder know the scanning or- der of the coefficients. In essence, the dominant pass of EZW determines of the significance map, whereas the subordinate pass refines the wavelet coefficients that have been found sig- nificant. In consequence, the reduction of distortion during the dominant pass is smaller than during the subordinate pass. EZW was developed for grayscale images. However it has been used for color images by using EZW on each of the color components separately. 6.2 Set partitioning in hierarchical trees (SPIHT) [10] Said and Pearlman [10] investigated different tree struc- tures that improved the quality of the decomposition. Using Set Partitioning in Hierarchical Trees (SPIHT), the coeffi- cients are divided into partitioning subsets using a known ordering in the SOT. Then, each subset is classified as signifi- cant or insignificant according to a pre-specified rule based on the magnitude of the coefficients in the subset. In the SOT, the descendants (a group of 2×2 adjacent coefficients) corre- spond to the pixels of the same spatial orientation in the next finer level of the decomposition. Fig. 12 shows how the spatial orientation tree is defined in a pyramid constructed with recursive four-sub-band splitting. One wavelet coefficient in the LL sub-bands (noted with “*”) does not have a child. The other coefficients, except for those in the highest fre- quency sub-bands, have four children. In contrast to Shapiro’s Zerotree, one coefficient in each group does not have descen- dants. SPIHT groups the wavelet coefficients into three lists: a list of insignificant sets (LIS), a list of insignificant pixels (LIP), and a list of significant pixels (LSP). The SPIHT algo- rithm can be summarized as follows: 1) Initialize the LIS to the set of sub-tree descendants of the nodes in the highest level, the LIP to the nodes in the highest level, and the LSP to an empty list. 2) Sorting pass: � Traverse the LIP testing the magnitude of its elements against the current threshold and representing its signifi- cance by 0 or 1. If a coefficient is found to be significant, its sign is coded and is moved to the LSP. � Examine the LIS and check the magnitude of all the coefficients in that set. If a particular set is found to be significant, it is then partitioned into subsets and tested for significance. Otherwise, a single bit is appended to the bit stream to indicate an insignificant set. 3) Refinement pass: Examine the LSP, excluding coefficients added during the sorting pass. This pass is accomplished by using progressive bit plane coding of the ordered coef- ficients. 4) Set T T� 2, and go to Step 2. The process is repeated until the target data rate is achieved. It is important to note that the locations of the coefficients being refined or classified are never explicitly transmitted. This is because all branching decisions made by the encoder as it searches throughout the coefficients are appended to the bit stream. The output of the sorting-refine- ment process is then entropy coded [11]. 6.3 Space frequency quantization (SFQ) algorithm [12] In [12], a wavelet-based image compression scheme is presented. This iterative algorithm seeks to minimize the distortion measure (MSE) by successively pruning a tree for a given target rate. The survivor nodes in the tree are quantized and sent in the data stream along with significance map infor- © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 9 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 Fig. 10: Parent-descendant relationships in the spatial-orienta- tion tree of the EZW algorithm Fig. 11: Scanning order of the sub-bands for encoding a signifi- cance map (EZW) Fig. 12: Parent-descendant relationships in the spatial-orienta- tion tree in the SPIHT algorithm mation. To further compress the stream, this map is predicted on the basis of statistical information. The spatial wavelet co- efficient tree structure used in the SFQ algorithm is shown in Fig. 13. A spatial wavelet coefficient tree is defined as the set of coefficients from different sub-bands that represent the same spatial region in the image. The arrows in Fig. 13 identify the parent-children dependencies in a tree. The low- est frequency sub-band of the decomposition is represented by the root nodes (top) of the tree, the highest frequency sub-bands by the leaf nodes (bottom) of the tree, and each parent node represents a lower frequency component than its children. Except for a root node, which has only three children nodes, each parent node has four children nodes, the 2×2 region of the same spatial location in the immedi- ately higher frequency sub-band. The SFQ coder has the goal of jointly finding the best combination of spatial zerotree quantization choice and the scalar frequency quantizer choice. The block diagram of this coder is shown in Fig. 14 (assuming a 2-level wavelet decomposition). The SFQ para- digm is conceptually simple: throw away, i.e. quantize to zero, a subset of the wavelet coefficients, and use a single sim- ple uniform scalar quantizer on the rest. All nine possible complementary subsets for each depth-2 spatial wavelet coef- ficient tree are listed in the spatial zerotree quantization block, and there are three possible scalar quantizer choices in the frequency scalar quantization block. The results show a gain of about 0.5 dB in PSNR versus SPIHT for grayscale images. 6.4 Multi-scale zerotree wavelet entropy (MZTE) coding algorithm [13] This paper describes the texture representation scheme adopted for MPEG-4 synthetic/natural hybrid coding (SNHC) of texture maps and images. The scheme is based on the concept of the multiscale zerotree wavelet entropy (MZTE) coding technique. MZTE was rated as one of the top five schemes in terms of compression efficiency in JPEG2000 November 1997, from among 27 submitted proposals. This scheme provides many levels of scalability layers in terms of either spatial resolution or picture quality. The MZTE coding technique is based on zerotree entropy (ZTE) coding, but it utilizes a new framework to improve and extend the ZTE method to achieve a fully scalable yet very efficient coding technique. In this scheme, the low-low sub-band is separately encoded. To achieve a wide range of scalability levels effi- ciently, as needed by the application, the other sub-bands are encoded using the multi-scale zerotree entropy coding scheme. This multi-scale scheme provides a very flexible approach to support the right tradeoff between layers and types of scalability, complexity, and coding efficiency for any 10 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague Fig. 13: SFQ spatial wavelet coefficient tree Fig. 14: Block diagram of the SFQ coder multimedia application. Fig. 15 shows the concept of this technique. The wavelet coefficients of the first spatial (and/or qual- ity) layer are first quantized with the quantizer Q0. These quantized coefficients are scanned using the zerotree concept, and then the significant maps and quantized coefficients are entropy coded. The output of the entropy coder at this level, BS0, is the first portion of the bitstream. The quantized wave- let coefficients of the first layer are also reconstructed and sub- tracted from the original wavelet coefficients. These residual wavelet coefficients are fed into the second stage of the coder, in which the wavelet coefficients are quantized with Q1, zerotree scanned, and entropy coded. The output of this stage, BS1, is the second portion of the output bitstream. The quantized coefficients of the second stage are also recon- structed and subtracted from the original coefficients. As shown in Fig. 15, N+1 stages of the scheme provide N+1 layers of scalability. Each level represents one layer of SNR quality, spatial scalability, or a combination of both. In MZTE, the wavelet coefficients are scanned in either the tree-depth scanning for each scalability layer or in the sub-band by sub-band fashion, from the lowest to the highest frequency sub-bands. The wavelet coefficients are quantized by a uni- form and midstep quantizer with a dead zone equal to the quantization step size as closely as possible to each scalability layer. The results show that MTZE performs about 2.5 dB better than baseline JPEG for the luminance component, and about 6 dB better for the chrominance components. 6.5 Embedded block coding with optimized truncation of the embedded bit-streams (EBCOT) algorithm [14–16] These papers describe the image compression scheme adopted for JPEG2000. The scheme offers rate and spa- tial scalability with excellent compression performance. In embedded block coding with optimized truncation of the embedded bit-streams (EBCOT), each sub-band is partition- ed into relatively small blocks of wavelet coefficients, called code blocks. EBCOT generates an embedded bit stream separately for each code block. These embedded bit streams can be truncated independently to different lengths. The code-blocks are of 32×32 or 64×64 samples each, and pro- vide random access to the image. The actual block coding algorithm which generates a separate embedded bit stream for each code block is combined with bit plane coding and adaptive arithmetic coding. Each code block is again divided into 2-D sequences of sub-blocks, whose size is 16×16. Then, for each bit plane, the significance map of sub-blocks is firstly encoded through quadtree coding. Then, four differ- ent primitive coding operations are used to encode those significant sub-blocks. The operations are zero coding (ZC), run-length coding (RLC), sign coding (SC) and magnitude refinement (MR). Finally, EBCOT organizes the final big stream in layers with optimized truncation so as to make it both resolution and SNR scalable. The layered bit stream may be truncated at any point during decoding. EBCOT uses the Daubechies (9,7) wavelet filter [17] with a five level transform, though JPEG2000 can use other filters. The results show performance gains of about 0.5 dB against SPIHT at various data rates. 7 Wavelet-based embedded rate scalable color image compression algorithms Both EZW and SPIHT were originally designed for gray- scale images. A straightforward application to color images is to code the transformed data from each spectral channel independently without exploiting any correlation that might exist between the spectral channels. The sub-sections below describe the EZW and SPIHT algorithms for color image compression. 7.1 Color image compression using an embedded rate scalable approach (CEZW) [7] This paper presents a wavelet based coding algorithm for color images using a luminance/chrominance color space. Data rate scalability is achieved by using an embedded coding scheme, which is similar to Shapiro’s embedded zerotree wavelet (EZW) algorithm [4]. In a luminance/chrominance color space, the three color components have little statistical correlation. However, it is observed that at the spatial loca- tions where chrominance signals have large transitions it is highly likely that the luminance signal will have large transi- tions. This interdependence between the color components is exploited in this algorithm. In this algorithm, the YUV space is used. The algorithm is developed on the basis of Shapiro’s algorithm. The SOT is established as follows: the original SOT structure in Shapiro’s algorithm is applied to all three- -color components. Each chrominance node is also a child node of the luminance node of the same location. Thus each chrominance node has two parent nodes: one is of the same chrominance component in a lower frequency sub-band; the other node is of the luminance component. A diagram of the SOT is shown in Fig. 16, where the tree is developed on the basis of the tree structure in Shapiro’s algorithm, and YUV color space is used. In this algorithm, the coding strategy is similar to Shapiro’s algorithm, except for the following changes in the dominant pass. (1) The luminance component is first scanned. For each luminance pixel, all descendents, including those of the luminance component and those of the chrominance components, are examined and appropriate © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 11 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 Fig. 15: MZTE encoding structure symbols are assigned. (2) The two chrominance components are alternately scanned. The pixels that have already been encoded while scanning the luminance component are not examined. The subjective experiments showed that this algo- rithm produces images with the same quality as those from Said and Pearlman’s algorithm (http://ipl.rpi.edu:80/SPIHT/) at the same data rate. However the peak - signal - to - noise (PSNR) ratio is lower than that from Said’s algorithm, where the PSNR is obtained as � � � � � �� � PSNR MSE Y MSE U MSE V � � � � � � � � �10 255 3 2 lg / 7.2 Embedded color image coding using SPIHT with partially linked spatial orientation trees (CSPIHT) [18] This paper describes a variation of the set partitioning in hierarchical trees (SPIHT) scheme for color image coding. By using partially linked spatial orientation tree structures across different spectral planes, the new color-SPIHT scheme is able to embed both chrominance and luminance data in the coded bit stream. The performance is comparable to that of a SPIHT-based coding scheme but with significantly lower computational complexity. In CSPIHT, SPIHT’s SOT structure [10] is used for all spectral planes. To create a com- prehensive SOT structure across the different spectral planes, luminance nodes in the LL sub-band that do not have any offspring will have descendants in the LL sub-bands of the chrominance planes, as shown in Fig. 17. The results obtained are compared to Said and Pearlman’s algorithm (SPIHT KLT) (http://ipl.rpi.edu:80/SPIHT/), which performs the Karhunen–Loeve Transform (KLT) [19] on the spectral com- ponents of the image before coding the decorrelated color planes independently using the SPIHT scheme. For the case of four and five-level DWT decomposition, CSPIHT has sig- nificantly better PSNR performance than SPIHT KLT, espe- cially at low bit-rates (up to 6 dB improvement in performance). 8 JPEG2000 still image compression standard JPEG2000 [20] is the new international standard for still image compression. The JPEG2000 standard is based on wavelet/sub-band coding techniques [17, 21] and supports lossy and lossless compression of single-component (e.g., grayscale) and multi-component (e.g., color) imagery. In order to facilitate both lossy and lossless coding in an efficient manner, reversible integer-to-integer [22–24] and nonrevers- ible real-to-real transforms are employed. To code transform data, the codec makes use of bit-plane coding techniques. For entropy coding, a context-based adaptive binary arithmetic coder [11] is used. In addition to this basic compression func- tionality, however, numerous other features are provided, including: 1) progressive recovery of an image by fidelity or resolution; 2) region of interest coding, whereby different parts of an image can be coded with differing fidelity; 3) random access to particular regions of an image without needing to decode the entire code stream; 4) a flexible file format with provisions for specifying opacity information and image sequences and 5) good error resilience. Due to its excellent coding performance and many attrac- tive features, JPEG2000 has a very large potential application base. Some possible application areas include: image archiv- ing, Internet, web browsing, document imaging, digital pho- tography, medical imaging, remote sensing, and desktop publishing. The JPEG2000 core compression algorithm is primarily based on embedded block coding with optimized 12 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague Fig. 16: Parent descendent relations in CEZW algorithm Fig. 17: Linking SOT structures in different spectral planes in the CSPIHT scheme truncation (EBCOT) of the embedded bit-streams [14–16]. The EBCOT algorithm provides superior compression per- formance and produces a bit-stream with features such as res- olution and SNR scalability and random access. 8.1 JPEG2000 codec structure The general structure of the JPEG2000 codec is shown in Fig. 18. The input to the encoding process is an image consisting of one or more components. Before any further processing takes place, an image is partitioned into one or more disjoint rectangular regions called tiles. This is done when the original image is quite large in comparison to the amount of memory available to the codec. In the preprocessing step, each image component has its sample values adjusted by an additive bias, in a process called DC level shifting. The bias is chosen such that the resulting sample values have a nominal dynamic range (approximate- ly) centered about zero. The RGB color space is transformed to YCrCb color space in the forward intercomponent trans- form step. In the forward intracomponent transform step, transforms that operate on individual components can be applied. The particular type of operator employed for this purpose is the wavelet transform. The resulting wavelet coef- ficients are quantized in the quantization step. A different quantizer is employed for the coefficients of each sub-band. In the case of lossless coding, reversible transforms must be employed and all quantizer step sizes are forced to be one. In the tier-1 coding step, the quantizer indices for each sub- -band are partitioned into code blocks and each of the code blocks is independently embedded coded. The coding is per- formed using the bit-plane coder. There are three coding passes per bit plane. These passes are the significance pass, the refinement pass, and the cleanup pass, respectively. In the tier-2 encoding step, the coding pass information generated during tier-1 is packaged into data units called packets, in a process referred to as packetization. The packetization pro- cess imposes a particular organization on coding pass data in the output code stream. This organization facilitates many of the desired codec features cited above. The rate control can be achieved through two distinct mechanisms: 1) the choice of quantizer step sizes, and 2) the selection of the subset of coding passes to include in the code stream. The decoder structure essentially mirrors that of the en- coder. That is, with the exception of rate control, there is a one-to-one correspondence between the functional blocks in the encoder and decoder. Each functional block in the decoder either exactly or approximately inverts the effects of its corresponding block in the encoder. Since the tiles are coded independently of one another, the input image is (con- ceptually, at least) processed one tile at a time. In the sections that follow, each of the above processes is briefly explained. For more details about the whole processes, see [20]. 8.2 Preprocessing/postprocessing The codec expects its input sample data to have a nominal dynamic range that is approximately centered about zero. The preprocessing stage of the encoder simply ensures that this expectation is met. Suppose that a particular component has P bits/sample. The samples may be either signed or un- signed, leading to a nominal dynamic range of [�2P �1, 2P �1� 1] or [0, 2P�1], respectively. If the sample values are unsigned, the nominal dynamic range is clearly not centered about zero. Thus, the nominal dynamic range of the samples is adjusted by subtracting a bias of 2P �1 from each of the sample values. If the sample values for a component are signed, the nominal dynamic range is already centered about zero, and no pro- cessing is required. The postprocessing stage of the decoder essentially undoes the effects of preprocessing in the encoder. If the sample values for a component are unsigned, the origi- nal nominal dynamic range is restored. Lastly, in the case of lossy coding, clipping is performed to ensure that the sample values do not exceed the allowable range. 8.3 Intercomponent transform In the encoder, the preprocessing stage is followed by the forward intercomponent transform stage. Here, an inter- component transform can be applied to the tile-component data. Such a transform operates on all of the components together, and serves to reduce the correlation between com- ponents, leading to improved coding efficiency. Only two intercomponent transforms are defined in the base-line JPEG-2000 codec: the irreversible color transform (ICT) and the reversible color transform (RCT). The ICT is nonrevers- ible and real-to-real in nature, while the RCT is reversible © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 13 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 (a) Encoder (b) Decoder Fig. 18: The general structure of the JPEG2000 codec and integer-to-integer. Both of these transforms essentially map image data from the RGB to the YCrCb color space. The ICT may only be used in the case of lossy coding, while the RCT can be used in either the lossy or lossless case. After the intercomponent transform stage in the encoder, data from each component is treated independently. The ICT is nothing more than the classic RGB to YCrCb color space transform. The RCT is simply a reversible integer-to-integer approximation to the ICT (similar to that proposed in [24]). The inverse intercomponent transform stage in the decoder essentially undoes the effects of the forward intercomponent transform stage in the encoder. 8.4 Intracomponent transform The intercomponent transform stage in the encoder is followed by the intracomponent transform stage. In this stage, transforms that operate on individual components can be applied. The particular type of operator employed for this purpose is the wavelet transform. Through the application of the wavelet transform, a component is split into numer- ous frequency bands (i.e., sub-bands). Due to the statistical properties of these sub-band signals, the transformed data can usually be coded more efficiently than the original un- transformed data. Both reversible integer-to-integer [22, 23, 25–27] and non-reversible real-to-real wavelet transforms are employed by the baseline codec. The inverse intracomponent transform stage in the decoder essentially undoes the ef- fects of the forward intracomponent transform stage in the encoder. 8.5 Quantization/dequantization In the encoder, after the tile-component data has been transformed (by intercomponent and/or intracomponent transforms), the resulting coefficients are quantized. Quantization allows greater compression to be achieved, by representing transform coefficients with only the minimal precision required to obtain the desired level of image qual- ity. Quantization of transform coefficients is one of the two primary sources of information loss in the coding path. Trans- form coefficients are quantized using scalar quantization with a deadzone. A different quantizer is employed for the coeffi- cients of each sub-band, and each quantizer has only one parameter, its step size. In the decoder, the dequantization stage tries to undo the effects of quantization. Unless all of the quantizer step sizes are less than or equal to one, the quantization process will normally result in some information loss, and this inversion process is only approximate. The base- line codec has two distinct modes of operation, referred to as the integer mode and the real mode. In the integer mode, in- teger-to-integer transforms are employed. The quantization step is fixed to one. Lossy coding is still achieved by discard- ing bit-planes. In the real mode, real-to-real transforms are employed. Quantization steps are chosen in conjunction with rate control. In this mode, discarding bi-planes or changing the size of the quantization step, or both, achieves lossy compression. 8.6 Tier-1 coding After quantization is performed in the encoder, tier-1 cod- ing takes place. This is the first of two coding stages. The quantizer indices for each sub-band are partitioned into code blocks and each of the code blocks is independently coded. The coding is performed using the bit-plane coder. For each code block, an embedded code is produced, comprised of numerous coding passes. The output of the tier-1 encoding process is, therefore, a collection of coding passes for the various code blocks. On the decoder side, the bit-plane cod- ing passes for the various code blocks are input to the tier-1 decoder, these passes are decoded, and the resulting data is assembled into sub-bands. The tier-1 coding process essentially involves bit-plane coding. After all of the sub-bands have been partitioned into code blocks, each of the resulting code blocks is independ- ently coded, using a bit-plane coder. There are three coding passes per bit plane. These passes, in order are as follows: 1) significance, 2) refinement, 3) cleanup. All three types of coding passes scan the samples of a code block in the same fixed order. The bit-plane encoding pro- cess generates a sequence of symbols for each coding pass. Some or all of these symbols may be entropy coded. For the purposes of entropy coding, a context-based adaptive binary arithmetic coder is used [28]. For each pass, all of the symbols are either arithmetically coded or raw coded (i.e., the binary symbols are emitted as raw bits with simple bit stuffing). The arithmetic and raw coding processes both ensure that certain bit patterns never occur in the output, allowing such patterns to be used for error resilience purposes. The following subsec- tions present these three passes. 8.6.1 Significance pass The first coding pass for each bit plane is the significance pass. This pass is used to convey the significance and (as nec- essary) sign information for samples that have not yet been found to be significant and are predicted to become signifi- cant during the processing of the current bit plane. The samples in the code block are scanned in a predefined order. If a sample has not yet been found to be significant, and is predicted to become significant, the significance of the sam- ple is coded with a single binary symbol. If the sample also happens to be significant, its sign is coded using a single binary symbol. If the most significant bit plane is being processed, all samples are predicted to remain insignificant. Otherwise, a sample is predicted to become significant if any 8-connected neighbor has already been found to be significant. As a consequence of this prediction policy, the sig- nificance and refinement passes for the most significant bit plane are always empty (and need not be explicitly coded). The symbols generated during the significance pass may or may not be arithmetically coded. If arithmetic coding is employed, the binary symbol conveying significance informa- tion is coded using one of nine contexts. The particular context used is selected on the basis of the significance of the sample’s 8-connected neighbors and the orientation of the sub-band with which the sample is associated (e.g., LL, LH, HL, HH). In the case that arithmetic coding is used, the sign of a sample is coded as the difference between the actual and predicted sign. Otherwise, the sign is coded directly. Sign prediction is performed using the significance and sign infor- mation for 4-connected neighbors. 14 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague 8.6.2 Refinement pass The second coding pass for each bit plane is the refine- ment pass. This pass signals subsequent bits after the most significant bit for each sample. If a sample was found to be sig- nificant in a previous bit plane, the next most significant bit of that sample is conveyed using a single binary symbol. Like the significance pass, the symbols of the refinement pass may or may not be arithmetically coded. If arithmetic coding is em- ployed, each refinement symbol is coded using one of three contexts. The particular context employed is selected accord- ing to whether the second MSB position is being refined and the significance of 8-connected neighbors. 8.6.3 Cleanup pass The third (and final) coding pass for each bit plane is the cleanup pass. This pass is used to convey significance and (as necessary) sign information for those samples that have not yet been found to be significant and are predicted to remain insignificant during the processing of the current bit plane. Conceptually, the cleanup pass is not much different from the significance pass. The key difference is that the cleanup pass conveys information about samples that are predicted to remain insignificant, rather than those that are predicted to become significant. Algorithmically, however, there is one important difference between the cleanup and significance passes. In the case of the cleanup pass, samples are sometimes processed in groups, rather than individually as with the significance pass. 8.7 Tier-2 coding In the encoder, tier-1 encoding is followed by tier-2 en- coding. The input to the tier-2 encoding process is the set of bit-plane coding passes generated during tier-1 encoding. In tier-2 encoding, the coding pass information is packaged into data units called packets, in a process referred to as packetization. The resulting packets are then output to the final code stream. The packetization process imposes a partic- ular organization on coding pass data in the output code stream. This organization facilitates many of the desired codec features, including rate scalability and progressive re- covery by fidelity or resolution. In the decoder, the tier-2 decoding process extracts the various coding passes from the code stream (i.e., depacketization) and associates each coding pass with its corresponding code block. 8.8 Rate control In the encoder, rate control can be achieved through two distinct mechanisms: 1) the choice of quantizer step sizes, 2) the selection of the subset of coding passes to include in the code stream. When the integer coding mode is used (i.e., when only integer-to-integer transforms are employed) only the first mechanism may be used, since the quantizer step sizes must be fixed at one. When the real coding mode is used, then either or both of these rate control mechanisms may be em- ployed. When the first mechanism is employed, the quantizer step sizes are adjusted in order to control the rate. As the step sizes are increased, the rate decreases, at the cost of greater distortion. Although this rate control mechanism is conceptually simple, it does have one potential drawback. Every time the quantizer step sizes are changed, the quantizer indices change, and tier-1 encoding must be performed again. Since tier-1 coding requires a considerable amount of computation, this approach to rate control may not be practical in computationally-constrained encoders. When the second mechanism is used, the encoder can elect to discard coding passes in order to control the rate. The encoder knows the contribution that each coding pass makes to the rate, and can also calculate the distortion reduction associated with each coding pass. Using this information, the encoder can then include the coding passes in order of decreasing dis- tortion reduction per unit rate until the bit budget has been exhausted. This approach is very flexible in that different distortion metrics can be easily accommodated (e.g., mean squared error, visually weighted mean squared error, etc.). For a more detailed treatment of rate control, the reader is referred to [14] and [20]. 8.9 Region of interest coding The codec allows different regions of an image to be coded with differing fidelity. This feature is known as region- -of-interest (ROI) coding. When an image is to be coded with an ROI, some of the transform coefficients are identified as being more important than the others. The coefficients of greater importance are referred to as ROI coefficients, while the remaining coefficients are known as background coeffi- cients. ROI coefficients are encoded with greater precision than the background coefficients. For more information on ROI coding, the reader is referred to [29, 30]. 9 Conclusions This paper has presented a review on some embedded rate scalable image codecs. The embedded zerotree wavelet (EZW) coder is often used as a performance reference. Simi- lar visual performance, with the added advantage of low arithmetic complexity, is achieved by set partitioning in the hierarchical trees algorithm (SPIHT). These algorithms use a tree structure or lists to detect and exploit similarities. The precise rate control that is achieved with these algorithms is a distinct advantage. The user can choose a bit rate and encode the image to exactly the desired bit rate. EZW and SPIHT offer only SNR scalability. The complexity of the SFQ algorithm lies mainly in the iterative zerotree pruning stage of the encoder, which can be substantially reduced with fast heuristics based on models rather than actual R-D data, which is expensive to compute. The SFQ coder has important char- acteristics. First, SFQ is built around a linear transform that allows signal energy to be compacted both in frequency and in space, and quantization modes designed to match this characterization. Second, SFQ provides a framework for opti- mizing (in the rate-distortion sense) the application of the quantization modes available to it. The EBCOT image compression algorithm offers state- -of-the-art compression performance together with an unprecedented set of bit-stream features, including resolution scalability, SNR scalability and a “random access” capability. All features can coexist-exist within a single bit-stream without substantial sacrifices in compression efficiency. The MZTE al- © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 15 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 gorithm describes the texture representation scheme adopted for MPEG-4 synthetic/natural hybrid coding (SNHC) of tex- ture maps and images. The scheme is based on the concept of the multi-scale zerotree wavelet entropy (MZTE) coding tech- nique, which provides many levels of scalability layers in terms of either spatial resolutions or picture quality. MZTE, with three different modes (single-Q, multi-Q, and bilevel), pro- vides much improved compression efficiency and fine-grad- ual scalabilities, which are ideal for hybrid coding of texture maps and natural images. The MZTE scheme has been adopted as the baseline technique for the visual texture cod- ing profile in both the MPEG-4 video group and SNHC group. MZTE was also rated as one of the top five schemes in terms of compression efficiency in the JPEG2000 November 1997 evaluation, among 27 submitted proposals. CEZW is a coding algorithm for color images that uses a luminance/chrominance color space. In CEZW algorithm data rate scalability is achieved by using an embedded coding scheme, which is similar to the EZW algorithm. The CEZW algorithm does not require image-dependent transforms such as KL transforms to decorrelate the color components. A spatial-orientation tree that links not only the frequency bands but also the color channels is used for scanning the wavelet coefficients, such that the interdependence be- tween different the color components in the LC spaces is automatically exploited. The CSPIHT scheme offered a sim- ple solution to embed both luminance and chrominance data into a single coded data stream. If an image is able to go through a maximum of levels of DWT decomposition, CSPIHT provides comparable performance to SPIHT KLT in terms of quality of reconstruction. For much fewer than decomposition levels for the same image, CSPIHT offers much better quality of reconstruction without the need to perform the computationally intensive KLT on the spectral components of the image. The advantage in the performance of CSPIHT lies in the low bit-rate coding and DWT maps with large LL sub-band dimensions. The JPEG2000 standard for still image compression is presented. The JPEG2000 standard supports lossy and loss- less compression of single-component and multi-component imagery. In addition to this basic compression functionality, it has many other features such as: progressive recovery of an image by fidelity or resolution, region of interest coding, whereby different parts of an image can be coded with differ- ing fidelity, random access to particular regions of an image without needing to decode the entire code stream, a flexible file format with provisions for specifying opacity information and image sequences, and good error resilience. The performance of all mentioned algorithms can be improved by using efficient sign coding and estimation of zero-quantized coefficients, as presented in [31]. References [1] Netravali A. N., Rubinstein C. B.: “Luminance Adaptive Coding of Chrominance Signals.” IEEE Transactions on Communications, COM 27, 1979, p. 703–710. [2] Limb J. O., Rubinstein C. B.: “Plateau Coding of the Chrominance Component of Color Picture Signals.” IEEE Transactions on Communications, COM 22, 1974, p. 812–820. [3] Rabbani M., Jones P. W.: Digital image compression tech- niques. Bellingham, Washington: SPIE Optical Engi- neering Press, 1991. [4] Shapiro J. M.: “Embedded Image Coding Using Zero- trees of Wavelet Coefficients.” IEEE Transactions on Sig- nal Processing, Vol. 41 (1993), p. 3445–3462. [5] Shen K., Delp E. J.: “Wavelet Based Rate Scalable Video Compression.” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9 (1999), p. 109–122. [6] Frazier M. W.: An introduction to wavelets through linear al- gebra. Springer-Verlag New York Inc., 1999. [7] Shen K., Delp E. J.: Color Image Compression Using an Em- bedded Rate Scalable Approach. Proc. IEEE Int. Conf. Im- age Processing, Santa Barbara, CA, Oct. 26–29, 1997, III-34–III-37. [8] Bhaskaran V., Konstantinides K.: Image and video com- pression standards: Algorithms and architectures (second edi- tion). Kluwer Academic Publisher, 1997. [9] Eskicioglu A. M., Fisher P. S.: “Image Quality Measures and Their Performance.” IEEE Transactions on Communi- cations, Vol. 43 (1995), p. 2959–2965. [10] Said A., Pearlman W. A.: “A New, Fast, and Efficient Im- age Codec Based on Set Partitioning in Hierarchical Trees.” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 6 (1996), p. 243–250. [11] Witten H., Neal R., Cleary J. G.: “Arithmetic Coding for Data Compression.” Comm. ACM, Vol. 30 (1987), p. 520–540. [12] Xiong Z., Ramchandran K., Orchard M. T.: “Space- -Frequency Quantization for Wavelet Image Coding.” IEEE Transaction on Image Processing, Vol. 6 (1997), p. 677–693. [13] Sodagar I. et al: “Scalable Wavelet Coding for Syn- thetic/Natural Hybrid Images.” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9 (1999), p. 244–254. [14] Taubman D.: “High performance scalable image com- pression with EBCOT.” Proc. of IEEE International Con- ference on Image Processing, Kobe (Japan), Vol. 3 (1999), p. 344–348. [15] Taubman D.: “High performance scalable image com- pression with EBCOT.” IEEE Trans. on Image Processing, Vol. 9 (2000), p. 1158–1170. [16] Taubman D. et al: “Embedded block coding in JPEG 2000.” Signal Processing: Image Communication, Vol. 17 (2002), p. 49–72. [17] Antonini M. et al: “Image Coding Using Wavelet Trans- form.” IEEE Transactions on Image Processing, Vol. 1 (1992), p. 205–220. [18] Kassim A. A., Lee W. S.: “Embedded Color Image Coding Using SPIHT with Partially Linked Spatial Ori- entation Trees.” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13 (2003), p. 203–206. [19] Kouassi R. K. et al: “Application of the Karhunen-Loeve Transform for Natural Color Images Analysis.” Conf. Record 3, 1st Asilomar Conf. Signals, Systems & Computers, Vol. 2 (1997), p. 1740–1744. [20] ISO/IEC 15444-1: Information technology – JPEG2000 image coding system – Part 1: Core coding system, 2000. 16 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 44 No. 1/2004 Czech Technical University in Prague [21] Lewis A. S., Knowles G.: “Image compression using the 2-D wavelet transform.” IEEE Trans. on Image Processing, Vol. 1 (1992), p. 244–250. [22] Calderbank A. R. et al: “Wavelet transforms that map in- tegers to integers.” Applied and Computational Harmonic Analysis, Vol. 5 (1998), p. 332–369. [23] Adams M. D.: Reversible Integer-to-Integer Wavelet Trans- forms for Image Coding. Ph.D. thesis, Department of Elec- trical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada, Sept. 2002, Avail- able online from http://www.ece.uvic.ca/ mdadams. [24] Gormish M. J. et al: Lossless and nearly lossless compression of high-quality images. Proc. of SPIE, San Jose, CA, USA, 3025, 1997, p. 62–70. [25] Chao H., Fisher P., Hua Z.: An approach to integer wavelet transforms for lossless image compression. Proc. of Inter- national Symposium on Computational Mathematics, Guangzhou, China, 1997, p. 19–38. [26] Adams M. D.: Reversible wavelet transforms and their appli- cation to embedded image compression. M.Sc. thesis, Depart- ment of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada, Jan. 1998, Available from http://www.ece.uvic.ca/ mdadams. [27] Adams M. D., Kossentini F.: “Reversible integer-to-inte- ger wavelet transforms for image compression: Perfor- mance evaluation and analysis.” IEEE Trans. on Image Processing, Vol. 9 (2000), p. 1010–1024. [28] ISO/IEC 14492-1: Lossy/lossless coding of bi-level im- ages, 2000. [29] Christopoulos C., Askelof J., Larsson M.: “Efficient methods for encoding regions of interest in the upcom- ing JPEG2000 still image coding standard.” IEEE Signal Processing Letters, Vol. 7 (2000), p. 247–249. [30] Askelof J., Carlander M. L., Christopoulos C.: “Region of interest coding in JPEG2000.” Signal Processing: Image Communication, Vol. 17 (2002), p. 105–111. [31] Deever A. T., Hemami S. S.: “Efficient Sign Coding and Estimation of Zero-Quantized Coefficients in Embedded Wavelet Image Codecs.” IEEE Transactions on Image Pro- cessing, Vol. 12 (2003), p. 420–430. Doc. Ing. Boris Šimák, CSc. e-mail: simak@feld.cvut.cz Eng. Farag Ibrahim Younis Elnagahy, MSc. E-mail: faragelnagahy@hotmail.com Department of Telecommunications Engineering Czech Technical University in Prague Faculty of Electrical Engineering Technická 2 166 27 Praha 6, Czech Republic © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 17 Czech Technical University in Prague Acta Polytechnica Vol. 44 No. 1/2004 1Simak 4 5 6 7 8 9 10 11 12 13 14 15 16_17