FACTA UNIVERSITATIS 
Series: Electronics and Energetics Vol. 31, N

o
 3, September 2018, pp. 461-485 

https://doi.org/10.2298/FUEE1803461A 

EFFICIENT IMAGE COMPRESSION AND DECOMPRESSION 

ALGORITHMS FOR OCR SYSTEMS

 

Boban Arizanović, Vladan Vučković 

Faculty of Electronic Engineering, Computer Department, Niš, Serbia 

Abstract. This paper presents an efficient new image compression and decompression 

methods for document images, intended for usage in the pre-processing stage of an 

OCR system designed for needs of the “Nikola Tesla Museum” in Belgrade. Proposed 

image compression methods exploit the Run-Length Encoding (RLE) algorithm and an 

algorithm based on document character contour extraction, while an iterative scanline 

fill algorithm is used for image decompression. Image compression and decompression 

methods are compared with JBIG2 and JPEG2000 image compression standards. 

Segmentation accuracy results for ground-truth documents are obtained in order to 

evaluate the proposed methods. Results show that the proposed methods outperform 

JBIG2 compression regarding the time complexity, providing up to 25 times lower 

processing time at the expense of worse compression ratio results, as well as 

JPEG2000 image compression standard, providing up to 4-fold improvement in 

compression ratio. Finally, time complexity results show that the presented methods 

are sufficiently fast for a real time character segmentation system. 

Key words: Image processing, image compression, image decompression, OCR, 

machine-typed documents, machine- printed documents 

1. INTRODUCTION 

Character segmentation still presents a considerable challenge in image processing 

and other related computer science fields [1], and is a very important pre-processing stage 

in Optical Character Recognition (OCR) systems [2-4]. Character segmentation and 

character recognition [5-8] have been important subjects of research for many years [9]. 

Outside of the OCR systems scope, much recent work deals with extracting characters 

from natural and other non-document images. Taking the recent work into consideration, 

it is noticeable that the difficulty of character segmentation is usually underestimated 

compared to the process of character recognition [10,11]. Related works may be 

classified into those that analyze the character segmentation approach in natural images 

[12-14], and others that deal with character segmentation in document images. The 

                                                           

Received December 22, 2017; received in revised form April 30, 2018 
Corresponding author: Vladan Vuĉković  

Faculty of Electronic Engineering, Computer Department, P.O. Box 73, 18000 Niš, Serbia  

(E-mail: vladanvuckovic24@gmail.com) 


462 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

second group includes machine-printed documents [10,15-18], where the document 

structure and the shape of its elements are regular, and handwritten documents where 

character segmentation is challenged due to irregular document structure [11,19-29]. Old 

machine-typed documents are of particular significance because important historical 

documents are often in this form [8,30-32]. 

Recent research includes all levels of character segmentation. Many approaches for 

skew estimation, as a part of a skew correction process, are modifications of the Hough 

transform [33-35], with some based on correlation functions or straight line fitting [36,37]. 

Analyses on document image binarization parameters showed that the Otsu method and 

other Otsu-based methods give the best results on average [32]. A learning-based approach 

for finding the best binarization parameters was presented in [38]. Document image 

compression and decompression methods can also be exploited in the pre-processing stage 

of the character segmentation system, in order to efficiently store the document images. A 

survey of image compression algorithms used in wireless multimedia sensor networks 

(WMSN) was presented in [39]. Compression of large Arabic textual images based on 

pattern segmentation is achieved using the approach proposed in [40]. Genetic algorithm 

based on discrete wavelet transformation information for fractal image compression was 

presented in [41]. Combination of the lapped transform and Tucker decomposition, named 

as hyperspectral image compression, was proposed in [42]. A lossy image compression 

technique based on singular value decomposition (SVD) and wavelet difference reduction 

(WDR) was proposed in [43]. Taking the character segmentation into account, many 

methods have been proposed. A technique based on searching for connected regions in the 

spatial domain performed on a binary image was proposed in [44]. A character 

segmentation method based on Gaussian low-pass filter and innovational Laplace-like 

transform was proposed in [45]. Segmentation process adapted for real time tasks is 

proposed in [46] and is based on the Bayes theorem in order to exploit prior knowledge. A 

novel approach was proposed in [47] based on the usage of contour curvature of letters for 

identifying the writer of ancient inscriptions and Byzantine codices, without requiring 

learning algorithms or a database. Diverse methods for segmentation of handwritten 

documents are proposed [26]. Some techniques exploit clustering in the process of 

segmentation [28,48]. Gabor filter for feature extraction and Fisher classifier for feature 

classification were exploited in [49]. To solve the problem of touching characters in 

handwritten documents, self-organizing maps, SVM classifiers, and Multi-Layer Perceptron 

are used [21,27,50,51]. For natural images, tensor voting and the three-color bar code for 

segmentation have been combined [14,52]. 

This paper presents further improvements of the authors‟ character segmentation 

approach, which forms part of a real time OCR system for the needs of the “Nikola Tesla 

Museum” in Belgrade [53-56]. This paper presents pre-processing methods for document 

image compression and decompression, which take place after the image binarization. 

The proposed compression algorithms are based on the RLE data compression algorithm 

and document character contour extraction, while decompression algorithm exploits the 

scanline fill algorithm. Together with skew estimation and correction [54], and the image 

filtering stage, which concludes with image binarization process, this pre-processing is 

executed independently before the actual segmentation stage. This offers the opportunity 

to prepare document images for further processing later, to store document images in a 

compressed form, to use the improved character segmentation and recognition software 

independently from the pre-processing stage, and also to test independently the character 


 Efficient Compression and Decompression Algorithms for OCR Systems 463 

 
segmentation and recognition stages. The results show that the proposed image 

compression and decompression methods perform up to 25 times faster than JBIG2 

compression at the expense of much lower compression ratio, and are better than JPEG2000 

image compression in its lossless mode, giving up to 4-fold improved compression ratio. 

Compression quality proved to be unimportant with regards to character segmentation 

accuracy. Additionally, the evaluated image compression and decompression methods 

proved to be quite efficient and suitable for use in real time system. 

This paper is organized as follows: Section 2 provides a description of the related works 

which deal with image compression, as well as a description of the previously proposed 

character segmentation approach. Section 3 offers a theoretical foundation for bi-level 

image compression standards and JPEG/JPEG 2000 image compression standards used for 

comparison with the proposed algorithms, as well as a description of the RLE algorithm 

and a scanline fill algorithm used for the proposed methods. Section 4 provides the 

complete description of the proposed image compression and decompression methods. 

Pseudo-codes for the proposed image compression and decompression methods are given in 

Section 5, including the suggestion for the optimal implementation. In Section 6, a large set 

of experimental results for image compression methods, obtained on different PC machines 

is provided. Image compression and decompression results from the aspect of compression 

ratio and time complexity are analyzed in Section 6, including segmentation accuracy 

results for compressed document images. Finally, discussion of the extended real time 

character segmentation method, results, and future work are given in Section 7. 

2. RELATED WORKS 

This section gives more detailed descriptions of other image compression methods 

and authors‟ existing character segmentation approach. 

A novel universal algorithm for lossless chain code compression with a new chain 

code binarization scheme was proposed in [57]. The compression method is based on the 

RLE algorithm and the modified LZ77 algorithm. Compression consists of three modes: 

RLE, LZ77, and COPY mode. The runs of the 0-bits are compressed using RLE, the 

simplified LZ77 algorithm handles the repetitions within the bit stream, and COPY mode 

is used if the aforementioned two methods are unsuccessful. On average, this method 

achieves better compression results than state-of-the-art methods. 

An image compression technique for video surveillance based on dictionary learning 

was presented in [58]. The main concept exploits the camera‟s being stationary, giving 

image samples a high level of similarity. The algorithm transforms images over sparsely 

tailored, over-complete dictionaries previously learned directly from image samples, and 

thus the image can be approximated with fewer coefficients. Results show that this 

method outperforms JPEG and JPEG2000 in terms of both image compression quality 

and compression ratio. 

An image compression technique which combines the properties of predictive coding 

and discrete wavelet coding was proposed in [59]. To reduce inter-pixel redundancy, the 

image data values are pre-processed using predictive coding. The difference between the 

predicted and the original values are transformed using discrete wavelet coding. A non-

linear neural network predictor is used in the predictive coding system. Results show that 

this method performs as well as JPEG2000. 


464 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

A multiplier-less efficient and low complexity 8-point approximate Discrete Cosine 

Transform (DCT) for image compression was proposed in [60]. An efficient Graphics 

Processing Unit (GPU) implementation for the presented DCT is provided. It is shown to 

outperform other approximate DCT transforms in JPEG-like image compression. 

The image compression and decompression methods proposed in this research are 

intended for usage in the authors‟ existing character segmentation approach. Vuĉković 

and Arizanović [53] proposed an efficient character segmentation method for machine-

typed documents and machine-printed documents based on the usage of projection profiles. 

The method consists of pre-processing and segmentation logic. The pre-processing of the 

character segmentation is focused on manual document skew correction [54], document 

image grayscale conversion (to perform the document image binarization), and noise 

reduction. This paper provides an extension of the pre-processing by adding the image 

compression/decompression to enable the efficient and independent document image 

storage before the segmentation stage. Segmentation logic is semi-automatic and consists 

of line, word, and character segmentation. All segmentation levels use the modified 

projection profiles technique. A new method for segmentation of words into characters 

based on decision-making logic is the core of the segmentation logic. This method 

gradually eliminates the possibility for big segmentation errors by determining the 

number of characters in a word using word width and the assumed average character 

width for a given document image. Computational efficiency is achieved using the linear 

image representation, with further implementation optimization using pointer arithmetic 

and highly-optimized low level machine code. The provided results have shown that this 

novel method outperforms state-of-the-art techniques in terms of both time complexity 

and segmentation accuracy. 

3. THEORETICAL BACKGROUND 

This section provides a theoretical foundation for the state-of-the-art image 

compression standards used for comparison with the proposed algorithms, including the 

description of the standard algorithms used in the proposed image compression and 

decompression methods. The state-of-the-art theoretical background provided in this section 

covers the compression standards for bi-level images, which are especially suitable for 

document images, as well as a general JPEG and JPEG2000 image compression standards. 

3.1. Compression standards for bi-level images 

Bi-level images are represented using only 1 bit per each pixel. This bit denotes a 

black or white color and has a value 0 or 1 depending on the color. For this reason, bi-

level images are also referred to as black and white images. Bi-level images usually 

contain a few specific types of elements such as text, halftone images, and line-art which 

includes graphs, equations, logos, and other similar features. First compression standards 

have been designed for facsimile (fax) images. Fax standards include Group 3 (G3), 

Group 4 (G4), and JBIG standard which is the basis of the later developed JBIG2 

standard. G3 standard includes the modified Huffman (MH) coding which combines the 

variable length codes of Huffman coding with standard RLE coding of the repetitive 

sequences, and the modified relative element address designate (READ) coding, also 


 Efficient Compression and Decompression Algorithms for OCR Systems 465 

 
called the modified READ (MR) coding. G4 standard uses the modified MR (MMR) 

coding, which similarly as G3 standard has MH as a basic coder. 

The JBIG compression standard recommended by the Joint Bi-Level Image Experts 

Group is a lossless compression standard used for binary images such as scanned text 

images, computer-generated text, fax transmissions, etc. This standard can work in three 

separate modes of operation: progressive, progressive-compatible sequential, and single-

progression sequential. Taking the coders into consideration, JBIG uses the arithmetic 

coding, exploiting the QM coder variant. Context-based prediction is used in the encoding 

process. In order to ensure a significantly higher compression ratio over the previously 

described compression standards, the modified lossy version of JBIG has been proposed 

and named JBIG2. Although the JBIG standard also supports a lossy compression, the lossy 

compression quality it provides is very low. On the other side, JBIG2 provides both higher 

compression ratio for lossless compression and lossy compression with a very high 

compression ratio. JBIG2 supports three basic coding modes: generic, halftone, and text 

coding mode. Generic coding mode uses either the MMR or MQ variant of arithmetic 

coding. Halftone coding is used for halftone images. Coding part is based on generic coding 

using a pattern, having a multi-level image as an output. The decoder obtains the halftone 

image using the multi-level image and previously used pattern. Finally, dictionary-based 

text coding is used for textual content. Each representative textual symbol is firstly encoded 

using the generic coding and is stored in the dictionary together with its position. Decoding 

is achieved in a straight-forward way, using the dictionary. Difference between the lossy 

and lossless text compression is in pattern matching type. Lossy compression uses a hard 

pattern matching and similar letters are coded with the same dictionary entry. Soft pattern 

matching is used for lossless compression where refinement coding is exploited in order to 

make a necessary difference between the already stored letter in the dictionary and the 

current letter. Different modes are used for different document regions. Sometimes text 

regions are classified as generic regions in order to obtain better results. In average JBIG2 

gives 3-5 times higher compression ratio than G4 compression standard and 2-4 times 

higher compression ratio compared with JBIG standard. 

3.2. JPEG and JPEG 2000 image compression standards 

JPEG and JPEG 2000 are well-known image compression standards used for comparison 

with the proposed methods and evaluation of their performance [61]. JPEG, which stands for 

Joint Photographic Experts Group, is a lossy image compression algorithm. The goal of the 

JPEG compression algorithm is to eliminate a high frequency colors in the image which 

cannot be observed by the human eye. This way original and compressed images would be 

usually the same visually, but the compressed image would be smaller in size. 

JPEG image compression algorithm consists of few steps: 

1) Image partitioning - The whole image is divided into blocks, size of 8 × 8. The 
choice of the block size is also an important part of this step. In case of blocks of 

bigger size, it is possible to happen that there would exist blocks with big areas of 

the similar color structure, but since blocks are observed as a whole, those 

similarities cannot be exploited to obtain a better compression results. On the other 

side, in case of smaller blocks the processing would be much slower because in the 

next step a Discrete Cosine Transform (DCT) needs to be performed on each block. 

For these reasons the blocks size of 8 × 8 are taken as an appropriate choice. 


466 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

2) DCT - In this step, DCT is performed on each block matrix. DCT is a similar 
transform as a Discrete Fourier Transform (DFT) since both transforms map a 

function to frequency domain, except the fact that DCT uses only cosine function, 

without dealing with imaginary parts. After derivation of the DCT term, it is 

noticeable that values of DCT are half of DFT values: 

                                                                          𝑓𝑚

= 2 ∑ 𝑓𝑗cos (
𝑚𝜋

2
(𝑗

𝑁−1

𝑗=0

+
1

2
))                                                            (1) 

where fm is the DFT coefficient. The values of DCT can be defined as: 

                                                                   𝑔𝑚

= ∑ 𝑓𝑗cos (
𝑚𝜋

2
(𝑗

𝑁−1

𝑗=0

+
1

2
))                                                             (2) 

If DCT values are taken into consideration and represented as a matrix, values 

with lower frequency will be grouped in the upper left side, while the higher 

frequency values, which are not visible for the human eye, are in other parts of the 

matrix. The goal of the next step is to eliminate these high frequency values. 

3) Elimination of the high frequency values – For this task, it is necessary to multiply 

the DCT matrix by the appropriate mask matrix. In case of 8 × 8 blocks, the mask 
matrix that eliminates high frequency values would have the next form: 

[
 
 
1 1 1 1 0 0 0 0
1 1 1 0 0 0 0 0
1 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0]

 
4) Inverse DCT - Finally, the last step in this process is to apply the inverse DCT on 
each obtained block and use the obtained values to form the new image of the 

same dimensions as the original image. 

In order to describe the JPEG 2000 image compression standard and point out its 

advantages over the JPEG compression standard, its comparison with JPEG standard 

would be appropriate at this point: 

1) Transformation type - While the previously described JPEG algorithm uses the 8 

× 8 DCT, JPEG 2000 uses the wavelet transform with lifting implementation. 
Using the wavelet transform, the better energy compaction and resolution 

scalability is obtained.  


 Efficient Compression and Decompression Algorithms for OCR Systems 467 

 
2) Partitioning domain - While JPEG uses partitioning in the space domain by 
dividing the image into blocks and applying transformations on each block, JPEG 

2000 performs partitioning in the wavelet domain. This way the blocking artifacts 

which appear during partitioning in the space domain, as a result of partial 

application of DCT transform, would be eliminated. 

3) Entropy coding - JPEG algorithm encodes the DCT coefficients one by one, while 
JPEG 2000 encodes the wavelet coefficients bitplane by bitplane. In case of JPEG, 

the resulting bitstream cannot be truncated, while in case of JPEG 2000 truncation 

is allowed, which enables the bitstream scalability. 

4) Rate control - Compression ratio and the amount of distortion when JPEG image 
compression algorithm is used can be determined by the quantization module, while 

JPEG 2000 uses the quantization module only for conversion of float wavelet 

transform coefficients to integer coefficients, and the bitstream assembly module is 

used to determine the compression ratio and the amount of distortion. This allows final 

bitstreams of the certain compression ratio to be easily converted to bitstreams of 

another compression ratio without repeating the entropy coding and transformation 

process. 

3.3. Run-length encoding (RLE) algorithm 

RLE is a standard widely used lossless data compression algorithm. The logic of this 

algorithm is to replace each repeating of some specific pattern with a symbol which 

describes that pattern and a value which defines the number of consecutive repeats of that 

pattern in the given sequence. In literature, the application of this algorithm to text 

compression is usually explained. The simplest example of algorithm application is in case 

that pattern is a single character. Suppose that the next sequence of characters is given: 

𝑠𝑠𝑎𝑎𝑎𝑐𝑐𝑐𝑎𝑠𝑑𝑎𝑎𝑒𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑠𝑑𝑑𝑑𝑎𝑑𝑑𝑑𝑑𝑑𝑤𝑤𝑤𝑎 

In this case, the compressed text will be the following: 

2𝑠3𝑎3𝑐1𝑎1𝑠1𝑑2𝑎1𝑒7𝑎1𝑠3𝑑1𝑎5𝑑3𝑤1𝑎 

The original text has 35 characters, while the compressed text has 30 characters. The gain 

in this concrete example is not significant, but in practice algorithm can deal with a large 

amount of binary values and can be very efficient. The larger runs of the same values 

exist in sequence, the higher compression ratio will be achieved. In general case, the 

pattern does not necessarily need to be a single character. It could be a word or even a 

sentence. In such cases, it is also mandatory to use a delimiter between the compressed 

information for each pattern run, since it is unknown how many characters each pattern 

has and decompression would be impossible without having this information. The 

pseudo-code for general RLE algorithm is shown in the following listing. 

 
RUN-LENGTH-ENCODING 

Input: 

  sequence s. 

Output: 

  array Compressed. 

 1:    I := 0 

 2:    while I < LENGTH(s) do 


468 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

 3:    CurrentPattern := s[I] 

 4:    RunLength := 0 

 5:    while I < LENGTH(s) AND s[I] = CurrentPattern do 

 6:    RunLength := RunLength + 1 

 7:    I := I + 1 

 8:    end while 

 9:    Compressed  RunLength 

10:    Compressed  CurrentPattern 

11:    end while 

12:    return Compressed 

 
The sign  is used to represent the assignment operator for array. 

3.4. Scanline fill algorithm 

Scanline fill algorithm belongs to the region filling algorithms. Instead of algorithms 

with flooding approach which fill the contour by coloring the connected pixels of the 

same color at pixel level, the scanline fill algorithm is defined at geometric level and fills 

the contour in a horizontal or vertical direction, i.e. row by row or column by column. 

Illustration of the scanline fill algorithm is shown in Fig. 1. 

 
Fig. 1 Illustration of the scanline fill algorithm 


 Efficient Compression and Decompression Algorithms for OCR Systems 469 

 
As it is shown in Fig. 1, the algorithm starts from a seed pixel and searches for the last 

pixel on the left side which color is the same as the seed pixel color. Once that pixel is 

found, the pixel row is processed from left to right until the pixel which color is not the 

same as the seed pixel color or the end of the pixel scanline is encountered. The color of 

each pixel in the current pixel row is changed to the fill color if its color is the same as 

the seed pixel color. Also, the pixels above and below the current pixel are checked and 

they are pushed to stack if their color is the same as the seed pixel color. Once the pixels 

above or below the current pixel are pushed to stack, the next pixels above and below are 

not considered until the new sequence of pixels which color is the same as the seed pixel 

color is not encountered. The next iteration starts when a new pixel is taken from the top 

of the stack. In fact, instead of pushing to stack the coordinates of the all individual pixels 

which need to be processed, this algorithm pushes the start coordinates of the line 

segments. This ensures that each pixel is checked once and leads to better time 

complexity results compared with flood fill algorithm, which is the main reason to use 

this algorithm for image decompression. 

4. PROPOSED ALGORITHMS 

This section proposes new image compression and decompression methods used in the 

pre-processing stage of the character segmentation system, in order to compress and 

decompress document images before the segmentation stage. As it has already been 

mentioned, previously presented character segmentation approach is extended by adding 

the image compression/decompression part in the pre-processing stage. This step gives the 

possibility to divide the character segmentation system into two independent parts. The first 

part is a pre-processing part having the document image compression and decompression as 

a final process, and the second part is a document image segmentation. The most evident 

gain achieved here is the possibility to execute two independent system parts in different 

moments. This is important due to several reasons. This way a document image 

compression and document image segmentation can be done on different machines. 

Additionally, the previously compressed/decompressed document images can be processed 

using different versions of the segmentation engine. This is very important feature since it 

also allows efficient testing of the segmentation engine. Finally, image compression allows 

efficient storing of document images which can save a significant disc space. 

In this section two image compression and decompression methods are proposed. The 

first method is completely based on RLE algorithm and can be used for both machine-

typed documents and machine-printed documents. The second method uses document 

character contour extraction in combination with scanline fill algorithm. The second 

method works with machine-printed documents, but its application on machine-typed 

documents is limited due to irregular structure of document image characters caused by 

low quality of documents. These image compression and decompression methods are 

presented in the following subsubsections. 

4.1. Image compression and decompression using RLE 

The first proposed image compression and decompression methods used in the pre-

processing stage of character segmentation system employ the RLE algorithm for data 


470 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

compression. After the image binarization, the image consists of black and white pixels, 

thus the RLE algorithm proved to be excellent choice. This approach is general and can 

be used for all types of binarized images, but RLE algorithm gives better compression 

results in case of document images than e.g. in case of natural images. The reason for this 

lies in fact that RLE algorithm searches for runs of black and white pixels in each pixel 

scanline. In natural images these runs are short, while in document images the 

background has large runs of white pixels. Illustration of the RLE algorithm including the 

coding format is given in Fig. 2. 

 
Fig. 2 Document image compression using RLE algorithm: (a) Compression format for 

white pixel runs, (b) Compression format for black pixel runs, (c) Example of 

pixel scanline, (d) Compressed pixel scanline. 

Fig. 2 shows an example of image compression using RLE algorithm. RLE algorithm 

counts white and black pixels and stores the information about pixel runs in compressed 

file. Storing is achieved using 3 bytes in overall for information about the white pixels and 

2 bytes for information about the black pixels. In both cases the pixels are counted until the 

maximal value is reached. Since 2 bytes are used for white pixels (WHITE_RUN), this 

value is 216 = 65536, while in case of black pixels (BLACK_RUN) this value is 28 = 256. 

When these values are reached, the WHITE_LOOPS or BLACK_LOOPS byte is 

incremented and WHITE_RUN and BLACK_RUN values are set to 0. The whole process 

of counting is then repeated. Since white pixels are a part of the background and are 

dominating in document images, it is expected that 1 byte is not enough for storing the 

information about the number of consecutive white pixels. On the other side, black pixel 

runs are not expected to be too long since they represent document characters and some 

spaces between characters are expected, thus only 1 byte is used for storing this information. 

Document image decompression is straight forward. First byte is always multiplied by 

256 or 65536 in case of black pixels or white pixels, respectively. After that, this value is 

incremented by value of the next byte or 2 bytes. The obtained value represents the 

number of consecutive pixels of the same color in the current run of pixels. This process 

is repeated until the end of the compressed file. 


 Efficient Compression and Decompression Algorithms for OCR Systems 471 

 
4.2. Image compression and decompression using document character 

contour extraction and scanline fill algorithm 

The second proposed method for image compression and decompression employs the 

combination of the algorithm based on document character contour extraction and 

scanline fill algorithm. The compression algorithm uses the 2D processing of a document 

image, since character contours need to be obtained from all four sides. Document image 

is processed in horizontal and vertical direction and distances between the black pixels 

which represent starting and ending pixels of the black runs are stored in the compressed 

file. For this purpose, 2 bytes can be used to store the distance between two black pixels. 

It should be mentioned that in both compression methods the number of bytes used for 

storing the information about the white and black pixel runs is dependent primarily on the 

image dimensions. Small images are expected to have short runs, while large images are 

expected to have long runs of pixels of the same color. Therefore, 1 byte can be used for 

both white and black pixels in case of small images, while in case of large images 2 bytes 

are necessary. Another important factor is a structure of a document image. If textual 

content is dominating in a document image, background areas are not huge and even in 

large images 1 byte can be used for storing the information about pixel runs. On the other 

side, if background area is dominating, even in medium images 2 bytes would not be 

enough to store the information about pixel runs. 

The process of image decompression is more specific. After obtaining the offsets of 

black pixels which represent the contours of the characters, in the first step of 

decompression method contours are drawn to the output image. The second step uses the 

iterative scanline fill algorithm. The main idea here is to scan the whole output image and 

fill the contours which represent the background with background color, while character 

contours will be filled with black color. This is achieved by repeating execution of the 

scanline fill algorithm. After that, a background color of a document image is replaced 

with white color and the original binarized document image is obtained. The illustration 

of image decompression is shown in Fig. 3. 

It is assumed that the first pixel in a document image is a background pixel, thus the 

scanline fill algorithm starts execution from the first pixel. The next step is filling of the 

closed contours which are the document characters contours. In order to fill these 

contours, the algorithm searches for the next white pixel. When the next white pixel is 

found, the colors of the previous two or three pixels are checked. It is also assumed that 

contour edge is not wider than two pixels, but generally this does not need to be a case. In 

case that the previous two pixels have the black color and background color, or in case 

that the previous three pixels have black color, black color, and background color, the 

character contour is found and needs to be colored in black using scanline fill algorithm. 

This procedure is repeated until the end of a document image. The final image will 

contain only the background color and the black color. The final step is to change the 

background color back to white color and binarized document image will be obtained. 

 
472 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

a) 

 
b) 

 
c) 

 
d) 

 
e) 

Fig. 3 Document image decompression using scanline fill algorithm: (a) Original image, 

(b) Resulting image after character contours extraction, (c) Resulting image after 

background filling, (d) Resulting image after character contours filling, (e) Final 

binarized image. 

5. IMPLEMENTATION 

This section provides pseudocodes for the proposed image compression and 

decompression methods. In order to achieve an efficient implementation, linear image 

representation could be used. Linear image representation is obtained by storing the image 

pixels linearly in a one-dimensional array. This representation is efficient since the memory 

organization is also linear and image pixels will be stored in successive memory locations, 

which will provide the fastest possible access to the image elements. The first presented 


 Efficient Compression and Decompression Algorithms for OCR Systems 473 

 
method for image compression and decompression uses the RLE data compression 

algorithm. The pseudocode of the image compression algorithm is shown below. 
 

RLE-IMAGE-COMPRESSION 

Input: 

  image f. 

Output: 

  array Compressed. 

 1:    CurrentColor := WHITE 

 2:    while not end of image f do 

 3:    RunsCount := CALCULATE-RUN-LENGTH(f, CurrentColor) 

 4:    Factor := RunsCount div MaxRun 

 5:    RunLength := RunsCount mod MaxRun 

 6:    Compressed  Factor 

 7:    if CurrentColor = BLACK then 

 8:    Compressed  RunLength 

 9:    else 

10:    Compressed  RunLength div 256 

11:    Compressed  RunLength mod 256 

12:    endif 

13:    REPLACE-MAX-RUN(MaxRun) 

14:    REPLACE-CURRENT-COLOR(CurrentColor) 

15:    end while 

16:    return Compressed 

 
This pseudocode represents a general application of the RLE algorithm for compression 

of binarized images. This algorithm provides lossless image compression and its application 

is not limited on document images, but as will be shown in the experimental section, this 

method provides better compression results when applied on document images. In order to 

provide the optimal compression, 2 bytes are used for storing the information about black 

pixel runs and 3 bytes are used for storing the information about white pixel runs. The 

Image decompression method is straightforward. The following listing shows the pseudocode 

of the decompression algorithm. 

 
RLE-IMAGE-DECOMPRESSION 

Input: 

  image f, array Compressed. 

 1:    CurrentColor := WHITE 

 2:    while not end of Compressed do 

 3:    RunLength := 0 

 4:    Factor := Compressed[Current] 

 5:    if CurrentColor = BLACK then 

 6:    RunLength := Compressed[Current + 1] 

 7:    else 

 8:    RunLength := Compressed[Current + 1] * 256 

 9:    RunLength := RunLength + Compressed[Current + 2] 

10:    endif 

11:    FILL-RUN(f, RunLength, CurrentColor) 

12:    REPLACE-CURRENT-COLOR(CurrentColor) 

13:    end while 

 
The image decompression algorithm reads the pixel runs information from the 

compressed file and regenerates the original image. Since the compression is lossless, the 

decompressed image will be the same as the original image, but results from the 


474 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

experimental section will prove that image compression quality is actually unimportant for 

segmentation accuracy. The pseudocode from the previous listing corresponds to the 

pseudocode for image compression shown in the first listing, regarding the number of bytes 

used for representing the pixel runs. 

The second proposed method for document image compression and decompression uses 

the character contour based document image representation and the scanline fill algorithm. 

Contour image representation is used for image compression, while image decompression 

uses the iterative scanline fill algorithm. The image compression algorithm performs the 2D 

image processing to obtain the distances between the black pixels which form the character 

contours in a document image. Depending on the document image dimensions, 1 byte or 2 

bytes can be used to store the distance between two black pixels which represent edges of 

document character contours. The following listing shows the pseudocode for the document 

image compression algorithm based on character contour extraction. 

 
CONTOUR-IMAGE-COMPRESSION 

Input: 

  image f. 

Output: 

  array Compressed. 

 1:    for each pixel row in image f do 

 2:    CurrentColor := WHITE 

 3:    while not end of current pixel row do 

 4:    RunsCount := CALCULATE-RUN-LENGTH(f, CurrentColor) 

 5:    Compressed  RunsCount div 256 

 6:    Compressed  RunsCount mod 256 

 7:    REPLACE-CURRENT-COLOR(CurrentColor) 

 8:    end while 

 9:    end for 

10:    for each pixel column in image f do 

11:    CurrentColor := WHITE 

12:    while not end of current pixel column do 

13:    RunsCount := CALCULATE-RUN-LENGTH(f, CurrentColor) 

14:    Compressed  RunsCount div 256 

15:    Compressed  RunsCount mod 256 

16:    REPLACE-CURRENT-COLOR(CurrentColor) 

17:    end while 

18:    end for 

19:    return Compressed 

 
The previous pseudocode describes the algorithm for image compression which uses 

the 2D image analysis to obtain the pixels which represent edges of the document 

character contour. Generally, this algorithm can be used for all binarized images, but its 

application on document images is more effective. The reason is that binarized document 

images have less details compared with say, natural images. Also, document images have 

large areas with background color which will ensure a good compression ratio. With 

regards to compression quality, this algorithm gives worse results than other methods, but 

it does not affect the segmentation accuracy. This fact justifies the usage of this 

compression algorithm in the character segmentation system. The following listing shows 

the pseudocode for document image decompression based on the scanline fill algorithm. 

 
 Efficient Compression and Decompression Algorithms for OCR Systems 475 

 
SCANLINE-FILL-IMAGE-DECOMPRESSION 

Input: 

  Image f, array Compressed. 

 1:    Offset := 0 

 2:    while not end of row compression bytes do 

 3:    RunLength := Compressed[Current] * 256 

 4:    RunLength := RunLength + Compressed[Current + 1] 

 5:    Offset := Offset + RunLength 

 6:    SET-PIXEL-COLOR(f, Offset, BLACK) 

 7:    end while 

 8:    Offset := 0 

 9:    while not end of column compression bytes do 

10:    RunLength := Compressed[Current] * 256 

11:    RunLength := RunLength + Compressed[Current + 1] 

12:    Offset := Offset + RunLength 

13:    SET-PIXEL-COLOR(f, Offset, BLACK) 

14:    end while 

15:    SCANLINE-FILL(0, BLUE_COLOR, WHITE_COLOR) 

16:    for each pixel in f do 

17:    if GET-PIXEL-COLOR(f, Current) = WHITE then 

18: if GET-PIXEL-COLOR (f, Current – 1, Current - 2) = [BLACK, BLUE] OR 

GETPIXEL-COLOR(f, Current – 1, Current – 2, Current - 3) = [BLACK, 

BLACK, BLUE] then 

19:    SCANLINE-FILL (Current, BLACK, WHITE) 

20:    endif 

21:    endif 

22:    SCANLINE-FILL(0, WHITE, BLUE) 

 
As aforementioned, the pseudocode for image decompression using the scanline fill 

algorithm can be applied in the case of machine-printed documents. The efficiency of the 
decompression algorithm is highly influenced by the efficiency of the scanline fill 
algorithm implementation, since the scanline fill algorithm must be executed multiple 
times. In the pseudocode presented in the previous listing, two and three previous pixels 
of the current pixel are checked. In the general case, this conditional statement can be 
changed since it is possible that character contour edges are wider than 1 or 2 pixels as it 
is assumed here. Using the linear image representation, as suggested at the start of this 
section, it is possible to achieve a real time implementation of the proposed methods, 
which will make them suitable for use in the real time character segmentation system. 

6. EXPERIMENTS 

Proposed image compression and decompression methods, as a part of character 
segmentation system, are tested on several PC machines. Results are analyzed from 
different aspects in order to provide the complete insight into the extended character 
segmentation approach and its capabilities. Image compression and decompression methods 
are evaluated from the perspective of the image compression ratio and time complexity, to 
the perspective of the segmentation accuracy when specific compression methods are used. 
Evaluation of the image compression ratio is performed using the standard test set of 
images. Test set consists of six black and white images: Baboon, Barbara, Cameraman, 
Goldhill, Lena, and Peppers. Each pixel intensity value is represented using 3 bytes, one 
byte for red, green, and blue pixel intensity value component. In order to obtain the 
comparative results, JBIG2 and JPEG2000 image compression standards are used. For both 
image compression standards, the performances of their lossless modes are evaluated. 


476 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

The most important metric for evaluating the compression methods is compression 

ratio. In case of images, compression ratio is a ratio between the original image file size and 

compressed image file size. Higher values for compression ratio mean that compression 

method is better regarding this metric. Comparison of the image compression ratio for 

standard set of images and different compression methods is given in Table 1. 

Table 1 Comparison of the compression ratio results for different compression methods 

Image  

(Image Dimensions) 

Image File Size 

(KB) 

Compression Ratio 

JPEG2000 

(Lossless) 
JBIG2 (Lossless) RLE 

Baboon (512x512) 769 3.544 46.048 8.010 
Barbara (512x512) 769 5.961 113.255 9.859 
Cameraman (256x256) 193 3.642 82.833 12.867 
Goldhill (512x512) 769 6.303 111.288 19.718 
Lena (512x512) 769 7.539 156.939 21.971 
Peppers (512x512) 769 8.640 201.837 30.76 

As it is visible from Table 1, JBIG2 compression gives much better compression ratio 

results than the proposed algorithm. This comes from the sophisticated nature of the JBIG2 

algorithm, which is specialized for black and white images. Compression ratio and time 

complexity are two most important measures for the quality of the compression algorithm. 

Although the proposed algorithms fail to surpass the JBIG2 compression ratio results, the time 

complexity results shown later will justify the usage of the proposed methods. Additionally, 

Table 1 shows that RLE based image compression method presented in this paper provides a 

higher compression ratio than JPEG2000 image compression standard in its lossless mode. 

RLE based method is not limited to document images, therefore it can be used for non-

document images as it is a case with standard test images. It is clear that this method ensures 

the possibility to store a huge amount of compressed document images efficiently without 

occupying a lot of disc space. The second presented compression method is limited to specific 

document images and performance of this method is analyzed later in this section. 

Previous results represent the general analysis of the image compression methods. 

Since the proposed image compression and decompression methods will be used in 

character segmentation system, their performances on document images should be 

analyzed. In order to perform this analysis, image compression methods are tested using 

two document images. These document images are machine-printed documents since the 

second method is limited on machine-printed documents which have the regular structure 

and character contours can be extracted correctly. Compression ratio results for these 

document images and different image compression methods are shown in Table 2. 

Table 2 Comparison of the image compression ratio for machine-printed document 

images for different image compression methods 

Image 

Dimensions 

Image File 

Size (KB) 

Compression Ratio 

JPEG2000 

(Lossless) 

JBIG2 

(Lossless) 
RLE 

Contour 

Extraction/Scanline Fill 

719x328 692 14.417 640.741 67.184 46.443 
1266x924 3429 9.741 357.933 27.878 19.373 


 Efficient Compression and Decompression Algorithms for OCR Systems 477 

 
As it is expected, the JBIG2 compression again achieves much better compression 
ratio results. In this case the compression ratio is much higher than in case of standard 
test set images because JBIG2 has a separate mode for compression of a textual content, 
as it is explained in Section 3. Taking the proposed methods into account, they perform 
very well on document images compared with JPEG2000 standard. In overall, RLE based 
method gives the second best results, while the contour extraction method provides 
worse, but still competitive results. It is important to mention that JPEG2000 image 
compression standard performs worse than RLE algorithm because of its nature. The 
color transformation stage of JPEG2000 algorithm generates two color channels which 
are being compressed in this stage. The channel related to black and white image features 
is not compressed the same way, thus the JPEG2000 compression gives worse results in 
case of black and white images, i.e. gives better compression ratio results in case of full-
color images with many color and contrast transitions. 

Contour extraction based compression in combination with scanline fill decompression 

is a lossy compression, since it cannot reconstruct the original image perfectly. Although it 

could imply that this method does not perform well enough, further analysis will deny this 

fact. In order to justify the usage of the second proposed image compression method, the 

segmentation results for document images previously compressed and decompressed using 

different methods are given in Table 3. 

Table 3 Comparison of the segmentation accuracy results for different image 

compression methods used in the pre-processing stage 

 
Segmentation Accuracy (%) 

JBIG2/JPEG2000 

(Lossless) 
RLE 

Contour Extraction/Scanline 

Fill 

Line Segmentation 81.54 81.54 80.32 
Word Segmentation 78.28 78.28 78.14 
Character Segmentation 87.08 87.08 86.92 

These results are obtained using the chosen ground-truth machine-printed documents. 

As expected, the results for JBIG2, JPEG2000, and RLE based compression are identical. 

The most important conclusion here is that contour extraction based compression in 

combination with scanline fill decompression gives slightly worse results than previous 

compression methods. The reason for this lies in sensitivity of the evaluation metrics and 

also in the specificity of the character segmentation technique. In general, this technique is 

not sensitive on small changes in document image structure and therefore the segmentation 

accuracy results are similar to those obtained using the lossless compression methods. Fig. 

4 shows the comparison of the original binarized image and image obtained after 

compression and decompression using the second proposed method. 


478 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

a) 

 
b) 

 
c) 

 
d) 

Fig. 4 Comparison of the original images and images obtained after 

compression/decompression using the second method:  

(a) First original image, (b) First image after compression/decompression,  

(c) Second original image, (d) Second image after compression/decompression 

Visual results from Fig. 4 clearly show that compression quality is sometimes irrelevant. 

Although the second method is the worst among all analyzed methods regarding the 

compression quality and provides a lossy compression, differences between original and final 

images are negligible in case of character segmentation. In order to clearly demonstrate this 

conclusion, Fig. 5 shows the same images from Fig. 4 after being processed using the 

character segmentation algorithm. 

 
a) 

 
b) 

 
c) 

 
d) 

Fig. 5 Comparison of the original and compressed images after being processed  

using the character segmentation algorithm: (a) First original processed image,  

(b) First image processed after compression/decompression, (c) Second processed 

original image, (d) Second image processed after compression/decompression 


 Efficient Compression and Decompression Algorithms for OCR Systems 479 

 
Finally, a very important aspect of the image compression methods is time complexity. 

The image compression methods are intended for the usage in real-time character 

segmentation system and their processing time should be appropriate for that. In order to 

provide the reliable results, proposed image compression and decompression methods are 

tested on several PC machines. Comparison of the processing time for JBIG2 compression 

and compression/decompression performed using the proposed methods is given in Table 4. 

Table 4 Comparison of the processing time for JBIG2 compression and RLE based 

compression and decompression method (AMD Athlon™ X4 840 Quad Core 

Processor 3.1 GHz) 

Image 
dimensions 

(pixels) 

White Pixels/ 
Black Pixels 

(%) 

Processing Time (ms) 

JBIG2 Compression RLE Compression RLE Decompression 

719x328 93.09:6.91 9.36832 10.12405 0.26639 0.30149 0.23231 0.27278 
1266x924 83.51:16.49 44.64481 49.82263 2.04907 2.59600 1.94450 3.05122 
2632x3575 98.06:1.94 353.74121 382.56647 10.61898 15.07239 13.35769 21.82402 
2640x3612 98.69:1.31 372.48256 405.33698 10.79404 16.99354 13.31302 21.95595 

Time complexity comparison justifies the usage of the proposed algorithms in the 

authors‟ character segmentation system. Numerical results in Table 4 are obtained after 

10000 executions of the analyzed algorithms implementations. Table 4 shows the best 

(left) and average (right) processing time. RLE based compression algorithm provides up 

to 25 times faster compression than the JBIG2 compression standard. The reason for this 

lies in simplicity of the proposed compression method, as well as in the complexity of the 

JBIG2 compression. This advantage of the proposed compression method comes to the 

fore with a large number of documents that need to be processed in the “Nikola Tesla 

Museum” in Belgrade. In order to provide a reliable time complexity results, processing 

time are given for a set of different PC machines. These results are shown in Tables 5-14. 

Table 5 Processing time for RLE based compression and decompression method 

(AMD Athlon™ X4 840 Quad Core Processor 3.1 GHz) 

Image 
dimensions 

(pixels) 

White Pixels/ 
Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.26639 0.30149 0.23231 0.27278 
1266x924 83.51:16.49 2.04907 2.59600 1.94450 3.05122 

2632x3575 98.06:1.94 10.61898 15.07239 13.35769 21.82402 
2640x3612 98.69:1.31 10.79404 16.99354 13.31302 21.95595 

Table 6 Processing time for RLE based compression and decompression method 

(AMD Athlon™ 64 X2 Dual Core Processor 5200+ 2.6 GHz) 

Image 
dimensions 

(pixels) 

White Pixels/ 
Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.59617 0.64098 0.45145 0.48502 
1266x924 83.51:16.49 4.54918 4.98643 3.32389 3.93771 

2632x3575 98.06:1.94 18.97615 22.70397 23.82398 27.92950 
2640x3612 98.69:1.31 18.41798 35.85595 24.20699 42.99031 


480 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

Table 7 Processing time for RLE based compression and decompression method 

(Intel® Core™ i3-4150 CPU @ 3.50GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.22140 0.23022 0.14838 0.15094 

1266x924 83.51:16.49 1.34219 1.39174 0.86508 0.90333 

2632x3575 98.06:1.94 7.57190 7.81398 5.38810 5.69849 

2640x3612 98.69:1.31 7.36194 7.66240 5.28898 5.67878 

Table 8 Processing time for RLE based compression and decompression method 

(Intel® Core™ i5-750 CPU @ 2.67GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.44596 0.45164 0.53324 0.53898 

1266x924 83.51:16.49 2.92499 2.99345 2.82125 2.86606 

2632x3575 98.06:1.94 15.47632 16.07258 21.01470 22.01499 

2640x3612 98.69:1.31 15.18539 15.83180 21.34199 21.78855 

Table 9 Processing time for RLE based compression and decompression method 

(Intel® Core™ i7-920 CPU @ 2.67GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.42999 0.44189 0.31279 0.31529 

1266x924 83.51:16.49 2.74480 2.84551 1.94092 1.95931 

2632x3575 98.06:1.94 15.46358 15.60705 11.13561 11.33740 

2640x3612 98.69:1.31 15.24224 15.38186 10.98920 11.21025 

Table 10 Processing time for RLE based compression and decompression method 

(Intel® Core™ i7-4700 MQ Processor 2.4 GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.32331 0.34429 0.21639 0.22497 

1266x924 83.51:16.49 1.93728 2.05009 1.24961 1.31144 

2632x3575 98.06:1.94 11.24646 11.64243 7.86414 8.18985 

2640x3612 98.69:1.31 10.96036 11.31501 7.70805 8.34529 


 Efficient Compression and Decompression Algorithms for OCR Systems 481 

 
Table 11 Processing time for RLE based compression and decompression method 

(Intel® Core™2 Quad Q9550 CPU @ 2.83GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.41861 0.42261 0.29742 0.29985 

1266x924 83.51:16.49 2.61796 2.66822 1.86799 1.96971 

2632x3575 98.06:1.94 14.93945 15.69766 13.77433 17.14845 

2640x3612 98.69:1.31 14.78641 15.59181 13.69504 17.03617 

Table 12 Processing time for RLE based compression and decompression method 

(Intel® Core™ i5-3470 CPU @ 3.20GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.24246 0.25897 0.17094 0.18088 

1266x924 83.51:16.49 1.42943 1.54486 1.00416 1.06919 

2632x3575 98.06:1.94 9.77028 10.26213 6.09392 6.58797 

2640x3612 98.69:1.31 9.61185 10.10193 5.96242 6.46359 

Table 13 Processing time for RLE based compression and decompression method 

(Intel® Celeron® E3400 CPU @ 2.60GHz) 

Image 

dimensions 

(pixels) 

White Pixels/ 

Black Pixels  

(%) 

Processing Time (ms) 

RLE Compression RLE Decompression 

719x328 93.09:6.91 0.45608 0.58744 0.34018 0.63256 

1266x924 83.51:16.49 3.00126 3.56548 2.50430 3.85496 

2632x3575 98.06:1.94 16.44169 20.02098 20.42599 27.38618 

2640x3612 98.69:1.31 16.07075 19.90696 20.32023 27.60395 

Table 14 Processing time for contour extraction based compression method and scanline 

fill decompression method (document image size of 719x328 pixels) 

PC Machine Specification 

Processing Time (ms) 

Contour 

Compression 

Scanline Fill 

Decompression 

AMD Athlon™ X4 840 Quad Core Processor 3.1 GHz 0.26672 0.29824 1.59406 2.06575 

AMD Athlon™ 64 X2 Dual Core Processor 5200+ 2.6 GHz 1.10936 1.24688 2.94814 6.71124 

Intel® Core™ i3-4150 CPU @ 3.50GHz 0.50409 0.52009 1.16008 1.18575 

Intel® Core™ i5-750 CPU @ 2.67GHz 1.05348 1.06509 1.94235 1.95962 

Intel® Core™ i7-920 CPU @ 2.67GHz 1.13896 1.22269 1.93016 1.97598 

Intel® Core™ i7-4700 MQ Processor 2.4 GHz 0.74069 0.75961 1.69607 1.72861 

Intel® Core™2 Quad Q9550 CPU @ 2.83GHz 0.89366 0.90396 1.85968 1.87623 

Intel® Core™ i5-3470 CPU @ 3.20GHz 0.53816 0.57638 1.19594 1.27253 
Intel® Celeron® E3400 CPU @ 2.60GHz 0.98342 1.82706 2.36290 3.41929 

Time complexity results shown in Tables 5-14 are also obtained after 10000 executions 

of analyzed algorithms implementations. In the previous tables, the average and the best 


482 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

processing time are given. Provided results prove the superiority of the linear image 

representation used in the character segmentation system. All processing time are below 

50ms which is a quite good result for real-time usage. The first method also achieves very 

good results in case of document images with bigger number of black pixels which is a 

characteristic of documents with predominant textual content. The second method 

processing time is primarily affected by the number of closed contour which represent 

character borders and need to be filled using scanline fill algorithm. Each time the closed 

contour needs to be filled, scanline fill algorithm for region filling needs to be executed. 

Since the document image used for obtaining the results in Table 14 has huge areas of the 

same color and small number of characters, the second method proved to be very efficient. 

7. CONCLUSIONS 

This paper presents an image compression/decompression stage of the authors‟ existing 

character segmentation approach. In Section 2 the description of the other image 

compression methods and the authors‟ character segmentation approach is given. Section 3 

provides a theoretical background for bi-level image compression standards and a 

theoretical comparison of the JPEG and JPEG 2000 image compression standards used for 

evaluation of the proposed methods, as well as a description of the RLE data compression 

algorithm and scanline fill algorithm exploited for the proposed image compression and 

decompression algorithms. In Section 4 the image compression and decompression 

methods are presented. The presented image compression and decompression methods are 

adapted to document images and use the RLE data compression algorithm and document 

character contour extraction for image compression, and the scanline fill algorithm for 

document image decompression. The decoupling of the image compression/decompression 

stage allows preparation document images for further processing later, to keep the 

document images in compressed form and save storage memory, to use the improved 

character segmentation and recognition software independently from the pre-processing 

stage, and also to test independently the character segmentation and recognition stages. In 

Section 5, pseudocodes and suggestion for optimal implementation of the proposed image 

compression and decompression algorithms are given. In Section 6 a large set of 

experimental results is provided for image compression methods. The proposed image 

compression algorithms perform up to 25 times faster than the JBIG2 image compression 

and 4 times better than JPEG2000 image compression standard in its lossless mode 

regarding the image compression ratio, whereas the proposed algorithms give much worse 

compression ratio results compared with JBIG2 compression standard. The results proved 

that image compression quality does not affect the segmentation accuracy. Also, the 

proposed image compression and decompression methods proved to be suitable for a real 

time character segmentation system. The official evaluation of the complete system 

performance will be performed at the “Nikola Tesla Museum”, since the approach is 

designed for the needs of the museum, namely for conversion of the original Nikola Tesla 

documents to electronic form. Future work will be focused on algorithms improvement, 

particularly of the segmentation logic, further approach optimization and automation, and 

its integration into the complete OCR system. 


 Efficient Compression and Decompression Algorithms for OCR Systems 483 

 
Acknowledgments: This paper is supported by the Ministry of Education, Science and 

Technological Development of the Republic of Serbia (Project III44006-10), Mathematical Institute 

of Serbian Academy of Science and Arts (SANU), Museum of Nikola Tesla (providing original 

typewritten documents of Nikola Tesla), and Pattern Recognition & Image Analysis Research Lab 

(PRimA) (providing ground-truth historical machine-printed documents). 

REFERENCES 

[1] A. Andreopoulos and J. K. Tsotsos, “50 Years of object recognition: Directions forward”, Computer 
Vision and Image Understanding, vol. 117, no. 8, pp. 827-891, 2013. 

[2] N. Bourbakis, N. Pereira and S. Mertoguno, “Hardware design of a letter-driven OCR and document 
processing system”, Journal of Network and Computer Applications, vol. 19, no. 3, pp. 275-294, 1996. 

[3] S. Khoubyari and J. J. Hull, “Font and Function Word Identification in Document Recognition”, 
Computer Vision and Image Understanding, vol. 63, no. 1, pp. 66-74, 1996. 

[4] J. Mao and K. M. Mohiuddin, “Improving OCR performance using character degradation models and 
boosting algorithm”, Pattern Recognition Letters, vol. 18, no. 11-13, pp. 1415-1419, 1997. 

[5] A. Namane, A. Guessoum, E. H. Soubari and P. Meyrueis, “CSM neural network for degraded printed 
character optical recognition”, Journal of Visual Communication and Image Representation, vol. 25, no. 
5, pp. 1171-1186, 2014. 

[6] J. I. Olszewska, “Active contour based optical character recognition for automated scene understanding”, 
Neurocomputing, vol. 161, no. 5, pp. 65-71, 2015. 

[7] M. I. Razzak, F. Anwar, S. A. Husain, A. Belaid and M. Sher, “HMM and fuzzy logic: A hybrid approach 
for online Urdu script-based languages‟ character recognition”, Knowledge-Based Systems, vol. 23, no. 8, 

pp. 914-923, 2010. 
[8] G. Vamvakas, B. Gatos, N. Stamatopoulos and S. Perantonis, “A Complete Optical Character Recognition 

Methodology for Historical Documents”, IAPR International Workshop on Document Analysis Systems, 

vol. 1, 2008, pp. 525-532. 
[9] H. Fujisawa, “Forty years of research in character and document recognition-and industrial perspective”, 

Pattern Recognition, vol. 41, no. 8, pp. 2435-2446, 2008. 

[10] Y. Lu, “Machine Printed Character Segmentation - An Overview”, Pattern Recognition, vol. 28, no. 1, pp. 
67-80, 1995. 

[11] Y. Lu and M. Shridhar, “Character segmentation in handwritten words - An overview”, Pattern 
Recognition, vol. 29, no. 1, pp. 77-96, 1996. 

[12] Á. González and L. M. Bergasa, “A text reading algorithm for natural images”, Image and Vision 
Computing, vol. 31, no. 3, pp. 255-274, 2013. 

[13] D. Karatzas and A. Antonacopoulos, “Colour text segmentation in web images based on human 
perception”, Image and Vision Computing, vol. 25, no. 5, pp. 564-577, 2007. 

[14] J. Lim, J. Park and G. G. Medioni, “Text segmentation in color images using tensor voting”, Image and 
Vision Computing, vol. 25,  no. 5, pp. 671-685, 2007. 

[15] J. Min-Chul, S. Yong-Chul and S. N. Srihari, “Machine Printed Character Segmentation Method Using 
Side Profiles”, In Proceedings of the IEEE SMC ‟99 Conference on Systems, Man and Cybernetics, 1999. 

[16] N. Nikolaou, M. Makridis, B. Gatos, N. Stamatopoulos and N. Papamarkos, “Segmentation of historical 
machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths”, 

Image and Vision Computing, vol. 28, no. 4, pp. 590-604, 2010. 
[17] H. C. Park, S. Y. Ok, Y. J. Yu and H. G. Cho, “A word extraction algorithm for machine-printed 

documents using a 3D neighborhood graph model”, International Journal on Document Analysis and 

Recognition, vol. 4,  no. 2, pp. 115-130, 2001. 
[18] L. Zheng, A. H. Hassin and X. Tang, “A new algorithm for machine printed Arabic character 

segmentation”, Pattern Recognition Letters, vol. 25, no. 15, pp. 1723-1729, 2004. 

[19] A. Choudhary, R. Rishi and S. Ahlawat, “A New Character Segmentation Approach for Off-Line Cursive 
Handwritten Words”, Procedia Computer Science, First International Conference on Information 

Technology and Quantitative Management, vol. 17, pp. 88-95, 2013. 

[20] K. Fukushima and T. Imagawa, “Recognition and segmentation of connected characters with selective 
attention”, Neural Networks, vol. 6, no. 1, pp. 33-41, 1993. 

[21] E. B. Lacerda and C. A. B. Mello, “Segmentation of connected handwritten digits using Self-Organizing 
Maps”, Expert Systems with Applications, vol. 40, no. 15, pp. 5867-5877, 2013. 


484 B. ARIZANOVIĆ, V. VUĈKOVIĆ 
 

[22] H. Lee and B. Verma, “Binary segmentation algorithm for English cursive handwriting recognition”, 
Pattern Recognition, vol. 45, no. 4, pp. 1306-1317, 2012. 

[23] J. Oh, S. Joon, S. Sangkuk, L. Ji-Won, K. Nojun and K. Em, “Online recognition of handwritten music 
symbols”, International Journal on Document Analysis and Recognition (IJDAR), pp. 1-11, 2017. 

[24] T. Plötz and G. A. Fink, “Markov models for offline handwriting recognition: a survey”, International 
Journal on Document Analysis and Recognition (IJDAR), vol. 12, pp. 269-298, 2009. 

[25] A. Rehman and T. Saba, “Performance analysis of character segmentation approach for cursive script 
recognition on benchmark database”, Digital Signal Processing, vol. 21, no. 3, pp. 486-490, 2011. 

[26] N. Stamatopoulos, B. Gatos, G. Louloudis, U. Pal and A. Alaei, “ICDAR 2013 Handwriting Segmentation 
Contest”,  In Proceedings of the 12th International Conference on Document Analysis and Recognition 

(ICDAR), 2013. 
[27] O. Surinta, M. F. Karaaba, L. R. B. Schomaker and M. A. Wiering, “Recognition of handwritten 

characters using local gradient feature descriptors”, Engineering Applications of Artificial Intelligence, 

vol. 45, pp. 405-414, 2015. 
[28] J. Tan, J. Lai, C. Wang, W. Wang and X. Zuo, “A new handwritten character segmentation method based 

on nonlinear clustering”, Neurocomputing, vol. 89, pp. 213-219, 2012. 

[29] M. Younes and Y. Abdellah, “Segmentation of Arabic Handwritten Text to Lines”, In Proceedings of the 
Procedia Computer Science, International Conference on Advanced Wireless Information and 

Communication Technologies (AWICT 2015), vol. 73, pp. 115-121, 2015. 

[30] A. Antonacopoulos and D. Karatzas, “Semantics-based content extraction in typewritten historical 
documents”, In Proceedings of the 8th International Conference on Document Analysis and Recognition 

(ICDAR „05), 2005, pp. 48-53. 

[31] I. Bar-Yosef, A. Mokeichev, K. Kedem, I. Dinstein and U. Ehrlich, “Adaptive shape prior for recognition 
and variational segmentation of degraded historical characters”, Pattern Recognition, vol. 42, no. 12, pp. 

3348-3354, 2009. 

[32] M. R. Gupta, N. P. Jacobson and E. K. Garcia, “OCR binarization and image pre-processing for searching 
historical documents”, Pattern Recognition, vol. 40,  no. 2, pp. 389-397, 2007. 

[33] L. A. F. Fernandes and M. M. Oliveira, “Real-time line detection through an improved Hough transform 
voting scheme”, Pattern Recognition, vol. 41,  no. 1, pp. 299-314, 2008. 

[34] V. Shapiro, “Accuracy of the straight line Hough Transform: The non-voting approach”, Computer Vision 
and Image Understanding, vol. 103, no. 1, pp. 1-21, 2006. 

[35] C. Singh, N. Bhatia and A. Kaur, “Hough transform based fast skew detection and accurate skew 
correction methods”, Pattern Recognition, vol. 41, no. 12, pp. 3528-3546, 2008. 

[36] G. Bessho, K. Ejiri and J. F. Cullen, “Fast and accurate skew detection algorithm for a text document or a 
document with straight lines”, In Proc.  of the SPIE, vol. 2181, pp. 133-140, 1994. 

[37] Y. Cao, S. Wang and H. Li, “Skew detection and correction in document images based on straight-line 
fitting”, Pattern Recognition Letters, vol. 24, no. 12, pp. 1871-1879, 2003. 

[38] A. Fernández-Caballero, M. T. López and J. C. Castillo, “Display text segmentation after learning best-
fitted OCR binarization parameters”, Expert Systems with Applications, vol. 39, no. 4, pp. 4032-4043, 

2012. 
[39] H. Z. Eldin, M. A. Elhosseini and H. A. Ali, “Image compression algorithms in wireless multimedia 

sensor networks: A survey”, Ain Shams Engineering Journal, vol. 6, no. 2, pp. 481-490, 2015. 

[40] J. Mtimet and H. Amiri, “Arabic Textual Image Compression Approach”, Procedia Computer Science, 
vol. 35, pp. 118-126, 2014. 

[41] M. Wu, “Genetic algorithm based on discrete wavelet transformation for fractal image compression”, 
Journal of Visual Communication and Image Representation, vol. 25, no. 8, pp. 1835-1841, 2014. 

[42] L. Wang, J. Bai, J. Wu and G. Jeon, “Hyperspectral image compression based on lapped transform and 
Tucker decomposition”, Signal Processing: Image Communication, vol. 36, pp. 63-69, 2015. 

[43] A. M. Rufai, G. Anbarjafari and H. Demirel, “Lossy image compression using singular value 
decomposition and wavelet difference reduction”, Digital Signal Processing, vol. 24, pp. 117-123, 2014. 

[44] Z. Zheng, J. Zhao, H. Guo, L. Yang, X. Yu and W. Fang, “Character Segmentation System Based on C# 
Design and Implementation”, Procedia Engineering, International Workshop on Information and 
Electronics Engineering, vol. 29, pp. 4073-4078, 2012. 

[45] A. Sedighi and M. Vafadust, “A new and robust method for character segmentation and recognition in 
license plate images”, Expert Systems with Applications, vol. 38, no. 11, pp. 13497-13504, 2011. 

[46] M. Grafmüller and J. Beyerer, “Performance improvement of character recognition in industrial 
applications using prior knowledge for more reliable segmentation”, Expert Systems with Applications, 

vol. 40, no. 17, pp. 6955-6963, 2013. 


 Efficient Compression and Decompression Algorithms for OCR Systems 485 

 
[47] C. Papaodysseus, P. Rousopoulos, F. Giannopoulos, S. Zannos, D. Arabadjis, M. Panagopoulos, E. Kalfa, 
C. Blackwell and S. Tracy, “Identifying the writer of ancient inscriptions and Byzantine codices. A novel 

approach”, Computer Vision and Image Understanding, vol. 121, pp. 57-73, 2014. 

[48] N. B. Venkateswarlu and R. D. Boyle, “New segmentation techniques for document image analysis”, 
Image and Vision Computing, vol. 13, no. 7, pp. 573-583, 1995. 

[49] J. Li, M. Li, J. Pan, S. Chu and J. F. Roddick, “Gabor-based kernel self-optimization Fisher discriminant 
for optical character segmentation from text-image-mixed document”, Optik - International Journal for 
Light and Electron Optics, vol. 126, no. 21, pp. 3119-3124, 2015. 

[50] J. H. Bae, K. C. Jung, J. W. Kim and H. J. Kim, “Segmentation of touching characters using an MLP”, 
Pattern Recognition Letters, vol. 19, no. 8, pp. 701-709, 1998. 

[51] P. P. Roy, U. Pal, J. Lladós and M. Delalandre, “Multi-oriented touching text character segmentation in 
graphical documents using dynamic programming”, Pattern Recognition, vol. 45, no. 5, pp. 1972-1983, 2012. 

[52] O. Starostenko, C. Cruz-Perez, F. Uceda-Ponga and V. Alarcon-Aquino, “Breaking text-based CAPTCHAs 
with variable word and character orientation”, Pattern Recognition, vol. 48, no. 4, pp. 1101-1112, 2015. 

[53] V. Vuĉković and B. Arizanović, “Efficient character segmentation approach for machine-typed 
documents”, Expert Systems with Applications, vol. 80, pp. 210-231, 2017. 

[54] V. Vuĉković and B. Arizanović, “Automatic document skew pre-processor for character segmentation 
algorithm“, Facta Universitatis: Electronics and Energetics, vol. 30, no. 4, pp. 611-625, 2017. 

[55] V. Vuĉković, B. Arizanović and S. Le Blond, “Ultra-fast basic geometrical transformations on linear 
image data structure“, Expert Systems with Applications, vol. 91, pp. 322-346, 2018. 

[56] V. Vuĉković, B. Arizanović and S. Le Blond, “Generalized N-way iterative scanline fill algorithm for 
real-time applications“, Journal of Real-Time Image Processing, vol. 13, no. 4, pp. 1-19, 2018. 

[57] B. Žalik, D. Mongus and N. Lukaĉ, “A universal chain code compression method”, Journal of Visual 
Communication and Image Representation, vol. 29, pp. 8-15, 2015. 

[58] J. Zhu, Z. Wang, R. Zhong and S. Qu, “Dictionary based surveillance image compression”, Journal of 
Visual Communication and Image Representation, vol. 31, pp. 225-230, 2015. 

[59] A. J. Hussain, D. Al-Jumeily, N. Radi and P. Lisboa, “Hybrid Neural Network Predictive Wavelet Image 
Compression System”, Neurocomputing, vol. 151, no. 3, pp. 975-984, 2015. 

[60] R. T. Haweel, W. S. El-Kilani and H. H. Ramadan, “Fast approximate DCT with GPU implementation for 
image compression”, Journal of Visual Communication and Image Representation, vol. 40 (Part A), pp. 
357-365, 2016. 

[61] J. Li, “Image Compression - the Mathematics of JPEG 2000”, Microsoft Research, Communication 
Collaboration and Signal Processing.