Instruction


FACTA UNIVERSITATIS  

Series: Electronics and Energetics Vol. 28, N
o
 2, June 2015, pp. 237 - 249 

DOI: 10.2298/FUEE1502237D 

NOVEL, LOW POWER, NONLINEAR DILATATION AND 

EROSION FILTERS REALIZED IN THE CMOS TECHNOLOGY

 

Rafał Długosz
1,2

, Andrzej Rydlewski
3
, Tomasz Talaśka

1
 

1
UTP University of Sciences and Technology, Faculty of Telecommunication, 

Computer Science and Electrical Engineering, Bydgoszcz, Poland 
2
 DELPHI Automotive Company, Kraków, Poland 

3
Alcatel-Lucent, Coldra Woods, Chepstow Rd, Newport NP18 2YB 

Abstract. In this paper we propose novel, binary-tree, asynchronous, nonlinear filters 

suitable for signal processing realized at the transistor level. Two versions of the filter 

have been proposed, namely the dilatation (Max) and the erosion (Min) one. In the 

proposed circuits an input signal (current) is sampled in a delay line, controlled by a 

multiphase clock. In the subsequent stage particular samples are converted to 1-bit 

digital signals with delays proportional to the values of these samples. In the last step 

the delays are compared in digital binary-tree structure in order to find either the Min 

or the Max value, depending on which filter is used. Both circuits have been simulated 

in the TSMC CMOS 0.18µm technology. To make the results reliable we applied the 

corner analysis procedure. The circuits were tested for temperatures ranging from -40 to 

120ºC, for different transistor models and supply voltages. The circuits offer a precision of 

about 99% at a typical detection time of 20 ns (for the Max filter) and 100 ns for the Min 

filter (the worst case scenario). The energy consumed per one input during a single 

calculation cycle equals 0.32 and 1.57 pJ, for the Max and Min filters, respectively. 

Key words: Nonlinear filters, CMOS realization, full-custom, binary-tree architecture 

1. INTRODUCTION 

Dilatation and erosion operations often referred to as the Max and Min functions, 

respectively, are useful in many applications. These operations are commonly used in 

artificial neural networks (ANN) but also in signal and image processing [1]. To perform 

the competitive learning, which is common in some types of ANNs, the Min function is 

used to determine which of the neurons is located in the closest proximity to a provided 

learning pattern. In this case this operation is known as Winner Takes All (WTA), which 

is somehow misleading, as from the formal point of view the Min function corresponds to 

the Loser Takes All (LTA) operation. However, the winning neuron is the one, for which 

the distance is the smallest and thus such a convention. 

                                                           
Received August 19, 2014; received in revised form February 11, 2015  

Corresponding author: Rafał Długosz  

UTP University of Sciences and Technology, Faculty of Telecommunication, Computer Science and Electrical 

Engineering,ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland 

(e-mail: rafal.dlugosz@gmail.com) 


238 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA 

Another area in which the Min, as well as the Max operations are used is nonlinear 

filtering. Such filters are used, for example, to enhance signals or to correct shapes of the 

objects in pictures. The Min and the Max operations are in such applications denoted as 

erosion and dilatation, respectively. Both types of filters can be joined in series in order to 

perform more complex tasks, such as morphological opening and closing operations, 

commonly used in image processing to reconstruct digital image into original form from 

noisy image. For such applications we can use Min/Max Detector Based (MDB) filters or 

Min/Max Exclusive Mean (MMEM) filters. These technics can be used to achieve best 

performance [2]-[4].  

A large similarity exists between the both nonlinear dilatation (Max) / erosion (Min) 

filters and the WTA / LTA operations, as in both mentioned cases the core circuit fulfills 

exactly the same task. The task relies on searching for either the minimum or maximum 

signal among a set of the input signals. The difference exists in the input signals used in 

each of these cases. 

In ANNs all signals are independent as they come from separate neurons distributed 

over the input data space. On the other hand, the erosion/dilatation filters process samples 

of on one signal stored in the delay line, as shown in Fig. 1. In this diagram a classic delay 

line is schematically shown to illustrate the idea. In the classic approach the samples are 

rewritten many times between memory cells. In software realizations it is not the problem, 

but in analog transistor level implementations this usually is the source of errors that have 

an impact on the quality of filtering. We faced with this problem in our former projects of 

Finite Impulse Response (FIR) filters realized in switched-capacitor technique [5].  

To avoid the problem of reduced accuracy it is necessary to use such filters, in which 

the number of read/write operations in minimized. One of the possibilities in this regard is 

to use the, so-called, circular delay line [6], [7]. In this approach, the samples are stored in 

particular memory cells and remain there as long as they are replaced with new samples 

after M clock cycles, where M is also the number of samples stored in the whole delay 

line. We adopted this solution to nonlinear filters presented in this paper. 

In filters the input signal can be sampled in time domain (1-D signal), or particular 

samples can be, for example, pixels of an image (2-D signal). In this paper we focus on 

nonlinear filters used in the first situation. However, the proposed solution can be adopted 

to image filtering as well.  

 
Fig. 1 Nonlinear dilatation / erosion filtering of a 1-D signal in time domain 

Numerous Min/Max circuits have been reported in the literature, but two major types 

of architectures can be clearly distinguished. In the first group the Min/Max circuits are 

usually based on the current conveyor (CC) architecture [8]-[10]. In this approach all 

input signals (either currents or voltages) are compared in a single stage. Such circuits 


 Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 239 

usually feature a simple structure, but suffer from limited accuracy that decreases when 

the number of inputs increases [9], [11]. This problem results mostly from the, so called, 

'corner error‟, which occurs when two or more input signals have similar values. In this 

case an average value between these signals appears at the output of the filter.  

 
Fig. 2 Block diagram of the proposed Min / Max filter Binary-tree solution 

The second group of filters is based on the concept of the binary tree (BT) structure. 

In this case the competition between the input signals is conducted on particular layers of 

the tree. The number of layers equals log2M, where M is the number of inputs that equals 

the length of delay line. Signals at each particular layer compete in pairs and always only 

one winning signal is allowed to take part in the competition at the next layer of the tree 

[9], [12], [13]. The binary-tree circuits usually are more complex than their CC counterparts. 

However, if precise comparators are used, they are able to properly distinguish signals that 

differ by very small amounts. The advantage of BT solutions is also evident in the fact that 

they are able to determine the address of the Max or Min signal, which is not possible in the 

CC circuits. This is an important feature in case of the application of such circuits in ANNs, 

in which the value of the winning signal is less important than the information which of 

the input signals has the smallest value. 

In typical BT solutions the signals (analog) at the outputs of particular layers of the 

tree are determined (calculated or copied) on the basis of signals provided from preceding 

layers [9], [12], [14]. This may be the source of errors [15] that accumulate at the top of 

the tree. In the proposed solution [16] this problem is less visible. At an early stage of the 

signal processing chain the analog input signals are converted to digital 1-bit signals with 

delays proportional to the values of the input signals. Then the comparison of the signals 

(their delays) is performed in a digital BT structure. In this way, the copying of analog 

signals between layers has been eliminated. 

The paper is organized as follows: In next Section we propose two filters specific for 

the dilatation (Max) and erosion (Min) nonlinear operations. In following Section we 

present verification of the proposed circuit by means of transistor level simulations. To 

provide reliable results we performed rigorous PVT (process, voltage, temperature) variations 

tests. The conclusions are formulated in last Section.  


240 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA 

a)    b)  c)   

d) e)  

f)    

Fig. 3 Components of the proposed nonlinear filters: (a) input multiple-output CM used in 

circular delay line, (b) S&H memory element used in delay line, (c) current to time 

converter (ITC), (d) delay comparator used in dilatation (Max) filter, (e) delay 

comparator used in erosion (Min) filter, (f) address determination block (ADET) 

2. PROPOSED DILATATION AND EROSION NONLINEAR FILTERS 

Both nonlinear filters proposed in this paper are based on the same structure shown in 

Fig. 2. The circuit is composed of the analog part whose role is to prepare simplified 

signals for the subsequent digital BT structure. The circuit consists of several blocks, or 

groups of elements, presented in detail in Fig. 3.  

Analog part of the system 

In both filters the input current, Iin, is first sampled and held in the circular delay line. 

This delay line has been used to avoid multiple read and write operations of particular 

signal samples, which is the source of large errors in classical delay line. In this approach 

particular samples are not rewritten between memory cells but remain in particular cells 

as long as they are replaced by new samples after M clock cycles, as described earlier. 

The delay line in the proposed filters works as follows: The input signal is copied M times 

by the use of the multiple output current mirror (CM), shown in Fig. 3(a). In this way each 

branch receives a separate copy of the input signal and thus data processing in particular 

branches is independent from each other. Particular samples of the input signal are stored 

in sample & hold (S&H) memory elements, shown in Fig. 3 (b). To compensate a typical 


 Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 241 

in this case charge injection effect across the storage capacitors, CST, we have used the, so 

called, dummy switches, swD. In such switches inputs and outputs are shorted together, so 

they do not change the functionality of the circuit. Such switches are controlled by clock 

signals of opposite polarity in comparison with the memory switches, swM. 

The circular delay line is in this case controlled by an M-phases clock. The complexity 

of the clock can be viewed as a disadvantage. However, since the length of nonlinear 

filters usually does not exceed 8-10, it is not a significant problem, taking into account an 

increased precision of the circuit. 

Output signals from particular S&H elements, denoted as I
'
in i are provided to current-

to-time converters (ITC), shown in Fig. 3(c), that convert them into binary 1-bit flags (F).  

These converters are also common for both filters. The flag signals occur at the outputs of 

particular ITCs with delays proportional to the values of the signal samples. Each of these 

blocks is composed of a PMOS-type cascoded CM, an integrating capacitor with reset 

function, and two NOT gates. The voltage across the capacitor is increasing with a rate 

which is proportional to the value of the I
'
in i signal. The NOT gates change their logical 

states when the voltage across the capacitor reaches a value of about VDD/2.  

 
Fig. 4 A theoretical influence of transistor sizes on the gain error of the current mirror 

(due to threshold voltage mismatch) for the weak and strong inversion regions.  

 
To improve the precision of the circuit we have used the cascoded CMs to increase the 

accuracy of the copying operations. An additional problem while designing the CMs, is 

how to determine the optimal sizes of transistors for particular values of the input currents 

(in the range up to 10 µA in this case). We faced with a similar problem in our former 

projects [17], [18]. The sizes of transistors have a strong influence on the mismatch effect 

[19], for example the threshold voltage mismatch ∆VTH. The last parameter has, in turn, 

an impact on the theoretical gain error of the CM and thus on the precision of the circuit. 

In the weak inversion region the impact is usually larger than in the strong inversion 

region, as shown in Fig. 4, and therefore we polarize the transistors in such a way to put 

their operating points in the strong inversion region. To make it possible we do not work 

with currents smaller than 1 µA.  

Increasing the sizes of transistors always reduces the mismatch effect. However, for 

given values of the input currents this also decreases the gate to source voltage, VGS, that 

in turn enlarges the gain error of the CM. For the currents being in the range up to 10µA 

optimal sizes of transistors are W / L=3 / 1 and 9 / 1 µm for NMOS and PMOS transistors, 

respectively.  


242 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA 

Digital binary tree structure 
 

The binary tree structure used in the proposed nonlinear filters is composed of delay 

comparators (DCMP), which are different for particular filters. Two versions of this block 

have been proposed. The circuit used in the dilatation filter is shown in Fig. 3(d), while 

the one used in the erosion filter in Fig.3 (e).  

Both circuits are built on the basis of the RS flip flop (RSFF) that is able to distinguish 

very small (at the level of 3-5ns) differences between delays of particular input signals. 

Depending on the mode of the filter (Min or Max) either the smaller or the larger of two 

input signals becomes the winner, which DCMP signalizes by two digital signals, o1 and 

o2. In the overall BT the process of determination of the winning (or losing) signal is 

based on the competition performed at particular layers of the tree. To make it possible, 

DCMP blocks provide an additional signal - flag (F) of a given pair - that takes part in the 

competition at the following layer of the tree.  

Depending on the type of filter, between the F11 and F12 inputs and the output F there 

is only a single OR or AND gate. In the dilatation filter, as soon as only one of the input 

flags becomes 1, a given DCMP immediately (with a delay below 0.5 ns) sends the flag F 

of a pair to the next layer of the tree. In the erosion filter, on the other hand, the AND gate 

causes that a given DCMP has wait with sending the flag of the pair until both input flags 

become 1. This causes that the erosion filter is slower than the dilatation one. The other 

problem in this filter appears when the minimum signal very small or zero. In this case the 

process of detection of this signal can take a very long time. To solve this problem we 

assume not only an upper range of the input signals but also the bottom range, which in 

this case equals 10% of the upper range. If there is no possibility that the input signals are 

always larger than the bottom range, we can introduce a constant bias, added in junction 

to each signal charging the integrating capacitor in the ITC block.   

The last operation performed in the proposed filters is determination of the address of 

the Min or the Max signal, depending on the type of the filter. The o1 and o2 signals from 

particular DCMP blocks are used by the ADET block (address determination), shown in 

Fig. 3(e), to determine the address. The o1 and o2 signals have always such values that 

enable an unambiguous indication of the winning signal. 

Unfortunately, the problem with the RSFF is that it can hang ('0.5' states at both 

outputs) when two input flags arrive at almost the same time i.e. when the corresponding 

input currents are almost equal. In this case the values at both outputs of the RSFF are 

equal to about VDD/2. To avoid ambiguity in this case, a simple hierarchy mechanism has 

been introduced that is able to recognize the '0.5' states. In such situations the circuit 

arbitrarily decides that one of the input signals obtains the status of the 'winner'.  

The proposed arbitrary mechanism is based on asymmetrical NOT (NOTn and NOTp) 

gates. The gates have different threshold voltages obtained throughout a proper transistor 

sizing. These voltages are equal to 0.25∙VDD/2 and 0.75∙VDD/2 for the NOTp and NOTn 

gates, respectively. In case when the RSFF hangs, the gates provide different output 

signals that is detected by the XOR gate. This gate throughout the configuration switches 

(controlled by 'swn' / 'swp' signals) controls the values of the o1 and o2 output signals. In 

this case the circuit arbitrarily connects the outputs of the RSFF to VDD ('1') and VSS ('0') 

supplies. This function does not introduce a substantial error, as in this case both analog 

input signals are almost equal (difference < 0.2%). Additionally, it is worth to say that the 

'0.5' states occur seldom in practice, so the mechanism is only an emergency solution. 


 Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 243 

 
Fig. 5 Transistor level simulations of a single DCMP equipped with the proposed 

arbitrary mechanism. In the B state the arbitrary mechanism eliminates the ambiguity. 

 
Fig. 6 Simulations of the circular delay line with eight S&H memory elements. From top 

to bottom are presented: (1) an example input current with the amplitude of 2µA, 

(2) controlling clock signals (8 phases), (3) signal samples stored in particular 

S&H cells (voltages across the storage capacitors, CST), and (4) the supply current 

 
244 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA 

 
Fig. 7 Simulations of the BT block composed of the DCMP (Max mode) circuits for 

T=20ºC and VDD=1.8V. From top to bottom are presented: (1) VC voltages in the 

ITC circuits, (2) resultant flag signals, and (3) addresses of the samples that in a 

given time period have the maximum values 

 
Fig. 8 Simulations of the BT block composed of the DCMP (Min mode) circuits for 

T=20ºC and VDD=1.8V. From top to bottom are presented: (1) VC voltages in the 

ITC circuits, (2) resultant flag signals, and (3) addresses of the samples that in a 

given time period have the minimum values 


 Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 245 

3. VERIFICATION OF THE PROPOSED NONLINEAR FILTERS 

The proposed circuit has been tested in several steps. At the beginning we tested the 

DCMP as a separate circuit. We put a special emphasis on the arbitrary mechanism, as 

this block has a crucial meaning. Illustrative results for the circuit used in the dilatation 

filter are shown in Fig. 5.The RS out1 and RS out2 signals can be either in a typical state 

(A), in which their values are `0' or '1', or in the '0.5' state (B), which is not desired. In 

case B (e.g. in the range from 47 to 52us) the outputs of asymmetrical NOTn and NOTp 

gates provide different values. This state is detected by the XOR gate that signalizes it by 

reverting the values of the 'swp' and 'swn' signals. As a result, the outputs o1 and o2 are 

arbitrary connected to VDD and VSS rails (logical '1' and '0'), respectively.  

 After verifying the DCMP block we have tested the performance of the overall filter 

composed of a delay line with 8 memory cells and the BT block with three layers (log28). 

The results for the dilatation as well as the erosion filters are presented in Figs. 6 – 8.  

 Fig. 6 illustrates the operation of the circular delay line. An example input current – 

sinus waveform with f=10kHz and the amplitude of 2 µA across the 3µA DC signal – is 

sampled and held in the memory cells (CST = 400 fF). Each sample remains in a given cell 

during eight subsequent clock cycles. An average supply current equals 70 µA, which 

means that an average power dissipation equals 126 µW (for VDD = 1.8 V).  

The performance of the overall circuit for both types of nonlinear filters is presented in 

Figs. 7 and 8. Top panel in both Figures present voltages across storage capacitors in 

particular ITC blocks. This phase is preceded by resetting the capacitors. After the Reset 

signal is released, the capacitors are charged from 0 to VDD by currents (samples of the 

input signal), whose values are stored in the corresponding S&H elements. Middle panel 

presents resultant delays of particular flags. Finally, the bottom panels show the addresses 

of the samples with the Max or Min values that appear at the outputs of the filters. 

 The input signals in both cases have been selected in such a way to present different 

scenarios. In Fig. 7 in the first cycle (in between 50 and 60 µs) the I7 and I8 samples are 

almost equal. As a result, both corresponding flags are set to 1 in a short period of time 

that activates the arbitrary mechanism. In this case the mechanism arbitrary selects the I7 

signal as a winner. The next two cycles (60-70 and 70-80 µs) present a typical situation, 

in which differences between particular signals are larger. The detection time varies in 

this case in between 5 and 20 ns. In most cases we expect that this time will be no greater 

than 20 ns. In the worst case scenario (not presented), i.e. for bottom values of all samples 

of the input signal (1µA) this time can reach even 100 ns. Taking into account the average 

power dissipation provided above, the energy consumed during one detection cycle per 

one input can be determined to be 1.57pJ and 0.32pJ in the worst case scenario and in a 

typical situation, respectively.  

 In case of the erosion filter we selected such input signals for which the flags appear 

almost at the same time (a difference of 1 – 3 ns). It is shown in Fig. 8. In each situation 

the arbitrary mechanism properly selects one of these signals as a winner (minimum signal 

in this case). Detection time is in this case longer than in case of the dilatation filter, as the 

circuit must wait until the flag of the smallest signal appears at the output of the ITC. We 

can assume that the detection time is in this case closer to the worst case scenario of the 

dilatation filter (100 ns), while the power dissipation remains the same. 

  
246 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA 

a)  

b)  

Fig. 9 Corner analysis: Simulations of the dilatation (Max) filter for different 

temperatures: (a) -40ºC , (b) 120 ºC. The meaning of particular diagrams seen from 

top to bottom is the same as in Figs. 7 and 8. 

After the initial verification of the circuit described above we performed a detailed 

corner analysis of the circuits. We tested the filters in wide ranges of particular PVT 

(process, voltage and temperature) parameters. The environment temperature varied in the 

range from -40ºC to 120 ºC, while the supply voltage in the range from 1.2 to 1.8 V. We 

tested the circuit for three transistor models, namely slow, fast and typical (SS, FF, TT). 

Fig. 9 presents selected simulation results for the same situation as in Fig. 7, but for 

different temperatures to illustrate the stability of the system. 


 Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 247 

Table 1 Performance comparison of selected Min / Max circuits reported in literature. 

Reference Process  

(CMOS µm) 

VDD  

[V] 

No. of 

inputs 

P / (1 in) 

[µW] 

Data rate f 

[MHz] 

Input range 

[µA] 

FOM (f/P1in) 

[MHz/µW] 

[20] 0.5   3.3 8 106.25 5    3.3 0.047 

[21] 0.35 3.3 8 70      1    10 0.014 

[12] 0.8   6 8 120      2.8 50 0.023 

[9] 2.4   5 8 200      13.8 100 0.069 

[15] 0.6   3 8 283.75 20    70 0.070 

This work (dilatation) 0.18 1.8 8 15.75 50    1 – 10 3.174 

This work (erosion) 0.18 1.8 8 15.75 10    1 – 10 0.635 

4. DISCUSSION OF RESULTS 

In this Section we compare the obtained results with performance of other Min/Max 

circuits reported in the literature. A straightforward comparison is not easy, as particular 

solutions were designed for different purposes and thus, to some extent, offer different 

features. Most of the reported circuits does not contain the delay line, as they have been 

designed for independent input signals directly provided to the BT block. In case of our 

circuit the memory cells used to store the signal samples contain additional branches that 

conduct current, thus increasing the power dissipation. Schemes presented in Fig. 3 (a) – 

(c) show that each memory cell contains two additional branches that almost doubles the 

power dissipation in the comparison with the situation in which the same circuit (without 

the delay line) would be used to process independent signals. 

To facilitate the comparison of different solutions we define a Figure-of-Merit (FOM) 

as data rate over the power dissipation for a single input. Such assumption is correct, as 

the power dissipation increases approximately linearly with the number of inputs. Note 

that the number of all elements used in the circuit also increases linearly with the number 

of inputs. As discussed earlier, the number of layers in the binary tree equals log2M. At 

each following layer the number of elements that serve as comparators (DCMP) is 

reduced by 2. For example, for 8 inputs the circuit has 3 layers with 4, 2, 1 comparators, 

respectively (7 comparators in total). For 128 inputs the number of comparators equals 

127. The number of ITCs and memory cells in delay line equals the number of the inputs. 

We are aware that the proposed circuit has been realized in newer technology than 

other circuits of this type, presented in Table 1. However, as described earlier, to reduce 

the mismatch effect we oversized transistors used in the analog part that had some impact 

on the attainable data rate. The main source of the observable delay is the analog part of 

the system. In particular ITC blocks the currents with the values in-between 1 and 10 µA 

have to charge the capacitors of 100 fF to the value of about 0.9V, that enables generating 

the flag. This process takes about 9 to 90 ns, for 10 and 1 µA, respectively. If the circuit 

would be realized in an older technology we would have to increase the supply voltage, so 

the process of charging the capacitors would take a longer time (let us assume 2 -3 times). 

The digital part of the system is very fast. The delay of a single layer of the BT block 

equals the delay of a single OR or AND gate only, as the flag of the pair is generated by 

these gates (Fig. 3 d-e). This delay in the CMOS 0.18µm technology does not exceed 1 ns 

for VDD = 1.8V. If the RSFF hangs, the arbitrary mechanism requires about 3 ns to decide 


248 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA 

which of the inputs is assumed to be the winner. However, this process is parallel to the 

process of propagating flags in the tree. In the BT block we use transistors with minimal 

lengths in a given technology. If we would redesign the circuit in an older technology, the 

propagation time of each layer of the tree would increase by a factor of (L1/L2)
2
. In the 

CMOS 0.5µm technology, for example, this time would be longer about 7-10 times. We 

suppose that in this technology the delay of the overall circuit in the worst case scenario 

would not exceed 300 ns. This would reduce the FOM of our circuit about 3 times. 

However the obtained results are still four times better than in an example circuit reported 

in [20], designed in CMOS 0.5µm technology.  

The provided delay times and calculations are for an example case of 8 inputs i.e. 3 

layers in the tree. In case of larger structures the delay of the analog part will remain the 

same, while the delay of the digital part will increase only moderately. This is one of the 

main advantages of the proposed solution. In other circuits of this type with analog BT, 

the delay is linearly proportional to the number of layers. 

During the corner analysis we simulated the filters with smaller supply voltages. The 

circuit worked properly dissipating less power, but it was also much slower in this case. 

For VDD = 0.8V the digital part was approximately 10 times slower. Working with such 

supply voltages does not make sense as the energy consumed during one cycle does not 

decrease as fast as the dissipated power, just due to reduced speed. Additionally for such 

voltages transistors used in the analog part work in the weak inversion region that reduces 

the precision of the circuit. 

5. CONCLUSIONS 

Novel nonlinear dilatation and erosion filters have been proposed in the paper. The 

circuits are based on the binary tree concept. However, in contrary to typical solutions of 

this type, in which analog BT structures are used, is the proposed circuit we distinguish 

the analog part that converts the analog signals to 1-bit signals with different delays and 

the parallel and asynchronous digital BT block that determines which delay is the smallest 

or the largest, depending on the type of the filter. The proposed digital BT is much faster 

than its analog counterparts. It additionally eliminates propagation of analog signals in the 

tree, as it is in other circuits of this type. As a result, the circuit offers a precision at the 

level exceeding 99% that is sufficient in many signal processing tasks.  

The proposed BT is very sensitive and is able to distinguish very small differences of 

delays of particular input signals. This is possible through a not typical use of the RS flip 

flops, which serve in this case as time comparators. In a typical application of the RS flip 

flops the „11‟ input state is not allowed. In our circuit we call this situation an emergency 

state that happens relatively seldom. Nevertheless, to avoid the situation in which this 

state will unable calculation of the output sample of the filter, we propose an arbitrary 

mechanism that is able to handle this situation. 

The next step of the project will be design and fabrication of the chip containing the 

filters and its laboratory tests. This phase is necessary, as the noise can have same impact 

on the results.  


 Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 249 

REFERENCES 

[1] M. Vemis, G. Economou, S. Fotopoulos, A. Khodyrev, "The Use of Boolean Functions and Logical 

Operations for Edge Detection in Images", Signal Processing, 1995, vol. 45, 161–172 

[2] R.A. Araujo, A.L.I. Oliveira, S. Soares, S. Meira, "Designing Dilation-Erosion Perceptrons with 

Di_erential Evolutionary Learning for Air Pressure Forecasting", In Procedings of the International Joint 

Conference on Neural Networks, 2011, San Jose, California, USA, pp. 595–602 

[3] P.T. Jackway, M. Deriche, "Scale-Space Properties of the Multiscale Morphological Dilation-Erosion", 

IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, vol. 18, no. 1, pp.38–51 

[4] Joseph (Yossi) Gil and Ron Kimmel, "Efficient dilation, erosion, opening, and closing algorithms", 

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, Iss. 12, December 2002, 

pp.1606–1617 

[5] A. Dąbrowski, R. Długosz, P. Pawłowski, “Integrated CMOS GSM Baseband Channel Selecting Filters 

Realized Using Switched Capacitor Finite Impulse Response Technique”, Elsevier Microelectronics 

Reliability Journal, vol. 46, no. 5–6, pp. 949–958, June 2006. 

[6] Sophocles J. Orfanidis, "Introduction to Signal Processing", previously published by Pearson Education, 

Inc. 1996-2009 by Prentice Hall, Inc. Previous ISBN 0-13-209172-0 

[7] R. Długosz, K. Iniewski, “Programmable Switched Capacitor Finite Impulse Response Filter with 

Circular Memory Implemented in CMOS 0.18μm Technology”, Journal of Signal Processing Systems 

(formerly the Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology), 

Springer New York, vol. 56, no. 2-3, pp. 295–306, September 2009. 

[8] W. W. Moses, E. Beuville, M. H. Ho, "A Winner-Take-All IC for determining the crystal of interaction 

in PET detectors", IEEE Transactions on Nuclear Science, vol. 43, no. 3, 1996, pp.1615–1618 

[9] A. Demosthenous, S. Smedley, J. Taylor, "A CMOS analog Winner-Takes-All network for large-scale 

applications", IEEE Transactions on Circuits and Systems-I: Fundamental Theory and Applications, 

vol. 45, no. 3, 1998, pp.300–304. 

[10] J. Ramirez-Angulo, J.E. Molinar-Solis, S. Gupta, R. G. Carvajal, A. J. Lopez-Martin, "A high-swing, 

high-speed CMOS WTA using differential flipped voltage followers", IEEE Transactions on Circuits 

and Systems II: Express Briefs, vol.54, no. 8, 2007, pp.668–672. 

[11] T. Serrano, B. Linares-Barranco, "A modular current-mode high-precision winner-take-all circuit", IEEE 

Transactions on Circuits and Systems-II: Analog and Digital Signal Processing, vol. 42, no. 2, 1995, 

pp.132–134. 

[12] K. Wawryn, B. Strzeszewski, "Current mode AB class WTA circuit", In the Proceedings of the IEEE 

International Conference on Electronics, Circuits and Systems (ICECS), 2001, pp. 293–296. 

[13] G. T. Tuttle, S. Fallahi, A. A. Abidi, "An 8-b CMOS vector A/D converter", IEEE International Solid-

State Circuit Conference (ISSCC), San Francisco, USA, 1993, pp. 38–39 

[14] R. Długosz, T. Talaśka, R.Wojtyna, "New binary-tree-based Winner-Takes-All circuit for learning on 

silicon Kohonen's networks", In Proceedings on the Int. Conf. on Signals and Electronic Systems 

(ICSES),  Lódź, Poland, 2006, pp. 441–446. 

[15] B. Tomatsopoulos, A. Demosthenous, "Low power, low complexity CMOS multiple-input replicating 

current comparators and WTA/LTA circuits", In Proceedings on the European Conference on Circuit 

Theory and Design (ECCTD), vol. 3, no. 28, Cork, Ireland, 2005, pp. 241–244. 

[16] R. Dlugosz , A. Rydlewski , T. Talaska, "Low Power Nonlinear MIN/MAX Filters Implemented in the 

CMOS Technology", In Proceedings on the 29th International Conference on Microelectronics, Beograd, 

Serbia, 12-14 May 2014, pp. 397–400. 

[17] R. Długosz, W. Pedrycz, "Łukasiewicz Fuzzy Logic Networks and Their Ultra Low Power Hardware 

Implementation", Elsevier Neurocomputing, vol. 73, Iss.7-9, pp.1222–1234, March 2010. 

[18] R. Długosz, T. Talaska, W. Pedrycz, "Current-Mode Analog Adaptive Mechanism for Ultra-Low Power 

Neural Networks", IEEE Transactions on Circuits and Systems–II: Express Briefs, vol. 58, Iss. 1, pp. 

31–35, January 2011. 

[19] M.J.M. Pelgrom, H.P. Tuinhout and M. Vertregt, "Transistor matching in analog CMOS applications", 

In Proceedings on the IEEE International Electron Devices Meeting, December 1998, pp. 915–918 

[20] Y.C. Hung, B.D. Liu, "High-reliability programmable WTA/LTA circuit of O(N) complexity using a 

single comparator", IEE Proceedings-Circuits Devices and Systems, vol. 151, no. 6, 2004, pp. 579–586. 

[21] Yu Chien-Cheng, Tang Yun-Ching, Liu Bin-Da, "Design of high performance CMOS current-mode 

winner-take-all circuit", In Proceedings on the International Conference on ASIC, Beijing, China, 2003, 

pp. 568–572.