Instruction FACTA UNIVERSITATIS Series: Electronics and Energetics Vol. 28, N o 2, June 2015, pp. 237 - 249 DOI: 10.2298/FUEE1502237D NOVEL, LOW POWER, NONLINEAR DILATATION AND EROSION FILTERS REALIZED IN THE CMOS TECHNOLOGY  Rafał Długosz 1,2 , Andrzej Rydlewski 3 , Tomasz Talaśka 1 1 UTP University of Sciences and Technology, Faculty of Telecommunication, Computer Science and Electrical Engineering, Bydgoszcz, Poland 2 DELPHI Automotive Company, Kraków, Poland 3 Alcatel-Lucent, Coldra Woods, Chepstow Rd, Newport NP18 2YB Abstract. In this paper we propose novel, binary-tree, asynchronous, nonlinear filters suitable for signal processing realized at the transistor level. Two versions of the filter have been proposed, namely the dilatation (Max) and the erosion (Min) one. In the proposed circuits an input signal (current) is sampled in a delay line, controlled by a multiphase clock. In the subsequent stage particular samples are converted to 1-bit digital signals with delays proportional to the values of these samples. In the last step the delays are compared in digital binary-tree structure in order to find either the Min or the Max value, depending on which filter is used. Both circuits have been simulated in the TSMC CMOS 0.18µm technology. To make the results reliable we applied the corner analysis procedure. The circuits were tested for temperatures ranging from -40 to 120ºC, for different transistor models and supply voltages. The circuits offer a precision of about 99% at a typical detection time of 20 ns (for the Max filter) and 100 ns for the Min filter (the worst case scenario). The energy consumed per one input during a single calculation cycle equals 0.32 and 1.57 pJ, for the Max and Min filters, respectively. Key words: Nonlinear filters, CMOS realization, full-custom, binary-tree architecture 1. INTRODUCTION Dilatation and erosion operations often referred to as the Max and Min functions, respectively, are useful in many applications. These operations are commonly used in artificial neural networks (ANN) but also in signal and image processing [1]. To perform the competitive learning, which is common in some types of ANNs, the Min function is used to determine which of the neurons is located in the closest proximity to a provided learning pattern. In this case this operation is known as Winner Takes All (WTA), which is somehow misleading, as from the formal point of view the Min function corresponds to the Loser Takes All (LTA) operation. However, the winning neuron is the one, for which the distance is the smallest and thus such a convention. Received August 19, 2014; received in revised form February 11, 2015 Corresponding author: Rafał Długosz UTP University of Sciences and Technology, Faculty of Telecommunication, Computer Science and Electrical Engineering,ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland (e-mail: rafal.dlugosz@gmail.com) 238 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA Another area in which the Min, as well as the Max operations are used is nonlinear filtering. Such filters are used, for example, to enhance signals or to correct shapes of the objects in pictures. The Min and the Max operations are in such applications denoted as erosion and dilatation, respectively. Both types of filters can be joined in series in order to perform more complex tasks, such as morphological opening and closing operations, commonly used in image processing to reconstruct digital image into original form from noisy image. For such applications we can use Min/Max Detector Based (MDB) filters or Min/Max Exclusive Mean (MMEM) filters. These technics can be used to achieve best performance [2]-[4]. A large similarity exists between the both nonlinear dilatation (Max) / erosion (Min) filters and the WTA / LTA operations, as in both mentioned cases the core circuit fulfills exactly the same task. The task relies on searching for either the minimum or maximum signal among a set of the input signals. The difference exists in the input signals used in each of these cases. In ANNs all signals are independent as they come from separate neurons distributed over the input data space. On the other hand, the erosion/dilatation filters process samples of on one signal stored in the delay line, as shown in Fig. 1. In this diagram a classic delay line is schematically shown to illustrate the idea. In the classic approach the samples are rewritten many times between memory cells. In software realizations it is not the problem, but in analog transistor level implementations this usually is the source of errors that have an impact on the quality of filtering. We faced with this problem in our former projects of Finite Impulse Response (FIR) filters realized in switched-capacitor technique [5]. To avoid the problem of reduced accuracy it is necessary to use such filters, in which the number of read/write operations in minimized. One of the possibilities in this regard is to use the, so-called, circular delay line [6], [7]. In this approach, the samples are stored in particular memory cells and remain there as long as they are replaced with new samples after M clock cycles, where M is also the number of samples stored in the whole delay line. We adopted this solution to nonlinear filters presented in this paper. In filters the input signal can be sampled in time domain (1-D signal), or particular samples can be, for example, pixels of an image (2-D signal). In this paper we focus on nonlinear filters used in the first situation. However, the proposed solution can be adopted to image filtering as well. Fig. 1 Nonlinear dilatation / erosion filtering of a 1-D signal in time domain Numerous Min/Max circuits have been reported in the literature, but two major types of architectures can be clearly distinguished. In the first group the Min/Max circuits are usually based on the current conveyor (CC) architecture [8]-[10]. In this approach all input signals (either currents or voltages) are compared in a single stage. Such circuits Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 239 usually feature a simple structure, but suffer from limited accuracy that decreases when the number of inputs increases [9], [11]. This problem results mostly from the, so called, 'corner error‟, which occurs when two or more input signals have similar values. In this case an average value between these signals appears at the output of the filter. Fig. 2 Block diagram of the proposed Min / Max filter Binary-tree solution The second group of filters is based on the concept of the binary tree (BT) structure. In this case the competition between the input signals is conducted on particular layers of the tree. The number of layers equals log2M, where M is the number of inputs that equals the length of delay line. Signals at each particular layer compete in pairs and always only one winning signal is allowed to take part in the competition at the next layer of the tree [9], [12], [13]. The binary-tree circuits usually are more complex than their CC counterparts. However, if precise comparators are used, they are able to properly distinguish signals that differ by very small amounts. The advantage of BT solutions is also evident in the fact that they are able to determine the address of the Max or Min signal, which is not possible in the CC circuits. This is an important feature in case of the application of such circuits in ANNs, in which the value of the winning signal is less important than the information which of the input signals has the smallest value. In typical BT solutions the signals (analog) at the outputs of particular layers of the tree are determined (calculated or copied) on the basis of signals provided from preceding layers [9], [12], [14]. This may be the source of errors [15] that accumulate at the top of the tree. In the proposed solution [16] this problem is less visible. At an early stage of the signal processing chain the analog input signals are converted to digital 1-bit signals with delays proportional to the values of the input signals. Then the comparison of the signals (their delays) is performed in a digital BT structure. In this way, the copying of analog signals between layers has been eliminated. The paper is organized as follows: In next Section we propose two filters specific for the dilatation (Max) and erosion (Min) nonlinear operations. In following Section we present verification of the proposed circuit by means of transistor level simulations. To provide reliable results we performed rigorous PVT (process, voltage, temperature) variations tests. The conclusions are formulated in last Section. 240 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA a) b) c) d) e) f) Fig. 3 Components of the proposed nonlinear filters: (a) input multiple-output CM used in circular delay line, (b) S&H memory element used in delay line, (c) current to time converter (ITC), (d) delay comparator used in dilatation (Max) filter, (e) delay comparator used in erosion (Min) filter, (f) address determination block (ADET) 2. PROPOSED DILATATION AND EROSION NONLINEAR FILTERS Both nonlinear filters proposed in this paper are based on the same structure shown in Fig. 2. The circuit is composed of the analog part whose role is to prepare simplified signals for the subsequent digital BT structure. The circuit consists of several blocks, or groups of elements, presented in detail in Fig. 3. Analog part of the system In both filters the input current, Iin, is first sampled and held in the circular delay line. This delay line has been used to avoid multiple read and write operations of particular signal samples, which is the source of large errors in classical delay line. In this approach particular samples are not rewritten between memory cells but remain in particular cells as long as they are replaced by new samples after M clock cycles, as described earlier. The delay line in the proposed filters works as follows: The input signal is copied M times by the use of the multiple output current mirror (CM), shown in Fig. 3(a). In this way each branch receives a separate copy of the input signal and thus data processing in particular branches is independent from each other. Particular samples of the input signal are stored in sample & hold (S&H) memory elements, shown in Fig. 3 (b). To compensate a typical Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 241 in this case charge injection effect across the storage capacitors, CST, we have used the, so called, dummy switches, swD. In such switches inputs and outputs are shorted together, so they do not change the functionality of the circuit. Such switches are controlled by clock signals of opposite polarity in comparison with the memory switches, swM. The circular delay line is in this case controlled by an M-phases clock. The complexity of the clock can be viewed as a disadvantage. However, since the length of nonlinear filters usually does not exceed 8-10, it is not a significant problem, taking into account an increased precision of the circuit. Output signals from particular S&H elements, denoted as I ' in i are provided to current- to-time converters (ITC), shown in Fig. 3(c), that convert them into binary 1-bit flags (F). These converters are also common for both filters. The flag signals occur at the outputs of particular ITCs with delays proportional to the values of the signal samples. Each of these blocks is composed of a PMOS-type cascoded CM, an integrating capacitor with reset function, and two NOT gates. The voltage across the capacitor is increasing with a rate which is proportional to the value of the I ' in i signal. The NOT gates change their logical states when the voltage across the capacitor reaches a value of about VDD/2. Fig. 4 A theoretical influence of transistor sizes on the gain error of the current mirror (due to threshold voltage mismatch) for the weak and strong inversion regions. To improve the precision of the circuit we have used the cascoded CMs to increase the accuracy of the copying operations. An additional problem while designing the CMs, is how to determine the optimal sizes of transistors for particular values of the input currents (in the range up to 10 µA in this case). We faced with a similar problem in our former projects [17], [18]. The sizes of transistors have a strong influence on the mismatch effect [19], for example the threshold voltage mismatch ∆VTH. The last parameter has, in turn, an impact on the theoretical gain error of the CM and thus on the precision of the circuit. In the weak inversion region the impact is usually larger than in the strong inversion region, as shown in Fig. 4, and therefore we polarize the transistors in such a way to put their operating points in the strong inversion region. To make it possible we do not work with currents smaller than 1 µA. Increasing the sizes of transistors always reduces the mismatch effect. However, for given values of the input currents this also decreases the gate to source voltage, VGS, that in turn enlarges the gain error of the CM. For the currents being in the range up to 10µA optimal sizes of transistors are W / L=3 / 1 and 9 / 1 µm for NMOS and PMOS transistors, respectively. 242 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA Digital binary tree structure The binary tree structure used in the proposed nonlinear filters is composed of delay comparators (DCMP), which are different for particular filters. Two versions of this block have been proposed. The circuit used in the dilatation filter is shown in Fig. 3(d), while the one used in the erosion filter in Fig.3 (e). Both circuits are built on the basis of the RS flip flop (RSFF) that is able to distinguish very small (at the level of 3-5ns) differences between delays of particular input signals. Depending on the mode of the filter (Min or Max) either the smaller or the larger of two input signals becomes the winner, which DCMP signalizes by two digital signals, o1 and o2. In the overall BT the process of determination of the winning (or losing) signal is based on the competition performed at particular layers of the tree. To make it possible, DCMP blocks provide an additional signal - flag (F) of a given pair - that takes part in the competition at the following layer of the tree. Depending on the type of filter, between the F11 and F12 inputs and the output F there is only a single OR or AND gate. In the dilatation filter, as soon as only one of the input flags becomes 1, a given DCMP immediately (with a delay below 0.5 ns) sends the flag F of a pair to the next layer of the tree. In the erosion filter, on the other hand, the AND gate causes that a given DCMP has wait with sending the flag of the pair until both input flags become 1. This causes that the erosion filter is slower than the dilatation one. The other problem in this filter appears when the minimum signal very small or zero. In this case the process of detection of this signal can take a very long time. To solve this problem we assume not only an upper range of the input signals but also the bottom range, which in this case equals 10% of the upper range. If there is no possibility that the input signals are always larger than the bottom range, we can introduce a constant bias, added in junction to each signal charging the integrating capacitor in the ITC block. The last operation performed in the proposed filters is determination of the address of the Min or the Max signal, depending on the type of the filter. The o1 and o2 signals from particular DCMP blocks are used by the ADET block (address determination), shown in Fig. 3(e), to determine the address. The o1 and o2 signals have always such values that enable an unambiguous indication of the winning signal. Unfortunately, the problem with the RSFF is that it can hang ('0.5' states at both outputs) when two input flags arrive at almost the same time i.e. when the corresponding input currents are almost equal. In this case the values at both outputs of the RSFF are equal to about VDD/2. To avoid ambiguity in this case, a simple hierarchy mechanism has been introduced that is able to recognize the '0.5' states. In such situations the circuit arbitrarily decides that one of the input signals obtains the status of the 'winner'. The proposed arbitrary mechanism is based on asymmetrical NOT (NOTn and NOTp) gates. The gates have different threshold voltages obtained throughout a proper transistor sizing. These voltages are equal to 0.25∙VDD/2 and 0.75∙VDD/2 for the NOTp and NOTn gates, respectively. In case when the RSFF hangs, the gates provide different output signals that is detected by the XOR gate. This gate throughout the configuration switches (controlled by 'swn' / 'swp' signals) controls the values of the o1 and o2 output signals. In this case the circuit arbitrarily connects the outputs of the RSFF to VDD ('1') and VSS ('0') supplies. This function does not introduce a substantial error, as in this case both analog input signals are almost equal (difference < 0.2%). Additionally, it is worth to say that the '0.5' states occur seldom in practice, so the mechanism is only an emergency solution. Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 243 Fig. 5 Transistor level simulations of a single DCMP equipped with the proposed arbitrary mechanism. In the B state the arbitrary mechanism eliminates the ambiguity. Fig. 6 Simulations of the circular delay line with eight S&H memory elements. From top to bottom are presented: (1) an example input current with the amplitude of 2µA, (2) controlling clock signals (8 phases), (3) signal samples stored in particular S&H cells (voltages across the storage capacitors, CST), and (4) the supply current 244 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA Fig. 7 Simulations of the BT block composed of the DCMP (Max mode) circuits for T=20ºC and VDD=1.8V. From top to bottom are presented: (1) VC voltages in the ITC circuits, (2) resultant flag signals, and (3) addresses of the samples that in a given time period have the maximum values Fig. 8 Simulations of the BT block composed of the DCMP (Min mode) circuits for T=20ºC and VDD=1.8V. From top to bottom are presented: (1) VC voltages in the ITC circuits, (2) resultant flag signals, and (3) addresses of the samples that in a given time period have the minimum values Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 245 3. VERIFICATION OF THE PROPOSED NONLINEAR FILTERS The proposed circuit has been tested in several steps. At the beginning we tested the DCMP as a separate circuit. We put a special emphasis on the arbitrary mechanism, as this block has a crucial meaning. Illustrative results for the circuit used in the dilatation filter are shown in Fig. 5.The RS out1 and RS out2 signals can be either in a typical state (A), in which their values are `0' or '1', or in the '0.5' state (B), which is not desired. In case B (e.g. in the range from 47 to 52us) the outputs of asymmetrical NOTn and NOTp gates provide different values. This state is detected by the XOR gate that signalizes it by reverting the values of the 'swp' and 'swn' signals. As a result, the outputs o1 and o2 are arbitrary connected to VDD and VSS rails (logical '1' and '0'), respectively. After verifying the DCMP block we have tested the performance of the overall filter composed of a delay line with 8 memory cells and the BT block with three layers (log28). The results for the dilatation as well as the erosion filters are presented in Figs. 6 – 8. Fig. 6 illustrates the operation of the circular delay line. An example input current – sinus waveform with f=10kHz and the amplitude of 2 µA across the 3µA DC signal – is sampled and held in the memory cells (CST = 400 fF). Each sample remains in a given cell during eight subsequent clock cycles. An average supply current equals 70 µA, which means that an average power dissipation equals 126 µW (for VDD = 1.8 V). The performance of the overall circuit for both types of nonlinear filters is presented in Figs. 7 and 8. Top panel in both Figures present voltages across storage capacitors in particular ITC blocks. This phase is preceded by resetting the capacitors. After the Reset signal is released, the capacitors are charged from 0 to VDD by currents (samples of the input signal), whose values are stored in the corresponding S&H elements. Middle panel presents resultant delays of particular flags. Finally, the bottom panels show the addresses of the samples with the Max or Min values that appear at the outputs of the filters. The input signals in both cases have been selected in such a way to present different scenarios. In Fig. 7 in the first cycle (in between 50 and 60 µs) the I7 and I8 samples are almost equal. As a result, both corresponding flags are set to 1 in a short period of time that activates the arbitrary mechanism. In this case the mechanism arbitrary selects the I7 signal as a winner. The next two cycles (60-70 and 70-80 µs) present a typical situation, in which differences between particular signals are larger. The detection time varies in this case in between 5 and 20 ns. In most cases we expect that this time will be no greater than 20 ns. In the worst case scenario (not presented), i.e. for bottom values of all samples of the input signal (1µA) this time can reach even 100 ns. Taking into account the average power dissipation provided above, the energy consumed during one detection cycle per one input can be determined to be 1.57pJ and 0.32pJ in the worst case scenario and in a typical situation, respectively. In case of the erosion filter we selected such input signals for which the flags appear almost at the same time (a difference of 1 – 3 ns). It is shown in Fig. 8. In each situation the arbitrary mechanism properly selects one of these signals as a winner (minimum signal in this case). Detection time is in this case longer than in case of the dilatation filter, as the circuit must wait until the flag of the smallest signal appears at the output of the ITC. We can assume that the detection time is in this case closer to the worst case scenario of the dilatation filter (100 ns), while the power dissipation remains the same. 246 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA a) b) Fig. 9 Corner analysis: Simulations of the dilatation (Max) filter for different temperatures: (a) -40ºC , (b) 120 ºC. The meaning of particular diagrams seen from top to bottom is the same as in Figs. 7 and 8. After the initial verification of the circuit described above we performed a detailed corner analysis of the circuits. We tested the filters in wide ranges of particular PVT (process, voltage and temperature) parameters. The environment temperature varied in the range from -40ºC to 120 ºC, while the supply voltage in the range from 1.2 to 1.8 V. We tested the circuit for three transistor models, namely slow, fast and typical (SS, FF, TT). Fig. 9 presents selected simulation results for the same situation as in Fig. 7, but for different temperatures to illustrate the stability of the system. Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 247 Table 1 Performance comparison of selected Min / Max circuits reported in literature. Reference Process (CMOS µm) VDD [V] No. of inputs P / (1 in) [µW] Data rate f [MHz] Input range [µA] FOM (f/P1in) [MHz/µW] [20] 0.5 3.3 8 106.25 5 3.3 0.047 [21] 0.35 3.3 8 70 1 10 0.014 [12] 0.8 6 8 120 2.8 50 0.023 [9] 2.4 5 8 200 13.8 100 0.069 [15] 0.6 3 8 283.75 20 70 0.070 This work (dilatation) 0.18 1.8 8 15.75 50 1 – 10 3.174 This work (erosion) 0.18 1.8 8 15.75 10 1 – 10 0.635 4. DISCUSSION OF RESULTS In this Section we compare the obtained results with performance of other Min/Max circuits reported in the literature. A straightforward comparison is not easy, as particular solutions were designed for different purposes and thus, to some extent, offer different features. Most of the reported circuits does not contain the delay line, as they have been designed for independent input signals directly provided to the BT block. In case of our circuit the memory cells used to store the signal samples contain additional branches that conduct current, thus increasing the power dissipation. Schemes presented in Fig. 3 (a) – (c) show that each memory cell contains two additional branches that almost doubles the power dissipation in the comparison with the situation in which the same circuit (without the delay line) would be used to process independent signals. To facilitate the comparison of different solutions we define a Figure-of-Merit (FOM) as data rate over the power dissipation for a single input. Such assumption is correct, as the power dissipation increases approximately linearly with the number of inputs. Note that the number of all elements used in the circuit also increases linearly with the number of inputs. As discussed earlier, the number of layers in the binary tree equals log2M. At each following layer the number of elements that serve as comparators (DCMP) is reduced by 2. For example, for 8 inputs the circuit has 3 layers with 4, 2, 1 comparators, respectively (7 comparators in total). For 128 inputs the number of comparators equals 127. The number of ITCs and memory cells in delay line equals the number of the inputs. We are aware that the proposed circuit has been realized in newer technology than other circuits of this type, presented in Table 1. However, as described earlier, to reduce the mismatch effect we oversized transistors used in the analog part that had some impact on the attainable data rate. The main source of the observable delay is the analog part of the system. In particular ITC blocks the currents with the values in-between 1 and 10 µA have to charge the capacitors of 100 fF to the value of about 0.9V, that enables generating the flag. This process takes about 9 to 90 ns, for 10 and 1 µA, respectively. If the circuit would be realized in an older technology we would have to increase the supply voltage, so the process of charging the capacitors would take a longer time (let us assume 2 -3 times). The digital part of the system is very fast. The delay of a single layer of the BT block equals the delay of a single OR or AND gate only, as the flag of the pair is generated by these gates (Fig. 3 d-e). This delay in the CMOS 0.18µm technology does not exceed 1 ns for VDD = 1.8V. If the RSFF hangs, the arbitrary mechanism requires about 3 ns to decide 248 R. DŁUGOSZ, A. RYDLEWSKI, T. TALAŚKA which of the inputs is assumed to be the winner. However, this process is parallel to the process of propagating flags in the tree. In the BT block we use transistors with minimal lengths in a given technology. If we would redesign the circuit in an older technology, the propagation time of each layer of the tree would increase by a factor of (L1/L2) 2 . In the CMOS 0.5µm technology, for example, this time would be longer about 7-10 times. We suppose that in this technology the delay of the overall circuit in the worst case scenario would not exceed 300 ns. This would reduce the FOM of our circuit about 3 times. However the obtained results are still four times better than in an example circuit reported in [20], designed in CMOS 0.5µm technology. The provided delay times and calculations are for an example case of 8 inputs i.e. 3 layers in the tree. In case of larger structures the delay of the analog part will remain the same, while the delay of the digital part will increase only moderately. This is one of the main advantages of the proposed solution. In other circuits of this type with analog BT, the delay is linearly proportional to the number of layers. During the corner analysis we simulated the filters with smaller supply voltages. The circuit worked properly dissipating less power, but it was also much slower in this case. For VDD = 0.8V the digital part was approximately 10 times slower. Working with such supply voltages does not make sense as the energy consumed during one cycle does not decrease as fast as the dissipated power, just due to reduced speed. Additionally for such voltages transistors used in the analog part work in the weak inversion region that reduces the precision of the circuit. 5. CONCLUSIONS Novel nonlinear dilatation and erosion filters have been proposed in the paper. The circuits are based on the binary tree concept. However, in contrary to typical solutions of this type, in which analog BT structures are used, is the proposed circuit we distinguish the analog part that converts the analog signals to 1-bit signals with different delays and the parallel and asynchronous digital BT block that determines which delay is the smallest or the largest, depending on the type of the filter. The proposed digital BT is much faster than its analog counterparts. It additionally eliminates propagation of analog signals in the tree, as it is in other circuits of this type. As a result, the circuit offers a precision at the level exceeding 99% that is sufficient in many signal processing tasks. The proposed BT is very sensitive and is able to distinguish very small differences of delays of particular input signals. This is possible through a not typical use of the RS flip flops, which serve in this case as time comparators. In a typical application of the RS flip flops the „11‟ input state is not allowed. In our circuit we call this situation an emergency state that happens relatively seldom. Nevertheless, to avoid the situation in which this state will unable calculation of the output sample of the filter, we propose an arbitrary mechanism that is able to handle this situation. The next step of the project will be design and fabrication of the chip containing the filters and its laboratory tests. This phase is necessary, as the noise can have same impact on the results. Low Power Nonlinear Min/Max Filters Implemented in the CMOS Technology 249 REFERENCES [1] M. Vemis, G. Economou, S. Fotopoulos, A. Khodyrev, "The Use of Boolean Functions and Logical Operations for Edge Detection in Images", Signal Processing, 1995, vol. 45, 161–172 [2] R.A. Araujo, A.L.I. Oliveira, S. Soares, S. Meira, "Designing Dilation-Erosion Perceptrons with Di_erential Evolutionary Learning for Air Pressure Forecasting", In Procedings of the International Joint Conference on Neural Networks, 2011, San Jose, California, USA, pp. 595–602 [3] P.T. Jackway, M. Deriche, "Scale-Space Properties of the Multiscale Morphological Dilation-Erosion", IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, vol. 18, no. 1, pp.38–51 [4] Joseph (Yossi) Gil and Ron Kimmel, "Efficient dilation, erosion, opening, and closing algorithms", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, Iss. 12, December 2002, pp.1606–1617 [5] A. Dąbrowski, R. Długosz, P. Pawłowski, “Integrated CMOS GSM Baseband Channel Selecting Filters Realized Using Switched Capacitor Finite Impulse Response Technique”, Elsevier Microelectronics Reliability Journal, vol. 46, no. 5–6, pp. 949–958, June 2006. [6] Sophocles J. Orfanidis, "Introduction to Signal Processing", previously published by Pearson Education, Inc. 1996-2009 by Prentice Hall, Inc. Previous ISBN 0-13-209172-0 [7] R. Długosz, K. Iniewski, “Programmable Switched Capacitor Finite Impulse Response Filter with Circular Memory Implemented in CMOS 0.18μm Technology”, Journal of Signal Processing Systems (formerly the Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology), Springer New York, vol. 56, no. 2-3, pp. 295–306, September 2009. [8] W. W. Moses, E. Beuville, M. H. Ho, "A Winner-Take-All IC for determining the crystal of interaction in PET detectors", IEEE Transactions on Nuclear Science, vol. 43, no. 3, 1996, pp.1615–1618 [9] A. Demosthenous, S. Smedley, J. Taylor, "A CMOS analog Winner-Takes-All network for large-scale applications", IEEE Transactions on Circuits and Systems-I: Fundamental Theory and Applications, vol. 45, no. 3, 1998, pp.300–304. [10] J. Ramirez-Angulo, J.E. Molinar-Solis, S. Gupta, R. G. Carvajal, A. J. Lopez-Martin, "A high-swing, high-speed CMOS WTA using differential flipped voltage followers", IEEE Transactions on Circuits and Systems II: Express Briefs, vol.54, no. 8, 2007, pp.668–672. [11] T. Serrano, B. Linares-Barranco, "A modular current-mode high-precision winner-take-all circuit", IEEE Transactions on Circuits and Systems-II: Analog and Digital Signal Processing, vol. 42, no. 2, 1995, pp.132–134. [12] K. Wawryn, B. Strzeszewski, "Current mode AB class WTA circuit", In the Proceedings of the IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2001, pp. 293–296. [13] G. T. Tuttle, S. Fallahi, A. A. Abidi, "An 8-b CMOS vector A/D converter", IEEE International Solid- State Circuit Conference (ISSCC), San Francisco, USA, 1993, pp. 38–39 [14] R. Długosz, T. Talaśka, R.Wojtyna, "New binary-tree-based Winner-Takes-All circuit for learning on silicon Kohonen's networks", In Proceedings on the Int. Conf. on Signals and Electronic Systems (ICSES), Lódź, Poland, 2006, pp. 441–446. [15] B. Tomatsopoulos, A. Demosthenous, "Low power, low complexity CMOS multiple-input replicating current comparators and WTA/LTA circuits", In Proceedings on the European Conference on Circuit Theory and Design (ECCTD), vol. 3, no. 28, Cork, Ireland, 2005, pp. 241–244. [16] R. Dlugosz , A. Rydlewski , T. Talaska, "Low Power Nonlinear MIN/MAX Filters Implemented in the CMOS Technology", In Proceedings on the 29th International Conference on Microelectronics, Beograd, Serbia, 12-14 May 2014, pp. 397–400. [17] R. Długosz, W. Pedrycz, "Łukasiewicz Fuzzy Logic Networks and Their Ultra Low Power Hardware Implementation", Elsevier Neurocomputing, vol. 73, Iss.7-9, pp.1222–1234, March 2010. [18] R. Długosz, T. Talaska, W. Pedrycz, "Current-Mode Analog Adaptive Mechanism for Ultra-Low Power Neural Networks", IEEE Transactions on Circuits and Systems–II: Express Briefs, vol. 58, Iss. 1, pp. 31–35, January 2011. [19] M.J.M. Pelgrom, H.P. Tuinhout and M. Vertregt, "Transistor matching in analog CMOS applications", In Proceedings on the IEEE International Electron Devices Meeting, December 1998, pp. 915–918 [20] Y.C. Hung, B.D. Liu, "High-reliability programmable WTA/LTA circuit of O(N) complexity using a single comparator", IEE Proceedings-Circuits Devices and Systems, vol. 151, no. 6, 2004, pp. 579–586. [21] Yu Chien-Cheng, Tang Yun-Ching, Liu Bin-Da, "Design of high performance CMOS current-mode winner-take-all circuit", In Proceedings on the International Conference on ASIC, Beijing, China, 2003, pp. 568–572.